Syndicated Actor Model
The Syndicated Actor Model (SAM) [Garnock-Jones 2017] is an approach to concurrency based on the Communicating Event-Loop Actor Model [De Koster et al 2016] as pioneered by E [Miller 2006] and AmbientTalk [Van Cutsem et al 2007].
While other Actor-like models take message-passing as fundamental, the SAM builds on a different underlying primitive: eventually-consistent replication of state among actors. Message-passing follows as a derived operation.
This fundamental difference integrates Tuplespace- [Gelernter and Carriero 1992] and publish/subscribe-like ideas with concurrent object-oriented programming, and makes the SAM well-suited for building programs and systems that are reactive, robust to change, and graceful in the face of partial failure.
Outline. This document first describes the primitives of SAM interaction, and then briefly illustrates their application to distributed state management and handling of partial failure. It goes on to present the idea of a dataspace, an integration of Tuplespace- and publish/subscribe-like ideas with the SAM. Finally, it discusses the SAM's generalization of object capabilities to allow for control not only over invocation of object behaviour but subscription to object state.
Throughout, we will limit discussion to interaction among actors connected directly to one another: that is, to interaction within a single scope. Scopes can be treated as "subnets" and connected together: see the Syndicate protocol specification.
For more on the SAM, on the concept of "conversational concurrency" that the model is a response to, and on other aspects of the larger project that the SAM is a part of, please see https://syndicate-lang.org/about/ and Garnock-Jones' 2017 dissertation.
Concepts and components of SAM interaction
A number of inter-related ideas must be taken together to make sense of SAM interaction. This section will outline the essentials.
For core concepts of Actor models generally, see De Koster et al.'s outstanding 2016 survey paper, which lays out a taxonomy of Actor systems as well as introducing solid definitions for terms such as "actor", "message", and so on.
Actors, Entities, Assertions and Messages
The SAM is based around actors which not only exchange messages, but publish ("assert") selected portions of their internal state ("assertions") to their peers in a publish/subscribe, reactive manner. Assertions and messages in the SAM are semi-structured data: their structure allows for pattern-matching and content-based routing.
Assertions are published and withdrawn freely throughout each actor's lifetime. When an actor terminates, all its published assertions are automatically withdrawn. This holds for both normal and exceptional termination: crashing actors are cleaned up, too.
An actor in the SAM comprises
- an inbox, for receiving events from peers;
- a state, "all the state that is synchronously accessible by that actor" (De Koster et al 2016);
- a collection of entities; and
- a collection of outbound assertions, the data to be automatically retracted upon actor termination.
The term "entity" in the SAM denotes a reactive object, owned by a specific actor.1 Entities, not actors, are the unit of addressing in the SAM. Every published assertion and every sent message is targeted at some entity. Entities never outlive their actors—when an actor terminates, its entities become unresponsive—but may have lifetimes shorter than their owning actors.
Local interactions, among objects (entities) within the state of the same actor, occur synchronously. All other interactions are considered "remote", and occur asynchronously.
Turns
Each time an event arrives at an actor's inbox, the actor takes a turn. De Koster et al. define turns as follows:
A turn is defined as the processing of a single message by an actor. In other words, a turn defines the process of an actor taking a message from its inbox and processing that message to completion.
In the SAM, a turn comprises
- the event that triggered the turn and the entity addressed by the event,
- the entity's execution of its response to the event, and
- the collection of pending actions produced during execution.
If a turn proceeds to completion without an exception or other crash, its pending actions are delivered to their target entities/actors. If, on the other hand, the turn is aborted for some reason, its pending actions are discarded. This transactional "commit" or "rollback" of a turn is familiar from other event-loop-style models such as Ken [Yoo et al 2012].
Events and Actions
SAM events convey a new assertion, retraction of a previously-established assertion, delivery of a message, or a request for synchronisation.
In response to an event, an actor (entity) schedules actions to be performed at the end of the turn. Actions include not only publication and retraction of assertions, transmission of messages, and issuing of synchronisaton requests, but also termination of the running actor and creation of new actors to run alongside the running actor.
Entity References are Object Capabilities
As mentioned above, entities are the unit of addressing in the SAM. Assertions and message bodies may include references to entities. Actors receiving such references may then use them as targets for later assertions and messages. Entity references act as object capabilities, very similar to those offered by E [Miller 2006].
Entity references play many roles in SAM interactions, but two are of particular importance. First, entity references are used to simulate functions and continuations for encoding remote procedure calls (RPCs). Second, entity references can act like consumers or subscribers, receiving asynchronous notifications about state changes from peers.
Illustrative Examples
To show some of the potential of the SAM, we will explore two representative examples: a distributed spreadsheet, and a cellular modem server.
Spreadsheet cell
Imagine a collection of actors representing portions of a spreadsheet, each containing entities representing spreadsheet cells. Each cell entity publishes public aspects of its state to interested peers: namely, its current value. It also responds to messages instructing it to update its formula. In pseudocode:
1:
define entity Cell(formula):
2:
subscribers ← ∅
3:
on assertion from a peer of interest in our value,
4:
add peer, the entity reference carried in the assertion of interest, to subscribers
5:
on retraction of previously-expressed interest from some peer,
6:
remove peer from subscribers
7:
assert subscriptions to other Cells (using entity references in formula)
8:
on message conveying a new formula,
9:
formula ← newFormula
10:
replace subscription assertions using references in new formula
11:
on assertion conveying updated contents relevant to formula,
12:
value ← eval(formula)
13:
continuously, whenever value or subscribers changes,
14:
assert the contents of value to every peer in subscribers,
15:
retracting previously-asserted values
Much of the subscription-management behaviour of Cell is generic: lines 2–6 managing the subscribers set and lines 13–14 iterating over it will be common to any entity wishing to allow observers to track portions of its state. This observation leads to the factoring-out of dataspaces, introduced below.
Cellular modem server
Imagine an actor implementing a simple driver for a cellular modem, that accepts requests (as Hayes modem command strings) paired with continuations represented as entity references. Any responses the modem sends in reply to a command string are delivered to the continuation entity as a SAM message.
1:
define entity Server():
2:
on assertion Request(commandString, replyEntity)
3:
output commandString via modem serial port
4:
collect response(s) from modem serial port
5:
send response(s) as a message to replyEntity
6:
define entity Client(serverRef):
7:
define entity k:
8:
on message containing responses,
9:
retract the Request assertion
10:
(and continue with other tasks)
11:
assert Request("AT+CMGS=..."
, k) to serverRef
This is almost a standard continuation-passing style encoding of remote procedure call.2 However, there is one important difference: the request is sent to the remote object not as a message, but as an assertion. Assertions, unlike messages, have a lifetime and so can act to set up a conversational frame within which further interaction can take place.
Here, subsequent interaction appears at first glance to be limited to transmission of a response message to replyEntity. But what if the Server were to crash before sending a response?
Erlang [Armstrong 2003] pioneered the use of "links" and "monitors" to detect failure of a
remote peer during an interaction; "broken promises" and a suite of special system messages
such as __whenBroken
and __reactToLostClient
[Miller 2006, chapter 17] do the same for
E. The SAM instead uses retraction of previous assertions to signal failure.
To see how this works, we must step away from the pseudocode above and examine the context where serverRef is discovered for eventual use with Client. In the case that an assertion, rather than a message, conveys serverRef to the client actor, then when Server crashes, the assertion conveying serverRef is automatically retracted. The client actor, interpreting this as failure, can choose to respond appropriately.
The ubiquity of these patterns of service discovery and failure signalling also contributed, along with the patterns of generic publisher/subscriber state management mentioned above, to the factoring-out of dataspaces.
Dataspaces
A special kind of syndicated actor entity, a dataspace, routes and replicates published data according to actors' interests.
1:
define entity Dataspace():
2:
allAssertions ← new Bag()
3:
allSubscribers ← new Set()
4:
on assertion of semi-structured datum a,
5:
add a to allAssertions
6:
if a appears exactly once now in allAssertions,
7:
if a matches Observe(pattern, subscriberRef),
8:
add (pattern, subscriberRef) to allSubscribers
9:
for x in allAssertions, if x matches pattern,
10:
assert x at subscriberRef
11:
otherwise,
12:
for (p, s) in allSubscribers, if a matches p,
13:
assert a at s
14:
on retraction of previously-asserted a,
15:
remove a from allAssertions
16:
if a no longer appears at all in allAssertions,
17:
retract a from all subscribers to whom it was forwarded
18:
if a matches Observe(pattern, subscriberRef),
19:
remove (pattern, subscriberRef) from allSubscribers
20:
retract all assertions previously sent to subscriberRef
Assertions sent to a dataspace are routed by pattern-matching. Subscriptions—tuples associating a pattern with a subscriber entity—are placed in the dataspace as assertions like any other.
A dataspace entity behaves very similarly to a tuplespace [Gelernter and Carriero 1992]. However, there are two key differences.
The first is that, while tuples in a tuplespace are "generative" [Gelernter 1985], taking on independent existence once created and potentially remaining in a tuplespace indefinitely, SAM assertions never outlive their asserting actors. This means that assertions placed at a dataspace only exist as long as they are actively maintained. If an actor terminates or crashes, all its assertions are withdrawn, including those targeted at a dataspace entity. The dataspace, following its definition, forwards all withdrawals on to interested subscribers.
The second is that assertion of a value is idempotent: multiple assertions of the same value3 appear to observers indistinguishable from a single assertion. In other words, assertions at a dataspace are deduplicated.
Applications of dataspaces
Dataspaces have many uses. They are ubiquitous in SAM programs. The form of state replication embodied in dataspaces subsumes Erlang-style links and monitors, publish/subscribe, tuplespaces, presence notifications, directory/naming services, and so on.
Subscription management
The very essence of a dataspace entity is subscription management. Entities wishing to manage collections of subscribers can cooperate with dataspaces: they may either manage a private dataspace entity, or share a dataspace with other entities. For example, in the spreadsheet cell example above, each cell could use its own private dataspace, or all cells could share a dataspace by embedding their values in a record alongside some name for the cell.
Service directory and service discovery
Assertions placed at a dataspace may include entity references. This makes a dataspace an ideal implementation of a service directory. Services advertise their existence by asserting service presence [Konieczny et al 2009] records including their names alongside relevant entity references:
Service("name"
, serviceRef)
Clients discover services by asserting interest in such records using patterns:
Observe(⌜Service("name"
, _)⌝, clientRef)
Whenever some matching Service record has been asserted by a server, the dataspace asserts the corresponding record to clientRef. (The real dataspace pattern language includes binding, not discussed here; see "Patterns over assertions" in the Syndicate protocol documentation.)
Failure signalling
Since assertions of service presence are withdrawn on failure, and withdrawals are propagated to interested subscribers, service clients like clientRef above will be automatically notified whenever serviceRef goes out of service. The same principle can also be applied in other similar settings.
Independence from service identity
There's no need to separate service discovery from service interaction. A client may assert its request directly at the dataspace; a service may subscribe to requests in the same direct way:
(client:)
ServiceRequest("name'
, arg1, arg2, ..., replyRef)
(server:)
Observe(⌜ServiceRequest("name'
, ?a, ?b, ..., ?k)⌝, serviceRef)
In fact, there are benefits to doing things this way. If the service should crash mid-transaction, then when it restarts, the incomplete ServiceRequest record will remain, and it can pick up where it left off. The client has become decoupled from the specific identity of the service provider, allowing flexibility that wasn't available before.
Asserting interest in assertions of interest
Subscriptions at a dataspace are assertions like any other. This opens up the possibility of reacting to subscriptions:
Observe(⌜Observe(⌜...⌝, _)⌝, r)
This allows dataspace subscribers to express interest in which other subscribers are present.
In many cases, explicit assertion of presence (via, e.g., the Service records above) is the right thing to do, but from time to time it can make sense for clients to treat the presence of some subscriber interested in their requests as sufficient indication of service presence to go ahead.4
Illustrative Examples revisited
Now that we have Dataspaces in our toolbelt, let's revisit the spreadsheet cell and cellular modem examples from above.
Spreadsheet cell with a dataspace
1:
define entity Cell(dataspaceRef, name, formula):
2:
continuously, whenever value changes,
3:
assert CellValue(name, value) to dataspaceRef
4:
continuously, whenever formula changes,
5:
for each name n in formula,
6:
define entity k:
7:
on assertion of nValue,
8:
value ← (re)evaluation based on formula, nValue, and other nValues
9:
assert Observe(⌜CellValue(n, ?nValue)⌝, k) to dataspaceRef
10:
on message conveying a new formula,
11:
formula ← newFormula
The cell is able to outsource all subscription management to the dataspaceRef it is given. Its behaviour function is looking much closer to an abstract prose specification of a spreadsheet cell.
Cellular modem server with a dataspace
There are many ways to implement RPC using dataspaces,2 each with different characteristics. This implementation uses anonymous service instances, implicit service names, asserted requests, and message-based responses:
1:
define entity Server(dataspaceRef):
2:
define entity serviceRef:
3:
on assertion of commandString and replyEntity
4:
output commandString via modem serial port
5:
collect response(s) from modem serial port
6:
send response(s) as a message to replyEntity
7:
assert Observe(⌜Request(?commandString, ?replyEntity)⌝, serviceRef) to dataspaceRef
8:
define entity Client(dataspaceRef):
9:
define entity k:
10:
on message containing responses,
11:
retract the Request assertion
12:
(and continue with other tasks)
13:
assert Request("AT+CMGS=..."
, k) to dataspaceRef
If the service crashes before replying, the client's request remains outstanding, and a service supervisor [Armstrong 2003, section 4.3.2] can reset the modem and start a fresh service instance. The client remains blissfully unaware that anything untoward happened.
We may also consider a variation where the client wishes to retract or modify its request in case of service crash. To do this, the client must pay more attention to the conversational frame of its interaction with the server. In the pseudocode above, no explicit service discovery step is used, but the client could reason about the server's lifecycle by observing the (disappearance of) presence of the server's subscription to requests: Observe(⌜Observe(⌜Request(⌞_⌟, ⌞_⌟)⌝, _)⌝, ...).
Object-capabilities for access control
Object capabilities are the only properly compositional way to secure a distributed system.5 They are a natural fit for Actor-style systems, as demonstrated by E and its various descendants [Miller 2006, Van Cutsem et al 2007, Stiegler and Tie 2010, Yoo et al 2012 and others], so it makes sense that they would work well for the Syndicated Actor Model.
The main difference between SAM capabilities and those in E-style Actor models is that syndicated capabilities express pattern-matching-based restrictions on the assertions that may be directed toward a given entity, as well as the messages that may be sent its way.
Combined with the fact that subscription is expressed with assertions like any other, this yields a mechanism offering control over state replication and observation of replicated state as well as ordinary message-passing and RPC.
In the SAM, a capability is a triple of
- target actor reference,
- target entity reference within that actor, and
- an attenuation describing accepted assertions and messages.
An "attenuation" is a piece of syntax including patterns over semi-structured data. When an assertion or message is directed to the underlying entity by way of an attenuated capability, the asserted value or message body is checked against the patterns in the attenuation. Values not matching are discarded silently.6
Restricting method calls. For example, a reference to the dataspace where our cellular
modem server example is running could be attenuated to only allow assertions of the form
Request("ATA"
, _). This would have the effect of limiting holders of the capability to only
being able to cause the modem to answer an incoming call ("ATA").
Restricting subscriptions. As another example, a reference to the dataspace where our
spreadsheet cells are running could be attenuated to only allow assertions of the form
Observe(⌜CellValue("B13"
, _)⌝, _). This would have the effect of limiting holders of the
capability to only being able to read the contents (or presence) of cell B13.
Conclusion
We have looked at the concepts involved in the Syndicated Actor Model (SAM), an Actor-like approach to concurrency that offers a form of concurrent object-oriented programming with intrinsic publish/subscribe support. The notion of a dataspace factors out common interaction patterns and decouples SAM components from one another in useful ways. Object capabilities are used in the SAM not only to restrict access to the behaviour offered by objects, but to restrict the kinds of subscriptions that can be established to the state published by SAM objects.
While we have examined some of the high level forms of interaction among entities residing in SAM actors, we have not explored techniques for effectively structuring the internals of such actors. For this, the SAM offers the concept of "facets", which relate directly to conversational contexts; for a discussion of these, see Garnock-Jones' 2017 dissertation, especially chapter 2, chapter 5, chapter 8 and section 11.1. A less formal discussion of facets can also be found on the Syndicate project website.
Bibliography
[Armstrong 2003] Armstrong, Joe. “Making Reliable Distributed Systems in the Presence of Software Errors.” PhD, Royal Institute of Technology, Stockholm, 2003. [PDF]
[De Koster et al 2016] De Koster, Joeri, Tom Van Cutsem, and Wolfgang De Meuter. “43 Years of Actors: A Taxonomy of Actor Models and Their Key Properties.” In Proc. AGERE. Amsterdam, The Netherlands, 2016. [DOI (PDF available)]
[Felleisen 1991] Felleisen, Matthias. “On the Expressive Power of Programming Languages.” Science of Computer Programming 17, no. 1–3 (1991): 35–75. [DOI (PDF available)] [PS]
[Fischer et al 1985] Fischer, Michael J., Nancy A. Lynch, and Michael S. Paterson. “Impossibility of Distributed Consensus with One Faulty Process.” Journal of the ACM 32, no. 2 (April 1985): 374–382. [DOI (PDF available)] [PDF]
[Garnock-Jones 2017] Garnock-Jones, Tony. “Conversational Concurrency.” PhD, Northeastern University, 2017. [PDF] [HTML]
[Gelernter 1985] Gelernter, David. “Generative Communication in Linda.” ACM TOPLAS 7, no. 1 (January 2, 1985): 80–112. [DOI]
[Gelernter and Carriero 1992] Gelernter, David, and Nicholas Carriero. “Coordination Languages and Their Significance.” Communications of the ACM 35, no. 2 (February 1, 1992): 97–107. [DOI]
[Karp 2015] Karp, Alan H. “Access Control for IoT: A Position Paper.” In IEEE Workshop on Security and Privacy for IoT. Washington, DC, USA, 2015. [PDF]
[Konieczny et al 2009] Konieczny, Eric, Ryan Ashcraft, David Cunningham, and Sandeep Maripuri. “Establishing Presence within the Service-Oriented Environment.” In IEEE Aerospace Conference. Big Sky, Montana, 2009. [DOI]
[Miller 2006] Miller, Mark S. “Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control.” PhD, Johns Hopkins University, 2006. [PDF]
[Morris 1968] Morris, James Hiram, Jr. “Lambda-Calculus Models of Programming Languages.” PhD thesis, Massachusetts Institute of Technology, 1968. [Available online]
[Stiegler and Tie 2010] Stiegler, Marc, and Jing Tie. “Introduction to Waterken Programming.” Technical Report. Hewlett-Packard Labs, August 6, 2010. [Available online]
[Van Cutsem et al 2007] Van Cutsem, Tom, Stijn Mostinckx, Elisa González Boix, Jessie Dedecker, and Wolfgang De Meuter. “AmbientTalk: Object-Oriented Event-Driven Programming in Mobile Ad Hoc Networks.” In Proc. XXVI Int. Conf. of the Chilean Soc. of Comp. Sci. (SCCC’07). Iquique, Chile, 2007. [DOI]
[Yoo et al 2012] Yoo, Sunghwan, Charles Killian, Terence Kelly, Hyoun Kyu Cho, and Steven Plite. “Composable Reliability for Asynchronous Systems.” In Proc. USENIX Annual Technical Conference. Boston, Massachusetts, 2012. [Talk] [PDF] [Project page]
Notes
The terminology used in the SAM connects to the names used in E [Miller 2006] as follows: our actors are E's vats; our entities are E's objects.
Many variations on RPC are discussed in section 8.7 of Garnock-Jones' 2017 dissertation (direct link to relevant section of online text).
Here the thorny question of the equivalence of entity
references rears its head. Preserves specifies an equivalence over its Value
s that is
generic in the equivalence over embedded values such as entity references. The ideal
equivalence here would be observational equivalence [Morris 1968, Felleisen
1991]: two references are the same when they react indistinguishably to assertions and
messages. However, this isn't something that can be practically implemented except in
relatively limited circumstances. Fortunately, in most cases, pointer equivalence of
entity references is good enough to work with, and that's what I've implemented to date
(modulo details such as structural comparison of attenuations attached to a reference
etc.).
Karp [2015] offers a good justification of this claim along with a worked example of object-capabilities in a personal-computing setting. The capabilities are ordinary E-style capabilities rather than SAM-style capabilities, but the conclusions hold.
You might be wondering "why silent discard of assertions rejected by an attenuation filter?", or more generally, "why discard assertions and messages silently on any kind of failure?" The answer is related to the famous Fischer/Lynch/Paterson (FLP) result [Fischer et al 1985], where one cannot distinguish between a failed process or a slow process. By extending the reasoning to a process that simply ignores some or all of its inputs, we see that offering any kind of response at the SAM level in case of failure or rejection would be a false comfort, because nothing would prevent successful delivery of a message to a recipient which then simply discards it. Instead, processes have to agree ahead of time on the conversational frame in which they will communicate. The SAM encourages a programming style where assertions are used to set up a conversational frame, and then other interactions happen in the context of the information carried in those assertions; see the section where we revisit the cellular modem server with the components decoupled and placed in a conversational frame by addition of a dataspace to the system. Finally, and with all this said, debugging-level notifications of rejected or discarded messages have their place: it's just the SAM itself that does not include feedback of this kind. Implementions are encouraged to offer such aids to debugging.