Negotiating protocols between actors or clients

Copy of an email I just sent to the SWICG mailing list: Negotiating protocols between actors or clients from a on 2024-11-02 (public-swicg@w3.org from November 2024)

Hello all,

I know that the CG meeting next week is likely to have its time filled with discussions of potential charters, but I’d like to bring up a potential CG meeting agenda item:

PROPOSAL: The SocialCG should convene a task force to document protocols built on top of ActivityPub, including how to negotiate protocols client-to-client or actor-to-actor.

The roles and responsibilities of this task force would include, to start with:

  1. Defining a mechanism to signal which protocols are supported, or in other words, which total set of behaviors will be carried out when an activity is received at an outbox or delivered to an inbox. Behaviors can be broken down into the following categories:

1.1. Server behaviors. Which actions can be carried out automatically by a server? How can an actor detect that a server will understand and carry out side effects? Example: When an Announce is received in an inbox, a server might automatically add it to the shares collection… or it might not.

1.2. Client behaviors. Which actions are intended for clients to understand and do something with them? How can an actor negotiate a session between two clients? Example: Alice is using an E2EE messenger (or chess application), and wants to establish an E2EE conversation (or chess game) specifically with a client attached to Bob’s actor or profile.

1.3. Actor behaviors. Which actions can the actor take or respond with, in response to a given activity? How can an actor be expected to act? Example: When Alice receives an activity referencing an object that has set inReplyTo or context or etc. referencing an object belonging to Alice, then Alice might want to Add that object to an appropriate collection. This behavior might also occur at the client level, if the client is configured to automatically manage those collections. This behavior might also occur at the server level, if the server is configured to automatically manage those collections.

  1. Documenting one or more common profiles for a total set of behaviors as described above.

2.1. Option: A protocol for the management and replication of resources across servers. Define behaviors for Create/Update/Delete/Add/Remove at the server level, client level, and actor level. Define where and how resources are stored, how long resources are stored or cached, etc.

2.2 Option: A protocol for publishing “posts” to a “profile”. Define what is a “post”, what is a “profile”, etc. (Some overlap with the Forum TF here, especially once you get to the level of defining “conversation” and “forum”.)

2.3. Option: A protocol for publishing “activities” to an “activity stream”. Define generic processing and shape that an activity must fit. Define how to identify these activities as simple notifications, in cases where side effects might not be needed or desired.

  1. Generating one or more reports for the above items.

I would love to know what other people think about this agenda item or general issue, and I welcome discussion of this issue both on the mailing list and in other venues.

2 Likes

if this is a response to ChuuniIRC’s multihoming goal? honestly, go for it. at the very least, we’d love to see how it plays out on fedi.

Sent this to list too myself, but adding here for posterity. FEP-9fde: Mechanism for servers to expose supported operations is probably relevant.

I don’t think it is, no. NodeInfo is not required and it also doesn’t apply to actors or clients.

This is mostly a response to a recurring refrain that a) ActivityPub does not sufficiently describe the protocols being used on the fediverse, and b) broad interoperability and alignment between such protocols is impossible as long as they remain formally undefined. You need a target for compatibility, and “whatever Mastodon does”, “whatever Lemmy does”, etc. is a constantly moving target.

The answer to both of these concerns is to recognize and document implicit protocols that are layered on top of ActivityPub, in the same way that ActivityPub S2S is layered on top a specialization of Linked Data Notifications (LDN). LDN guarantees you that the thing arriving in your inbox is “a notification represented by linked data”; ActivityPub guarantees you that the thing arriving in your inbox is “an AS2 document that is specifically an Activity”; the protocols of the fediverse generally imply a certain shape to those activities and their objects. More importantly, they expect behaviors or side effects.

For example, the “Mastodon protocol” defines entities like “accounts” and “statuses”, which are not the same as “actors” and “objects”. A “status” created according to the “Mastodon protocol” is attached to an “account”, and deleting the “account” will delete the “status”. A “status” has certain required properties; an “account” has certain required properties; sending an activity to an inbox controlled by Mastodon will trigger certain processing behaviors. All this and more make up a “protocol” – a complete set of composable behaviors.

A simpler “activity stream” protocol might not do all of these things. It might only define publishing streams of activities, relying on the “Follow protocol” of base ActivityPub. A consumer who follows such an “activity stream” might be expected to only browse the stream of activities. Side effects triggered by certain activities could be layered on top of this by a separate protocol or by an application like IFTTT – for example, when the application sees that a certain actor performed an activity with a type of Listen and the object is an Audio object, then the Audio object might be added to an OrderedCollection representing a playlist of audio to check out later.

A “resource management protocol” might be declared by e.g. a Solid pod, or some other personal data storage service. You could set up data repos on multiple providers, and then have each of those repos follow each other, so that any modification on one repo is mirrored to the other repos – for example, an HTTP PUT for a given Solid resource could be used to trigger an Update notification that makes its way over to the other pods.

The ultimate goal is that, when looking at any resource that has an ldp:inbox, you have “more than zero” knowledge about what might happen if you send something to that inbox. Less assuming, more knowing.

2 Likes

ah yeah…

we remember how fedi and activitypub was touted as having the ability to interoperate between video media (peertube), microblogging media (mastodon), pictures (pixelfed?), wikis, etc but that never materialized.

we understand the idea of protocol negotiation, but what is the expected outcome? like, not the details of protocol negotiation, but what would a client or server implementation be doing in order to improve interoperability? what would the DX and UX look like? or in other words, how are you expecting implementers and users to interact with this?

as a user i want to be able to make an informed decision on whether i should or should not send (or have my client/user-agent send) arbitrary activities to arbitrary actors. i want to be able to look at an actor and say things like:

  • “yep, i have a pretty good idea that if i send them a Florp Ping, then they’ll understand what that means.”
  • “eh, i shouldn’t bother sending this actor anything other than a Create Note where the Create has these properties and the Note has these properties, and i understand that this will be converted to a Status and i need to send them an explicit Delete Note later because they don’t understand HTTP caching headers…”
  • “so this actor is specifically a conversation manager, and in order to participate in any conversations i need to send it a Create where the object of that activity has at least context and content, otherwise it will get ignored entirely.”
  • “wait, this isn’t an activitypub actor at all, it’s just a resource with an inbox. it’s following some other protocol entirely and expecting LDN payloads of a different type or shape.”

worst case scenario, the information isn’t available, so we fall back to the current behavior: spray and pray. send them the whole firehose, let god sort it out.

1 Like

Would this by like syncing resources and posts on multiple servers, similar to how Zot Protocol deals with cloning channels (i.e. your channel can reside on more than one server and is synced)?

This would be nice information to know.

It is good, since remote servers and other platforms can know what is allowed and not allowed. It also means that we would have to consider what other platforms are capable of when designing our platform. For example, if we know a platform won’t accept a comment in a particular situation, we can remove the comment box on our interface. That is not necessarily a bad thing.

so like how lemmy federates upvotes while mastodon doesn’t know what to do with them? (wait does lemmy have upvotes we’ve never used lemmy)

I like the overall idea (very much). However, I think it would be better to not explain the proposal using LDN.

Amy Guy describes AP as a specialization of LDN. That’s different than being layered on top of it. The only overlap I see is that they use the same JSON-LD term for inbox and that term represents an HTTP POST endpoint. Beyond that, I don’t see much similarity. LDN is based on RDF and Linked Data, AP is not. LDN supports other content types like Turtle and HTML. Dereferencing an AP inbox results in a data structure that’s not compatible with LDN (AP uses paging, as:orderItems instead of ldp:contains to reference “notifications”, etc.). LDN specifies discovery techniques not supported by AP. In general, AP and LDN implementations are not interoperable.

https://dr.amy.gy/chapter5

1 Like

kind of. i was thinking of a generic “personal data storage repo service” along the lines of Solid when writing that bit, so you could have an HTTP PUT in Solid translated into an AP Update. every pod follows every other pod, so the Update reaches all other pods and gets applied to all data stores equally.

1 Like

i can amend the word choice but i think that even as a “specialization” there’s still enough of an overlap for it to make sense to make the comparison that “AP inbox is (mostly) just LDN inbox but restricted to AS2 Activity payloads”. it’s less about dereferencing (HTTP GET) and more about what happens when you deliver (HTTP POST). i say “mostly” because there are side effects that AP prescribes on top of the activity being “just a simple notification”. (it’s those side effects i’d like to unpack and make composable.)

unrelatedly, thank you for the link to Standards for the Social Web because i have never seen it before. the content looks similar to that of Social Web Protocols though (also written by Amy Guy). there is a section Social Web Protocols that describes how AP and LDN might interop re: the inbox.

There is no such thing as “protocol built on top of ActivityPub”.

If by “protocols” you mean “application features”, then it is true, ActivityPub spec doesn’t sufficiently describe them. It would be better if it didn’t describe them at all, but this is not a big problem.

This is not true. Various Fediverse applications are, in fact, interoperating.

No, that is not a moving target.


I think some kind of negotiation between servers might be useful, but if you are starting from false assumptions, I doubt that anything good could come out of that work.

why not? “no such thing” is a strong claim to make against the existence of such protocols.

i do not mean “application features”. i mean “set of behaviors”. the thing that makes a “protocol” is sufficiently defining those behaviors, such that two entities can communicate in a concisely bounded manner.

fediverse applications are interoperating, but only where their implicit protocols overlap. if i send mitra a Florp Ping, what will it do? what if i send it a Move Rook from origin:d8 to target:d5? what if i send it an EnvelopedMessage?

even within the scope of what currently exists, various interpretations of semantics abound. does a Delete leave behind a Tombstone or not? can a Delete be Undone? do i even need to send a Delete at all? does Deleting an actor also Delete objects that are attributedTo that actor? various implementations have made some decision or another regarding these questions and more, and the answers aren’t all the same.

this isn’t about negotiation between servers. this is about negotiation between clients and actors. not everything happens within the purview of the server, and not everything should happen within the purview of the server.

i don’t think the assumptions are false. in fact, i think we agree at least somewhat that

but the fact is that if you want to be have meaningful communication between two entities, it’s not enough to pass messages back and forth. you need to define the semantics for those messages. activitypub currently only guarantees one thing: “it is an activity”. it says nothing of resource lifecycles, nothing of posts/statuses and profiles/accounts, nothing of even streams of activities. you could have different implementations of “activitypub” be mutually unintelligible, and none of them would even be wrong necessarily. i described three potential “profiles” above that only somewhat overlap. compare the interpretation of a Create under these profiles:

  • AP core: “surprisingly few side effects”, show it in the inbox like any other activity
  • “activity stream” profile: publish the Create activity within the stream of activities
  • “resource management” profile: generate a copy of the object in the local storage/data repo
  • “social posts” profile: check to see that the actor can be converted to an “account” or “profile” first. if it’s not missing any of the required properties (which?) then check to see if the object can be converted to a “status” or “post”. if it’s not missing any of the required properties (which?) then create the “post”/“status” and attach it to the “profile”/“account”. (notice that i haven’t even gotten to how to define or convert to/from any of these entities.)

if i as a user intend a Create activity to be just a notification, and i do not want, wish for, or consent to the replication of my resource, then how do i have any way of even beginning to approach my choice of addressees or delivery targets? by AP core, the baseline behavior of S2S Create is just “show it in the inbox”. if i had foreknowledge of the protocols that certain actors were adhering to, then i can make an informed decision when choosing which actors to address. if i send my Create to Alice who is using a generic ActivityPub server following AP core and a generic ActivityPub client that just shows Alice’s inbox as a collection of activities similar to an email inbox, then it’s fine. if i send my Create to an application following the resource replication protocol, then i should be aware that i’ve just potentially replicated my object to some datastore. if i send my Create to an actor being managed by a “social posts” instance (monolithic client+server), then it would be helpful to know that i might have just led to the mirroring of a shadow “profile” based on whatever got converted out of my actor and my object.

the behavior to unpack here is “what happens when this inbox receives a Create”. there are potential behaviors at the server, client, and user level. similar behaviors can be unpacked for other activities. common patterns and connections can be drawn. profiles can be written to describe bundles of such behaviors. and none of this belongs in core ActivityPub. it belongs in protocols built on top of ActivityPub… which should be clearly described, documented, and signaled.

Having some protocol to sync entities would be nice. Hubzilla has to use Zot Protocol for that since it does not exist in ActivityPub. Although I am sure we could come up with some way to do it in ActivityPub. It would just be a matter of getting people to agree on how to do it.

But Hubzilla also has a lot of unique features, like web pages, wikis, articles, photo album, and cloud storage. And no one else seems to have those features. So there has been little push to try to sync stuff over ActivityPub since we can do it in Zot and every platform that we sync with uses Zot.

If such protocol exists, there should be a specification, but I haven’t seen any. There is a related claim that “no one implements ActivityPub”, which is disproven by the fact that all popular Fediverse applications interoperate.

Adding a new header to HTTP request doesn’t create a new protocol. Similarly, attaching a behavior to a specific property or a type doesn’t create a new protocol.

You can create a “protocol on top of ActivityPub”, but I think that would be counterproductive and harmful.

These set of behaviors are just implementations details. Applications support different features, so activities and objects they produce are also different. Information about that usually can be found in software documentation (plus, a special kind of document has been invented for this purpose - FEDERATION.md).

Implementation details are not protocols.

It will ignore them.

I am using a platform that uses multiple protocols (ActivityPub, Zot, OpenWebAuth, WebFinger, etc.), so from that perspective, I wonder if adding something like this to ActivityPub is really needed. We could just implement SOLID as another addon, for example.

To add something like this to ActivityPub, I think you would need at least two independent platforms that plan to support it. One, at the bare minimum, if they want to use it to talk to themselves.

For example, nomadic identity is finally getting added to ActivityPub because Mitra and a new platform called Forte decided to implement it for their own use. Hopefully it catches on, but at least they can use it among themselves.

It might be good to add personal data storage repo service directly to ActivityPub, but you would have to weight the pros and cons of adding it to ActivityPub versus just implementing something like SOLID directly as an additional protocol.

That’s illogical. Just because implementations interoperate doesn’t mean they are conformant to ActivityPub or implement ActivityPub at all. In other words, Fediverse applications interoperated even before ActivityPub was created.

ActivityPub, as specified, is insufficient for effective interop. Try handing a new developer only the AP, AS2, LDN, and JSON-LD specifications and ask them to implement an interoperable Fediverse server (interoperable with Mastodon, for example). That approach is very unlikely to be successful. I believe that popular Fediverse applications interoperate because they’ve copied Mastodon’s design and implementation decisions (or buggy/quirky behavior, in a few cases).

The “implementation detail” argument doesn’t make sense to me either (although it’s true that protocols are not just implementation details). If servers interoperate, they do because they have implemented a common, implicit protocol that @trwnh describes. Sure, there are many ways to implement those implicit protocols, but if the “implementation details” are inconsistent with the protocol, then interop is likely to fail.

The “protocol” is the common abstract behavior that allows interop. I think that equating a protocol with the implementation of the protocol is mixing levels of abstraction in an unproductive (and arguably, incorrect) way.

One might view @trwnh ’s proposal as a more formal and structured version of Federation.md. That FEP only requires Federation.md to be a Markdown file (an empty file would be conformant) and be placed in the project root. The resulting ad hoc documents are better than nothing, but I haven’t found them to be especially useful for interop.

3 Likes

in general, you could use Create/Update/Delete to manipulate resources, and Add/Remove to manipulate collections of resources. but it requires both sides of the communication to have shared understanding of what Create/Update/Delete/Add/Remove are supposed to mean, semantically. there is a difference between the “activity as notification” and “activity as command/procedure”. the former can be displayed as-is in a browser or viewer of some sort, with no additional considerations. the latter has side effects, so you need to agree on the side effects or else you will have misunderstandings like one side intending a Delete as a “soft delete” that they can Undo later, but the other side interprets it as a “hard delete” and irreversibly drops all data. similar considerations for how, when, or if deletes should cascade – does deleting a collection delete objects in that collection? does deleting an actor delete objects attributed to that actor? discrepancies can have wide-reaching consequences.

not really what i meant, so i’ll try to give a better example:

right now in Solid, you use HTTP verbs to manipulate your resources in your data pod. so an HTTP POST will create something, an HTTP PUT/PATCH will update it, and an HTTP DELETE will delete it. but these actions only take effect on the local Solid pod, and pods aren’t replicated. there is some work on a “Solid Notifications Protocol” which allows entities to “communicate about changes affecting a resource.” this protocol allows for subscribing to “channels” commonly powered by WebSockets. but the core notification payload is often represented using AS2 Vocab:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    "https://www.w3.org/ns/solid/notifications-context/v1"
  ],
  "id": "urn:uuid:fc8b5af4-bd7e-4fd1-a649-afcbd0e1c083",
  "type": "Update",
  "object": "https://example.org/guinan/profile",
  "state": "128f-MtYev",
  "published": "2021-08-05T01:01:49.550Z"
}