Layered ActivityPub

After several attempts with ActivityPub, I think ActivityPub needs more layers— and I’d love some discussion. For now, I nicknamed it Layered ActivityPub, or LAP, but it’s not meant to replace ActivityPub but rather act as its backend; Traditional AP interfaces can be implementable on top of it.

This is by no means an exact spec but rather a concept.

Note, I’m marking layers with a in case we’ll decide to add/remove some, for future reference.

Layer 7.1a, Activity Transport Layer

Basic Terms—

  • Peer, any involved group.
  • Server, the peer receiving a request.
  • Client, the peer signing and sending it.

This layer does not care about profiles, domains, accounts, anything. Its only concepts are Messages, which are opaque binary blobs plus some basic headers.

How it works is simple: you sign, using RFC 9421, all your requests. A request’s body, content type, and some additional headers, are a Message. To create a message, you POST it, including your public key (which is part of the message, and thus signed as well) in a header. The server will reply linking to where it stored the message.

To read a Message, one must send a signed GET request, which the recipient is free to refuse based on the key’s permissions (to its best knowledge).

It is also possible to request a server to DELETE a Message using the same contract.

Now, here’s the catch: these GETs and DELETEs are also messages. The server could store them, but it is not at all required to.

Keys can additionally request a server to DELETE themselves, thus removing all messages only signed by them if any. For messages signed by multiple client, a server is expected to DELETE on an && basis.

Discovery

Via webfinger.

Open questions

  • How to globally reference a message in URI form?
  • Should a DELETE reply with the deleted message, with where the DELETE messages itself was stored if any, or?
  • Webfinger discovery specifics

Layer 7.2a, Activity Access Layer

At this layer, we use the previous one to define access to to Messages. Here we define the concept of an Actor, that is, an arbitrary URI of the scheme web+lap:aupro:resource, where provider is an Authentication Protocol (Aupro).

An example Actor can be a Key: web+lap:client:did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK.

Servers implementing AAL MUST advertise their supported Aupros. Aupros specify who an Actor is and how they authenticate.

Message owners may POST messages giving other actors access and permissions to their message. Servers are required to maintain a log of such messages, perhaps requiring a client to sign all their permission-changing messages into one to combine them and thus save space. Permission messages are required to be at least as public as the effects of their changes.

An additional Aupro can be one of groups, allowing an Actor to give multiple other Actors permissions at once instead of listing them individually.

These groups will also be reusable by other protocols that further distribute messages say on explicit mentions, follows, etc.

Open Questions

  • How to advertise AAL Aupro support?
  • Can groups create/delete/manage permissions collectively instead of delegating this right to each individual member actor? See also FEP 541d (or whatever it’s called): Federated Democracy

I hope the picture I laid is clear enough:

  • Layer 7.1a only cares about transporting Messages from Clients (Keys) to Servers, and does not care about their contents save for the header in which the public key is transferred.
  • Layer 7.2a defines how Servers should manage permissions, Actors, the entities given said permissions, and Aupros, who define Actors.
    • The built-in client Aupro defines Keys to he Actors.
    • The also-built-in group Aupro defines Groups (perhaps “collections”, “circles”? Idk) that delegate their permissions to all member Actors, and uses the same permission system to manage group members.
    • Potential Federated Democracy Aupro could define a complementary to Groups, where to exercise permissions, all a certain percentage of/etc member Actors must agree.
  • Layer 7.3a will define Profiles, who will take part in negotiating how to send Messages from Profile A to Profile B, where profiles are assigned 0..1:1 to Actors.
  • Layer 7.4a will define Follows, Followed, Blocks groups etc. and thus there we have the core of ActivityPub implemented.

Would love to hear your thoughts. Sorry this is so disorganized.

Cc @aschrijver @silverpill because I know you and to start a discussion.

1 Like

What is the problem that you are trying to solve? I mean, we already have FEPs for object proofs (which seem to be the goal of your activity layer) and I believe there has been some extensive discussion about DIDs being a valid alternative method for dereferencing URIs.

Unless I am missing some specific issue that can not be addressed with the FEPs in development, wouldn’t it be better to put weight into them before?

2 Likes

Several.

  1. ActivityPub has an identity crisis. It is not sure whether it is a Transport- or an Application-layer protocol. This is not inherently problematic, but it is annoying.
  2. Hostility to existing Internet standards, e.g. Content Negotiation. JsonLD is used with the justification it allows infinite extensibility; However, this is true to HTTP as well, and unlike JsonLD, it’s possible to add headers that are not included in the RFC 9421 signature.
  3. Lack of canonical representation. JsonLD inherently allows for multiple canonical representations of the same object. This, in turn, makes implementing ActivityPub much harder by nature compared to the same protocol if it used Json with namespaces, without the JsonLD.
  4. Multiple sources of truth. In ActivityPub, you have two distinct identifiers: your RFC 9421 key, and your acct-URI, which complicates implementation and opens ground for man-in-the-middle and DNS spoofing attacks[1].
    • In the case of discrepency, who to believe? The account owner, or the DNS entry owner? The ActivityPub specification does not answer this question, but in the wild, it’s DNS; This means that ICANN has indirect, yet full and complete control over the Fediverse[2], and that the Fediverse cannot adapt to changes in names’ meaning.
    • TL:DR; DNS should be used as a part of handshake processes as it was meant to, not stored in databases, ever.
  5. Actors are not opaque, and instead are transparently atributted to domain names, forcing servers to sanitize content or limit the funcitonality they offer, to prevent their users from pretending they represent the server itself.
  6. Convolution of content authentication (LAP Key, AP Actor), authorization (LAP & AP Actor), and authorship (LAP Profile, AP Actor). In other words: what device did this come from[3], are they allowed to do it, and who wrote this? These are not the same questions, but ActivityPub treats them as if they are.
  7. Complete lack of a federated permission protocol.
    • If example.social sent me a “private-but-not-really” Message, and I don’t know what that means, I, a layer 7.123a developer, am now responsible of guaranteeing their data is still safe, instead of this being the responsibility of the layer 7.2a developer, or even better, if I never stored it in the first place and instead 3XX’d my clients to the foreign server, which ActivityPub cannot do[4].
    • LAP very easily allows having multiple profiles per actor, giving multiple people access to the same profile, multiple devices/keys/sign in methods (as you can add your own Aupro that just asks firebase or something[5]), at the protocol level. You can also implement OpenID ontop of it very easily. This is something that AP simply does not have, architecturally.
  8. Speaking of the client, it doesn’t even know ActivityPub! ActivityPub C2S de-facto does not exist, as it is simply the server, except repackaged, and JsonLD does not have the capabilities needed; However, a REST-over-LAP API will, and it is the same API used by foreign servers, as LAP inherently does not have the concept of “my” instance.
  9. Single identity, multiple services. Just use different clients with the same keys, or different keys, but assign your profile to a group actor containing both. You could even make your own device management thingy.
  10. TL;DR: FEPs are the USB-4 of the Fediverse.

Yes and no. FEPs are ducktape, and pile enough of them, and you’ll get roughly what I am proposing, except without proper layer separation due to legacy concerns. In other words, this proposal can be said to be a method by which FEPs can be implemented.


  1. If we relied on DNS by design like email servers, that’s “fine”. If we only relied on the key, same logic. However, relying on both opens ground into tricking a client/server into signing messages it would otherwise not. ↩︎

  2. What if a bad party acquires control of the ICANN root server, say an authoritarian regime rises and ceases control of the organization? This means the internet is now splintered into multiple root DNS servers, and the Fediverse will be most impacted: even Mastodon is more vulnerable to this than e-mail servers, and it is very simple compared to say PeerTube. Generally, DNS is a mess. Useful for figuring out what stuff humans refer to, but it’s just not designed for everything else. ↩︎

  3. Layer 7.1a does require 3rd parties to forward Messages as-is instead of signing them themselves, but it still “came from” the original device/authentication server/etc. ↩︎

  4. In my proposal, you only need to save layer 7.2a permission and membership changes, and layer 7.3a negotiation changes, and even for that you don’t really need the signatures. For everything else, if you weren’t directly notified by a client, you can simply redirect to another server; HTTP headers can be added to ask the target server to actually save, too. With minimal configuration, you can save everything, only save specific accounts, whatever you want. ↩︎

  5. Note, every software you use will have to support it too; But a generic OAuth aupro is relatively simple to design. ↩︎

Is it possible what @Laxystem is looking for is a more elaborate protocol suite that integrates AP? Could this proposal to include AP in the Solid specs achieve some of the same goals?

Yes, I think a better separation of the two would be a good idea & would make it easier to build modular applications on top of it. I’m not sure we need a major redefinition of the protocol to achieve this, though; we just need to identify which pieces of functionality belong to which “layer”, and then make incremental improvements to separate them (such as FEP-0499).

1 Like

I will look at the specifics of the layer model later, but want to mention a deeper problem related to the subject first …

ActivityPub needs a consistent domain model

Discussions like these in various shapes and sizes have been going on for years. They always find traction that is just so-so and part of the reason is that everyone has their own understanding of terminology and language being used. If it comes to consistent language and abstractions, the AP world represents a jungle. :palm_tree:

I also agree on the need to be able to distinguish between different layers. As I argued before on this forum, in the thread on the 3-stage bottom-up standardization process: We need one consistent ubiqitous language when we discuss protocol capabilities and requirements. Or else anyone leaves any discussion with their own idea of what was just discussed.

Ubiquitous Language (UL) is the terminology that all stakeholders agree to use consistently when referring to concepts of a particular domain so that there is a common understanding of the domain.

Or in a picture that looks like this:

All stakeholders, code, specs and other resources use the same terminology
(source: infoq: ddd context mapping)

In the UL we should answer “What layers does the protocol have?”, “What is an Extension?”, “What is a Vocabulary?”, “Are these these properly named, or can we improve for better common understanding?”, “What is the complete set of terminology we use, how do things fit together?”.


Update: I posted a toot mentioning additional challenge where resistance to change should be overcome, where app focus prevails over ecoystem health concerns.

1 Like

I have two responses to that, I think.

  1. While I agree on the theoretical level, wouldn’t multiple FEPs over one single unified specification make ActivityPub harder to implement? In other words, I’d argue we first need to settle on a v2, then introduce FEPs to gradually convert v1 into v2 .
  2. I think most if not all of what I detailed above here can be implemented via FEPs, and I’ll make sure to build my demos so that they’d also accept a ‘dumb’ ActivityPub-only protocol as if it understood these concepts. But, I’m not sure a permission system can (or should) be implemented in an FEP as it might be better to break compat to prevent existing systems from misunderstanding it.

Though, I must add I generally have an ideological opposition to not separating the actual content, the protocol-level effects (e.g. pings) and the target of the message as FEP 0499 does. It makes forwarding the message much harder, forcing it to be parsed; You can’t make a dumb, level 2-style ActivityPub relay server a la Bluesky relays. If we draw layers ontop of the existing functionality, this might be a problem.

ActivityPub is extremely flexible and versatile. I often called it to be more of a protocol framework than a protocol in its own right. There’s no spec that you implement and gain universal interop with the “distributed service bus and social fabric” or anything like that.

From that perspective the FEP Process is extremely valuable, as it allows anyone to give a platform where their particular protocol twist or flavour is presented to a larger dev community to gain popularity and adoption. Best-practices thus bubble up from the grassroots decentralized ecoystem that way. It follows a kind of fitness formula for natural selection, based on practice in the field.

At the same time it also causes a Big Ball of Mud anti-pattern, in a way, though that is more applicable to what we see when looking what stuff crosses “over the wire”, not the FEP itself which is just an unorganised list of that.

A minum level of clarity on how abstractions and technical mechanisms conceptually fit together, is still required and missing. For the FEP process itself and improvement might be to have a further classification to distinguish core protocol capabilities, from optional ones and/or extensions in the Application layer (defining wire formats and msg exchange patterns for particular domains e.g. “Podcasting”).

Update: In TLDR.. the FEP is great, but it misses a backbone domain model to position its constituent parts against in consistent manner. FEP docs remain thus loose pieces in a puzzle, unclear how they fit into the whole (unless by established expert AP devs).

Arguably, that’s what HTTP is supposed to be.

It seems to differ significantly from traditional ActivityPub. Does it use the same message format? For example, how would a Follow-Accept flow would look like in Layered ActivityPub?

I believe most of those problems can be solved without creating a new protocol.

It is nearly impossible to bootstrap a new network. On the other hand, ActivityPub already has users, has many desirable properties (works at scale, resists centralization, cheap to run, easy to implement, …). It is also very flexible: you can go very far without losing backward compatibility. For example, I figured out how to implement nomadic identity without creating a new incompatible protocol.

Yes, the FEP is a marketplace of ideas, and the work on v2 requires central planning, but I think it must be an incremental process, where ideas are implemented, refined, and the good ones are selected for inclusion in v2. That is what I plan to do.

2 Likes

An alternative formulation of the core idea behind FEP-0499 is to use raw HTTP messages, i.e. to specify the delivery inboxes as URL values of an HTTP header. But at that point, the endpoint becomes generic and can be used to amplify an HTTP POST to arbitrary HTTP resources/endpoints. Would that be better? Maybe, if you wanted to work with HTTP messages instead of ActivityPub activities. I’d be open to hearing your thoughts on this alternative take. See also: FEP-0499: Delivering to multiple inboxes with a multibox endpoint

Arguably, HTTP is a message protocol and not a framework. It is Layer 7, the application layer. If you want to virtualize another protocol on top of HTTP, you can do that, but you don’t need to – you can just use HTTP as-is, assuming you’re fine with the request-response model of messaging.

No, but I’m not a creative namer. It was that, or a random LoTR character.

Yes.

Well, the concept of a Profile I briefly mentioned above would be essentially a combination of an Actor + “to reach this profile, send to Servers X, Y, Z.”.

  • To setup a Profile, the actor will send a Message detailing how to reach it (e.g. going to example.social).
  • Then, it’d tell example.social it is interested in everything sent by another actor.
  • Example.soical will just, do nothing if it doesn’t know this actor, and if it does, will fetch it, and forward that message to the followed actor (you could just use ActivityPub’s username system as a handshake for following people).
  • Then the followed actor could add this actor to the Group it uses to send Messages to its followers, or add it to the Group it uses to block people (which is backed by the permission system), or send a “please do not let this actor follow me”

Oh, I agree. I’mma try making this implementable on an AP base.

1 Like

@silverpill vis a vis content types
Ok, I gave this thought.

There are essentially two things I’m trying to do here:

  • regular LAP-LAP interactions
  • LAP-AP compat

We don’t like JsonLD because we want to be able to reconstruct a Message from say an SQL database and still have the signature be valid - if we’re using JsonLD, that’ll never work.

With that being said, the reasoning for this is to support structures not relying on an always-online server as a primary source of truth, which ActivityPub is inherently incompatible with.

That is, if a Message’s Content-Type is application/ld+json; profile="https://www.w3.org/ns/activitystreams , it does not have nor need a single representation anyway due to how ActivityPub is designed: we cannot relay such Message, and must always redirect to the original AP server[1]. But if it is a fully-LAP message - e.g. application/json; profile="https://lap.example" or whatevs - we can use a JsonLD subset ActivityPub will still understand.

On top of this, there are some concepts that are inherently different between LAP and AP. What AP represents as:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Follow",
  "actor": "acct:alice@example.social",
  "object": "acct:bob@example.social"
}

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Accept",
  "actor": "acct:bob@example.social",
  "object": {
    "type": "Follow",
    "actor": "acct:alice@example.social",
    "object": "acct:bob@example.social"
   }
}

LAP represents as (pusedo-syntax; actor is provided by a level 7.1a header):

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Follow",
  "object": "web+lap:key:<DID public key URI>"
}

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Add",
  "object": "web+lap:key:<DID public key URI>",
  "target": "web+lap:group:<a Group of their choosing>"
}

So a translation layer will be needed (an ActivityPub Actor could be represented as an LAP Group(Key, ActivityPub(Key)), where ActivityPub is a translating Aupro based on the same Key).


  1. That is, in layers where Message body is not opaque. It is totally possible to store Messages as binary blobs in a dumb Layer 7.1a relay. ↩︎

Why do you want to move actor out of the message? Wouldn’t that tie everything to HTTP transport, even more than it is tied in ActivityPub?

I think it’s better when messages are self-contained. And if an ActivityPub message includes an embedded signature (e.g. FEP-8b32), it can be relayed, and its recipients don’t need to contact the original server.

How those URIs are resolved?