Desired changes for a future revision of ActivityPub and ActivityStreams

trwnh · September 17, 2024, 10:29pm

@codenamedmitri says in A brief and unromantic history of ActivityPub - #17 by codenamedmitri

There was a previous discussion touching on this subject: Should we fork AS/AP specs to Codeberg, create vNext drafts?

To summarize, we can break changes up into three categories:

minor corrections and clarifications
backwards-compatible changes and progressive enhancements
backwards-incompatible breaking changes

We also have some issues that have piled up on the Github repos for AS2 and AP, labeled “next version”:

codenamedmitri · September 17, 2024, 10:52pm

Awesome, thank you so much!

aschrijver · September 18, 2024, 5:38am

Couple of things where discussions rage on for years and years are whether AP is more JSON-first or rather a Linked Data spec, as mentioned in ActivityPub: A Linked Data spec or JSON spec with Linked Data profile?

There were long discussion threads on the fedi around this theme recently (particularly pinging @hrefna, @jenniferplusplus and @stevebate), and many others now lost in fedi history and link rot unfortunately. Key question is:

Is the linked data based extension mechanism sufficient to specify robust protocol extensions?

trwnh · September 18, 2024, 6:04am

i get the feeling that the popular intent is JSON-first with LD compatibility, but if you want to do any sort of extensibility then it immediately becomes an LD spec or else you will go utterly mad trying to deal with the infinite possibilities and inherent ambiguity and lack of context. Or you force everyone to use the same context as you, which is… decidedly not how context works, and goes against the whole point of having context in the first place.

So basically, I think LD is definitely sufficient for robust extensions. It’s also necessary.

stevebate · September 18, 2024, 6:13am

I’m not totally clear about what the “linked data based extension mechanism” is in ActivityPub. @evan has proposed a process for extending the JSON-LD context, but if that’s the whole mechanism, then I believe it’s insufficient (no structural definitions and almost no data constraints on serialized AP data/messages, for example). For non-LD consumers, the term definitions in the JSON-LD context are practically useless.

There’s a convention in the (plain) JSON community, to use “$schema” to refer to JSON Schema URIs for a given message. For non-LD usage (which @evan has stated is the primary intended usage of AP), I think that this approach could be more useful to developers.

I’m not claiming it’s a great idea, but I’ve also been wondering if we could fork the spec to create an ActivityPub-LD (with Linked Data (RDF) support) spec and vanilla JSON ActivityPub (not LD-focused) spec. The latter could use JSON Schema to define data structure and constraints. The former would fix the existing problems with ActivityPub in an LD context (like the errors in the C2S partial update approach). We’d need to work through some interop issues, if that’s a goal, but at least it would be less ambiguous for developers.

To be clear, I’d be more a fan of ActivityPub-LD, but I believe the current lack of clarity in the recommendation is a disservice to both LD and non-LD developers (who are a large majority of developers at this point).

Theoretically, we could revive the AS2 OWL ontology, fix it, update it to support ActivityPub, and make it normative. It could then provide additional semantics for the JSON-LD terms. However, that’s probably more in the scope of ActivityPub-LD since most developers aren’t going to be interested in learning about Linked Data, RDF, description logic, ontologies, and so on. @evan has accused me of conjuring dark and evil forces for even discussing it.

naturzukunft · September 18, 2024, 7:54am

I had the same thoughts. But i think it’s better to have an LD Spec and a description “how to plain json”

SorteKanin · September 18, 2024, 8:19am

In no particular order:

Any kind of mechanism to make it possible to migrate from one AP server implementation to another on the same domain without having to keep legacy IDs, see also this thread. This will likely not be backwards-compatible but I think it’s very important for the long-term success of the fediverse. Otherwise domains will be stuck with the same implementations (or maybe forks) forever. That seems bad.
Eschew JSON-LD and just use JSON. To me, JSON-LD seems too complicated for its own good. Most (AFAIK) implementations don’t bother actually handling it as JSON-LD and just treats it as JSON. Personally I had never even heard of JSON-LD before I read about ActivityPub. This made understanding the spec majorly difficult as JSON-LD seemed very strange. I think using plain JSON would simplify things for almost no disadvantage. The argument that JSON-LD allows better extensibility falls kinda flat when most implementations do not even properly use JSON-LD. The different representations that JSON-LD has also makes treating it as plain JSON problematic, but AFAIK the spec doesn’t require using JSON-LD. It all seems very weird and again, plain JSON would just be simpler and I think simplicity should be prioritized a lot more than it has been. I believe extensibility could still be achieved quite painlessly with plain JSON. This is not backwards-compatible either.
Support Dislike in the same fashion as Like, with a dislikes collection. The asymmetry here seems strange to me. Should be backwards-compatible.
Support emoji reactions in the base spec. This is a very common thing on most social media so it is kind of strange that ActivityPub doesn’t support it natively. See also this thread. If added as a separate thing (which would seem fine to me), it should be backwards-compatible. It could also replace Likes/Dislikes and those could be modelled with and reactions, but that would not be backwards-compatible.
Get rid of HTTP Signatures and instead provide a way to put signatures directly inside the JSON objects. This means that every object is fully self-describing and self-authenticating. The fact that the signature is associated with the request and not the object seems weird to me since it is the object that needs to be verified. In fact you need to cross-reference the HTTP Signature and the object anyway to properly verify the object, which should be a good hint that the signature should be on the object, not the request. Probably not backwards-compatible.
Provide some way to send activities in bulk. I.e. instead of 1 request = 1 activity, allow 1 request = many activities. This just seems like a reasonable optimization as you can send a lot of activities to a shared inbox at once that way. If signatures are on the objects themselves, each object can also include a signature on its own and so the request doesn’t need the HTTP Signature (which would be difficult to provide for many activities at once). Probably not backwards-compatible.
Provide better semantics for forums/reddit-like implementations in the base spec. As it is, ActivityPub is unfortunately quite focused on microblogging and the following of individual actors for content. However, social media like forums or reddit don’t work like that. You don’t follow actors, you follow “categories” or collections of posts. The way apps like Lemmy currently work around this is by modelling communities (subreddits) as Groups, where the group is a collection of all the people subscribed to the community. But this seems unnatural, as the more semantically accurate thing would be to follow a kind of category or collection, not an actor. It’s also slightly hacky as the community has to Announce (or boost in mastodon terms) every single piece of content (posts, comments, votes) to all the followers, even if that content comes from external users, which seems like a strange way to model it. Honestly don’t know if this could already be done more cleverly but long story short this use case seems poorly supported by ActivityPub right now. This could maybe be done backwards-compatibly?
This could be very hard, but possibly provide some standard way to signify roles and permissions for actors. For instance, it is useful to know that a certain user is an admin of another instance and has certain permissions because of that. Currently Lemmy hardcodes a “moderator” role and assumes certain permissions based on that, but that seems rigid and not very flexible. Ideally roles and/or permissions could somehow be specified flexibly. Not sure if that is backwards-compatible.

Laxystem · September 18, 2024, 10:41am

Get rid of all types that are not unambiguously defined, and provide a first contact mechanism to declare supported extensions.

SorteKanin · September 18, 2024, 11:00am

I’m also personally quite confused about the distinction between Article, Document, Note and Page. They don’t seem to have any meaningful difference aside from the type name. It’d be great if this was consolidated into a single Post or something like that.

I also find it weird that Link is not a subtype of object.

On another note, I think in general inheritance is just bad and the Activity Vocabulary is built around an inheritance hierarchy. It should instead define each type independently and then say what behaviors are expected from each type. This is kind of analogous to hardcore object-oriented programming (such as in C++/Java) versus a type-class/trait system (such as in Rust/Haskell).

FenTiger · September 18, 2024, 2:03pm

A few of mine:

Formally specify that “updated” is intended to be a machine-readable field that’s useful for “most recent wins” conflict resolution. That is, if you receive two different versions of the same object, you should keep the one with the most recent “updated” timestamp. I’m pretty sure Mastodon does this already, so hopefully this won’t be controversial.
Formally specify how URLs with fragment IDs should be resolved.
The sharedInbox mechanism would be more useful if it had a way to say “deliver to these specific recipients”, rather than making the receiver try to work it out from the addressing fields (without any knowledge of which recipients may have already been handled). Basically the same thing as the “envelope recipients” in SMTP. I’m not sure how best to add this - maybe a HTTP header, a new “Deliver” activity, or something else.

trwnh · September 18, 2024, 7:07pm

FWIW here’s my personal takes, at least for now:

Revive the Multibox endpoint, or some similar mechanism for specifying exact inboxes to deliver to.

The sharedInbox we ended up with has some pretty egregious shortcomings with respect to the receiving server having to interpret how it should handle delivery, and the popular use of followers collections in the addressing properties introduces a dependency on state synchronization that would otherwise be completely avoidable. For example, you may think you know who the local followers are for any given remote actor, but you actually don’t! The local actor may have been silently removed from the followers collection, or you might have gotten a removal activity that you didn’t understand for some reason and therefore didn’t process the correct side-effects for it. The problem only gets worse when you start to consider things like bto/bcc, or addressing arbitrary collections of actors a la “Circles” or “Aspects”.

Make Groups make sense

Looking at other vocabularies, you have things like foaf:Group which is a subclass of foaf:Agent that can have foaf:member statements. In VCard, you have vcard:Group which also uses vcard:hasMembership to point to constituent VCards. There is no such mechanism for AS2 Groups. It could make sense to use the Join/Leave activities to manage membership, where membership is denoted by an additional special collection. It could also make sense to extend this mechanism to Organization, although I imagine some hierarchy-related vocabulary might make more sense there.

Resolve the `as:Public` issue

The JSON-LD context defines Public as as:Public which in theory should expand to https://www.w3.org/ns/activitystreams#Public, and in theory, all three should be equivalent – but in reality, because the JSON-LD context also defines as: as a prefix mapping to https://www.w3.org/ns/activitystreams#, compaction will always generate as:Public and never generate anything else. Further compounding this issue is the fact that the properties you’d use such a Public term in… are not defined to take advantage of vocabulary mappings. Properties like to, cc, and so on are defined as @type: @id and not as @type: @vocab. Not that they necessarily should be defined as @type: @vocab, but if the intent is to be able to refer to Public, then this reference should be within a term that is defined as @type: @vocab. This means defining a new property dedicated specifically to denoting that an object is addressed to or intended for a specific class instead of specific actors. This is similar to how Web Access Control has predicates for both agent and also agentClass, the latter of which allows you to specify Agent or AuthenticatedAgent (while the former allows you to specify other people’s WebIDs). There’s also agentGroup for denoting instances of vcard:Group whose members are allowed to access the resource.

Better support for access control and not just delivery

Continuing from the above point, properties like to, cc, audience are intended to trigger ActivityPub delivery. But there are cases where you might want to allow an actor to fetch a resource without necessarily delivering to them. It might be possible to just use WAC for this, I’m not entirely sure. But there are going to be undefined interactions with the existing audience addressing properties, which seem to imply a limited form of access control already.

Redefine `Mention` in a way that doesn’t depend on microsyntax?

Right now, a Mention is a special type of Link that… “represents an @mention”? Even though it doesn’t even require an @ character? Basically, the definition of Mention doesn’t line up with anything useful in reality. There’s an aspect of linking to something that maps to a “user”, but this is artificial because ActivityPub doesn’t really have a concept of “users” or “accounts”, only “actors”. It’s not meant to generate a notification either, because that’s what to and cc are for. So what is it really for? What should it be for? Referring to an actor without generating a notification? It seems to be entangled with the more general issue of microsyntax, but surely there’s some semantic meaning we can extract out of it. Otherwise, we might as well just tag a Link instead of subclassing it as a Mention. For prior art or related concepts, we can look at Webmentions, where it is possible to mention pages or resources and not just people.

silverpill · September 18, 2024, 7:12pm

I think ActivityPub spec is mostly good as it is.

Minor corrections and clarifications are very welcome. In particular, new version of the spec needs to somehow address common misconceptions such as “ActivityPub requires JSON-LD” and “identities are attached to domain names”.
Some requirements can be relaxed to enable new kinds of applications. New protocol features can be added, but only if there’s an overwhelming support from implementers (only HTTP signatures and Webfinger qualify as such).
No backwards-incompatible changes. ActivityPub should be treated similarly to other widely used web standards.

strypey · September 20, 2024, 2:29pm

This is another example of a place where the AP spec could delegate this to FEPs, ideally with a mechanism for allowing servers to signal which they support, and which they prefer.

It’s a shame an FEP-style process wasn’t standardised along with the AP spec. Because then a lot of the deadlocks in the drafting discussions could have been resolved by delegating them to FEPs, allowing a lot of dead wood and pet preferences to be pruned out of the draft. If there’s anything I’d add to the agenda for new Social Web Working Group, it’s formally standardising the FEP process.

They’re pretty common on the web in general, but this isn’t a good argument for hardcoding one approach to them in the HTTP spec. Again, emoji reactions are something much better handled in FEPs. Why? A few reasons, including;

They are non-essential to federation between two or more social web servers.
There are plenty of conceivable use cases for AP federation where they would be surplus to requirement.
They would add avoidable complexity to the core AP spec.

But the most important reason is that there are bound to be a range of ways to implement the federation of emoji reactions, different emoji sets that can be used and so on. If there is disagreement about the best approach, or new approaches emerge, more than one emoji reaction FEP can be defined. Implementers are free to choose which one(s) they want to support, while remaining AP-compliant.

SorteKanin · September 20, 2024, 3:19pm

Couldn’t you say the same thing about many of the “native” activities? Like Like or Block or Question or any of the other more “obscure” native activity types. I don’t really see why an emoji reaction doesn’t fit in there. To be honest, the activity vocabulary seems very arbitrary in that sense to me (as I also mentioned above, Article, Document, Note and Page? What?).

In a way I agree with you and I wish the base spec didn’t include all these super weirdly specific things (Travel? Offer? TentativeAccept?) but it does. I guess I’d be fine with reducing the base vocabulary and having more types defined outside the base spec.

strypey · September 20, 2024, 5:17pm

Perhaps I’m showing my ignorance here, or perhaps we’re speaking at cross-purposes.

My comments were focused on the spec for the ActivityPub federation protocol. As I understand it, the Activities you mention are defined in the spec for the Activity Vocabulary, which extends the ActivityStreams data format. If you wanted to embed emoji reactions deeper than the FEP level, adding a Reaction activity to AS could well be the way to do it. Although I’m not familiar enough with the details of AS to know whether or not there’s already an Activity for that, and it wouldn’t surprise me if there is.

I could well be wrong here (I’m a fediverse evangelist not an AP implementer), but I don’t think the AP spec says that an implementation must be able to send and accept all the Activities in the Activity Vocabulary. If it does, then Mastodon certainly isn’t compliant I suspect what it does say, is that if you send and receive an Activity, you must do so as defined in the AS spec and its Activity Vocabulary.

Can anyone confirm or correct me on this?

SorteKanin · September 20, 2024, 5:47pm

I think that is correct, you don’t need to cover all activity types (though you could choose to display them in an actor’s inbox unchanged, regardless of if you don’t understand the activity, but almost nobody does C2S so whatever).

However that still means that the ones defined in the Activity Vocabulary has a kind of “canonical interpretation”. Like you couldn’t choose to interpret the “Travel” type in a different way than what the Activity Vocabulary says, otherwise you wouldn’t follow the spec. So if you wanted to do “Travel” in a different way (in the same way that you suggest there might be different implementations of emoji reactions), then that would not be standard. That’s how I see it at least.

Honestly, I just find the idea of the activity vocabulary and the way that it seems to “define the world” with these arbitrary categories quite strange. I feel like it should’ve been possible to define AP with a simpler, smaller set of (purely JSON, not JSON-LD) objects. For instance, consolidating Article, Document, Note and Page into a single “Post” object.

Maybe you could even go further and have a very general “Content” object that specifies a media type and then you could lob Image, Audio and Video in there as well. Then you can work with more general media types, which might be easier. I haven’t thought about it in super detail but I do feel the current categories are quite arbitrary. In general I find ActivityPub more semantically complicated than it could be.

arcanicanis · September 20, 2024, 8:19pm

I’m not sure how to articulate a specific design choice to favorably accomplish this, but maybe it would be nice if it were a requirement for Activities to be dereferenceable by it’s ID (and maybe have posts have some link back to it’s last activity?).

The crux of my annoyance is that you operationally get stuck with storing the body of an Activity twice: once for the activity itself, then again in another table for the post content (for a Note, Article, etc). Users can query individual posts from another instance by the post’s ID, which returns the bare post object without it’s Activity envelope (Create/Update); thus if you try modeling a database where there’s an ‘activities’ table, along with a ‘posts’ table that just points to the latest activity that holds the latest representation of the post, the whole model concept falls apart if you have the post without it’s originating/latest activity (such as from manually querying a remote post).

Maybe it could be inverted where the activity ID MUST be dereferenceable, while the Note/Article/etc just has some virtual [non-dereferencible] identifier instead (that’s scoped under the user)? I don’t know the exact solution, but it’s one of the big itches I have that just compounds database bloat.

trwnh · September 21, 2024, 2:14pm

This is why @context exists. Travel in the ActivityStreams context document expands to https://www.w3.org/ns/activitystreams#Travel and refers specifically to the concept or notion of a Travel activity as defined by the Activity Vocabulary. If someone wanted to have different semantics for travelling, they can do that in a different vocabulary.

This is what you get “for free” with JSON-LD, and what you stand to lose by moving to “pure JSON”: a complete lack of context, and therefore a complete lack of decentralized extensibility. It also creates a VERY fragile handling of how to name any given property or class, because you don’t have anything else to work with other than a single unqualified bare string. You wouldn’t be able to tell if actor meant “an entity performing an activity” or “a person who played a performance role in a movie”. You would suddenly need a central body to decide which use of the term actor is “correct”, and they would have to decide to potentially rename terms to resolve this ambiguity, which would break all existing implementations.

Or, the much simpler solution ahead-of-time is to be unambiguous with what you meant in the first place. You and I should both know exactly what is meant by https://www.w3.org/ns/activitystreams#actor versus what is meant by http://schema.org/actor. The @context mechanism in JSON-LD allows you to not have to type out the whole IRI every single time, using your own mapping of terms as you please. And the wonderful thing is that it’s possible that you and I don’t need to agree to use the same context – I can use a context that maps doer to https://www.w3.org/ns/activitystreams#actor, and you can use a context that maps as2_activity_performer to the same IRI. Expanding the term to a full IRI should arrive at the same IRI, and thus we know for sure that we both meant the same thing.

The juggling act that the Activity Streams 2.0 spec had to do is to decide that the AS2 vocabulary reigns supreme, that AS2 documents “MUST NOT override or change the normative context.” This is what makes the “plain JSON” interpretation of AS2 possible, but it also means that using shorthand terms for extensions is suddenly much more difficult than it needs to be. You end up with a problem where, for best results, two people need to agree to use exactly the same context, otherwise their extensions will not match in shorthand. The only way around this is to generate your AS2 documents with full IRIs always, or otherwise you expect that any “plain JSON” consumer shares the same context. (For more about this specific issue, see FEP-e229: Best practices for extensibility for further discussion.)

trwnh · September 21, 2024, 2:18pm

I think that this requirement already exists: ActivityPub

Implementations don’t always comply with this, though. Maybe they don’t realize activities are objects, too. Whatever the reason, they are wrong.

But we are missing a mechanism to go from a non-activity object to its “activity log” of sorts. I think that, plus some kind of revision mechanism, would be a good thing to work on.

stevebate · September 21, 2024, 4:08pm

It’s there if you use Linked Data / RDF, but … nevermind.