NextGen ActivityPub Social API

stevebate · November 12, 2024, 5:41am

The current ActivityPub Social API (C2S API) is not widely implemented in servers and almost no clients exist for it. Part of the reason is that the standard C2S feature set very sparse compared to the Mastodon Client API, which is supported by many clients. For example, the AP Social API is missing features like:

Search
Streaming events
Timelines (different from inbox/outbox, at user and server/fediverse levels)
Authentication / Authorization
Bookmarks (and custom collection discovery and management, in general)
Media upload (described in nonnormative wiki)
Actor/Account notes

Given almost all popular Fediverse clients are using the Mastodon Client API, many servers have decided to implement it to support those clients. The Mastodon team has requested that server implementers not do this, but the lack of any good alternative will result in this practice continuing.

I’m wondering if it would be useful to define these features as a group of NextGen Social API FEPs. If so, one challenge is transitioning ActivityPub clients and servers to this extended Social API.

I’ve done some experimentation with software that will provide a façade for the Mastodon Client API and requires no Mastodon software changes. It’s a special kind of proxy that implements the extended Social API features but passes through requests to the Mastodon API where it makes sense. The Mastodon Account and Status URIs don’t change so it doesn’t break the existing social graph. The Mastodon Client API remains usable with this approach, so current clients will continue to work during the migration phase.

The façade can be deployed without any support from the Mastodon team and, theoretically, would work with any server implementation that correctly implements the Mastodon Client API.

This is one implementation intended to support Mastodon-related migration. However, other servers could implement the FEPs “natively” rather than using a façade. After the FEPs stabilize, even the Mastodon team could do this and eliminate the need for the façade.

I can think of other features that would go beyond what the Mastodon Client API offers, but in this post I’m focusing on features that would give the Social API approximate feature parity with the Mastodon Client API.

This would be a relatively big project. I’m curious how much interest there would be in doing something along these lines.

aschrijver · November 12, 2024, 7:31am

I think it would be very useful. The intent of the AP recommendation is specifically to facilitate both server and client. Yet C2S has always remained elusive, with people musing about its potential and impact, yet the exploration requiring too much effort to pull off. Information on these attempts, like @yvolk’s AndStatus C2S open issue, is dispersed and prone to link rot.

The specs alone aren’t enough to lift C2S off the ground, esp. with the shortcut of a battle-tested Mastodon API to do the job. So FEP’s can show the way.

Two projects, or subprojects, then. The FEP’s and the façade. Where the latter can serve to transition and be a reference impl from client perspective. The façade project is a cool idea, and important to bring the “Mastoverse” along, yet there may be enough interest to implement natively to get the FEP project going.

I posted to draw some fedi attention (boosts appreciated):

In terms of finding interested native implementers the more specialty fedi projects are good candidates to nudge with this inquiry. I am thinking of @srosset81 ActivityPods, @naturzukunft rdfPub, @acka47 SkoHub. Also pinging @hrefna and @thisismissem with their good perspectives on next-gen social web.

I will not become directly involved myself, but can lend a helping hand here and there like I do now. Also, should you be interested, I can offer fedi.foundation as a ready-to-go portal to communicate about the effort. It is a multi-author magazine site meant as a research/dev portal, where “foundation” refers to the Social Web technology base (and not to a non-profit).

PS. @dansup of Pixelfed replied to my toot that he considers adding C2S support to Loops short video platform that is in early-access now.

stevebate · November 12, 2024, 7:42am

Right. Actually, I’m thinking of three related projects:

FEP definitions (each one could be considered a subproject in itself)
Mastodon Social API Façade
NextGen Social API reference client implementation

The latter would demonstrate using the API. It wouldn’t be intended to be as polished as existing clients or to compete with them.

Even if some servers aggressively implement the FEPs natively, the façade could convince client developers to add support for the new API without the risk of losing large numbers of Mastodon users.

trwnh · November 12, 2024, 11:50am

i think that

is pretty spot-on, but at the same time needs a bit of reframing. i think that the primary reason that C2S was not adopted widely is not purely because of “missing features” but more accurately a sort of “impedance mismatch”. the C2S api is simply trying to do something different from what a social network is trying to do.

by which i mean:

a social network wants to enable its users to do all of the above things and more;
AP C2S was written to enable its users to maybe perform simple resource manipulations and/or push notifications to some recipients.

i’ve encountered this dynamic in trying to build a playground / “test server” for what i consider to be “minimum viable activitypub”: basically LDN + the AP addressing properties. this approach is fully compliant with the “Federated Server” profile (pre-errata), and while it discounts a few SHOULD recommendations, it does so because they are inapplicable. the way it works is like so:

there is an outbox endpoint.
- it accepts as2 activities for now (but could be extended to accept arbitrary RDF payloads for LDN purposes).
- its sole function is to look at the values in as:to as:cc as:audience as:bto as:bcc, try to discover any ldp:inbox for each resource, and then deliver to the discovered inboxes.
  - this makes it compliant with 7.1.1 outbox requirements for the “Federated Server” conformance profile.
  - it does not perform any of the C2S side effects, so it is not compliant with the “Server” conformance profile.
there is an inbox endpoint.
- it accepts as2 activities for now (but could be extended to accept arbitrary RDF payloads for LDN purposes)
- its sole function is to receive notifications.
  - it does not perform any S2S side effects, but this is because the side effects are inapplicable – the inbox server does not store any representations other than writing the notification payload to disk and allowing it to be read via HTTP GET.
  - consequently, since all the S2S side effects are SHOULD statements that are inapplicable, we maintain compliance with the “Federated Server” conformance profile.

of course there are some natural extension points:

as described above, supporting arbitrary RDF payloads would make this compliant with LDN, and this is not terribly hard to do since LDN has provisions for discovering supported serializations (HTTP OPTIONS to check the Accept-Post headers).
i may want to handle Follow activities at some point. the “Follow protocol” described in AP is probably one of the more/most well-defined parts of the “protocol”. the challenge is in wiring it up in a way that flows naturally between entities.
- in particular, because of the Accept/Reject requirement, the only way this could ever be handled on the server is by handling it automatically. i would want to support other options, of course.
  - it may be handled at the server level automatically (e.g. if the actor is not blocked, then respond with an Accept immediately.)
  - it may be handled at the client level automatically or manually (with more capacity for client-side rules and automations, based on information the server does not have).
  - it may be handled at the user level manually (e.g. if the server does not handle it, and the client does not handle it, then the user may manually read the Follow in their inbox and manually send an Accept/Reject to that actor).
i may want to add integration between the outbox and/or inbox with a “storage service”.
- the “storage service” would store the activities sent and received.
- the “storage service” might also have its own inbox which you might use as an interface for communicating with the “storage service” (if such a protocol/profile is defined). at least in Solid right now they use HTTP verbs but you could imagine using AS2 Create/Update/Delete/Add/Remove to interact with such a “storage service” both remotely and automatically. the user would simply add the “storage service” as a recipient on their activity before sending it to the outbox; the activity would then flow from the user’s outbox to the “storage service” inbox where it would have side effects.
  - if this were not done, then the user or their client would have to manually POST to multiple outboxes instead of POSTing to a single outbox that then delivers to multiple inboxes.

one thing that falls out of this experiment is that the inbox and outbox don’t have to be served by the same software. in fact, if you’re looking at it from the perspective of the 3 conformance profiles, then you end up with:

the outbox can comply with the “Server” conformance profile by allowing for resource manipulation via Create/Update; the outbox can also comply with the “Federated Server” conformance profile by allowing for delivery to inboxes.
the inbox can comply with the “Federated Server” conformance profile by adding activities into itself.

given this reframing, i would wonder what role AP C2S plays if applied to a more “social network” use case. or put another way: which additional APIs are needed to provide a good “social network” experience? what are the assumptions that can, should, or must be made? are we following the “instance” model, or do we instead have a concept of a singular “activitypub server”, or are we operating at the level of “http endpoints”?

i think before we really get around to “fixing c2s” we need to start with the foundational work and figure out what the invariants are. what are the things that any implementer can depend on to be true?

it might start to look like not just one API, but several APIs as part of an “API suite”.

for example:

the “social notifications API” would allow for sending push notifications.
- it must interface with a delivery service in order to perform side effects.
the “social publishing API” would allow for resource manipulation.
- it must interface with a storage service in order to perform side effects.
- it may generate notifications by interfacing with the delivery service.
…?

additional functionality would slot into this “suite” as appropriate. the challenge is in identifying the boundaries and natural points of integration.

naturzukunft · November 12, 2024, 2:37pm

phew, there’s always so much content to read, I’m not too keen on it. I started rdf-pub because I couldn’t find a project that I could control with C2S. So I thought I’d have to build something myself. The project is a side job, because I have the incentive to work with rdf / linked data and not with json.

I don’t really like the term ‘social API’. I like activity-pub because it is very powerful due to the use of json-ld and you can send many types ‘back and forth’. What exactly ‘social’ means in the sense of a domain model is unclear to me ,-)

only microblogging (I’m afraid that’s what ‘Social API’ stands for) is not my goal.

My goal is to provide an AP server s2s/cs2 that allows client implementers to build clients on top of a C2S AP API. I realise that there is still a lot to define. For me, a client is ‘also’ an adapter to https://www.kartevonmorgen.org or https://wechange.de.

naturzukunft · November 12, 2024, 2:48pm

here is a quick diagram that roughly shows my vision that I would like to realise with linked data. I’m not sure if I’ll live long enough to realise it

If ‘Social API’ makes something like this possible, it’s exciting for me

stevebate · November 12, 2024, 3:10pm

I’m using the terminology from the AP specification. However, I don’t feel strongly about the naming. In Evan’s ActivityPub book he’s calling C2S the “ActivityPub API” or “social-network API”. Neither of those specific names are used in the AP spec, AFAIK. Of course, “C2S” is not used in the spec either but it’s easier to write than “client to server”.

Note that C2S and S2S are both APIs and protocols, per the spec. From the Abstract:

The ActivityPub protocol is a decentralized social networking protocol based upon the ActivityStreams 2.0 data format. It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.

and Section 2.1:

This specification defines two closely related and interacting protocols:

A client to server protocol, or “Social API” …

A server to server protocol, or “Federation Protocol” …

Hopefully that’s clear now.

That’s not what I have in mind and I don’t believe that was the intention of using that name in the spec, but it should at least support that use case reasonably well given it’s currently the primary activity on the Fediverse. However, I don’t know any reason it can’t support other Fediverse use cases too (forums, media sharing, event scheduling, long-form content, something we haven’t even considered yet, etc.).

naturzukunft · November 12, 2024, 3:20pm

@stevebate I wouldn’t have expected that from you either.

I think I or rdfPub would be a native implementer of any kind of activity-pub client to server specifications.
And I would welcome it if there were client developers building on it.

stevebate · November 12, 2024, 4:24pm

I think this might not be relevant to what I’m proposing? Unfortunately, like many aspects of AP, it’s not completely clear, but the “Federated Server” profile seems to be related to the S2S protocol and not directly related to C2S (it depends on what the undefined “implementation of the entirety of the federation protocols” means). What I’m discussing would only be directly related to the “ActivityPub conformant Client” (client side of C2S) and “ActivityPub conformant Server” (server side of C2S) protocols. It is not dependent on the “Federated Server” protocol (although the inverse dependency could exist because of the optional federation protocol dependence on an outbox).

I think the outbox delivery to local inboxes is also part of the existing “Server” conformance profile, independent of the “Federated Server” behaviors. I think it’s reasonable to consider inbox delivery for local server actors to be “Server” behavior and delivery to remote/federated inboxes as “Federated Server” behavior.

So far, I’m not seeing a clear C2S reframing here (after subtracting the S2S Federated Server considerations). The FEPs/APIs I listed at the start of the thread are my evolving list for additional features that could help AP C2S be a good alternative to the widely used and implemented Mastodon Client API.

To be clear, this project wouldn’t solve many of the problems that degrade the Fediverse social network experience (like incomplete replies, inaccurate interaction statistics, lack of visibility/reach for small instances, network-level search, and so on). Other projects can focus (or are focusing) on those.

(Tangent…) This didn’t make sense to me.

the inbox server does not store any representations other than writing the notification payload to disk and allowing it to be read via HTTP GET.

It’s not clear to me if this is intended to be an extension of AP S2S or not. I don’t know of any current requirement to write either an AP Activity or an LDN notification to disk (but, obviously, it’s usually practical to do so). For AP, the federated inbox “notification” (e.g., AP Create/Note Activity) would typically (always?) have a URI from a remote server. Using HTTP GET on that URI will retrieve the “notification” from the remote server, not from the inbox’s server.

Dereferencing an inbox in either AP or LDN might return the (cached) remote representations for the resources the inbox references or it could return a set of URIs (or, in AP, return an HTTP error status, or always return an empty set, or …).

In any case, if you want to dig into this further, let’s start a separate thread.

trwnh · November 12, 2024, 5:11pm

I think “social” in this case has to do with “other people/entities”, that is, your interactions are not happening in a vacuum but are also being broadcasted or delivered to others. This is more than “only microblogging”, and in fact it doesn’t have to involve microblogging at all! The two pillars here are “notifications” (C2S being used to trigger S2S delivery) and “resource management” (C2S being used to Create/Update/Delete/Add/Remove).

trwnh · November 12, 2024, 5:23pm

Loosely: Client = client side of C2S; Server = server side of C2S; Federated Server = delivery S2S; but I bring up my explorations of “minimal Federated Server” as an introduction to my main point, which is that

trwnh:

i would wonder what role AP C2S plays if applied to a more “social network” use case. or put another way: which additional APIs are needed to provide a good “social network” experience? what are the assumptions that can, should, or must be made? are we following the “instance” model, or do we instead have a concept of a singular “activitypub server”, or are we operating at the level of “http endpoints”?

i think before we really get around to “fixing c2s” we need to start with the foundational work and figure out what the invariants are. what are the things that any implementer can depend on to be true?

it might start to look like not just one API, but several APIs as part of an “API suite”.

for example:

the “social notifications API” would allow for sending push notifications.

it must interface with a delivery service in order to perform side effects.

the “social publishing API” would allow for resource manipulation.

it must interface with a storage service in order to perform side effects.

it may generate notifications by interfacing with the delivery service.

…?

additional functionality would slot into this “suite” as appropriate. the challenge is in identifying the boundaries and natural points of integration.

I think I can go back and details/summary collapse the expository bits but by “reframing” I meant in terms of thinking of C2S primarily as a vehicle for inbox delivery, and also with the intention of manipulating resources via Create/Update/Delete/Add/Remove. Right now, it feels like a lot of the general overlooking of C2S is that it “doesn’t do anything useful”, but that’s only because what it’s “useful” for is not what most people are looking for out of a social network. It can at best be considered a component of such a social networking experience (the “publishing” bit and/or the “notifications” bit), but not the whole experience.

by “writing the notification payload to disk” that was part of the section where i was describing the application i wrote / am writing (the “minimal ap s2s federated server” implementation). you can of course implement it in other ways but we don’t have to dig further into what is a proof-of-concept used as a jumping point into my main argument (the self-quote above).

aschrijver · November 12, 2024, 6:54pm

FYI Tangential, but interesting fediverse discussion unfolding, started by @hrefna:

aschrijver · November 12, 2024, 7:21pm

In this toot @helge mentions:

My personal acceptance criteria (of a social networking protocol) is: When building a recipe sharing app, a developer has to worry more about converting three tea spoons of salts into sensible units, than what the exact data structure that represents a message is, or how other applications display the recipe (if they even do).

And indeed, I feel that the current reality with protocol decay, tech debt and whack-a-mole programming has totally shifted conversation to nitty-gritty technical hardtalk.

@naturzukunft refers to domain modeling as a supported way to deliver social experiences then that would lead to a very versatile social web. The DX should be such that the developer can be free again to focus on the solution and how it fulfills people’s needs.

The promise of AS/AP is to conceptually offer a network of actors that form social graphs and exchange activities. Yet no DX of any project comes close to give a dev direct power at that level, as one gets mired in low-level impl details.

trwnh · November 13, 2024, 12:49am

i’m wondering what might make sense as the “natural next step” to integrate into such a suite. obviously there are going to be some components that are easier than others to “refactor” into open standards, or otherwise represent using existing technologies:

authentication/authorization is a bit open-ended but can be made to work if you negotiate allowed schemes between any client/user-agent and the service in question. (WWW-Authenticate?)
- it MUST interface with an identity service. the identity service:
  - MAY grant some kind of token (OAuth or JWT)
  - MAY require signatures (DPoP or HTTP Message Signatures, perhaps against keys included in a Controller Document or DID Document)
  - MAY use http basic auth but generally SHOULD NOT if at all avoidable (could require explicit configuration to enable this as an option, so not every deployment has this option enabled)
account notes can for example be handled by Web Annotations without too much extra work. just give your actor a dedicated, private Annotation Container into which they can create annotations about other actors (or other objects, really).
- it MUST interface with a storage service in order to store annotations.
media upload can build upon the non-normative wiki description and just formalize it, perhaps with an exploratory phase in which the api design is evaluated against the needs that a user may have in uploading media.
- it MUST interface with a storage service in order to store media and maybe also media descriptor documents (in AS2, most likely).
custom collection discovery and management could be done via the as:streams property but i feel like it maybe probably shouldn’t? the more natural usage of streams is to allow publishing multiple “streams” of activities (not just an outbox), but this whole area definitely deserves more thought. this is also an area that is going to inherently require more coordination because the natural way to point to various collections is going to be through purpose-fit semantic properties.
- it MUST interface with a storage service in order to manipulate collections.
- it MAY interface with whichever service is serving the actor document (storage or identity, but could be its own service) – especially in the case where extra properties are introduced.
searching, querying, filtering, etc. against collections is also in theory possible using SPARQL but perhaps there’s something lighter-weight that could work? the facade approach is likely extra helpful here because the user may want a client that translates more “natural language” queries into the appropriate SPARQL.
- it MUST interface with the storage service to provide a source of data.
- it SHOULD interface with the identity service to allow querying private data.

and then there’s stuff that i’m a bit fuzzy on the details of or have no real idea how it could be done:

timelines. i can imagine that this is mostly in the realm of Clients who can sync all activities from the inbox (and/or directly from actor outboxes?) and then filter for e.g. Create+Announce and sort in (base case) “reverse chronological order”, which is… how do you determine that ordering? just based off of the order of activities in the inboxes or outboxes? that seems like a bad idea. it would be even worse to use as:published for this. there probably needs to be some way to store metadata about received activities, separately from the inbox’s items themselves.
- it MAY?? interface with a storage service to store such metadata
“mark as read” for the inbox. i know there’s FEP-c4ad but that doesn’t seem like a particularly good idea as currently formulated. the “ideal” here seems to be a way to store lightweight information about which items have been read, which items are still unread, and/or which item is the most recently read (a la “read markers” or “timeline markers” in mastodon). this kind of goes with the above point as well – a mechanism to store metadata about other objects would probably go well here.
- again, it MAY?? interface with a storage service to store such metadata.
streaming. maybe taking a page out of solid’s book we could use something like WebSocketChannel2023
- it SHOULD interface with the inbox service at least, most likely.
- it MAY interface with other services, i’ve not really thought too much about this yet.

i guess that’s as good a start as any into thinking about all this. if there are any other considerations anyone would like to add, or suggestions for other apis that are missing from the above list, then go for it.

thisismissem · November 13, 2024, 1:19am

I’d be strongly asking if you even need streaming; maybe you might want eventsource subscriptions to collections or something. But maintaining a high performance streaming server that scales is no small amount of work. Polling efficiently may be better.

trwnh · November 13, 2024, 1:30am

well, i did leave it last on the list for a reason ^_^;

stevebate · November 13, 2024, 6:49am

That mostly overlaps with what I’ve been thinking. The biggest difference is that a server-internal “storage service” doesn’t seem to be necessary for defining the exposed API surface.

For search, there are several possibilities. I think SPARQL wouldn’t be the best fit given ActivityPub servers rarely use Linked Data / RDF technologies. Based on my experience with SPARQL, I believe it wouldn’t be something most developers wouldn’t want to implement for non-LD data. However, it would be the natural choice for ActivityPub-LD (if that ever became a thing). Given the plain JSON orientation of the ActivityPub transport serialization, something like jq or graphql might be an option. However, for the core search, a simple keyword based full-text search (with a few AP-specific search operators?) may be sufficient and more advanced search and query capabilities would be optional.

Speaking of optional features, I didn’t list it in the original post but I believe we’ll need support for some form of a capabilities profile (something similar, in intent, to the current related FEPs related to that topic). There may be an implicit core set of required capabilities with additional metadata to indicate support for optional SPARQL and/or GraphQL endpoints (for example) if a server chooses to implement those features.

My thinking is that most of these kinds of features can be defined in separate FEPs and a NG Social API FEP would select a core set of them and reference the optional capabilities that developers might want to consider.

It’s an interesting point about the event streaming feature. It might not be a core feature (since notification pull is needed anyway for backfilling). I had been thinking of WebSockets, but other options to consider are SSE, WebPub, STOMP (easy to implement, would integrate well with message queues and pubsub middleware), LDES/TREE (but really only a good choice for ActivityPub-LD).

On the timeline topic, it’s more involved than event sourcing an actor’s inbox. That’s a “home” timeline (in Mastodon terms), but there are other timelines (“local”, “public”/“federated”, “account list”, “hashtag”, etc.) that cannot be event sourced from the inbox. I’d like to see something like a DSL for defining timelines that support the Mastodon-style timelines but much more flexible. For example, it could combine accounts and multiple hashtags, have custom ordering, and/or use advanced filtering (maybe use an LLM to not include posts with a very negative tone, for example). I can see this being a relatively large FEP.

DameO · November 17, 2024, 4:56pm

ActivityPods would be a good choice as it already makes use of the C2S spec. So there’s active development with working applications, you can look at the ActivityPods Mastodon client Mastopod for reference.