Implementing ActivityPub Client-to-Server

ActivityPub Client-to-Server (AP C2S) API compared to other APIs allows to have more flexibility and control of the activities/objects, plus proper multi-instance abilities (origin URIs instead of IDs local to the instance).

Pleroma Backend has the ground layer of AP C2S and has been almost not progressing from it for a while.
AndStatus is a standalone client which does AP C2S among other APIs.

And so far that’s what I’m aware of.
I think we should see how to effectively kickstart this API into one that can be effectively used (adding some endpoints/fields, standardizing the different auth methods, …).

Proposed by: Haelwenn (lanodan) Monnier


As a matter of pragmatics to facilitate AP C2S adoption, I recommend recycling the Mastodon authentication API for limited clients, eg. apps like AndStatus with a limited repertoire of supported media and object types. The endpoints object in the actor can reference these endpoints as long as the server doesn’t differentiate between Mastodon API and AP C2S clients. C2S clients with open ended mechanisms like providing a raw format or leveraging OS mechanisms to open unknown media types should use the actor endpoints object for discovery and fall back to Mastodon endpoints. Servers offering capabilities that might not display well on a limited client can detect this limitation and provide an appropriate presentation. Clients currently supporting the Mastodon API would have a path to incrementally add AP C2S support

The stickiest of remaining details is that OAuth2 Dynamic Client Registration still assumes that clients know about a server’s scopes. Of course we don’t have defined scopes for AP. This could be resolved 3 ways - either a simple set of scopes with very general application, copy Mastodon, or propose a simple BNF for describing dynamic scopes

Since I prefer the latter approach…

Scope ::= Activity [ “:” , Object type [ “:” , Media Type ]]
Virtual ::= “ANY” | “ADMIN” | “ALL”
Virtual Media ::= “TEXT” | “BIN” | Virtual

Where ANY is any, except administrative types, ADMIN is administrative specific types (like update to one’s own actor), and ALL includes both ANY and ADMIN. TEXT is any byte stream intended to be human readable and BIN is any byte stream or encoding not intended to be human readable. Activity would be any object defined as extending the type “Activity,” or Virtual. Object type can be any object or link type that can be specified as the type property in an item that serves as the object of an activity, or Virtual. Media Type is any valid media type or Virtual Media

I am not personally invested in this structure, but I am invested in the idea of having a means to dynamically define scopes, so I’ve proposed a specific method for devising a schema here in hopes that anyone who finds the specifics disagreeable will be motivated to make a comparable effort

It’s really great seeing all the advocacy for AP C2S in the presentations and from various teams

In case it’s not fairly implied, I think AP C2S should support OAuth 2.1, or at least OAuth2 and Open ID Connect, with Dynamic Client Registration. Basic auth is okay for testing and maybe single user instances, but OAuth2 isn’t terrible to implement


Great idea.

I’m also concerned about apps. What can we do to encourage the authors of apps like Tusky to add AP C2S support?

1 Like

Yeah, Pleroma allows to use the existing MastodonAPI authentication for AP C2S endpoints, as well as basic auth (which I wouldn’t recommend much using as it doesn’t support 2FA and you can end up storing a password rather than a token).

Could you elaborate a bit on the media type? Is it something like MimeType? And not really sure where Virtual and Virtual Media fit in the scoping.

And while your proposition really makes sense I’m not so sure about it, like pleroma added EmojiReact activity which works like a Like plus an Unicode Emoji and added ChatMessage object for the ability to have nicely private-scoped messages between users.
Same goes for the rare ones where ActivityStreams isn’t the root namespace (and so tend to have {"type": "as:Note", …} instead of the more usual {"type": "Note", …}).

Elaborating a bit on the activity/object part, Mastodon API basically doesn’t allows to create an object other than Note, Question(polls) and sort-of Answer. So Article, Video, Image, Page, … goes away.

And one thing that I think app devs could end up liking in ActivityStreams is how orthogonal it is: known and quite limited set of endpoints and one kind of format for both sending and receiving.
Sad part is that ActivityStreams can be a bit of an adventure to support as it’s quite too flexible in it’s presentation (Pleroma normalizes it to one format btw so could make it easier to bootstrap a new client).

Meanwhile Mastodon API: Something like O(3n) kind of formats, at bare minimum O(n) endpoints. Where n is an action done (like, bookmark, …).

Right. I think it’s usually worth repeating that the Mastodon API works very well if you’re Mastodon, but less and less well the less like Mastodon you are.

1 Like

I don’t think it’s a problem. It would be more interesting for clients to be able to split a post into the various types and send the relevant objects to the right servers. E.g., a post containing an image could send the Note to Mastodon, and the image to PixelFed. Then you’d have generic clients that can dispatch the objects according to their types, so that server implementations can concentrate on doing one thing and doing it well. And users would then be able to choose their favorite platform for each type. I’m all for moving away from this one big thing that does it all and evolves into a huge piece of bloatware: the Web was not intended that way, and that’s how you build centralization.

Another example: it would be awesome to write a blog about a new song, attaching a video clip to it and have the client:

  • extract audio from the video clip and upload to Funkwhale
  • upload the video to peertube
  • post the whole piece to WriteFreely (including the link to the video and audio files, as one URL each[1])
  • extract the first paragraph including the video link and post a Note to Pleroma
  • etc. ditto for other servers the user wants to use.

@rhiaro is that close to what you described in The ActivityPub Panel? In other words, move away from the app paradigm, make it possible to cooperate among ActivityPub software implementations rather than to compete.

  1. the video and audio could be attached to the post and also posted as is, on their own lines into the post, to be interpreted by WriteFreely as a “OneBox” – much like in Discourse where a video URL is turned into a full-blown video player, etc. ↩︎


I imagine it more like servers being agnostic to the type so you can authenticate with and post to your preferred (for whatever reason) server from any client. I see the server as a simple storage device. It’s then other clients which present the data back to you. Clients may be read only (like a feed reader) or may be integrated read-write clients (like you’d expect from a fully fledged system like Mastodon or Pixelfed) but they’re fetching data from your outbox on the server you’ve authenticated with. The client can then choose to display the activity/object types they know how to handle. Basically I’d like to see UI/presentation for both reading and writing completely separate from the backend/storage.

None of this is new - remoteStorage and Solid operate this way.

I totally see that this gets more complicated where media is concerned, and special endpoints (endpoints, rather than servers) for things like photos and videos may need to be defined. Which was totally a thing in ActivityPub but proper handling of it got postponed due to time constraints.


It could make sense to have some kind of an ActivityPub C2S proxy but outside of Mastodon, there isn’t much Activity/Object mangling and they could support a large chunk of ActivityPub when it’s not implementation-specific (Funkwhale’s Track) or niche (most of ForgeFed I believe).

For example I think an AP C2S client for Pleroma/Friendica/… could look like Tumblr’s dashboard where you could choose to post between Text, Quote, Link, Chat, Audio, Video.

And I think it would be better for clients to have multi-server abilities themselves, so for example a quick&dirty picture goes to Pleroma but a nice shot goes to PixelFed (or different accounts on the same software like some folks do on Youtube).

1 Like

My suggested approach was very much informed by the sensible progress already made in pleroma working with AndStatus. That was good work. I just inferred the completion of the pattern and suggested it for general application

As for the specific starting point for dynamically describing scopes, I didn’t expect it to hold in detail, but make the final compilation ‘Scope ::= Name Space [ “:” , Activity [ “:” , Object type [ “:” , Media Type ]]]’ and we’re still looking good. I forgot to mention that Activity also needs a READ type. There should probably be a convention for the server to describe supported scopes

Yes, Media Type is the same as MIME Type. I’m just using the same terminology as was used in the Social WG docs to avoid problems if someone ends up doing a copy and paste to bootstrap a formal description

Virtual* is just a macro to reuse some allowed exceptions to compose an EBNF because I’m a lazy typist

1 Like

This great post from @erincandescent should be mentioned here (as:copyOfPost: The ActivityPub Panel)

One thing that I think is key but maybe not obvious from the spec: The AP Client-to-Server API isn’t just aimed at mobile or desktop apps. That’s the obvious and straightforward use case, but not the only one.

What the C2S API gives you is delegation. It lets some other agent - which might be a mobile app, or might be a web app - act on your behalf. And, importantly, it gives those agents effectively equal power to the server itself has.

In the extreme, you could have a server which has no UI of its own - it only exposes ActivityStreams JSON. Maybe not the most user friendly, but permitted.

With the C2S spec you can build a PeerTube which acts as a true adjunct to your “primary” AP server - you upload videos to it, but they get posted to your main identity hosted elswhere. Thinking about things on-the-wire, you can envisage an entry in your outbox which might look like

    "id": "",
    "type": "Create",
    "actor": "",
    "object": {
        "id": "",

AP C2S allows an additional degree of federation and decentralisation. And it doesn’t involve e.g. PeerTube giving up features either - it can still give you a timeline and all of those sorts of things

Some of the difficulty comes from the world and mental models that exist. In the Q&A, Chris commented on the trouble he had convincing any of the existing social networks to come to the SocialWG, and commented on the interoperability troubles we might have had trying to bring the existing worlds (the “old guard” of federated social networks, almost) together.

In some ways, I think Mastodon is the “last of the old guard”. Mastodon’s data model isn’t (or wasn’t) ActivityStreams, and its’ worldview isn’t the same either. They’re building something which follows the same basica model as Twitter (With their own twist), and that’s fine, but it’s intentionally limited.

However, when that intentionally limited implementation is 90% of the population of your network, it can colour perceptions of it. ActivityPub can support things wildly different from Mastodon, but when people think ActivityPub they also think Mastodon-likes. Some of the functionality can cause interoperability issues also. However, just in terms of the major proprietary social networks, there shouldn’t be anything stopping you from building something in the model of Facebook or Google+, or any number of other designs on top of it. And in terms of something much older, you should totally be able to have an ActivityPub enabled blog and you should also be able to blend all of these things together into one cohesive whole.

Systems built since tend to target AP directly, and have data models which more directly correspond to AS2 and its functionality. It would be a lot easier for these to implement the AP C2S spec, and it would be a great world where things did.

(In case my comments above might be misinterpreted: I do think that Mastodon adopting AP is perhaps the best thing that has happened to the spec; without it, the AP fediverse would almost certainly be a lot smaller and less successful. So while I lament here some of the downsides of having Mastodon as the “de facto standard” AP implementation, I’m still thankful for it. If we’d have built the most elegant social networking protocol in the world but there had been nobody to talk to, it would have all been for naught)


Shared Notes from the session

AndStatus has implemented some C2S with Pleroma.

  • pump io

Pleroma can:

  • Create Notes

  • Likes

  • But not really keen on opening it up due to large surface area


  • Used to be that messages would pass through Transmogrifier

  • Also has standard Mastodon API

  • These were 2 ways of validating

  • Now has per-type(?) fine grained validators

  • Still missing 1 big validator: Notes

Current status/future:

  • Interested in using OAuth

  • Uploads are similar to Mastodon API (1 parameter to make an object)

  • Pleroma piggypacks on the Mastodon flow

  • Pleroma does update the object with ‘id’ and ‘published’ and other fields before federating

Friendica :

  • has a challenge with LD-Signatures. For ex: C creating an object (and being signed) which makes it hard for S to add the ‘id’ and ‘published’

  • Friendica could do LD-Sig on the Server side in the C2S communication

  • Key management challenge

On Media upload…

  • pictures are always cached today

  • How to handle media upload when sharing the AS payload?

  • How to properly proxy media (remote URLs) when client views federated message.

Q: What motivating examples need additional endpoints and/or auth?

  • Uploads may require an additional endpoint.

  • End to end encryption is exposing ActivityPub objects to the client.

  • Key management? (maybe not)

  • How to do search properly (messages, hashtags), so that the client can properly “view” the data it cares about

  • How to properly search inbox/outbox to properly build state and filter.

Q: What about auth? OAuth 2.1?

Q: How about storing the /me aka my AP-handle in the browser?

  • registerProtocolHandler FAILED - website could have browser ask user to save handle information

  • C2S allows interacting with different instances with the same software

  • In indieweb they have indieauth to remember common handles


How the term “C2S” makes you think of your own client:

  • Services can be C2S of one another

  • Does not need to be a UI

  • Really a conceptual

  • Is this related to ‘micropub’? How does Hubzilla do this use case?

  • Instead of OAuth, capabilities (OCAP) may fit really well, allowing posting on behalf

  • Quote from @erincandescent:

“What the C2S API gives you is delegation. It lets some other agent - which might be a mobile app, or might be a web app - act on your behalf. And, importantly, it gives those agents effectively equal power to the server itself has […] AP C2S allows an additional degree of federation and decentralisation.”

C2S has chicken-and-egg problem.Hackathon opportunity! cj “volunteered” lain to track.

“C2S is still a heck of a discovery process”:

Q: How to do search with AP C2S?

  • Doesn’t fit well into REST (maybe GraphQL)

  • “How to get all X messages” in the inbox/outbox

  • How to provide state to the user from a stream?

  • Referenced data could make it easier by providing some more stateful information about the actor.

  • Could have implicit collections, or dynamically-generated collections

Q: How does C2S following another contact work? Problem: when client sends “Accept” message, must include id on client?

Next steps:

  • Hackathon

  • Really want feedback from AP authors on C2S and how it should work (particularly with filtering the inbox/outbox to build a “meaningful” view)