I’m not ready to write this up as a FEP yet, because I’ve not thought through everything. However, I think it is a good idea for something that is currently missing in ActivityPub.
Basically, it boils down to defining a new endpoint of the actor called websocket. See here for other endpoints. This endpoint SHOULD only be visible when the actor object is queried by the actor. The actor is identified through the Client2Server authentication/authorization mechanism.
The websocket endpoint MUST provide a RFC-6455 compliant endpoint, with the following properties:
An event of type inbox is send whenever a new item is added to the Actor’s inbox. This event contains the new inbox item as data.
When an event of type outbox is received by the server, it is treated similarly to a POST to the outbox endpoint of the Actor. See 6.2 Create Activity and following.
When an event of type proxy is received with data an object URI, the server replies with an event of type proxied with data the object being requested.
An event of type store that allows the client to store a blob on the server. The server returns the URL of the stored blob.
An event of type fetch used to retrieve blobs. Both from the server and the wider internet.
The use case is to avoid polling when using Client To Server.
i can see the point of streaming inbox events over websocket, but what is the point of the other ones? it seems to be unnecessarily replicating standard activitypub flows for no discernible benefit. for example, i don’t see a reason to post to outbox via websocket instead of via a regular http call. the only thing you would ever need to poll is the inbox, no? and websockets are the equivalent of push notifications, right?
Tolkien wrote a book with a phrase that said something like
One connection to rule them all
I think that pretty much describes, why I want to put activities in the outbox through the websocket.
On the technical side of things: Implementing putting things in the outbox through a websocket is probably cleaner. One can send the new id of the object and activity through the socket. And once the request is done being processed, i.e. send to all followers, one can send a “done sending”. This is not easily possible with HTTP.
The delivery notification is interesting, but I agree with the other comments that posting to the outbox is already covered.
You may want some kind of incremental delivery status notification since delivery to some recipients of an activity may complete quickly and others may take days or never complete if the target server has permanently gone offline.
I think it makes more sense to use SSE or WebSockets to handle notifications for inbox items, because it adds a new and unique capability to ActivityPub that doesn’t exist elsewhere. I agree with the rest of the community that it doesn’t make sense to use websockets to reimplement features that are already required to be transmitted over HTTPS, like outbox POST events. This just doubles the amount of work required to implement an ActivityPub server for no clear benefit. Taking your “one connection to rule then all” approach would fragment the ecosystem—some clients would support only websockets, some would support only POST, and users would have no way of knowing which is which.
I’m curious why you think this isn’t possible with HTTP. it would consume exactly the same amount of resources as a websocket connection—you keep the POST request open until the state of the async job is settled, and then you return the response. Most web performance advice would tell you not to do this because it would tie up a connection slot, but that’s the exact same reason why websockets is so hard to scale anyway, so if you’re already committed to paying the cost of implementing WebSockets, there’s no reason not to wait asynchronously before resolving your POST request.
Otherwise, I’m not exactly sure of the details but you could also imagine offering this as an optional upgrade to the outbox POST request by looking at the Connection header and determining whether the client is offering to make a HTTP connection upgrade and then serving the “extended status info” only over websockets with incremental updates. I’m not sure how good browser support is for this usecase, but my understanding is that the underlying protocols easily support it.
I think I’m being sold on using Server Side Events. In addition to “one-way communication is enough”, I also want to mention that it is clearly specified. There are fields, I can assign a value to. Also it should be familiar to a lot of Fediverse developers as it is used in Mastodon.
My current idea on how to use the fields is:
type is the type of object the data is appended to. Suggested types inbox, outbox, and meta (for status events on the server, e.g. 5 minute warning for server reboot)
data contains the JSON string of the activity being added. For the inbox, this is the activity as added to the inbox.
id specifies an id used for recovery of the connection. If possible this id should also be compatible with inbox fetches.
Ad 1. I’m still undecided if requiring anything except type: inbox is a good idea. I’m also most tempted to leave it up to the server and specify inboxStream to contain all new inbox elements, and then outboxStream and stream