I’m reading ActivityPub with an eye to how it uses HTTP, and am fairly new to the protocol, so apologies in advance if I’ve missed something obvious or already discussed.
Section 7.1 defines delivery as the heart of server-to-server interaction in ActivityPub.
An HTTP POST request (with authorization of the submitting user) is then made to the inbox, with the Activity as the body of the request.
As far as I can tell, pretty much all server-to-server interactions defined in ActivityPub use this mechanism because they use the ‘deliver / delivery’ terminology.
Section 1, however, implies that there’s an alternative mechanism for server-to-server updates:
- You can GET from someone’s outbox to see what messages they’ve posted (or at least the ones you’re authorized to see). (client-to-server and/or server-to-server)
Of course, if that last one (GET’ing from someone’s outbox) was the only way to see what people have sent, this wouldn’t be a very efficient federation protocol! Indeed, federation happens usually by servers posting messages sent by actors to actors on other servers’ inboxes.
This last statement is very interesting to me. GET is cacheable, whereas POST is not (at least in this particular use case). GET can be scaled out by a proxy cache, which can serve hundreds of thousands of requests per second on modern hardware, and be geographically distributed very easily because it’s a generic function of HTTP; POST handling in most implementations requires application-specific code that often struggles to achieve single-digit thousands of requests a second (or even hundreds).
GET is also resilient, because it’s idempotent; if a client doesn’t have a complete view of the state of the server, it can make requests to complete its view. Section 7.1 addresses this with:
For federated servers performing delivery to a third party server, delivery SHOULD be performed asynchronously, and SHOULD additionally retry delivery to recipients if it fails due to network error.
This isn’t specific enough to ensure that messages will be delivered reliably and interoperably – implementations will make different decisions about when and how they ‘give up’ on failure.
Has there been much discussion of these aspects of the delivery protocol? My initial sense is that it would be helpful to have a negotiation mechanism that allows two servers to agree on how updates will flow between them, so that (for example) one that desires timely updates can use POST, whereas one that wants to take advantage of caching mechanisms for scale can ask its peers to use GET polling. That work might also include extensions to communicate expectations about retry behaviour on POSTs.