CORS restrictions

koehn · May 4, 2020, 3:16pm

I’m noticing that many servers (Mastodon, Friendica, et al) restrict access via CORS that effectively prevents C2S web applications from resolving ActivityPub objects. For example, if a client being served from any domain that is not example.com receives the following from the user’s inbox Collection:

{
  "id": "https://example.com/123",
  "type": "Note",
  "content": "Hello, world!",
  "attributedTo": "https://example.com/actor/456"
}

A web-based client cannot display any information about the actor to who the note is attributed beyond their ID, unless the CORS headers for https://example.com/actor/456 allows cross-origin GET requests. Without those permissions, C2S applications cannot be written using browsers, because browsers will prevent significant functionality.

As a workaround I’m using an API on my server application that takes an ActivityPub ID and performs a GET instruction on the ID, bypassing CORS requirements. This degrades performance unneccisarily, and provides a gaping hole through the CORS requirement in the first place.

ActivityPub servers should serve GET requests to their object endpoints by all comers.

nightpool · May 5, 2020, 6:25pm

Degrades performance how? Implementing a trusted proxy allows you to implement caching, prefetching, and other types of performance improvements that are impossible with each client making it’s own independent request. This isn’t a “gaping hole” in CORS requirements, it’s CORS working as designed. CORS is a browser-based extension of the same-origin requirement, and it’s designed to restrict the ways in which a site can use the browser’s credentials and the client’s network to request cross-origin websites. Since the proxied requests aren’t using the cross-origin credentials, and they go through the proxy’s network, there’s no CORS concern.

IMO, a trusted proxy should be a requirement for implementing a privacy-preserving C2S client. As a user, i would find it very surprising if viewing a post from a user or even just viewing my home feed exposed my IP address to untrusted remote servers. Personally, I would prefer we add language in the opposite direction: Clients should proxy requests to remote origins through a trusted proxy server, to avoid disclosing the user’s IP address

koehn · May 5, 2020, 7:04pm

Degrades performance how? Implementing a trusted proxy allows you to implement caching, prefetching, and other types of performance improvements that are impossible with each client making it’s own independent request.

Cache invalidation is one of the two hard things in computer science, the others being naming things and off-by-one errors. Introducing another layer of caching on top of the ones browsers already have is counter-productive: many objects require authentication for access, in which case you cannot share them between users. You’re also introducing additional latency on top of every request. It’s another service for every server to implement, and to secure so that people cannot turn create an account and get an anonymous proxy with which to access the web (how would you secure such a service so that it cannot access anything other than ActivityPub endpoints?). And every C2S client needs to know how to discover it.

As a user, i would find it very surprising if viewing a post from a user or even just viewing my home feed exposed my IP address to untrusted remote servers.

You’re in for a major surprise then: browsers already pull resources of all kinds from all over the web; from CDNs and from static services like S3. There are solutions to these problems already; VPNs, request headers, preventing third-party cookies. None of those problems will get any better for proxying requests, and not proxying requests won’t make the current situation any worse than it already is.

Last, many ActivityPub objects require authentication to pull the request, so the receiving server will likely know the identity of the user making the request. Their IP address is of minimal value at that point, but if you want to hide your IP address, you should be using a VPN or the <img> tag is going to give you away anyhow. There is nothing preventing a server from adding a unique identifier onto the end of each bit of media on any object they serve up, e.g., https://example.com/my-image.jpeg?id=123456.

jauntywunderkind · May 6, 2020, 5:40pm

I want to share what I think is the canonical guidance on CORS in general:

https://mobile.twitter.com/jaffathecake/status/1222802740243566593

In my opinion, allowing cross origin access to your host is a massively socially conscious improvement to your origin. I for one would greatly appreciate it if more activitypub systems would allow CORS.

It’s important to me to think of activitypub not just as it’s own protocol, but as a part of the wider web. Allowing that web to be weaved among many hosts will be a healthy, cross-connecting thing for the web and for activitypub.

koehn · May 7, 2020, 1:03pm

A last aside: I think it’s perfectly acceptable for people to want a trusted proxy if they feel it’s valuable. What I’m arguing is the possibility that others might not, and that server implementations support both choices.

melody · May 28, 2020, 3:48pm

Server implementations should not be forced into being complicit in your poor privacy practices, use the proxy.

koehn · May 28, 2020, 4:24pm

Roughly translated: “we know better than you, so conform to the lowest standard of interoperability.” Not a very effective practice.

melody · May 28, 2020, 9:15pm

A browser-based C2S client is fundamentally flawed - the spec should not demand that implementations relax their security standards to enable you to expose your users to unnecessary risk.

koehn · May 28, 2020, 9:32pm

A browser-based C2S client is fundamentally flawed

Though you state it as a fact, that’s actually an opinion; any non-browser client would have the same “fundamental flaws” but are not similarly prevented from exposing just as much of the user’s information. Why hamstring only browser-based clients?