FEP-96ff: Explicit signalling of ActivityPub Semantics

silverpill · February 17, 2024, 7:27pm

Hello!

This is a discussion thread for the proposed FEP-96ff: Explicit signalling of ActivityPub Semantics.
Please use this thread to discuss the proposed FEP and any potential problems
or improvements that can be addressed.

Summary

A number of vulnerabilities have occurred in ActivityPub implementations due to
“type confusion” attacks - where unrelated files on the same hostnmae as an ActivityPub
implementation are processed as obejcts with ActivityPub semantics.

Such attacks have been mitigated by carefuly validating the Content-Type header (and
by implementations ensuring that users cannot create files with the application/activity+json
or application/ld+json content types), but it would bolster such defences if messages
intended to be processed with ActivityPub semantics

Additionally, ActivityPub nominally supports transfer syntaxes other than JSON-LD (such
as any other RDF syntax like Turtle; or potentially a more bandwidth efficient syntax such
as a hypothetical CBOR-LD). Strict content type filtering permanently prevents usage of
such syntaxes in the future

erincandescent · February 17, 2024, 8:33pm

The RFC2119 text has gotten a bit mangled when copied there - perhaps that paragraph should be deleted?

The primary purpose of this document is to (eventually) make all use of ActivityPub semantics explicitly signalled - effectively, semantics should be opt-in.

silverpill · February 17, 2024, 9:47pm

Sure, I removed the last paragraph. Per convention we include the whole Summary section in the introduction post.

tesaguri · February 18, 2024, 2:49am

The ActivityStreams 2 syntax can be used independently of ActivityPub, and non-ActivityPub systems such as Cohost produce ActivityStreams 2 documents.

Isn’t the problem of spoofed attributions also applicable to Activity Streams in general? Or, do you have another concern unique to ActivityPub?

My pet theory is that one of the points of a common data model like Activity Streams is interoperability between different ecosystems, e.g. ActivityPub consumers can share contents from static Activity Streams servers, and I fear completely rejecting non-ActivityPub producers defeats that merit.

erincandescent · February 18, 2024, 6:45pm

Yes, but the exact semantics expected may differ. This doesn’t preclude interoperability with non-ActivityPub systems, but does say “Hey this isn’t ActivityPub and you have to be (potentially differently) careful what you do with it”

tesaguri · February 19, 2024, 11:16am

While an ability to signal different semantics is a nice addition, that seems to me to be a different matter from the security issue mentioned in the proposal.

As a more conservative precaution against the type confusion attacks, how about using the profile link relation type (RFC 6906) with a value of the Activity Streams namespace (i.e. Link: <https://www.w3.org/ns/activitystreams>;rel="profile")? This still achieves the proposal’s goal of supporting RDF syntaxes other than JSON-LD, and it’s also applicable to non-ActivityPub producers.

lanodan · February 19, 2024, 8:33pm

One thing that’s entirely missing from it is deployment, surely you don’t mean to have that be an hard-requirement for federation, right? That would be a complete disaster.

Meaning implementations need to signal their support of it.

(btw replying through email seems broken, no idea why, nothing wrong in my server logs…)

erincandescent · February 19, 2024, 9:36pm

If implemented with the “An implementation MAY” behaviour as specified, then this is backwards compatible with existing implementations (Unless they’re including a different Link: <>, rel=type value, which to my knowledge no implementation does)

If implemented without that behaviour, then yes, you don’t have backwards compatibility

The idea is that if this becomes pervasively deployed then you can just remove the MAY behaviour.

If you get back a response from an implementation with a Link: <https://www.w3.org/TR/activitypub/>;rel="type" header, then the implementation supports 96ff . Perhaps it could be explicit that you must implement it everywhere?

erincandescent · February 19, 2024, 9:42pm

What does “ActivityStreams with the ActivityStreams profile” actually mean? I think anything we do should convey that this is ActivityPub. It may be dissatisfying that doing this breaks the “open world assumption”, but it seems like doing this is the easiest way for us to avoid type confusion attacks.

With regards to rel=profile vs rel=type:

rel=type is about server capabilities. LDP uses it to tell you “This is an LDP server, and this HTTP resource is an LDP resource”; here we use it to say “This is an ActivityPub server, and this HTTP resource is an ActivityPub resource”
rel=profile is about data semantics.

tesaguri · February 20, 2024, 11:26am

Do you intend something like Content-Type: application/ld+json; profile="https://www.w3.org/ns/activitystreams"\r\nLink: <https://www.w3.org/ns/activitystreams>; rel="profile" by “ActivityStreams with the ActivityStreams profile”? In that case, we’d have no problem with interpreting the profile parameters as idempotent, I guess?

I’m not suggesting to replace your proposal. Instead, I’m suggesting to extend the “MAY” behavior to make it applicable to other RDF syntaxes. (You may not like extending the requirement for backward compatibility with something new, though.)

While the “Hey this isn’t ActivityPub” semantics sounds nice for making some decisions on compatibility considerations, I believe we need a requirement clearer than just “be careful” as a security precaution, which should be fool-proof, and yet I suppose that rejecting anything that doesn’t explicitly support ActivityPub is not what everyone wants.

Content-Type and profile may not perfectly fit the purpose of expressing the server’s intention, but I suppose that it would still be a reasonable compromise to assume the responsibility of servers that presents the Activity Streams media type to ensure a minimum level of integrity of data they publish. In that case, we might at least want to update the security considerations of the IANA registration, though.

By the way, regarding security requirements in FEPs in general, the discussion in the following topic may be interesting:

(Not that I have an opinion on it. I’m linking to it merely for an informational purpose.)

tesaguri · February 20, 2024, 12:29pm

How does the proposal interact with the specification profiles of ActivityPub?

The Recommendation defines two comformance classes for servers:

ActivityPub conformant Server
ActivityPub conformant Federated Server

If a server only implements one of the profiles, can the server still express the type link relation?

Since not many servers implement both the profiles, maybe we don’t want to restrict the proposal to servers that supports both. But in that case, could a client be sure that a server supports the federation when it fetches a resource and gets the Link header, for example?

erincandescent · March 1, 2024, 1:01am

I don’t see any real reason for it to not be implemented for both.

The main purpose of the rel=type in the header is to let you know the server isn’t a confused deputy; if it implements one of the profiles and not the other, that’s irrelevant to whether its confused. You might find some things you want to do don’t work, but that’s OK

stevebate · March 14, 2024, 5:24pm

To help me understand, if the referenced server implementations had been checking the content-type header validity, would this have prevented the “type confusion”? If not, why?