Unique activity IDs

julian · June 12, 2024, 3:30pm

Is there a requirement that an activity ID be unique?

Reason why I ask is it seems prudent to save a list of encountered activities, and drop those that have been seen before.

However that caused me to run head-first into a NodeBB regression because we ourselves don't actually send unique activity IDs.

For example, a Follow-Undo(Follow)-Follow chain would have the two Follows with the same ID, since we just construct them ad-hoc based on request data.

Easy fix is to throw in a timestamp there, but it got me wondering about whether there were uniqueness expectations at all, or whether I was being overzealous in checking for it.

hrefna1 · June 12, 2024, 3:37pm

@julian That's… an excellent question.

So the requirement as I understand it is that the id must be publicly resolvable, which would imply a uniqueness constraint.

Still, the verbage doesn't say that it must be a publicly resolvable _to the object in question_ IIRC? I'd have to look this up to confirm it and am not in a position to do so right now, but that's an interesting question.

evan1 · June 12, 2024, 3:39pm

@julian there is absolutely a uniqueness expectation in there.

silverpill1 · June 12, 2024, 3:41pm

@julian Yes, IDs should be unique:

>All Objects in [ActivityStreams] should have unique global identifiers. ActivityPub extends this requirement; all objects distributed by the ActivityPub protocol MUST have unique global identifiers, unless they are intentionally transient

-- https://www.w3.org/TR/activitypub/#obj-id

bumblefudge · June 14, 2024, 9:56am

Can I trick anyone thinking about deduplication and identifier schemes into reviewing my FEP and proposing changes or additions? I’m a huge fan of content-identifiers (whether the IPFS kind or others) so understanding dedup and performance needs inherent to the protocol informs some research I’m doing on content-identified AP: