Some discussion regarding NodeBB’s handling of soft deleted posts and Discourse’s parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.
@Claire, in Feb 2002, you created a topic where you mentioned soft deletes. While this isn’t strictly related to Undo(Delete), this FEP recommends thinking of a received Delete as an instruction to invalidate the cache, and re-fetch, which would give you a better answer as to how to handle the received Delete or Undo(Delete).
>Request the object (via its id) from the origin server directly
Couldn't Delete activity itself indicate the type of operation?
For example, if Delete contains embedded Tombstone, then treat it as a soft delete. Otherwise, treat it as a hard delete.
>The Forums and Threaded Discussions Task Force (ForumWG) has identified a common nomenclature when referring to organized objects in a threaded discussion model.
I find this nomenclature a bit confusing. Commented on the linked issue.
The assumption is that the object is not embedded. If it is, then it stands to reason that the embedded object can be used as is. I’ll call it out in that section, thanks.
What would happen if you receive a Delete for an object that you believe to have been soft deleted, but now it shows up as an object instead of a Tombstone? Like, it was undeleted by the time you receive the Delete or something?
Likewise, you receive an Undo(Delete) and when you fetch the referenced object, it returns back a Tombstone instead of the object?
It’d be good to document those cases, because I think the answers are:
If you receive a Delete and the object returns an object, not a 410 / 404 or Tombstone, then you discard the Delete
If you receive an Undo(Delete) and the object returns a 404, 410 or Tombstone, then you discard the Undo(Delete)
this regards soft deletion + context collections (as a collection of posts). This topic started at
I’m curious what should happen if the context contains three elements ap-obj, reply, and reply2. reply2 is a reply of reply. Now reply is deleted. How many elements does the context then contain?
@silverpill said that for mitra the context would contain 1 element ap-obj.
The scenario as Gherkin:
Background:
Given A new user called "Alice"
And A new user called "Bob"
And An ActivityPub object called "ap-obj"
Scenario: Reply to reply with parent reply deleted
Given "Alice" replied to "ap-obj" with "Nice post!" as "reply"
And "Bob" replied to "reply" with "Good point!" as "reply2"
When "Alice" deletes "reply"
Then For "Alice", the "context" collection of "ap-obj" contains "?" elements
Per my understanding, when processing a deletion of reply, you would not presume deletion of any or all downstream objects. Only the referenced object is deleted.
Deleting multiple objects at once would require multiple activities, or perhaps a single (and as-yet undefined) "batch" style activity.
I understand the need to be able to undo deletions, this is something we face at Mastodon for the edge case of appealing moderation decisions (currently, most moderation decisions can be reversed upon appeal, but not post deletion).
I have some concern with the FEP as it stands regarding performances, and ensuring consistency wrt. chronology of events, caching and possible out-of-order activities.
Indeed, performance-wise, the FEP asks recipients of a `Delete` to fetch the object that has just been deleted. This means that for a post that has, over its lifetime, reached a thousand different servers, in addition to ideally reaching all of those servers again (either directly or through inbox forwarding), the authoring server must now handle all of these servers fetching the now-deleted post all at once. I fear this is an especially bad instance of the thundering herd issue.
As for ensuring consistency wrt. chronology of events, we face a lot of challenges:
depending on their architecture, servers may emit outgoing activities (or process incoming ones) out-of-order (for instance, Mastodon queues jobs into work queues, but if there are multiple workers, a later job can finish before an earlier job does)
due to network failures, servers may fail to deliver an activity on time and retry later
due to caching (e.g. Mastodon offers short-time caching on reverse proxies, but does not invalidate the reverse-proxy cache when the resource is changed), fetched data might actually be older than just-delivered data
The ActivityPub primer makes note of this but offers no solutions besides “The receiving server, if it receives an activity that refers to an unknown activity, should store that activity for later processing.” While this is relatively easy to do when an object cannot be brought back once it’s deleted, this breaks done if you can undo the `Delete`, and I have seen no solution offered for that in the current FEP.
Using `published` in activities and `published`/`updated` or similar in objects might help with that, but I’m afraid this might not be enough because of the seconds resolution of `xsd:Datetime` (and it would require extra care that the lifecycle of an object is indeed serialized with a monotonic time).
Regarding the performance issue, and avoiding the thundering herd problem, one could simply embed the object itself (so, a Delete with an expanded Tombstone in object) into the activity. You could additionally sign it (LD Signature) or attach a proof (Object Integrity Proofs) if necessary.
As for sub-second resolution of updated/published... is xsd:Datetime required? I've honestly just been sending ISO Strings, which include millisecond-level accuracy.
When a Delete activity is encountered, the referenced object MAY be either the full object or a reference to one.
If object is a reference, the server MUST request the object (via its id) from the origin server directly.
Emphasis is mine. In situations where you choose to embed the full object in the activity, then you are not bound by the MUST to refetch the object.
Now, when talking about hard deletes, you cannot literally embed a non-existent object, so a re-fetch would be necessary, although I am hoping that 404 handlers are a great deal faster.
I like published. I can add that in to the FEP if it makes it easier to handle situations where multiple Deletes and Updates are encountered out-of-rder due to network congestion, parallel processing, etc.
`xsd:dateTime` is required as per Activity Vocabulary but i skimmed over the definition too fast, it definitely allows fractional seconds!
It appears I must have read too fast once again, and was confused by the “Unexpected responses” section.
That can still be an issue, negative hits are still expensive and in general you may not want to cache them (to avoid an attacker targeting something that does not exist yet).
i think this requirement can be removed, as the behavior on receiving a Delete is up to the receiver and not the sender. that’s also where the issue lies – receivers assuming Delete is a permanent removal. any or all of the following behaviors on receiving a Delete are “valid” in some sense:
do nothing to the object, just store the activity
expunge object from HTTP cache
expunge object from AS2/RDF dataset
edit the object to say it is “deleted”
convert object to a Tombstone
prevent reuse of the object.id
fetch the object using HTTP GET and handle caching/refetching using HTTP cache control headers
having a reference doesn’t imply needing to fetch it if you already have information about it. if you don’t already have information about it then you can also choose not to fetch on Delete activities. the point of having an id is that you can choose whether or not to obtain additional information! that’s what linked data is founded on – the linking. every link is in effect a boundary between two records of information.
if the goal is to prevent receivers from completely purging an object, then you can’t really do this. if the goal is to stop receivers from preventing reuse of the id, then recommend that they SHOULD NOT do this.
more generally i would ask you to consider two different senses of “deletion”:
type: Tombstone
formerType: Note
content: "the text is still there but the account was deleted"
attributedTo:
type: Tombstone
formerType: Person
or this:
type: Tombstone
formerType: Note
content: "the text is still there but the account was deleted"
attributedTo: <someone>
# GET someone HTTP/1.1
# HTTP/1.1 404 Not Found
Okay, I am perfectly fine to relax the requirement from a MUST to a SHOULD.
Does that resolve the thundering herd concern acceptably?
Other solutions would entail:
Setting explicit null as object (yes @trwnh@mastodon.social this is yet another example of a place where null makes sense!) if the object is hard deleted.
Sending an ETag header with the Delete activity. When re-requesting, send that same value in If-Modified-Since and the receiver can opt to terminate execution early with an HTTP 304.