FEP-bad1: Object history collection

Hello!

This is a discussion thread for the proposed FEP-bad1: Object history collection.
Please use this thread to discuss the proposed FEP and any potential problems
or improvements that can be addressed.

Summary

[AS2-Core] provides examples 18, 19, 32 which represent the “history” of an object.

Particularly in example 32, we see an object being Created, Updated, and Deleted. However, there is no property dedicated to advertising a collection fit for this purpose. This FEP attempts to define one.

cc @trwnh

An object’s history collection will necessarily be ordered chronologically, although whether the ordering should be forward chronological or reverse chronological is an open question; at the time of writing this FEP, [ActivityPub] Section 5 currently contains the following language:

An OrderedCollection MUST be presented consistently in reverse chronological order.

This language indicates that if OrderedCollection is used, the ordering MUST be reverse chronological.

An OrderedCollection defined in an extension is not subject to this requirement. See: ActivityPub errata/Proposed - W3C Wiki

It’s not specific to this FEP, but the ambiguity of reverse chrono ordering is problematic in general. According to the Activity specification:

What property is used to determine the reverse chronological order is intentionally left as an implementation detail.

Especially for an Object’s Activity history, it seems we’d want it to be specifically ordered by the time the Activity was performed versus receive time, store time or some other time-related property.

Side comment: It feels like we are starting to define special properties and collections like this as a workaround for not having a general-purpose Object query capability. This is not a criticism of this FEP, just an observation of a trend I’m noticing. I could see this leading to a plethora of properties over time for special-purpose, inflexible “queries” and associated indexing. However, I also understand that defining a general-purpose query capability (maybe SPARQL-ish with authorization filtering) would be challenging and a topic for another discussion thread.

I suppose it could sort of be reconstructed with a sort of SPARQL query for where object == some id, which would return all activities targeting that object. You could filter the author’s outbox for this, and really, you could filter any collection that you would expect to contain all activities related to that object. But I do agree that this sort of filtering or querying support is more and more needed, as we are discovering with the submission of several recent FEPs. The closest prior effort to this is FEP-5bf0 which proposes using streams for pre-filtered sub-collections, but my current thinking is that there should be an endpoint defined in endpoints against which you can submit SPARQL queries. I just don’t know enough about SPARQL yet to come up with a fully fleshed-out proposal. Additionally, endpoints are generally only exposed on actors, so either endpoints needs to be attached on objects as well (which doesn’t make sense for most of the endpoints such as the OAuth ones), or the specific property needs to be attached to the object directly. Something like Collection.sparqlEndpoint? Or more indirectly via attributedTo.endpoints.sparqlEndpoint? This is the domain of some other FEP, though…

For this FEP, I think it’s useful enough to have an object’s history explicitly presented, for use cases where other consumers don’t particularly care to run their own queries.

Per the discussion on the PR, there is also another alternative:

RFC5829 defines the following rels:

  • version-history
  • latest-version
  • predecessor-version
  • successor-version

it sounds like litepub:formerRepresentations is semantically equivalent to version-history, so my preference would be to formally name such a property versionHistory.

the challenge is in storing each revision of an object, or generating it on-the-fly. which ID do you use? do you use an ID at all? i could see something like so:

id: <some-post>
type: Note
content: This post has been edited.
published: 2023-06-21
updated: 2023-06-22
versionHistory:
  - id: <some-post/history>
    type: OrderedCollection
    orderedItems:
      - id: <some-post/history/1>
        type: Note
        content: This post has not been edited.
        published: 2023-06-21
        versionHistory: <some-post/history>

either the Update activity, during processing, should generate <some-post/history/1> as an exact copy of the object at the time of the edit, and at the time of the first edit, this versionHistory should be created. or possibly, the versionHistory and <some-post/history/1> should be created at the time of the original object.

this approach probably works for mastodon API use-cases but not so well for anyone trying to reconstruct or represent actual history (file creation, update, deletion, etc). maybe that’s fine. i still want to explore exposing this via result though…

I was thinking of an “instance actor” endpoint.

Isn’t a version history something different than what this FEP is proposing? Given a time-series of change events (chrono-sorted Activities), one can materialize a version history but my understanding is that the FEP doesn’t represent that explicitly.

I see there is sometimes interesting information in the PR comments for FEPs. Has there been any discussion about that and whether we should just announce FEPs here and discuss them in the related issue comments?

Correct, this FEP doesn’t describe a versionHistory with objects representing revisions, it describes a history collection with Create/Update/Delete activities targeting the object. The versionHistory is presented as an alternative take that would be defined in a separate FEP. I was just reproducing the PR comment here in-thread so it could be more easily tracked.