Publishing / Processing Collections of Activities

Another practical issue @devnull and I faced when integrating NodeBB and Discourse is that the Discourse plugin has been publishing collections of activities to its followers if “Full topic” is enabled (i.e. all posts in a topic will be federated).

I built the plugin this way to accommodate an initial publication delay after the first post is made in a topic. Other posts may be made during that delay period, so the thinking being that rather than send multiple POST requests to the followers of that actor at the time of publication, you would just send them a collection of activities. In other words

  1. Topic is created.
  2. A few posts are made.
  3. Publication delay period ends.
  4. Publish the topic’s full Collection of Activities to followers.

As Mastodon also supports processing a collection of activities this seemed to make some sense to me. However, as I think @trwnh warned me, this has not turned out to be popular. There might still be something in this approach, but I’m going to be moving to just publishing sequential activities, even if there are multiple to publish at one time.

That said, I think there may be normative support for representing a topic (/thread) as a Collection, so the processing of Collections may become more of a practice in time.

2 Likes

Collections don’t typically arrive in inboxes, yeah. It’s a Mastodon implementation quirk that they wire up their inbox handlers to their collection handler, to minimize the possible code paths. The collection handler is also the first step when fetching an object. This is done because Mastodon supports the ‘featured’ collection as a way to signal “pinned posts”, and they cache the first page of pinned posts when discovering a profile.

This has come up in various topics tangentially over the years and hasn’t had its own thread, but basically, the assumption being made is that each activity represents a single action. You don’t typically see multiple activity types, multiple objects, or entire collections of activities being posted… and that last one is technically a spec violation too (although it only violates ActivityPub, not LDN). Part of the issue is that it’s unclear how to handle partial failures. For example, say you have multiple objects to a Create, and 7 of them are successful but 3 of them are discarded due to parsing issues. Was the Create activity successful? Or if you willfully violate AP and POST a Collection of Activities… you may have a similar “partial failure” to deal with.

Now, it’s still possible to work around these concerns. But it requires careful thought and broader spec revision and implementer buy-in. The “safe” thing to do for now is to only ever have one object per activity per POST.

1 Like

Ok, the Discourse plugin will always publish activities, one at a time, even if publication is delayed, once this is merged

4 Likes

Perhaps this can be viewed from a different angle, as a collection synchronization problem?

For prior work see FEP-8fcf: Followers collection synchronization across servers. I think the mechanism described there can be generalized to other types of collections.

2 Likes

Thanks @angus@socialhub.activitypub.rocks — I think this is (one of) the last piece before two-way federation will start working. Hopefully it will be merged and updated soon 😄

2 Likes