Negotiating protocols between actors or clients

trwnh · November 3, 2024, 4:59am

this is the key part. you hand someone jsonld, as2, ldn, and ap – what are they missing? hand them apwf and ap-http-sig and they can maybe send messages now that don’t get dropped on the floor, but what actually goes in those messages? if you want to trigger a certain behavior, what is the message you need to send to trigger that behavior? and if you send the same message to someone else, will they understand it in the same way?

put another way: the C2S outbox has more meaningfully defined side effects than the S2S inbox. you could view the “resource management protocol” as basically an extension of C2S side effects to S2S targets, a kind of “remote C2S” that serves as a counterpart to your standard C2S against your local server.

the other side of this is that having an actual defined protocol means you can not only define behaviors, but you actually have a stable target to test those behaviors.

anecdotally, i know more than one person whose current approach to interop testing is literally to just spin up an instance of every single fedi software they care to target and blindly send those instances various messages to test for which behaviors they can observe heuristically.

because right now,

the only answer most people have is to go look at ActivityPub - Mastodon documentation or similar for whichever implementation they intend to primarily target.

as @stevebate points out, it’s more than “just implementation details.” if an “implementation detail” prevents federation entirely, then is it really just an “implementation detail”? what about if it significantly changes semantics? i would personally reserve the term “implementation detail” for things like showing a Person.image as the background image of the header on their “profile” rather than as the background image of the entire page. another “implementation detail” might be injecting a “read more” link if the height of the rendered content in a “post” exceeds a certain max height. but when any “implementation detail” starts to have semantic consequences, it ceases to be an “implementation detail” and becomes a “protocol behavior”. things like “you can only have 1 actor, 1 attributedTo, 1 inReplyTo” have consequences; if you violate these expectations then your “status” will not be created at all. your activity will be dropped entirely.

there is some consideration that needs to be given for extensibility, though. protocol behaviors are not always set in stone. a protocol that today demands a maximum of 1 inReplyTo might relax its requirements in the future to allow multiple inReplyTo. if and when it does, this should ideally be signaled somehow.

aschrijver · November 3, 2024, 7:24am

Some helpful resources related to the general topic:

silverpill · November 5, 2024, 1:24pm

This is not true. Fediverse applications interoperate because they implement ActivityPub and because developers worked together on improving interoperability. Some of those applications existed before Mastodon.

Mastodon-driven development is certainly a thing (especially among those who discovered Fediverse in 2022), but it (obviously) has an opposite effect: such applications don’t communicate with anything else.

That protocol is called “ActivityPub”.

Or “ActivityPub with HttpSignatures”, if you prefer.

This is exactly what I am trying to say: there is no Mastodon “protocol”, or Lemmy “protocol”. Only ActivityPub.

The idea that we have 100 different protocols in Fediverse, one protocol per application, is absurd.

trwnh · November 5, 2024, 2:53pm

can you not see how this is reductive?

there are behaviors that are not described by activitypub. there are semantics that are not defined by activitypub. there are processing rules that are not described by activitypub. to say that “activitypub” sufficiently defines the interop requirements of every “fediverse” application is the real absurdity.

this is like saying that some other protocol is “http” and “only http”. activitypub is not much more than “http post to ldp:inbox where the payload is an as2 activity”. if we were building exclusively messaging applications that read the raw activities directly from the inbox, then sure, that’s basically activitypub.

on top of that, you have the core c2s side effects regarding resource manipulation, and the core s2s side effects which mostly amount to “maybe add the activity to a collection”. but on top of that, you have further requirements if you want to interop with various applications.

i’m saying that these requirements are not and cannot be dismissed as “just implementation details”. they’re part of the protocol – not the formally defined “activitypub protocol”, but an undefined and implicit protocol layered on top of it that includes additional semantic, behavioral, and processing requirements. this divergence should be captured in a profile, and the existence of a profile enables things like conformance and compliance and testing.

again:

stevebate · November 5, 2024, 2:56pm

Like I said, try giving the AP spec alone (or even the set of specs I listed) to a developer and see if they can develop an interoperable “ActivityPub” server. I claim they won’t be able to do it. The ActivityPub “spec” has far too many unspecified or underspecified aspects.

Or… ActivityPub with HttpSignatures and a Mastodon-defined technique for WebFinger resolution of account identifiers that depends on a Mastodon-defined JSON-LD context and a specific subset of supported AP/AS2 entities , Mastodon-specific constraints for which properties are optional and required, expected property cardinality, expected authorization/visibility behaviors, and so on…

Maybe you haven’t seen the arguments between the team implementing the partially, but problematically, interoperable Lemmy and Mastodon protocols?

Yes, that statement is absurd. Nobody claimed that (unless I missed it). To quote @hrefna,

If you disagree with someone but consistently misstate their position, that doesn’t make your argument stronger. Just for the record.
Hrefna (DHC): "If you disagree with someone but consistently mis…" - Hachyderm.io

In any case, I think I understand your position although I’m puzzled why you have it. Maybe protocol means something different for you?

What risks do you see with a proposal like the one @trwnh has described?

trwnh · November 5, 2024, 3:24pm

for my part, the risks that i see are:

timing. is this kind of effort “too early”? have the implicit protocols settled to the point that they can be sufficiently profiled?
correctness. how can we avoid codifying incorrect or improper behaviors into such a profile? do we defer to implementers in cases where they violate the intent or letter of the spec, or do we pressure them to align with the standard?
scope. requirements may be loosened or tightened over time, so what goes into the profile and what stays outside of it? how should each profile account for extensibility?

while considering these risks, i have intentionally left the “options” for profiles as open-ended as possible, giving only suggestions for a few possible profiles instead of mandating specific ones to pursue. but it’s never too early to give proper thought to which behaviors are in play across the ecosystem, and i think it would be useful to be able to at least identify these behaviors at a server/client/actor level, while striving to enable producers to signal these behaviors in some way at these different levels. i don’t know what the “final form” of such a signal^[1] might look like, but that’s what the responsibility of such a task force would be.

(preliminarily, one might expect a mechanism to identify clients attached to a given actor, or servers responsible for a given endpoint. given such a mechanism, it would become possible to describe those clients and servers in the same way that we currently can describe actors. perhaps the clients and servers could be actors themselves. i don’t want to make any sweeping statements yet, but these are all possibilities.) ↩︎

silverpill · November 5, 2024, 4:56pm

:

And you just said that again:

If Mastodon and Lemmy are protocols, then all other ActivityPub implementations are also protocols.

(And no, I haven’t seen arguments between Mastodon and Lemmy devs. My software can interoperate with Lemmy just fine.)

As I already said, there is only one protocol. This proposal promotes a false idea of multiple incompatible protocols within ActivityPub network, which goes against other efforts to improve interoperability.

Other points of this proposal just don’t make sense to me. Documenting behaviors is a good idea, but this is a developer’s job.

trwnh · November 5, 2024, 5:20pm

the position isn’t “there are 100 protocols and they’re completely separate”. the position is “there are protocols that require more than just activitypub, and they partially overlap”.

the idea that “there is only one protocol” promotes a false idea that just because something uses activitypub it will be in any way compatible with something else that uses activitypub. this is like claiming that any two applications using http will be interoperable. the messages being “valid” is not enough for the messages to be “meaningful”. the payload of the HTTP POST is subject to additional requirements outside of activitypub. if you say “just drop anything you don’t understand”, then this is not meaningful interoperability, this is a failure state.

this is a standards job. behaviors should be standardized. if you don’t have standard behaviors, you don’t have a useful protocol.

silverpill · November 5, 2024, 5:27pm

“profile” is a better word than “protocol”, if you are talking about sets of features.

However, I don’t think developers actually want to implement feature sets. Nobody wants to build a clone of an existing product. Documenting features, on the other hand. is valuable, because developers then could pick what they want.

silverpill · November 5, 2024, 5:39pm

Well, if you don’t listen to developers, what exactly you are going to “standardize”?

So far, things that were discussed in this thread have little to do with that is actually happening in the network.

stevebate · November 5, 2024, 6:31pm

You are grossly misrepresenting my position and using that misrepresentation to support a clearly illogical conclusion. You mention two protocols that implemented by multiple applications (many, in the Mastodon protocol case), not 100 protocols. Nobody ever suggested per-application protocols. It wouldn’t even make sense for interoperability.

There’s a Discourse search feature that will help you find them (if you are interested).

WisTex · November 9, 2024, 11:26am

We had something similar to this discussion come up with whether Zot6, which Hubzilla uses, is Zot Protocol or Nomad Protocol. The history behind it is that Hubzilla implemented Zot version 6, and Streams implemented a later version (11 or 12 or something like that). But between version 6 and version 11, the name Zot was discontinued and all future versions were called Nomad.

After many discussions, we came to the conclusion that Zot 6 and Nomad 12 are both “implementations” of the same protocol, which is called the Nomad protocol.

I think we have the same situation here. Every platform has its own implementation, with varying levels of compatibility, but we are still using the same ActivityPub Protocol.

silverpill · November 9, 2024, 6:12pm

At least there were versions in Zot/Nomad. In ActivityPub world, no such boundaries exist, formal or informal.

trwnh · November 9, 2024, 6:59pm

“or informal”? you’re not seriously saying that all messages are universally understood across every single software that claims to implement activitypub?

maybe you can claim there is no formal recognition of the divergence of behaviors, but to claim that even informally they categorically don’t exist is… well, i don’t know what to call it. i’m stumped.

it seems pretty clear to me that activitypub only guarantees that a payload will be an as2 activity. there is no meaningful way that anyone can claim to support every possible activity, unless they treat the activity as purely a notification and assign it no behaviors whatsoever. in this “minimal profile”, we might say the only side effect of any activity is “add it to the inbox”. the second we add any other behavior, we’ve got undocumented behavior that exists informally.

but you’re saying that we don’t. we might as well just assume that every single consumer behaves in exactly the same way. what way is that? well, beyond the guidance of the spec, we don’t know. that some implementations require a certain property and some don’t is apparently not a real difference, despite this leading to some implementations completely dropping your message while others are completely fine with it. i guess we don’t need to worry about inconsequential “features” like “having your message not be processed at all”.

silverpill · November 9, 2024, 10:14pm

Every software behaves in its own unique way. A single piece of software is not a protocol.

stevebate · November 10, 2024, 5:59am

If progress on this proposal is slow in the SocialCG (because of the focus on WG rechartering), another option is to write an FEP for one of the AP-based protocols. The Mastodon protocol used by most current Fediverse server implementations, seems like a good candidate.

The protocol definition could be adopted by a SocialCG task force later, if one is formed.

stevebate · November 10, 2024, 6:08am

I don’t think repeating this statement over and over is helpful. Nobody is suggesting that a single piece of software is a network protocol. If I missed it, and someone is suggesting that, then maybe it’s a topic for a separate thread?

trwnh · November 10, 2024, 10:57am

I think we discussed it very very briefly in the meeting yesterday, but the outcome of about 1.5 minutes of discussion was “read the proposal, participate on the mailing list, and revisit it in December”. There was some consideration for whether new task forces should spin up now, or wait until we have CG/WG charters, but the answer seems to be “spin up now to start exploring the issues, then adopt the staging process once it’s ratified, then write reports/etc using that staging process.” Consequently, two task forces were proposed and resolved: a Geosocial TF for better supporting Place representations, and a Group TF for defining and supporting membership, joining/leaving, etc. as it relates to how grouped entities usually work in other vocabs and systems (foaf, vcard, …).

I don’t think I want to jump directly to a FEP documenting such things just yet, as I think there is potentially more value in working on a framework for defining and signaling such profiles and protocols first. This is going to relevant for the Forum TF and the new Group TF because they are introducing new behaviors. We might informally make statements like the following:

“If there is any context.attributedTo, you should send your interaction to them in the same way that you might send your interaction to any inReplyTo.attributedTo currently.” (Coincidentally, the latter half of this statement is currently just an informal social courtesy.)
“If there is a members collection present, you can assume that you can send Join/Leave activities to this actor, in the same way that you might take the presence of a followers collection to assume that you can send Follow activities to this actor.” (Again, the latter half of this statement is not explicitly stated in any formal guidance; consequently, a lot of fedi just assumes everyone supports Follow activities.)

But the really nice thing to have would be an unambiguous machine-readable way to declare beforehand that certain Activities will be processed in certain ways. Less assuming, more knowing. That way you don’t bother sending your chess moves to people who don’t understand them and quite probably never asked for them, and you don’t bother sending your e2ee chats to people who will never be able to participate and who are definitely not ignoring you even if it looks that way.

stevebate · November 10, 2024, 11:41am

You could be right. Writing a FEP to document your framework proposal is yet another option. It can be useful to have concrete starting point for discussion and iterative improvement. However, applying the framework to at least one real-world protocol can give valuable insights about the strengths and shortcomings of the framework.

trwnh · November 10, 2024, 11:43am

I think if it’s okay with everyone, I’d like to have this thread be about constructive feedback on or ideas for the general goal of “how can we describe server/client/controller behaviors”, as that is the preliminary focus of exploration. I am especially interested in prior art for this kind of thing, or any ideas on what a solution might look like.

For my part, I am vaguely aware of GitHub - ProfileNegotiation/I-D-Profile-Negotiation: Internet-Draft: Indicating and Negotiating Profiles in HTTP (see Indicating, Discovering, Negotiating, and Writing Profiled Representations for the live version) which is an effort (currently expired as of January 28 2024) to deal with the issue of having to register a new media type every time you make a slight change to semantics (for example, to enable content negotiation using an Accept header) when a profile would otherwise work fine (except for the lack of an Accept-Profile header, which the I-D defines). Perhaps there is something useful to extract here when applied to HTTP messages, but it would probably be more useful to have something you can declare on the resource itself, somewhere along the lines of “if you send this to my inbox, this is what I will do with it.” Perhaps it can go on the inbox itself. Perhaps both. I’m not sure yet.

More importantly, I think that bundling everything up in a profile is kind of a substitute for actually describing the processing considerations. You are depending on the human to (at some point) be aware of the profile’s existence and what it entails. In any case, if this is the best we can do, then that’s at least better than what we have now! You could have a message declare that its media type is application/activity+json and its profile is http://joinmastodon.org/profile/v1 and that would get you a fair bit of the way toward the goal of knowing what the intended processing of the message would be (“get the Create.object and convert it to a Status”). Every Mastodon-protocol-supporting inbox could declare this profile and it becomes a stand-in; if you say this is the only supported profile, then I can send you messages with this profile and have something more than a blind assumption that you will handle it in the expected way. I shouldn’t send you messages with a profile of https://w3id.org/fep/xxxx/profile/chess unless you declare support for your inbox being able to handle that. If you don’t support any of this profile stuff then the fallback is to continue doing what you’re doing now, just spray and pray.

Future steps might involve breaking up profiles into more granular declarations, either like “mini-profiles”, or otherwise having actual directly-parseable machine instructions for how any given activity is to be processed.

I still think for now that we can target a few profiles for eventual formalization:

“notification profile” or “minimal profile” will take an Activity and add it to the inbox. nothing more, nothing less.
“resource management profile” or “remote c2s profile” takes Create/Update/Delete and Add/Remove and performs the c2s side effects not just on outbox receipt, but also on inbox receipt.
“statuses and accounts profile” or “mastodon profile” will take an Activity and use it to manipulate some resources according to mastodon’s definitions of Status and Account. this profile can probably be defined in a somewhat-minimal sense and then allow extending with other profiles.
- for example, an “emoji react profile” might be another minimal building block to add on top of the “mastodon profile”.
- the real challenge here is in identifying where the boundaries are. anything that might be overridden should probably be part of a separate profile. i think the basic building blocks might end up breaking roughly along the lines of activity shapes, but i still need to think about how to represent the limitations on Account entities, because this is rightfully a separate axis from message shape. if we collapse it into a 1D single axis then you end up with combinatorial explosion where each profile has multiple sub-versions depending on the shape of not just the activity but also the actor/attributedTo.