We have created documentation on how exactly Lemmy federation works

This first document gives a high-level overview, without being too technical. It is roughly based on the federation.md proposal.

https://join-lemmy.org/docs/en/federation/overview.html

We also have a separate document that describes the exact JSON format used by Lemmy, and how the various fields are interpreted.

https://join-lemmy.org/docs/en/federation/lemmy_protocol.html

The links go to our test instance (which is already federating). It will also be available from dev.lemmy.ml once we deploy the next version.

If anything is unclear or missing from those docs, please comment here and we will add it.

8 Likes

In the second documentation given, it is said that:

Field Name Mandatory Description
name yes Name of the actor
preferredUsername no Displayname
// …
    "name": "riker",
    "preferredUsername": "Jean-Luc Picard",
// …

It seems that Lemmy uses preferredUsername field for display names and name for usernames, which is in contrast to the ActivityPub spec and fediverse implementations such as Mastodon:

// …
 "name": "Alyssa P. Hacker",
 "preferredUsername": "alyssa",
// …

Would it improve compatibility to follow the practice, or is it an intended design?

1 Like

Thanks a lot for noticing this and pointing it out! It would have been a real problem if we found out about this after releasing federation.

That said, who thought this was a good idea? Its completely counterintuitive.

1 Like

It seems ActivityStreams came first with name, and ActivityPub could only add a new property on top of that. (Especially since name is one of the few special properties that can also be a natural language map, and it seems preferredUsername is not)

Also, in our brief talks on Fedi, I know you do not have a goal to become ActivityPub compliant at this time. I was wondering if it is something being built towards on the long-term roadmap, even if not a priority anytime soon?

3 Likes

So far our goal has been to enable federation between Lemmy instances, and we almost finished with that (its a matter of weeks before we enable allowlist based federation on our main instance). Regarding ActivityPub compliance, we didnt expect that it would be so much additional work, so we didnt think to turn it into a milestone for our NLnet funding. But we definitely want that and I think we can get it done in parallel within the next few months.

The link to the documentation is dead

1 Like

https://join-lemmy.org/docs/en/federation/overview.html

2 Likes

This is some pretty wonderful documentation. Well done!

I’d like to make an observation on the User object

Lemmy Protocol - Lemmy Documentation → User

There are 2 entities described explicitly, and one implicitly, using the id alias:

  1. "id": "#key"

This is great. The PEM format offers a lot of flexibility. FYI: There is an emerging new standard too in social coding around ssh keys having username.keys.

  1. "id": "https://enterprise.lemmy.ml/u/picard"

This (implicitly) pertains to the HTTP document, so all headers and meta data are tied to this. The created date, the etag etc.

  1. "id": "https://enterprise.lemmy.ml/u/picard"

This is the same as (2) but explicitly it’s related to the fields in the User object. It’s a convenient way to get started, but there is some issue with mixing 2 & 3 as time goes on, as it can be hard for automated agents to know which is which. In solid we provided future proofing of this, and a clear separation of concerns, by adding #me to the User id field. I’d recommend this as a good practice, that will save you pain in the future (been there!)

I think the choice of schemas is quite practical and has a large network effect. I’m personally going to move much more to a JSON (with scattered html) model for schemas and alias them to existing fields. All the various formats on linked data are great for inclusion but harder for a parser. I’m going to move more to a json first approach, both for the self-description and the context.

Great work, in any case, and you’ve given some inspiration to me to make some fedi based services and docs + how to do it.

Two questions:

The second one “ActivityPub API Outline” is an empty html document (here).
Did it move somehow, is it maybe https://join-lemmy.org/docs/en/federation/lemmy_protocol.html now?
If so, the links in the first doc should be updated too.

And there might be a typo in the first document:

When a new Comment is created for a Post, both the Post ID and the parent Comment ID (if it exists) are written to the in_reply_to field. This allows assigning it to the correct Post, and building the Comment tree. It is then sent to the Community inbox as Create/Note

The as property reads inReplyTo

Also there are some observations about the “Lemmy Protocol Federation” Doc. which I am summing up here now:

Context

[
  "https://www.w3.org/ns/activitystreams",
  {
    "stickied": "as:stickied",
[…]

See e.g. Activity Vocabulary or https://www.w3.org/ns/activitystreams.jsonld

• So, I wonder about as:stickied - did you maybe mean toot:featured ?
Extensions to the official namespace as (apart from the new “alsoKnownAs”) are documented here Activity Streams extensions - W3C Wiki

The boolean toot:featured is the proposed sticky/pinned post thing and your documentation also says

“True means that it is shown on top of the community”

@nutomic @dessalines Let us avoid duplicates. I am working on a consolidated vocabulary including all Community Extensions.

• The as:moderators Collection should become YOURNAMESPACE:moderators
• “expires” could be as:endTime which is a native property.

Last not least pt is usually the prefix for the namespace of peertube.

1 Like

Yes the link you posted is the correct one now. Unfortunately it seems like I cant edit the original post, probably because it is too old.

I fixed the name of inReplyTo.

About context, the truth is that I dont really understand how it works, neither did I find anyone who does. Lemmy just adds it in case its needed by other software, but objects and activities are parsed as simple json.

The trailing # is easy to add, I didnt know that was significant.

Mastodon’s toot:featured field contains a collection of all stickied objects, while Lemmy sets as:stickied as a boolean directly on the stickied objects, with no collection. So changing that would require some rewrite, which is low priority for me.

You are right that we should probably define our own namespace. The problem is that I dont know how to do that, or to verify that it is valid. By the way, there are also many fields which are not at all part of the context, but those are all optional and can be ignored if you want (same as stickied or moderators).

Thanks for answering fast, I’ve (hopefully) corrected the links in the original post.

About context, the truth is that I dont really understand how it works

What is important: We speak about @context and not context.
The first is a useful underlying property from the JSON-LD specification.

The second is explained in “ActivityVocabulary”;
It is a native ActivityStreams property meant to group things.

anyway

about @context

Without using it, Sir Tim Berners Lee would award Lemmy with 3 of 5 stars:slight_smile:
When you use it, you can earn the 4th star.

The spec. tries to explains @context – recommending to read it in the order:

  1. Section Extensibility in the underlying spec. for “ActivityVocabulary”: Activity Streams 2.0
  2. Section Context in the very underlying spec. JSON-LD 1.1

Let me try: :slight_smile:
It is to “use URIs to denote things, so that people can point at your stuff”
See the benefits
In short:
Any property in the JSON document is not a word but an URI.
We do not want to repeat things, so in the @context field we can

  • define a Base URI for unprefixed properties (it is https://www.w3.org/ns/activitystreams#) unless specified specifically
  • define prefixes which are like “shortcuts” and denoted by :
  • define a property and its behaviour specifically

Then any property becomes a unique URI which can also point to both, a machine and human readable definition for the property. With multilanguage labels as bonus (like in wikidata or redaktor).

Now for example

{ "type": ["adidas:Offer"] } or { "type": ["puma:Offer"] }

can have different specified meanings (and if you see the company history probably have).


Which brings me to

Mastodon’s toot:featured field contains a collection of all stickied objects

You are right, sorry. Let us think it federated.
Both make sense and so you should use your own namespace.

Trying to highlight the differences.
On the one hand when an application shows the Outbox of an Actor (e.g. under the Profile), the mastodon approach makes sense cause you do not want to parse it until the end to know all sticky.
On the other hand when you treat Objects of different Actors, like when viewing your Inbox, “lemmy stickied” is fine to just show e.g. an Icon or “sticky”-label …

Please also note that id and context are aliases itself, specified by the “ActivityStreams 2.0 Terms”.
But the @context itself is independent. Since yours is consistent I could cache it for generator = Lemmy (recommending to use generator property).
But in the @context itself it must still be @id and @type (note the @ !)
The schema namespace is “http://schema.org/” (no “#”, it exactly replaces the “sc:”).
And if you want to alias things from “as”, you need to specify what “as” is.

tl;dr
proposing:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams#",
    "https://w3id.org/security/v1",
    {
      "as": "https://www.w3.org/ns/activitystreams#",
      "lm": "https://join-lemmy.org#",
      "pt": "https://joinpeertube.org/ns#",
      "sc": "http://schema.org#",
      "comments_enabled": {
        "@type": "sc:Boolean",
        "@id": "pt:commentsEnabled"
      },
      "matrixUserId": {
        "@type": "@id",
        "@id": "as:alsoKnownAs"
      },
      "moderators":  {
        "@type": "@id",
        "@id": "lm:moderators"
      },
      "stickied":  {
        "@type": "sc:Boolean",
        "@id": "lm:stickied"
      }
    }
  ]
}

Basically to avoid confusion for people not using JSON-LD/@context, I would rename ‘comments_enabled’ to ‘commentsEnabled’ as in peertube.
And: regarding @type, normally AP uses "xsd": "http://www.w3.org/2001/XMLSchema#" to describe functional datatypes like xsd:boolean.

@nutomic et al.
please do also attend the monthly meetings each 2nd TUE a month. We spoke and speak about all the @context and context things …

1 Like

OT’ish… While the namespacing is an improvement, application-specific namespaces are still non-optimal. Why would Lemmy reference a Peertube namespace to model Commenting features? This relates to the discussion in A namespace for things defined in FEPs and the fact that we never really figured out how to specify AP vocab extension in ways most beneficial for reuse across the ecosystem.

Because it is specified like this here

https://www.w3.org/TR/activitystreams-core/#h-extensibility

I tried multiple times to get @rhiaro in the loop, meanwhile I am stuck too but this is why I would like to at least continue to collect extension although if nobody seems to be interested.

PS: The lemmy namespace is for the newly introduced lm:stickied feature.
The aliases are just there so that you do not need prefixes anywhere in your document, as Melvin and other noted before it is easier to consume.

In general, I agree.
The thing is regarding mastodon and peertube it is too late cause they have already created their own namespaces (which we can only alias …)

But (cc @nutomic ) I would agree that it would be better to start now with a common namespace.
Does @aschrijver and @acka47 want to help?
I have created a huge rdfs/owl/skos turtle file which is collecting everything
It is huge. The minimal version is already 700kb (without all the roles in the shown vocab !) …

What do we need? → Presenting the SkoHub Vocabs Prototype | Skohub Blog
Step 2 would need an exchange/contact of repo owner @aschrijver and @acka47 then and I can push it.

/ edit
also pinging @cpmoser cause https://yuforium.com/ns/activitypub

/ edit2
There is also a problem with the peertube @context, neither as:dislikes nor as:comments does exist.

      "dislikes": {
        "@id": "as:dislikes",
        "@type": "@id"
      },
      "comments": {
        "@id": "as:comments",
        "@type": "@id"
      }

Yes. Though I’d like to restrict to generic procedures for extension and find best-practices for that to document.

Isn’t your SkoHub vocab an example of just one particular example? We should do it on a different topic, and maybe document in a Hedgedoc pad in parallel for the time being (or alternatively have a wiki post + discussion thread).

Well, this was a different SKOS file [just for attributions (Roles) and location]
Worked on all real used terms already.
Not every implementor replied yet and I will finish the uncommented things (e.g. pleroma) after work.

Here is what I have for now
see asSkos.ttl (valid turtle file)

[edit]
When describing @context above, forgot to mention Manus wonderful Intro Video to JSON LD.

So, we would need a repo and a name. And first the SkoHub steps.
Each term has an inbox, we can just write messages to them do discuss. It is all federated.
But we can also publish at next meeting.
Finally I can generate a JSON-LD context and maybe JSON Schema out of the turtle file.

PS: Thanks to everyone who feeded it so far.

To be honest, I’m not really interested in learning how @context works. Like I said, Lemmy doesnt use the field at all, and only sends it for the benefit of other software. It is defined in this file, so I suggest you make a pull request to change it (or I can do that if you prefer).

About featured/stickied items, it is true that the way Mastodon does it makes more sense. Our implementation is simply a reflection of how its stored in the database, because it was much easier to implement that way, and no one has complained so far. You can open an issue to change it.

comments_enabled is just a typo in the context, the actual field is called commentsEnabled.

And I dont really have time to do video chats, for me its very much preferable to talk like this via forum posts, github issues or matrix chat.

2 Likes

Yes, I will make a pull request.
It is really misleading because people already thought that the things like
wrongNamespace:stickied (Lemmy) or
wrongNamespace:dislikes or wrongNamespace:comments (PeerTube)
would do exist in the ActivityStreams namespace.

This is possible too but it must be formally decided in a meeting where a chair of the SocialCG is present.
Just talked with Dr. Amy Guy about it in fedi.
Also there is nothing special about creating your own namespace, it’s just an URI (it must not be an URL), it is really just a name associated with a space.

But people federate them on in the as namespace and others stumble upon it etc. :wink:

1 Like

for the benefit of other software, sending an incorrect @context is actually worse than sending no @context at all – all you really need to know is that in a json-ld aware software, the “plain json” property names derive their namespace from the @context property like so:

  • any URI (like https://www.w3.org/ns/activitystreams) gets fetched for a application/ld+json context document (like https://www.w3.org/ns/activitystreams.jsonld); all properties within that document get added to the understood context
  • aliases can be defined by mapping a prefix property to its expanded form, e.g. "as": "https://www.w3.org/ns/activitystreams#" maps the as: prefix to the full URI prefix (you can see this in the context document near the top)

so for example the following are all supposed to be equivalent:

  • Public (when @context includes https://www.w3.org/ns/activitystreams)
  • as:Public (when @context includes "as": "https://www.w3.org/ns/activitystreams#")
  • https://www.w3.org/ns/activitystreams#Public (no @context needed)

the purpose of json-ld normalization is to convert all properties to a fully-qualified URI like https://www.w3.org/ns/activitystreams#Publicthis removes all ambiguity. a json-ld parser would not check for Public, it would check for https://www.w3.org/ns/activitystreams#Public in order to be absolutely sure we both mean “the activitystreams definition of public” and not “some other definition of public”.


as far as lemmy’s @context goes, i see the following issues:

  • "stickied": "as:stickied", implies that stickied exists in the as: namespace, but it does not
    • likewise for "moderators": "as:moderators"
  • "pt": "https://join-lemmy.org#", implies that the pt: namespace is owned by / expands to lemmy’s domain; i assume this is supposed to be peertube
  • "matrixUserId" has a nonsensical id; based on a sample user payload it seems like it just maps to a matrix identifier?

using JSON-LD Playground i came up with the following sample:

  "@context": [
  "https://www.w3.org/ns/activitystreams",
  {
    "pt": "https://joinpeertube.org/ns#",
    "lemmy": "https://join-lemmy.org/ns#",
    "sc": "http://schema.org/",
    "sensitive": "as:sensitive",
    "stickied": {
      "@type": "sc:Boolean",
      "@id": "lemmy:stickied"
    },
    "matrixUserId": {
      "@type": "sc:Text",
      "@id": "lemmy:matrixUserId"
    },
    "commentsEnabled": "pt:commentsEnabled",
    "moderators": {
      "@type": "@id",
      "@id": "lemmy:moderators"
    }
  },
  "https://w3id.org/security/v1"
]

PR here: Fix: Use correctly parseable JSON-LD context by trwnh · Pull Request #2299 · LemmyNet/lemmy · GitHub

3 Likes