Practices around JSON formatting of JSON-LD messages

I think the JSON-vs-JSON-LD debate keeps going in circles because both sides are right – about different things. The trap is treating AP as one homogeneous payload space.

My working line: the split runs along content type.

  • Expressive/media content (microblog, image, video, likes, boosts): JSON is enough. Meaning comes from social convention, performance matters, JSON-LD processing is overhead without payback.
  • Structured factual content (food items, recipes, barter offers, flea-market listings, events, places, needs/offers, value flows): Linked Data helps every single time. This data only becomes useful when it can be aggregated across instances, languages and domains. There won’t be a Mastodon-sized actor defining the convention for recipes or for local barter networks – RDF is the only mechanism that lets a hundred small projects interoperate without one of them becoming the de-facto standard.

And: AI is happening, with or without us. Outside our bubble, agents are being built that act on behalf of people. The honest question isn’t whether we can stop that, but whether we provide the layer on which it can become accountable – or leave the field to platforms that ignore the very values we showed up here for. A machine working with RDF doesn’t guess – it follows unambiguous URIs and uses shared vocabularies. Ducking out means handing the field to those who already have too much power.

Concretely, we don’t need to reinvent JSON-LD for any of this. The AS/AP context that every implementation already carries does the crucial work for free: every property is resolvable as an IRI. That’s something to build on – inside the standard or alongside it. We’d close that door the moment we said “no more JSON-LD”. The harder question lies elsewhere anyway: what happens when an implementation supports 50 FEPs and pulls in 50 contexts? Composition, versioning, preloading – that’s where Linked Data has to prove itself in the fediverse.


A small note about myself: I probably won’t take an active part in the further discussion – please don’t read it as ignorance if I don’t reply to direct references. I’ll be reading along.

I wasn’t discussing media type profiles in that quote (or anywhere else), so that could be why you don’t see the point. I was discussing JSON-LD, how different orderings of context URIs can change the expanded terms, and how that would result in ambiguity for JSON consumers using the JSON-LD document as described in the initial post of this thread. I’m assuming there’s no “authority” defining a specific order of a set of URIs in the context (since the number of all possible combinations of all possible context URIs would be extremely large).

It might be a problem with a “profile”, but that can’t be known since we aren’t defining, in this thread, the details of what a “profile” (application domain profile) is or how it could avoid these issues. All we can say at this point is that it is definitely a problem with JSON-LD (which you’ve confirmed).

:face_with_raised_eyebrow:

I don’t think it would be unreasonable to conclude that your premise throughout the thread has been to

but it is weird that any details of this “more effective extension process” are left totally unspecified and it is simply assumed that such a process exists and is better.

If your position is:

Then my position is: No, this hasn’t worked, because the Fediverse actually fails to have any extensibility. If you are speaking to Mastodon, then Mastodon expects all terms to be used in the same way as Mastodon. You are speaking some Mastodon profile, but you never negotiated such a profile; it was forced upon you. This means Mastodon is the de facto authority in all communications with Mastodon. If you disagree with Mastodon, Mastodon will ignore your definition and use its own. “Ad hoc community discussions” don’t let people do what they want, they just attempt to reconcile any term conflicts.

The litmus test for decentralized extensibility is that you can use a term, even a term someone else has already used, and it won’t cause confusion, because decentralized extensibility doesn’t require any communication ahead of time. If decentralized extensibility is supported, your system will generally survive any first encounter with any other system without any conflicts. (There is a different problem where some possibly important information may not be understood, but there will at least be some least common denominator of information you both understand, and no information you actively disagree about.)

I have already said that if you don’t want to require any JSON-LD processing, then leaving it in partially expanded form is an option. (i.e. compact against AS2 only, don’t augment the context). It’s not idiomatic to most JSON you might have seen before, but it results in exactly one unambiguous representation where every key is an identifier and every value is an array of hashes, some of which are @id and some of which are @value. You can parse it consistently using only a JSON parser, or you can elect to compact it yourself against any additional context you yourself understand (without forcing anyone else to understand it). Asking to throw away JSON-LD is asking to throw away the minimal constraints that make anything unambiguous. You will need to find some other mechanism that tells you what terms mean and which values are actually references. Let’s say you go with HAL, despite it being an expired internet draft RFC as of April 2024: The Hypertext Application Language

GET /orders HTTP/1.1
Host: example.org
Accept: application/hal+json

HTTP/1.1 200 OK
Content-Type: application/hal+json

{
  "_links": {
    "self": { "href": "/orders" },
    "next": { "href": "/orders?page=2" },
    "find": { "href": "/orders{?id}", "templated": true }
  },
  "_embedded": {
    "orders": [{
        "_links": {
          "self": { "href": "/orders/123" },
          "basket": { "href": "/baskets/98712" },
          "customer": { "href": "/customers/7809" }
        },
        "total": 30.00,
        "currency": "USD",
        "status": "shipped",
      },{
        "_links": {
          "self": { "href": "/orders/124" },
          "basket": { "href": "/baskets/97213" },
          "customer": { "href": "/customers/12369" }
        },
        "total": 20.00,
        "currency": "USD",
        "status": "processing"
    }]
  },
  "currentlyProcessing": 14,
  "shippedToday": 20
}

Okay, now we have “link objects” at least, so we don’t get confused on which things are references and which things are values. But we don’t know what any of those terms mean.

HAL does support compact URIs (CURIEs), which can help if everything you use is a CURIE, but doesn’t help with anything that isn’t a CURIE:

{
    "_links": {
        "self": { "href": "/orders" },
        "curies": [{ "name": "ea", "href": "http://example.com/docs/rels/{rel}", "templated": true }],
        "next": { "href": "/orders?page=2" },
        "ea:find": {
            "href": "/orders{?id}",
            "templated": true
        },
        "ea:admin": [{
            "href": "/admins/2",
            "title": "Fred"
        }, {
            "href": "/admins/5",
            "title": "Kate"
        }]
    },
    "currentlyProcessing": 14,
    "shippedToday": 20,
    "_embedded": {
        "ea:order": [{
            "_links": {
                "self": { "href": "/orders/123" },
                "ea:basket": { "href": "/baskets/98712" },
                "ea:customer": { "href": "/customers/7809" }
            },
            "total": 30.00,
            "currency": "USD",
            "status": "shipped"
        }, {
            "_links": {
                "self": { "href": "/orders/124" },
                "ea:basket": { "href": "/baskets/97213" },
                "ea:customer": { "href": "/customers/12369" }
            },
            "total": 20.00,
            "currency": "USD",
            "status": "processing"
        }]
    }
}

Here, you can say that _links with a relation of “self” and “next” come from the IANA Link Relations registry, while an “ea:” namespace can be expanded wherever encountered to get a full URI. But then you have unexpanded properties like “total”, “status”, “currency”, “currentlyProcessing”, and “shippedToday”. These terms are undefined, and thus only meaningful in the context of the centralized server. But it’s not like HAL claimed to be a decentralized solution – it was only intended as a set of conventions for “API” use cases, where a single application controls the state of every resource and you can access that state with the principles of REST and HATEOAS. The central problem HAL concerns itself with is having an unambiguous way to express links and “follow your nose”.

If the Fediverse continues to ignore the fact that anyone can disagree with anyone, then it will remain fragmented and confused. This is what leads to either de jure or de facto consolidation of power among the most powerful party who can set the terms for everyone else communicating within their authority. Yes, you can disagree with Mastodon as long as you’re not talking to Mastodon. But you also don’t know if you’re talking to Mastodon specifically. So what do you do, then – attempt to sniff NodeInfo? This replicates all the problems of user agent sniffing. There needs to be an in-band way of negotiating terms with mutual agreement for that session that doesn’t leak outside the session. Otherwise, as bad as things are right now, they will get even worse.

This is all pretty off-topic of “How do JSON-LD users communicate in a way that is least painful for JSON users”, unless your answer to that question is “JSON-LD users shouldn’t communicate at all with the Fediverse except on some unstated terms decided ahead of time and out of band decided entirely by the more naive peer”. I’d wager this isn’t exactly the most valid reason to netsplit, out of all the reasons to netsplit I’ve heard. (Anti-abuse and ideological opposition are more compelling reasons, incidentally, but those are other discussions for other times.)

Fediverse-we-have is at a major inflection point

And so I did, and seeing all the discussion here for which I thank you all, I decided to put in way more effort than originally intended, and delving into much deeper subject matter. The article quotes @SorteKanin and mentions others in this thread, especially @silverpill as the pillar who upholds the FEP process today. I read the discussion with interest, but skipping the implementation details, to take a broader perspective.

All of you are right, and yet none of you are. There exists no clear path forward.
The road simply hasn’t been paved, and its unclear who can pave it.
The open standard is stuck, fossilised, unable to naturally evolve.
The chaotic grassroots ecosystem has moved on, unable to healthily grow.
The installed base that’s content with the status quo, has increasing inertia to change.

The only options that exist in the fediverse today are ..

  1. Either force-push changes through the formal W3C process, and hope for adoption.
  2. Or force-push changes as a post-facto interoperability leader, and hope for popular uptake.

With the fundamental architectural impact that most of the options have, both are ultra hard. Where @SorteKanin observes that open standards should not be controversial, I follow-up and reformulate that open standard designs should avoid misconceptions from the very start, as they are most costly if not addressed immediately and the cost only increases over time.

Grassroots open standards to evolve the fediverse

Pragmatic ways to solve the linked data conundrum

Protosocial AP overlay intends to split into two separate API’s exposed to the solution layers ..

  1. Protosocial API. Actor-based message bus-like abstraction, closed-world, plain JSON.
  2. Knowledge API. Optional extension that offers full-fledged, open-world Linked Data support.