FEP-2e40: The FEP Vocabulary Extension Process

aschrijver · February 16, 2023, 9:12am

I think we are in agreement on the general idea for the FEP. It is also great that you have made a list of some of the properties that are up for FEP consideration.

Also agree with these examples, but other than examplary we need not further focus on the particulars, and just focus on the process itself. There are likely a whole bunch of cases to be considered afterwards once the process is in place.

You mean by means of the @vocab keyword? And the mention of Default Vocabularies?

I think “Vocabulary” is appropriate. For instance the JSON-LD spec also mentions:

For example, the prefix foaf may be used as a shorthand for the Friend-of-a-Friend vocabulary.

We also have vocabularies in the same way. The Security Context vocabulary, the Mastodon Toot vocabulary, the ActivityStreams Vocabulary.

A question remains: When is something a vocabulary?

Above I mentioned in par. Human Readable that "Also note that in terms of how FEP’s relate to [vocabularies] there is no 1-to-1 mapping at the moment. And maybe there shouldn’t be one either, so that FEP’s and [vocabularies] (which define AP vocab extensions) are entirely separate.

A FEP might define one or more properties to be added to an existing vocabulary, or multiple vocabs, or introduce a new vocab, etc. Individual FEP specifications in that regard aren’t vocabularies, where each document gets their own namespace. But there still can be one universal FEP vocabulary where particular properties are added that are generic additions to the AP protocol stack.

It gets a bit confusing, but once again I see the aforemention division…

So 2) are docs, mostly related to core protocol extensions that are universal and relevant to any fedi project. And which also define the processes we use, i.e. the FEP process itself, and the Vocabulary process.

And in 3) we may have a FEP vocabulary collecting core extensions in its own @context and I like the ideas of @trwnh in the comment above, but we also get a growing set of other vocabs that are often used and their use increases general interopability, albeit sometimes in very particular domains. In this list of vocabs Mastodon might also submit their @context doc pointing to their own domain and docs location (example of a post-facto interop adoption).

It is the Process that we are defining now. How many vocabs there are and how they are named and defined is something that is a second step, imho.

But I think there’s more than just a FEP vocab (though this one is our starter for what we have already been specifying in prior FEP docs). Like e.g. I could well foresee a need to have some vocab around video platforms soon as more video-related apps are making their entrance to the fedi. Another example is the Podcasting scene that is working on their own vocabulary. See: activitypub-spec-work.

aschrijver · February 16, 2023, 9:16am

Though the background and uses are a bit different in our case, I still think for the process we can take inspiration from aforementioned DID Specification Registries document.

trwnh · February 16, 2023, 9:27am

a vocabulary is a collection of related terms. i think it makes more sense to say that each FEP proposes its own vocabulary, rather than the FEP process providing a vocabulary. rather, the FEP process might instead provide a bundled context that contains the vocabularies of each finalized proposal. or, with the way i floated above, you could import the context for only one FEP at a time, so you can pick and choose which FEPs you adopt

aschrijver · February 16, 2023, 9:34am

Yes, but in that case the FEP process needs to be modified to reflect that, as currently it is not about related things, but a FEP might just say “I want to add this one property in this particular location/use case”.

AS Groups is an example. “Here is how groups should work, when we encounter them from the AS vocab”.

This is what I am also trying to say. Maybe should draw a diagram of it all

An example for FEP-bar… someone submitting a “Video Transciptions” FEP and saying:

I have this new vocab proposal for video transcription.
But I want propose this type or property to fep-vocab as it is more generally applicable.
And to those folks implementing bar-vocab I suggest they standardize this property.

(Note: The diagram can be opened in diagrams.net and directly edited there)

helge · February 16, 2023, 10:01am

This discussion is currently moving faster than I can absorb information. I’ll take it as a positive sign that FEP-1570 will fill a need. Anyway, I’ve rewritten the first two parts of the FEP and I think they now describe the goals better:

Summary

Current usage of ActivityPub relies on the ActivityStreams namespace [AS-NS]
combined with custom extensions [Mastodon NS]. As far as I can tell, no
best practices exist or a formal process to add new namespaces.

This FEP will remedy this by

Create a FEP Vocabulary based on identified best practices

Define a process to add new entries to this FEP Vocabulary without a risk of Term collision

Define a process to elevate Terms to be common

Define a process to create specialized Vocabularies

Using [FEP-61CE] as an example how this process can be used.

Background and Terminology

The JSON-LD context is introduced in 3.1 The Context of [JSON-LD]. The context of an object is specified by its @context property.

One can think of the context as defining certain strings to be equivalent. For example Note, as:Note, and https://www.w3.org/ns/activitystreams#Note all represent the same thing. More details can be found in 3.2. IRIs. Following [JSON-LD], we will refer to all three strings mentioned above as a Term.

The second useful aspect of this is that one can define the used terms through the provided URL: https://www.w3.org/ns/activitystreams#Note. Clicking on it will let you easily find the definiton of the Note Type.

We will refer to the combination of Context and easily accessible documentations for the terms a Vocabulary.

The [DID Spec] states Best Practices on how to name Terms in 3. The Registration Process. We will reuse OR adapt them for our purposes.

aschrijver · February 16, 2023, 10:04am

Great, thx! Yes, I’m also spending time I don’t have… I’ll let it rest for a bit and revisit later to review docs.

weex · February 19, 2023, 8:52pm

Merged the latest changes.

aschrijver · February 20, 2023, 9:59am

2 posts were merged into an existing topic: FEP-a4ed: The Fediverse Enhancement Proposal Process

weex · February 21, 2023, 9:24pm

@helge Per this issue, it seems best to move the namespaces folder into feps/assets/fep-xxxx to maintain a clean top-level FEP filesystem. If you can envision multiple namespaces in the future then feps/namespaces/fep-xxxx could also work.

helge · February 22, 2023, 7:34am

Hi Weex. Is it okay if I wait to move the folder with the next draft?

Thanks!

trwnh · February 22, 2023, 12:55pm

if top-level folders are a concern, then i think it would be better to have the top-level folder and make this a revision to a4ed, not a separate FEP. once that revision is accepted by the community, then you would create the associated context folder.

helge · March 2, 2023, 9:41am

I’ve updated the FEP document: fep/fep-2e40.md at main - fep - Codeberg.org

trwnh · March 2, 2023, 8:21pm

proposed changes:

i think it’s weird that we’re using /feps/assets/fep-2e40/namespace.json instead of something top-level like /contexts/fep/context.jsonld and /contexts/fep/xxxx/context.jsonld. if this needs to be a revision to a4ed then that’s fine.
defining "fep": "https://w3id.org/fep#" is suboptimal and we should instead define "fep": "https://w3id.org/fep/" to allow for cleaner usage and better modularity as more FEPs are proposed. this would prevent conflicts in editing more cleanly and naturally than specifying a MUST clause about not editing other people’s terms.
fep-xxxx-term is really awkward; incidentally, the dashes look really out-of-place with language ecosystem standards. more saliently, we should define it as term directly, but define it in an FEP-specific context file that gets compiled into the “common” one once accepted and elevated. the compact IRI fep:xxxx/term or fep:xxxx#term can be used to refer to the term before acceptance and/or elevation, but only if we take the above changes as well, since having a singular “fep” resource and using #fragment at the base IRI level prevents us from doing this.

once again, the proposed w3id mapping would look something like this:

https://w3id.org/fep$1
- Accept: application/ld+json => https://codeberg.org/fediverse/fep/contexts/fep$1/context.jsonld
- Accept: */* => https://codeberg.org/fediverse/fep/contexts/fep$1

this way you don’t have to keep changing the w3id redirect mapping either.

if $1 is blank, then you get taken to the common context file or to the root of the base folder /contexts/fep.
the compact iri fep:xxxx auto-expands to https://w3id.org/fep/xxxx which gets redirected to https://codeberg.org/fediverse/fep/contexts/fep/xxxx by default and https://codeberg.org/fediverse/fep/contexts/fep/xxxx/context.jsonld if you specify the jsonld accept header
the compact iri fep:xxxx#term auto-expands to https://w3id.org/fep/xxxx#term which gets redirected to https://codeberg.org/fediverse/fep/contexts/fep/xxxx#term by default and https://codeberg.org/fediverse/fep/contexts/fep/xxxx/context.jsonld#term if you specify the jsonld accept header (i think this is how fragments work by default?)
to support compact iris of the form fep:xxxx/term (which auto-expands to https://w3id.org/fep/xxxx/term), you would need a new rule to handle something like https://w3id.org/fep/$1/$2 and map it to something like https://codeberg.org/fediverse/fep/contexts/fep/$1/$2 for the default case (and have some text file there), and for the jsonld case you would maybe map it to something like https://codeberg.org/fediverse/fep/contexts/fep/$1/context.jsonld#$2, but i don’t think you can really do this easily, and i don’t know enough about Apache’s .htaccess rule syntax to make it work. it’s probably doable, but it might be easier to just recommend (or require?) FEPs to use fep:xxxx#term instead of fep:xxxx/term. it might be a good idea to actually test how the redirects work in practice, first.

helge · March 3, 2023, 12:32pm

Changing a4ed creates too much friction to make 2e40 worthwhile for me.

"https://w3id.org/fep#" both simpler in usage and used everywhere else. I’m not introducing a new usage pattern. Again the friction created would make 2e40 not a worthwhile goal to pursue. 2e40 is a stopgap measure, to bring some order to the
“@context” extension process.

I have no idea how this would be implemented using json-ld. The first dash is there, because that’s how the FEPs are called. The second dash is there, because it cannot be a colon due to 9.1 Terms.

In the interest of keeping things simple

If the stuff about “secondary vocabularies” is confusing, I’m fine with dropping it. Including it feels a bit like reading tea leaves.

trwnh · March 3, 2023, 1:09pm

where is “everywhere else”, and how is it “simpler”? what’s “new” about it? i’m not sure i understand the objections here. i made the above proposed changes specifically in order to avoid conflicts. it seems worthwhile to do this in a robust way that doesn’t incur future problems…

for inspiration, look at the schema dot org context file, and how they assign URIs. they don’t do it like https://schema.org#Person, they do it like https://schema.org/Person. it’s a very clean way of routing things. there’s no good reason to prefer the former, and in fact, there are reasons to avoid using the former. this goes back to the much earlier points about FEPs being self-contained and, in effect, each FEP is its own “spec”. there is no authoritative unity in the FEP process. this is why i recommend each FEP maintain its own context and vocab definitions as-needed – promotion to “common” status becomes as simple as copying the term definition from the FEP context to the common context.

helge · March 3, 2023, 1:52pm

Schema.org is awful to use in my eyes. I get a headache trying to find anything on it.

The effort necessary to decide if one can achieve what I propose with any other pattern than

"fep": "https://w3id.org/fep#"

is large. I don’t know how I would even go about adapting my FEP to the pattern you propose. So if you want another pattern, you will have to propose an alternative FEP.

Setting up the w3id pull request can wait …

melvincarvalho · May 6, 2023, 8:20am

Some thoughts:

https://w3id.org/fep# is a good idea
A vocab in Linked Data is a machine readable description of a certain URI. The context is syntactic sugar that saves typing. The human readable thing doesnt really have a name, I suppose it’s a spec. There is much confusion where any or all of these things are called vocabs, and you also have redirects and content negotiation. Let’s loosely remember the vocabs are machine readable.
You could also have more than one vocab under the w3id/fep namespace fep/cooking fep/movies etc. But it seems you want just 1, at least to start with
Something quick i knocked up in under day is ASX Vocabulary, this is just food for thought, ignore the terms, they are dummy for now, but we could fill in proper terms
COOL URIs dont change is an important best practice. You sholdnt define terms and then change them later, or promote them, because that breaks inbound links, links in random places, a link written on a napkin. You should create links and then annotate them, which is what vocabs are about. Dont delete links, add new ones.
More than one vocab or extension system can work together.
Probably best to avoid content addressable stuff and URNs until we at least can get http working with a nice proof of concept. I suggest something like urn:json:foo in the future for so-called ‘unregistered’ terms, but this can be set in the @vocab
This tool will help you do the .htaccess: https://htaccess.madewithlove.com/ , If you put the proposal in the FEP it might help

That’s all for now. How about we start prototyping this with real terms to see how it look? I could set up a fep repo in ontologies · GitHub if you like, but do you have an example of some terms that would be useful?

helge · June 15, 2023, 5:58am

I’ve been recently reminded of the lofty goals of this FEP:

This FEP will: Create a FEP Vocabulary based on identified best practices

I think when writing this (a long time ago), I was mainly interested in linking context and documentation, see here. I think I can contribute a few more things to the list, that should probably be included.

While I don’t think MultiKey gets the details right, it provides an excellent example of how to define an object. If you look at the definition, you see that only a short list of properties is allowed. This pattern allows one to serialize complicated objects, while obtaining some level of type safety.
Instead of introducing a new value for “@type” for every type of something, it is much more convenient to have a dedicated “subtype” value. See DataIntegrityProof. Here the "type": "DataIntegrityProof" and cryptoSuite defines the subtype. The alternative can be viewed here. One ends up with a lot of “classes”. This should be avoided as a “vocabulary” should be short and to the point.
I don’t think the pattern of “contentMap” is good. Specifying languages seems more natural as done here. See the following code block. That pattern much more convenient when specifying a single language. By setting "contentMap": null in a context, one can actually force the following representation.

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "content": [{
    "@language": "en",
    "@value": "mooooo"
  },{
    "@language": "de",
    "@value": "muuuuuh"
  }]
}

I personally think that the pattern of including as:Public in the recipients instead of using a boolean flag is preferable. I would claim that avoiding dictionaries with lots of boolean flags is something every software developer learns at some point. So I’m not sure. My comment of contentMap kind of goes into the same direction. A list of key, value objects is generally preferable to a dictionary.
Not sure about this one: Expanding the vocabulary should have a certain amount of friction. It’s along the line of wanting a short vocabulary, and wanting to combine subtypes into a supertype (see DataIntegrityProof). Making adding terms hard, might be a way to encourage this.

Anybody any additions for this list of best practices? I’ll try to read through the linked-data discussions in this forum before making an update to fep-2e40. There is probably more good stuff to include.

to included above:

fep-fffd provides an example of not introducing new terms and reusing old ones.
I’m still unclear what a good answer to the question from Vocabulary vs Extension is. I get the feeling that having 2 vocabularies, FEP dealing with transport as ActivityPub does, and a second one for domain specific stuff might be a good model. Of course, this view of ActivityPub as a “transport layer” is deeply shaded by my vision of a future Fediverse.
This picks up what @silverpill said here. Using Gherkin as a syntax to describe additions to the vocabulary, including examples might work.

Scenario: 
When "eventSource" as "fep:eventSource" is added to the FEP context
And its description is
  """
  The term eventSource is to be as part of the endpoints of an [ActivityPub] Actor. It specifies an endpoint, where the Client can receive push notifications using the Server Side Events protocol of activities being added to collections on the server. By default the inbox collection of the Actor is used. By specifying the X-ActivityPub-Collection header a different collection can be specified to retrieve push notifications from.
  """
Then the following example is a valid document
  """
  {
   "@context": "https://www.w3id.org/fep",
   "type": "Person",
   "id": "https://example.com/client_actor",
   "inbox": "https://example.com/client_actor/inbox",
   "outbox": "https://example.com/client_actor/outbox",
   "preferredUsername": "actor",
   "endpoints": {
     "proxyUrl": "https://example.com/client_actor/proxyUrl",
     "eventSource": "https://example.com/client_actor/eventSource"
   }
  }
  """

This should even enable one to automatically generate the documentation + context with minimal parsing effort as Gherkin + test runner do it.

helge · June 17, 2023, 10:47am

I’ve build a first version of automation. See here.

The input are Gherkin files such as fediverse-vocabulary/fep-c390.feature at main - fediverse-vocabulary - Codeberg.org

Resulting context: https://helge.codeberg.page/fediverse-vocabulary/vocabulary.json
Resulting documentation:

https://helge.codeberg.page/fediverse-vocabulary/docs.html

One should note that the deep links such as https://helge.codeberg.page/fediverse-vocabulary/docs.html#VerifiableIdentityStatement already work.

I believe that one could probably fully automate the generation. So the process of adding new terms/types would be to make a pull request. Automated processes then ensure quality, i.e.

terms are camel case
types are pascal case
no conflicts
examples using everything new
the vocabulary is compatible with the old examples

If no problems are found, the pull request is merged, and new vocabulary/documentation is generated. All community members in good standing can create pull requests. This also means that this process is no longer tied into FEP.

stevebate · June 18, 2023, 10:30am

As metadata for context generation, this makes sense. As a test case, I’m not so sure. Is the documentation really required for the test definition? I’m guessing the associated step doesn’t do anything with that information. Is it possible for the documentation to be represented as some kind of test metadata?