FEP-2e40: The FEP Vocabulary Extension Process

helge · June 15, 2023, 5:58am

I’ve been recently reminded of the lofty goals of this FEP:

This FEP will: Create a FEP Vocabulary based on identified best practices

I think when writing this (a long time ago), I was mainly interested in linking context and documentation, see here. I think I can contribute a few more things to the list, that should probably be included.

While I don’t think MultiKey gets the details right, it provides an excellent example of how to define an object. If you look at the definition, you see that only a short list of properties is allowed. This pattern allows one to serialize complicated objects, while obtaining some level of type safety.
Instead of introducing a new value for “@type” for every type of something, it is much more convenient to have a dedicated “subtype” value. See DataIntegrityProof. Here the "type": "DataIntegrityProof" and cryptoSuite defines the subtype. The alternative can be viewed here. One ends up with a lot of “classes”. This should be avoided as a “vocabulary” should be short and to the point.
I don’t think the pattern of “contentMap” is good. Specifying languages seems more natural as done here. See the following code block. That pattern much more convenient when specifying a single language. By setting "contentMap": null in a context, one can actually force the following representation.

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "content": [{
    "@language": "en",
    "@value": "mooooo"
  },{
    "@language": "de",
    "@value": "muuuuuh"
  }]
}

I personally think that the pattern of including as:Public in the recipients instead of using a boolean flag is preferable. I would claim that avoiding dictionaries with lots of boolean flags is something every software developer learns at some point. So I’m not sure. My comment of contentMap kind of goes into the same direction. A list of key, value objects is generally preferable to a dictionary.
Not sure about this one: Expanding the vocabulary should have a certain amount of friction. It’s along the line of wanting a short vocabulary, and wanting to combine subtypes into a supertype (see DataIntegrityProof). Making adding terms hard, might be a way to encourage this.

Anybody any additions for this list of best practices? I’ll try to read through the linked-data discussions in this forum before making an update to fep-2e40. There is probably more good stuff to include.

to included above:

fep-fffd provides an example of not introducing new terms and reusing old ones.
I’m still unclear what a good answer to the question from Vocabulary vs Extension is. I get the feeling that having 2 vocabularies, FEP dealing with transport as ActivityPub does, and a second one for domain specific stuff might be a good model. Of course, this view of ActivityPub as a “transport layer” is deeply shaded by my vision of a future Fediverse.
This picks up what @silverpill said here. Using Gherkin as a syntax to describe additions to the vocabulary, including examples might work.

Scenario: 
When "eventSource" as "fep:eventSource" is added to the FEP context
And its description is
  """
  The term eventSource is to be as part of the endpoints of an [ActivityPub] Actor. It specifies an endpoint, where the Client can receive push notifications using the Server Side Events protocol of activities being added to collections on the server. By default the inbox collection of the Actor is used. By specifying the X-ActivityPub-Collection header a different collection can be specified to retrieve push notifications from.
  """
Then the following example is a valid document
  """
  {
   "@context": "https://www.w3id.org/fep",
   "type": "Person",
   "id": "https://example.com/client_actor",
   "inbox": "https://example.com/client_actor/inbox",
   "outbox": "https://example.com/client_actor/outbox",
   "preferredUsername": "actor",
   "endpoints": {
     "proxyUrl": "https://example.com/client_actor/proxyUrl",
     "eventSource": "https://example.com/client_actor/eventSource"
   }
  }
  """

This should even enable one to automatically generate the documentation + context with minimal parsing effort as Gherkin + test runner do it.