FEP-1570: The FEP Ontology Process

Great, thx! Yes, I’m also spending time I don’t have… I’ll let it rest for a bit and revisit later to review docs.

Merged the latest changes.

2 posts were merged into an existing topic: FEP-a4ed: The Fediverse Enhancement Proposal Process

@helge Per this issue, it seems best to move the namespaces folder into feps/assets/fep-xxxx to maintain a clean top-level FEP filesystem. If you can envision multiple namespaces in the future then feps/namespaces/fep-xxxx could also work.

1 Like

Hi Weex. Is it okay if I wait to move the folder with the next draft?

Thanks!

if top-level folders are a concern, then i think it would be better to have the top-level folder and make this a revision to a4ed, not a separate FEP. once that revision is accepted by the community, then you would create the associated context folder.

I’ve updated the FEP document: fep/fep-2e40.md at main - fep - Codeberg.org

2 Likes

proposed changes:

  • i think it’s weird that we’re using /feps/assets/fep-2e40/namespace.json instead of something top-level like /contexts/fep/context.jsonld and /contexts/fep/xxxx/context.jsonld. if this needs to be a revision to a4ed then that’s fine.

  • defining "fep": "https://w3id.org/fep#" is suboptimal and we should instead define "fep": "https://w3id.org/fep/" to allow for cleaner usage and better modularity as more FEPs are proposed. this would prevent conflicts in editing more cleanly and naturally than specifying a MUST clause about not editing other people’s terms.

  • fep-xxxx-term is really awkward; incidentally, the dashes look really out-of-place with language ecosystem standards. more saliently, we should define it as term directly, but define it in an FEP-specific context file that gets compiled into the “common” one once accepted and elevated. the compact IRI fep:xxxx/term or fep:xxxx#term can be used to refer to the term before acceptance and/or elevation, but only if we take the above changes as well, since having a singular “fep” resource and using #fragment at the base IRI level prevents us from doing this.

once again, the proposed w3id mapping would look something like this:

  • https://w3id.org/fep$1
    • Accept: application/ld+json => https://codeberg.org/fediverse/fep/contexts/fep$1/context.jsonld
    • Accept: */* => https://codeberg.org/fediverse/fep/contexts/fep$1

this way you don’t have to keep changing the w3id redirect mapping either.

  • if $1 is blank, then you get taken to the common context file or to the root of the base folder /contexts/fep.
  • the compact iri fep:xxxx auto-expands to https://w3id.org/fep/xxxx which gets redirected to https://codeberg.org/fediverse/fep/contexts/fep/xxxx by default and https://codeberg.org/fediverse/fep/contexts/fep/xxxx/context.jsonld if you specify the jsonld accept header
  • the compact iri fep:xxxx#term auto-expands to https://w3id.org/fep/xxxx#term which gets redirected to https://codeberg.org/fediverse/fep/contexts/fep/xxxx#term by default and https://codeberg.org/fediverse/fep/contexts/fep/xxxx/context.jsonld#term if you specify the jsonld accept header (i think this is how fragments work by default?)
  • to support compact iris of the form fep:xxxx/term (which auto-expands to https://w3id.org/fep/xxxx/term), you would need a new rule to handle something like https://w3id.org/fep/$1/$2 and map it to something like https://codeberg.org/fediverse/fep/contexts/fep/$1/$2 for the default case (and have some text file there), and for the jsonld case you would maybe map it to something like https://codeberg.org/fediverse/fep/contexts/fep/$1/context.jsonld#$2, but i don’t think you can really do this easily, and i don’t know enough about Apache’s .htaccess rule syntax to make it work. it’s probably doable, but it might be easier to just recommend (or require?) FEPs to use fep:xxxx#term instead of fep:xxxx/term. it might be a good idea to actually test how the redirects work in practice, first.
1 Like

Changing a4ed creates too much friction to make 2e40 worthwhile for me.

"https://w3id.org/fep#" both simpler in usage and used everywhere else. I’m not introducing a new usage pattern. Again the friction created would make 2e40 not a worthwhile goal to pursue. 2e40 is a stopgap measure, to bring some order to the
@context” extension process.

I have no idea how this would be implemented using json-ld. The first dash is there, because that’s how the FEPs are called. The second dash is there, because it cannot be a colon due to 9.1 Terms.

In the interest of keeping things simple

If the stuff about “secondary vocabularies” is confusing, I’m fine with dropping it. Including it feels a bit like reading tea leaves.

where is “everywhere else”, and how is it “simpler”? what’s “new” about it? i’m not sure i understand the objections here. i made the above proposed changes specifically in order to avoid conflicts. it seems worthwhile to do this in a robust way that doesn’t incur future problems…

for inspiration, look at the schema dot org context file, and how they assign URIs. they don’t do it like https://schema.org#Person, they do it like https://schema.org/Person. it’s a very clean way of routing things. there’s no good reason to prefer the former, and in fact, there are reasons to avoid using the former. this goes back to the much earlier points about FEPs being self-contained and, in effect, each FEP is its own “spec”. there is no authoritative unity in the FEP process. this is why i recommend each FEP maintain its own context and vocab definitions as-needed – promotion to “common” status becomes as simple as copying the term definition from the FEP context to the common context.

Schema.org is awful to use in my eyes. I get a headache trying to find anything on it.

The effort necessary to decide if one can achieve what I propose with any other pattern than

"fep": "https://w3id.org/fep#"

is large. I don’t know how I would even go about adapting my FEP to the pattern you propose. So if you want another pattern, you will have to propose an alternative FEP.

Setting up the w3id pull request can wait …

Some thoughts:

  • https://w3id.org/fep# is a good idea

  • A vocab in Linked Data is a machine readable description of a certain URI. The context is syntactic sugar that saves typing. The human readable thing doesnt really have a name, I suppose it’s a spec. There is much confusion where any or all of these things are called vocabs, and you also have redirects and content negotiation. Let’s loosely remember the vocabs are machine readable.

  • You could also have more than one vocab under the w3id/fep namespace fep/cooking fep/movies etc. But it seems you want just 1, at least to start with

  • Something quick i knocked up in under day is ASX Vocabulary, this is just food for thought, ignore the terms, they are dummy for now, but we could fill in proper terms

  • COOL URIs dont change is an important best practice. You sholdnt define terms and then change them later, or promote them, because that breaks inbound links, links in random places, a link written on a napkin. You should create links and then annotate them, which is what vocabs are about. Dont delete links, add new ones.

  • More than one vocab or extension system can work together.

  • Probably best to avoid content addressable stuff and URNs until we at least can get http working with a nice proof of concept. I suggest something like urn:json:foo in the future for so-called ‘unregistered’ terms, but this can be set in the @vocab

  • This tool will help you do the .htaccess: https://htaccess.madewithlove.com/ , If you put the proposal in the FEP it might help

That’s all for now. How about we start prototyping this with real terms to see how it look? I could set up a fep repo in ontologies · GitHub if you like, but do you have an example of some terms that would be useful?

1 Like

I’ve been recently reminded of the lofty goals of this FEP:

This FEP will: Create a FEP Vocabulary based on identified best practices

I think when writing this (a long time ago), I was mainly interested in linking context and documentation, see here. I think I can contribute a few more things to the list, that should probably be included.

  • While I don’t think MultiKey gets the details right, it provides an excellent example of how to define an object. If you look at the definition, you see that only a short list of properties is allowed. This pattern allows one to serialize complicated objects, while obtaining some level of type safety.
  • Instead of introducing a new value for “@type” for every type of something, it is much more convenient to have a dedicated “subtype” value. See DataIntegrityProof. Here the "type": "DataIntegrityProof" and cryptoSuite defines the subtype. The alternative can be viewed here. One ends up with a lot of “classes”. This should be avoided as a “vocabulary” should be short and to the point.
  • I don’t think the pattern of “contentMap” is good. Specifying languages seems more natural as done here. See the following code block. That pattern much more convenient when specifying a single language. By setting "contentMap": null in a context, one can actually force the following representation.
{
  "@context": "https://www.w3.org/ns/activitystreams",
  "content": [{
    "@language": "en",
    "@value": "mooooo"
  },{
    "@language": "de",
    "@value": "muuuuuh"
  }]
}
  • I personally think that the pattern of including as:Public in the recipients instead of using a boolean flag is preferable. I would claim that avoiding dictionaries with lots of boolean flags is something every software developer learns at some point. So I’m not sure. My comment of contentMap kind of goes into the same direction. A list of key, value objects is generally preferable to a dictionary.

  • Not sure about this one: Expanding the vocabulary should have a certain amount of friction. It’s along the line of wanting a short vocabulary, and wanting to combine subtypes into a supertype (see DataIntegrityProof). Making adding terms hard, might be a way to encourage this.

Anybody any additions for this list of best practices? I’ll try to read through the linked-data discussions in this forum before making an update to fep-2e40. There is probably more good stuff to include.

to included above:

  • fep-fffd provides an example of not introducing new terms and reusing old ones.
  • I’m still unclear what a good answer to the question from Vocabulary vs Extension is. I get the feeling that having 2 vocabularies, FEP dealing with transport as ActivityPub does, and a second one for domain specific stuff might be a good model. Of course, this view of ActivityPub as a “transport layer” is deeply shaded by my vision of a future Fediverse.
  • This picks up what @silverpill said here. Using Gherkin as a syntax to describe additions to the vocabulary, including examples might work.
Scenario: 
When "eventSource" as "fep:eventSource" is added to the FEP context
And its description is
  """
  The term eventSource is to be as part of the endpoints of an [ActivityPub] Actor. It specifies an endpoint, where the Client can receive push notifications using the Server Side Events protocol of activities being added to collections on the server. By default the inbox collection of the Actor is used. By specifying the X-ActivityPub-Collection header a different collection can be specified to retrieve push notifications from.
  """
Then the following example is a valid document
  """
  {
   "@context": "https://www.w3id.org/fep",
   "type": "Person",
   "id": "https://example.com/client_actor",
   "inbox": "https://example.com/client_actor/inbox",
   "outbox": "https://example.com/client_actor/outbox",
   "preferredUsername": "actor",
   "endpoints": {
     "proxyUrl": "https://example.com/client_actor/proxyUrl",
     "eventSource": "https://example.com/client_actor/eventSource"
   }
  }
  """

This should even enable one to automatically generate the documentation + context with minimal parsing effort as Gherkin + test runner do it.

1 Like

I’ve build a first version of automation. See here.

The input are Gherkin files such as fediverse-vocabulary/fep-c390.feature at main - fediverse-vocabulary - Codeberg.org

Resulting context: https://helge.codeberg.page/fediverse-vocabulary/vocabulary.json
Resulting documentation:

https://helge.codeberg.page/fediverse-vocabulary/docs.html

One should note that the deep links such as https://helge.codeberg.page/fediverse-vocabulary/docs.html#VerifiableIdentityStatement already work.

I believe that one could probably fully automate the generation. So the process of adding new terms/types would be to make a pull request. Automated processes then ensure quality, i.e.

  • terms are camel case
  • types are pascal case
  • no conflicts
  • examples using everything new
  • the vocabulary is compatible with the old examples

If no problems are found, the pull request is merged, and new vocabulary/documentation is generated. All community members in good standing can create pull requests. This also means that this process is no longer tied into FEP.

1 Like

As metadata for context generation, this makes sense. As a test case, I’m not so sure. Is the documentation really required for the test definition? I’m guessing the associated step doesn’t do anything with that information. Is it possible for the documentation to be represented as some kind of test metadata?

After taking another look at how the feature is written, I agree it needs a rework. Not exactly sure where it needs to end up. The current format is also lacking the ability to define something like MultiKey did:

{
  "@context": {
    "id": "@id",
    "type": "@type",
    "@protected": true,
    "Multikey": {
      "@id": "https://w3id.org/security#Multikey",
      "@context": {
        "@protected": true,
        "id": "@id",
        "type": "@type",
        "controller": {
          "@id": "https://w3id.org/security#controller",
          "@type": "@id"
        },
        "revoked": {
          "@id": "https://w3id.org/security#revoked",
          "@type": "http://www.w3.org/2001/XMLSchema#dateTime"
        },
        "publicKeyMultibase": {
          "@id": "https://w3id.org/security#publicKeyMultibase",
          "@type": "https://w3id.org/security#multibase"
        },
        "secretKeyMultibase": {
          "@id": "https://w3id.org/security#secretKeyMultibase",
          "@type": "https://w3id.org/security#multibase"
        }
      }
    }
  }
}

This would lead to some level of type safety being enforced through json-ld. This is certainly not useful for something that varies a lot, e.g. a Note, but I think is quite appropriate for cryptographic material.

:100: I think these are both really good points. I think the current version of FEP-2e40 is too over engineered. I completely ignored that having other processes to create vocabularies might be a possibility when writing it.

1 Like

I’ve reworked the file format a bit, the current result can be seen at

In order to test it, I added the terms from the security vocabulary commonly used to it, see here and can say that the tests work to find mistakes. At least the mistakes, I tend to make.

I’ve now updated the index page. Most things about references now work as intended:

https://helge.codeberg.page/fediverse-vocabulary/

Basically, what remains is update the FEP with the new format and add purls for the context file and documentation.

I’m withdrawing fep-2e40 as it is no longer compatible with the FEP repository model …

Do you have any plans for fediverse-vocabulary? I think such resource could be valuable to community

There are a few problems with the approach:

From a technical standpoint: The examples as shown are incompatible with Mastodon. One would need to use:

{
  "@context": [
       "https://purl.archive.org/fedi-2023/context.json",
       "https://www.w3.org/ns/activitystreams",
  ],
  "id": "https://example.com/note/1",
  "type": "Note",
  "content": "This is a note containing a #tag",
  "tag": {
    "type": "Hashtag",
    "name": "#tag",
    "href": "https://example.com/hashtags/tag"
  }
}

if one wants Mastodon to understand the objects.

From a more philosophical stand point: I’m no longer sure if a centralized context is a good idea. It might be a better idea to use “Fediverse Vocabulary” to just collect definitions of URIs used.

The entire thing needs thinking. Fortunately, the linked Fediverse Vocabulary provides an excellent base on how stuff might be build.

1 Like