Content-addressing and URN resolution

rustra · April 28, 2021, 6:28pm

CPub as a general semantic ActivityPub server is using content-addressable RDF with ERIS.

Object’s id should be de-referencable URI. There is no problem with loading a content using HTTP for URLs. However URNs require some another way for it.

Any existing ActivityPub client will be surprised to see something like this:

{
  "@context": "https://www.w3.org/ns/activitystreams#",
  "id": "urn:erisx2:AAADAF4BGFGNTYG65ZX3JW7A75VBCMRWN7B2WGIQ7PAQGBHOKYGMRESC2QZVSBDYW5P45R6FRRPBZYGZ5H6XTWSNR6HEXKOD735UG3B7IQ",
  "object": "urn:erisx2:AAAFVNIJDY22FZJZBL2LEWM7XR6P7GHUJ52NWP2E4EBR4O2LSBAOSW2YHA5DA5726X53KPABXYYO5ZRWLIOOLTEXNPE6FIBQW23LDEEMEM",
  "to": "local:rustra",
  "type": "Create"
}

For that reason we need a mechanism to resolve URNs to URLs / resources.

One of the proposal could be an agreement about exact endpoints similar to /.well-known. RFC2169 describes a trivial convention for using HTTP in URN resolution.

The general approach used to encode resolution service requests is quite simple:
GET /uri-res/<service>?<uri> HTTP/1.0

Examples of services:

N2L (URN to URL);
N2Ls (URN to URLs);
N2R (URN to Resource);
N2Rs (URN to Resources).

Please share your opinion according to RFC2169 and ideas about any possible ways of URN resolution.

how · April 28, 2021, 8:35pm

RDF 1.1 Concepts section on skolemization mentions the genid well-known IRI:

Systems that want Skolem IRIs to be recognizable outside of the system boundaries SHOULD use a well-known IRI [RFC5785] with the registered name genid . This is an IRI that uses the HTTP or HTTPS scheme, or another scheme that has been specified to use well-known IRIs; and whose path component starts with /.well-known/genid/ .

For example, the authority responsible for the domain example.com could mint the following recognizable Skolem IRI:

http://example.com/.well-known/genid/d26a2d0e98334696f4ad70a677abc1f6

I could imagine exploiting this well-known endpoint to address ERIS URNs:

https://openEngiadina.example/.well-known/genid/urn:erisx2:AAADAF4BGFGNTYG65ZX3JW7A75VBCMRWN7B2WGIQ7PAQGBHOKYGMRESC2QZVSBDYW5P45R6FRRPBZYGZ5H6XTWSNR6HEXKOD735UG3B7IQ

This would avoid having to provide a template such as WebFinger, to register a new well-known IRI or to use uri-lists. Any conformant server could have such URI and send the decoded version of the URN.

This way, the simplest solution would be for implementations to try this on the origin server if they do not implement ERIS URN resolution locally, or use their own.

Of course that does not solve the problem if the activity is received from a server that did not dereference it and simply forwards it, and does not support ERIS.

There MAY be a known service that will implement this, but it would kind of re-centralize the service. E.g., we could agree to host a URN to URI resolver at a known URL, but the idea makes me shiver. What’s the point of using content-addressing if it is to fall back to a single domain?

Other resources might be useful:

NodeInfo (e.g., use the metadata to indicate your URN resolvers)
RFC 2169 - A Trivial Convention for using HTTP in URN Resolution (see Appendix A on text/uri-list proposal)
RFC 2483 - URI Resolution Services Necessary for URN Resolution (for text/uri-list specification)
Well-Known URIs for a complete list of potential candidates for squatting

Another idea would be to use the URN as the identifier, and use AlsoKnownAs for the URI (maybe the genid).

cjs · April 28, 2021, 9:10pm

In addition to what @how said, it would be nice to fully understand if such an adoption entrenches HTTP as being the single ActivityPub transmission protocol and makes extending to other protocols not dependent on DNS harder to support in the future.

That is, if the solution to map URN =(http)=> URL makes it harder for other mappings to be adopted/developed (ex: URN =(ssb)=> SSBID")

rustra · April 29, 2021, 12:19am

Looking at Christopher’s Magenc introduction I’ve found that Magnet URIs is the most probably what we need. They can be generated by anyone who already has the content, without the need for a central authority to issue them. The standard for Magnet URIs provides different parameters like as (Acceptable Source) and xs (eXact Source) which will allow to point out to initial ActivityPub instance generated the content.

By the way an xs example from Wikipedia use the mentioned above endpoints from RFC2169:
xs=http://192.0.2.27:6346/uri-res/N2R?urn:sha1:FINYVGHENTHSMNDSQQYDNLPONVBZTICF

So a Magnet URI for ERIS-encoded content could be like:
magnet:?xt=urn:erisx2:AAADAF4BGFGNTYG65ZX3JW7A75VBCMRWN7B2WGIQ7PAQGBHOKYGMRESC2QZVSBDYW5P45R6FRRPBZYGZ5H6XTWSNR6HEXKOD735UG3B7IQ&xs=https://openEngiadina.example/uri-res/N2R?urn:erisx2:AAADAF4BGFGNTYG65ZX3JW7A75VBCMRWN7B2WGIQ7PAQGBHOKYGMRESC2QZVSBDYW5P45R6FRRPBZYGZ5H6XTWSNR6HEXKOD735UG3B7IQ

And together with text/uri-list it’s possible to refer to another ways to access the content (e.g. IPFS).

rustra · April 29, 2021, 11:37am

I’m kindly asking @yvolk to share your thoughts related to possible support of Magnet URIs.

yvolk · April 29, 2021, 6:47pm

My current opinion, after brief reading of the linked sources, is that Magnet URI:

is not for making URIs shorter,
does not require us to agree immediately on “URI to URL” resolution / content discovery services.

on the contrary:

it is packed content metadata needed for its identification and retrieval.
in the simplest case it contains one or more URLs, where the content may be found, so it easily replaces content id in the form of URL,
it contains content hash allowing to identify and deduplicate content, referred to by different sources (e.g. the same image sent by different Actors), maybe even without an immediate need to implement hash calculation (just by comparing hashes of two Magnet URIs…)
it is a compromise between pointing to the concrete location and at least potential ability to search and retrieve content from anywhere…

So I think that minimal support of Magnet URI containing actual download URL can be added easily to a Client application, with a bonus of getting much more from it in the future, step by step.

rustra · April 29, 2021, 7:09pm

Thanks everybody for your opinions! Now I see the direction to move later.

macgirvin · April 29, 2021, 7:52pm

can never be edited/updated. This may not concern some people, but is a deal-breaker for others.

rustra · April 30, 2021, 8:34am

@macgirvin, I suppose that it’s a question of mutability in general, immutable Magnet URIs are just consequence of immutable data.

@pukkamustard is researching data structures based on immutable data and allowing mutability in scope of DREAM / DROMEDAR project. And your concern should be addressed there as well.