FEP-f1d5: NodeInfo in Fediverse Software

source: https://codeberg.org/fediverse/fep/src/branch/main/feps/fep-f1d5.md

(Previously WITHDRAWN from https://git.activitypub.dev/ActivityPubDev/Fediverse-Enhancement-Proposals/src/branch/main/feps/fep-f1d5.md)



authors: @cjs , @silverpill
status: FINAL
dateReceived: 2020-12-13
discussionsTo: #50 - [TRACKING] FEP-f1d5: NodeInfo in Fediverse Software - fep - Codeberg.org

FEP-f1d5: NodeInfo in Fediverse Software

Summary

NodeInfo is a protocol intended to standardize upon a way to provide
server-level metadata to the public. This enables tools and clients to utilize
this metadata to assess server health or facilitate end-users choices about
servers and software to use on the Fediverse.

History

NodeInfo was developed prior to the ActivityPub protocol targeted for use by
diaspora, friendica, and redmatrix software [ActivityPub]. Some of the original
protocols it encapsulated include diaspora, pumpio, and gnusocial.

The NodeInfo specification is incredibly strict in its schema, often requiring
regex-validation and a closed set of enumerated possible values. As an objection
to this, the NodeInfo2 fork was created as a form of criticism by removing some
validation of fields and with some logical restructuring of the metadata.
Building off of NodeInfo and NodeInfo2, ServiceInfo was briefly
explored [ServiceInfo].

This FEP does not attempt to document the specific protocol details. For
that, see the [NodeInfoRepository] and [NodeInfo2Repository]. It attempts to
clarify the history and identify shortcomings with the current approaches, to
bring context to developers of Fediverse Software.

Requirements

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”,
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this specification are to
be interpreted as described in [RFC-2119].

Fediverse software SHOULD implement NodeInfo [NodeInfoRepository].

Caveats

At the time of this FEP’s writing, the current objections to the current state
of NodeInfo that have been identified by the community are below. Note that any
technical alternatives identified are meant to be illustrative and not
prescriptive:

  • The software.name regex is unnecessarily strict. For example, no uppercase
    letters, no spaces, no non-English-alphabet, and no special characters besides
    hyphen are permitted.
  • The software.version field is required, which is unnecessarily strict.
    Forcibly requiring software to divulge version information is potentially a
    security issue.
  • The inbound and outbound elements are specified as a closed set of enums
    instead of a simple string. Protocol versioning manifests as renaming, having
    to add a new enum, which results in unclear version management.
  • The Fediverse software MUST have an openRegistrations concept due to it
    being required.
  • Lacks an extendable method for identifying and versioning other features, such
    as HTTP Signatures, webfinger, or OAuth. Whereas the specification is very
    strict, the metadata is too lax.
  • The usage.users is not denormalized, such that implementations can provide
    custom pairs of (activity counts, time period in days) that make sense for
    the software.
  • The usage.users assumes that user identity is tied to a specific instance of
    running software. It is unclear how to count total users when user identity
    is: spread across multiple servers, spread across multiple groups, or present
    within multiple collections of users. Multiple software instances could each
    have a reasonable claim to counting the user as “using” their software, which
    globally results users being counted more than once.
  • The usage.users activity counts likewise assume that user identity is tied
    to a specific instance of running software. For the same reasons above, where
    the total user counts may result in duplicate counts of the same user across
    all software running, the activity counts activeHalfYear and activeMonth
    may also result in a globally inflated count.
  • The activeHalfyear and activeMonth are ill-named properties for describing
    the time periods of 180 days and 30 days, respectively. A “half of one year”
    is 180 days 0% of the time and roughly 182.5 days only 75% of the time. A
    month is 30 days only 33% of the time.
  • The localPosts and localComments are not denormalized into pairs of
    (kind, counts) for software that, for example, hosts audio files, hosts
    videos, or software that does not have comments, or does not have posts.
  • The localPosts and localComments are required, which is problematic for
    software that does not have comments, or does not have posts.

Implementations

Servers

This list is not comprehensive:

  • Mastodon
  • Matrix
  • Pleroma
  • PeerTube
  • WriteFreely
  • Friendica
  • Diaspora
  • PixelFed
  • Misskey
  • Funkwhale
  • Smithereen
  • Plume
  • GNU Social
  • lemmy
  • zap
  • Socialhome
  • epicyon
  • apcore

Clients

References

Copyright

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

To the extent possible under law, the authors of this Fediverse Enhancement
Proposal have waived all copyright and related or neighboring rights to this
work.

@cjs the draft is now available in the FEP Index.

Smithereen also implements NodeInfo :eyes:

(partially — I don’t keep track of account activity, at least not yet)

However. The concept of posts and comments fits my presentation nicely because I have both and they are very distinct in the UI, but what if there are different kinds of content? For example, what if I add photo albums? How would I count these and photos within, or would I ignore them entirely? The idea of “kind, count” pairs does indeed feel like the way to go.

1 Like

Thanks @grishka, I’ll update the FEP to add Smithereen – an oversight on my part (sorry!). Hopefully the bullet in the FEP that reads…

  • The localPosts and localComments are not denormalized into pairs of (kind, counts) for software that, for example, hosts audio files, hosts videos, or software that does not have comments, or does not have posts.

…captures your comments around photo albums. You’re correct that ideas like (kind, count) I’m trying to not be too technically prescriptive/dictatorial in solutions, just give a general sense of what alternatives could look like. I can update the FEP to make that clearer.

Additionally, Mike provided the following feedback to me here:

  • Providing the version should never be mandatory (this is potentially a critical security issue).
  • Count of “users” should likewise not be mandatory, and if it is, the concept needs to be better defined.

In my projects, a single user can be the controlling party behind an unlimited number of communication channels which may or may not be related. The same account registrant can also create groups and “collections” (aka sub-channels). They can also clone any of these communication channels to multiple servers. It isn’t always obvious what percentage of that user’s overall activity was actually carried out on a different website. Should they be able to claim this as some percentage of a single user’s activity? If not, the counts will be wrong. Either one site will claim 100% of the user and the others will have to remove that user from their own tallies (resulting in a fictional total for any of these sites), or they will each claim 1 user and the tallies will likewise be fictional.

I’m inclined to disagree on the first bullet, but fully agree on the rest.

My disagreement on the first bullet rests on the assumption that the “critical security issue” is disclosing of a software version number that identifies software as having a vulnerability (that may be patched at a later version). Simply removing the version number is obscuring the problem, not solving it, and is security-by-obscurity. Adversaries will simply try the exploit on a server that hasn’t disclosed a version anyway, and I would argue now non-malicious users and developers alike would have less information available to themselves to identify how many servers are still vulnerable, and contact admins to upgrade ASAP. Which is a net detriment, in my opinion.

As such, I’d like to avoid putting the first bullet in the FEP.

Mike’s second bullet and elaboration I agree with and would like to modify the FEP to include.

To update: I agree with Mike that software.version should not be a required field. It is unnecessarily strict. I just was hoping to avoid citing “potential security issue” as the specific reason. I don’t feel that strongly about it as others, so I’ve gone ahead and included it with that specific rationale.

I guess software.version is required as a way to ensure compatiblity. If it could be replaced with some protocol or API version, it might alleviate the security issue, i.e., by telling machines : expect this API, instead of telling attackers: I’m running that vulnerable version. Since AP has a stable version, indeed, this field is currently unnecessary in AP context.

And this is how we come to the need to have some sort of capability negotiation between servers :wink:

Imagine people hardcoding checks for software names and versions. This can’t end well.

1 Like

Another interesting thing about instance metadata — Mastodon’s /api/v1/instance has kinda become the de-facto standard for whatever isn’t representable in nodeinfo.

I’m not sure this is the right place to discuss this, but how fixed is the list of elements that would be made available in NodeInfo?
Where should I discuss potential addition of new entries?

I have been wondering about how to decrease the incentives for various people to use the APIs or to scrap Fediverse platforms to get information about the network structure (or specific subparts of it).

I wrote about this a bit on Mastodon, including a specific proposal for server information that would complement Mastodon’s “peer” and “activity” information in the API.

Basically the idea would be to provide a privacy-preserving method to show an instance’s connectivity / the Fediverse structure, without accessing individual information.

1 Like

If I remember right, NodeInfo’s author seems particularly keen on making all fields closed bounded enumerations instead of free field text in the specification. Hence the (temporary) NodeInfo2 fork.

Probably the NodeInfo repository. But once that is completed, then to get fediverse software to adopt it, the appropriate software repositories possibly.

1 Like

Also, the above deserves its own topic.

1 Like

I remember that the name of the software was specified as an enum. Obviously, that one was never respected.

Note: This FEP was re-submitted by @silverpill to fep/fep-f1d5.md at main - fep - Codeberg.org. Sticking with this thread for simplicity.

3 Likes

The draft of this proposal was withdrawn due to inactivity: #26 - Set status to WITHDRAWN - fep - Codeberg.org, so I re-submitted it. If there are no objections, I will later request finalization, in accordance to the FEP process.

3 Likes

I have requested finalization of FEP-f1d5: #50 - [TRACKING] FEP-f1d5: NodeInfo in Fediverse Software - fep - Codeberg.org

2 Likes

By chance, I discovered a hackish way to update the featured link, and it worked. It is documented here on Meta: Can I add featured links after I create a topic? - #2 by cocococosti - support - Discourse Meta

1 Like

Hurray :tada:

This Fediverse Enhancement Proposal is now FINAL :rocket:

2 Likes

NodeInfo v2.2 is being prepared: next (v2.2) nodeinfo schema version · Issue #76 · jhass/nodeinfo · GitHub

1 Like

Is there a volunteer to?

  • collect all the criticisms on nodeinfo from this fep
  • collect things that cause validation errors at

https://helge.codeberg.page/fediverse-features-tests/#__tabbed_1_2

and see if we can get them addressed in nodeinfo 2.2.?

Example, protocols being from an enum is probably a bad idea.

1 Like

I don’t necessarily agree with all objections listed in FEP-f1d5. For some of them, there is an open issue already (e.g. openRegistrations).

I’ll review open issues and proposed changes, maybe propose something too, but if some particular thing is bothering you, it would be better to open an issue yourself.