Resolving the Note vs. Article distinction

Background

Activity Vocabulary - 3.3 Object Types:

Note: Represents a short written work typically less than a single paragraph in length.
Article: represents any kind of multi-paragraph written work.

Example 48 (Article):

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Article",
  "name": "What a Crazy Day I Had",
  "content": "<div>... you will never believe ...</div>",
  "attributedTo": "http://sally.example.org"
}

Example 53 (Note):

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Note",
  "name": "A Word of Warning",
  "content": "Looks like it is going to rain today. Bring an umbrella!"
}

Semantically, the difference is never explicitly defined (how do you define a “paragraph”?), so the current fediverse has sort of assumed Article should be viewed natively on the remote website, while Note can be displayed as an inline status. Thus, Note is used to represent a status update, and a lot of the network just defaults to Note. The distinction is assumed to be formatting, but once again this is not an explicit definition (how do you define “formatting”?)

Disambiguation

Going purely from the Activity Vocabulary descriptions and examples, I would possibly assume one or both of the following:

  • Note SHOULD be plain text, Article SHOULD use HTML (or should these be a MUST?)
  • Note SHOULD NOT use newlines (but are technically allowed to do so)

However, there is ActivityPub 3.3, Example 8:

{
  "@context": ["https://www.w3.org/ns/activitystreams",
               {"@language": "en"}],
  "type": "Note",
  "id": "http://postparty.example/p/2415",
  "content": "<p>I <em>really</em> like strawberries!</p>",
  "source": {
    "content": "I *really* like strawberries!",
    "mediaType": "text/markdown"}
}

This example Note uses HTML for its content, in order to demonstrate the source property.

Also, ActivityPub Example 4:

{"@context": "https://www.w3.org/ns/activitystreams",
 "type": "Create",
 "id": "https://chatty.example/ben/p/51086",
 "to": ["https://social.example/alyssa/"],
 "actor": "https://chatty.example/ben/",
 "object": {"type": "Note",
            "id": "https://chatty.example/ben/p/51085",
            "attributedTo": "https://chatty.example/ben/",
            "to": ["https://social.example/alyssa/"],
            "inReplyTo": "https://social.example/alyssa/posts/49e2d03d-b53a-4c4c-a95c-94a6abf45a19",
            "content": "<p>Argh, yeah, sorry, I'll get it back to you tomorrow.</p>
                        <p>I was reviewing the section on register machines,
                           since it's been a while since I wrote one.</p>"}}

This example Note uses two <p> elements, representing two short paragraphs (once again not “less than a single paragraph”).

So even the specs themselves are inconsistent on any distinction.

How much does this actually matter?

Arguably not much, since implementations often convert Note and Article into their own internal schema for statuses anyway. But it could still be beneficial to set a clearer distinction going forward on how these types should be assigned, ideally.

The distinction I make between Article and Note isn’t related directly to it’s content but on how it’s supposed to be presented and used, Articles are more things for blogs where you have about a post per day and so articles should be easy to find back with a list of articles/tags and maybe a bit of search features, Notes are more stuff like microblogging where you can easily have hundreds in a day and aren’t that easy to find back even with good keywords in full text search.

Also I find that formatting is actually very useful for notes because it allows to express more/equivalent with less (like a word-list vs a paragraph).

This question goes a lot in the fediverse because they are the two mainly used activity types but one could also ask about the actor distinction between Organisation and Group, Application and Service. And so far I’ve only seen ActivityPub Document be used in the wild for Images with textual description (like if Image couldn’t have it in the first place), but Document has no inherent meaning either.

How much does this actually matter?

Arguably not much, since implementations often convert Note and Article into their own internal schema for statuses anyway. But it could still be beneficial to set a clearer distinction going forward on how these types should be assigned, ideally.

Note: Pleroma keeps the distinction between Articles and Notes internally, no real differencies for the Mastodon API though but there could be a query filter.

For prior art, I can think of semantic HTML’s <article> being a section of HTML that can be reproduced elsewhere in its entirety.

W3Schools:

The <article> tag specifies independent, self-contained content.

An article should make sense on its own and it should be possible to distribute it independently from the rest of the site.

Potential sources for the element:

Forum post
Blog post
News story
Comment

MDN:

The HTML <article> element represents a self-contained composition in a document, page, application, or site, which is intended to be independently distributable or reusable (e.g., in syndication). Examples include: a forum post, a magazine or newspaper article, or a blog entry.

This could be a distinction worth making, maybe? that an Article should roughly map to an <article>, whereas a Note is just arbitrary text?


Those distinctions at least seem clearer to me. There’s still ambiguity but much less:

  • Org = “Acme Inc”
  • Group = “Persons working for Acme Inc”
  • Application = “ChessBot”
  • Service = “Acme Mailing List / Relay”

And those arguably matter even less, because the relevant bit is that it’s an Actor. You could use Profile, or really anything else – the only functional thing you should care about is whether it has inbox/outbox (and maybe the other optional collections). So an Actor is defined by those properties, but Note/Article are not really defined by anything except really vague semantic advisory about what a “paragraph” is.


Practically speaking, though… if there indeed isn’t a big difference between Article and Note, then I’m kind of worried about what side effects one can expect when federating out content and having to choose a type. The only tangible thing I’ve seen in existing implementations that opt for Article rather than Note is that they support the use of URL slugs.