Help, trying to get my ahead around ActivityPub!

I’m new to the whole ActivityPub scene and it all seems very exciting but I’m not quite sure how it all works.

For starters, things like Mastadon, Friendica, PeerTube and Pixelfed are all services based on the ActivityPub protocol (according to: The many branches of the Fediverse) but what exactly does this mean in practice? For example, what, if anything, can a PeerTube client (or server?) do with a Mastadon ‘toot’ object? Ditto I assume PeerTube has some kind of custom object type for sharing videos - what can Mastadon do with these?

My understanding is that a ‘toot’ is a Mastadon extension of an ActivityPub ‘note’ object, so perhaps PeerTube can treat ‘toots’ as if they were ‘notes’, but without providing any support for the extra capabilites of ‘toots’?

But I’m really just guessing here, I’m a c++ coder by trade so I tend to think of things like this as ‘class hierarchies’ which seems reasonable enough given what I’ve read so far, but I’m still a bit puzzled by how useful it is to be able to ‘publish’ subclasses of a type that is only recognizable by one specialized service.


Hope that makes a bit of sense!


ActivityPub is a protocol to pass activities back and forth between actors. When an actor performs an activity, it sends a message to another actor’s inbox – like email, but for web services; you HTTP POST some JSON to the inbox endpoint.

Using the backing ActivityStreams 2.0 core spec and a subset of its vocabulary, you can represent actions taken on a remote service and handle them on your own service. ActivityPub defines side effects for certain activities like Follow, Create, Announce, Like, Update, Delete, Accept, Reject, Add, Remove, Undo. When receiving these activities in an inbox, the server will perform certain actions on the actor’s behalf.

They’re using standard objects, actually. Mastodon serializes and federates toots/posts as Note objects (basically unstructured blobs of text). PeerTube serializes and federates its videos as Video, and its comments as Note. When Peertube sends out a Create Video and it arrives in Mastodon, Mastodon will transform it into a status with the name of and link to the video. When replying to that status within Mastodon, Mastodon will send out a Create Note.inReplyTo pointing to the PeerTube video by its ID. Peertube will interpret that Note as a comment on the video.

In lay terms, it’s like using Twitter to comment on a YouTube video. The shared protocol allows the services to communicate together. You can follow/subscribe and reply/comment and favourite/like and each system will do its best to understand.

More or less – the principle behind federating out AS2 documents is basically “use what you need”, or rather, “use what you understand”. If you don’t understand something, you MUST ignore it.

There is no class hierarchy and there are no meaningful subclasses. Everything is an Object or a Link. Everything is derived from there – pretty much all properties stem from the Object type, with little to no regard to inheritance.

In fact, it is more appropriate to understand the AS2 types as interfaces rather than classes. An actor is just an object with inbox and outbox – it does not have to be specifically a Person or Group or Organization or Application or Service. An activity is just an object with an actor.

In practice, the standard vocabulary can get you pretty far, so you’ll often see implementations sticking to it unless they have a real meed to extend it.


I’m also still learning ActivityPub and AS2. I agree that thinking of the AS2 types as interfaces is more accurate than thinking of them as classes (in a programming language sense). However, there does seem to be a clear specialization or inheritance hierarchy. The AS2 specification explicitly refers to the types as specializations of other types (with Object and Link being the exceptions). For example, it states that an “Activity is a subtype of Object”.

In AS2, there is no inbox or outbox associated with Actor types (in contrast to the AP spec). The Activity Vocabulary document only mentions “inbox” in a non-normative section. The AS2 specification says Actors are “capable of performing activities” but isn’t specific about the nature of those activities (just the AS2 Activity types or more general activities?).

Would a publish-only bot be an example of an Actor that might have no inbox?

If I created an AP-enabled directory of things like people (Person) or software applications (Software/Application), would it be proper to model the directory entries as Actors even if they have no inbox or outbox?

This only has to do with which properties are supported. It has no additional meaning. Inheritance falls apart when you consider things like:

  • IntransitiveActivity is a subtype of Activity that removes properties instead of adding them
  • the difference (or lack thereof) between a Document and an Image
  • Collection vs OrderedCollection being largely the same thing and the real distinction being the use of “items” vs “orderedItems”

This is because ActivityPub is an extension to AS2.

Before ActivityPub, you could use ActivityStreams as a sort of internal schema within your application. If you wanted to keep track of actions performed by your users, for example, AS2 could be a suitable framework for building an API against it. ActivityPub formalized this to some extent with the Client-to-Server portion of the spec. It also defines side-effects for a subset of activity types.

You could do no-inbox, but then other software might not recognize it as a valid actor. In any case, you’d need an inbox to receive Follow activities. The outbox is less needed, but MUST still be present in order to be a valid actor.


Ok, so there is a standard set of ‘stock’ objects - notes, videos etc - in the ActivityPub spec that all ActivityPub services implement, even if it’s just in the form of a notification of some kind? Thought I’d seen the spec for a ‘toot’ somewhere but quite possibly not…

In that case, how does an ActivityPub based service provide an ‘enhanced’ (or ‘intentionally limited’) version of an object? For example, how does mastodon enforce it’s N character word limit on notes, is this just on the client side when text is entered and/or displayed, or do mastodon servers truncate notes to N characters or something else?

I guess I’ve been thinking about how hard it’d be to implement some kind of service that is more ‘journalist-friendly’ than Mastodon is (at least in the opinion of a number of journalists recently arrived from twitter), maybe ‘notes with simple markdown’ including ‘inline links to other notes’ etc so journalists can ‘quote toot’ etc.

I get the author(s) of Mastodon don’t want to do that for whatever reason, and I’m fine with that and am NOT claiming to have anything like an informed opinion on the subject of whether quote tooting would be a net improvement or not!), but it’s something a lot of journalists coming from twitter obviously want and I’m a bit confused about what it’d mean if an ActivityPub service came along that did do that - or even if it could (are notes limited to plain test always?!) Would they suddenly get blocked by the Fediverse en-masse for being quote tooting A-holes?!? Or on servers that didn’t block them, how would/could other clients see their ‘enhanced notes’ etc?


Mastodon enforces validation on its API so clients can only submit up to N characters of content. There’s no truncation or otherwise.

If it’s understood then it will be displayed. If not, then it won’t be. Notes (and anything in the content field in general) are HTML by default unless otherwise specified. As far as Mastodon is concerned, it will sanitize your incoming HTML so that it only contains <p>, <span>, <br>, and <a>.

Mastodon enforces validation on its API so clients can only submit up to N characters of content. There’s no truncation or otherwise.

OK, that’s for client->server but what about server->server? What would happen if a user on a Friendica server sent a note > MAX chars to a user on a Mastdon server? Would the note be rejected? Would the recipient/sender be informed?

I’m still kind of left wondering what the best way to think of Mastodon, Friendica etc is! Are they all effectively ‘just’ implementations of ActivityPub at heart? What does Mastodon have that Friendica doesn’t (and vice versa) if they’re all using the same activity/object definitions? Is it mainly down to the clients in that case?

I’m currently in the process of joining a Friendica server so I’ll be able to try to figure out some of this stuff for myself soon, and will stop bugging you guys (so much) but thanks for every one’s help so far!


Interesting question. Please share what you learn. From digging in the source code, it looks like there’s no length limit for the “text” field in the database schema. However, the StatusLengthValidator (with a 500 character hard-coded limit) validates the Status (created from the AP Note) before the data is inserted into the database. So, it appears to me that the Status will be effectively dropped when it fails validation.

Statements like “X is a subtype of Activity” do have meaning, at least to me. For example, when I’m writing AP-related software, I process an Activity subtype in a different way than a Person or a Video.

Variants of the word “inherit” are used over 50 times in the AS2 specification. It seems they at least had something like that in mind even if the result has some rough edges. Whether the AS2 subtype or inheritance terminology is useful in practice is a matter of opinion.

I don’t know about the Friendica side, but I tried sending a Note with ~1000 characters to a Mastodon instance configured for 500 max. I was surprised to see the complete Status text show up in my Mastodon home timeline. Apparently, the inbound server-to-server communication bypasses the validation of the Status text size.

I’m not sure I’d rely on that “feature” though since the Mastodon developers could change that behavior at any time.

1 Like

It would arrive just fine. The only thing Mastodon enforces on incoming messages is sanitization so that the HTML only contains p span br a as stated above.

If we were to compare to the current big networks, it would be like “Twitter and Facebook, but they can talk to each other”. Think about it. There’s no fundamental difference between what happens on Facebook and what happens on Twitter. You’re just sending blobs of text around, maybe with some media attached. The only thing different is the presentation around it.

The use of the word “inherit” in the AS2 spec has to do with inheriting properties from the root. Only properties are inherited, and in some cases even disinherited (as IntransitiveActivity is a subtype of Activity with the only difference being the lack of the object property). Note that wherever the word “inherit” is used, it is almost always verbatim in the sentence “Inherits all properties from Object”. This is because all properties live on the Object.

An Activity is simply an Object that has actor. What really matters is the side effects defined for each activity type within the ActivityPub extension to AS2. And outside of Activity subtypes (some of which have side-effects), the whole type system is largely irrelevant outside of merely hinting the author’s intent.

  • For example, you may assume that a .png Document should be treated differently from a .png Image, but on a spec level there is no difference.

  • There is also, for example, a mistake that many implementations make, where they assume an actor MUST be in Person Group Organization Application Service, and thus fail to account for extension types entirely

  • Yet another example: Note and Article are not actually distinct except for Mastodon refusing to show Article in full.

  • You can also use Collection and OrderedCollection interchangeably, as the real distinction is in whether you use the items vs orderedItems property – and in fact, you may need to use orderedItems on a regular Collection if its ordering is not strictly “reverse chronological”, since ActivityPub reserves OrderedCollection for reverse chronological collections where the newest items are always first.

So both in theory and in practice you should only be identifying Activities by the presence of the actor property. Whether you understand the specific type of the Activity is a separate concern – maybe you don’t understand Flag activities, for example, but you can still tell it’s an Activity and not a generic Object because of that actor property. That way, you don’t mistakenly try to wrap it in a Create or anything like that.

1 Like

As far as I can tell, Mastodon only enforces the 500 character limit on people using Mastodon. If you reply to a Mastodon post via ActivityPub, you are not limited to 500 characters.

I’m guessing that they do this because the alternative is to either randomly truncate the post at 500 characters, which would not be a good user experience, or they link to the comment on the originating community server, which they seem to be reluctant to do since that sends you away from Mastodon, which would also not be a good user experience. If they reject anything over 500 characters, that would not be a good user experience either, and would give people the impression that communications with Mastodon are unreliable (since your ActivityPub client might not tell you why your post was rejected).

Basically, they can limit posts to 500 characters in Mastodon because they can stop it before it gets posted. If the post comes from anyone other than Mastodon, the post is already the size it is, and to preserve a good user experience, they just display it, even if it is 1000 characters long.