The ActivityPub test suite

cwebber · November 5, 2019, 2:52pm

test.activitypub.rocks has been down for a long time, as pretty much everyone knows. More or less it’s server issues, and I can be blamed since I run the server (and wrote the test suite).

Why hasn’t it gone back up though? Mainly because I’m a) very busy b) know how poor the test suite is at fulfilling the needs of the community and that makes it hard to feel motivated to put in the work.

Here are some quirks about test.activitypub.rocks:

It was done in a rush to meet the standards deadline. The main purpose was to collect implementation reports, and given the rush some things are funny about it.
It was done in such a rush that it’s a completely separate program that’s just bundled straight into the same codebase as the test activitypub implementation I was writing, Pubstrate (which has some good ideas but which I won’t recommend either).
Technically it’s two test suites. Scratch that, it’s a test suite and a questionaire:
- There’s a test suite for the client to server protocol, which really does resemble a test suite; it makes many requests against your server and sees if they work right. (It’s buggy, though.)
- There’s a mostly-questionaire (but it runs a couple tests, I think) for the server-to-server protocol. Why? Because a) performing a test suite on an asynchronous protocol is already hard enough (when does the test complete? it can be done with timeouts, but it’s gross) and b) many of the requirements required that some authentication or authorization take place, but due to process reasons we couldn’t specify what that was, and c) at that point we already had an unusual amount of interop, so we decided that if multiple implementations could federate on a feature and could confirm that in the questionaire, we could consider that sufficient. (Even weirder, some of the questions would involve the server doing something to your server, but then still having a question for the user to confirm if the thing worked.)
A proper test suite wouldn’t work like this. It would be fully automated.

It does look pretty though, that’s the only thing I’ll give it credit for.

Where to from here? There are really a few options:

Do nothing, things remain broken. Obviously not ideal!
I get back up the test suite. Again, it’s hard for me to get motivated about this but I could probably do it. But then I fear once I get it up all I’ll get in response is THIS is the test suite? WTF, this isn’t a test suite at all!
Someone else could get up the test suite based on the current codebase and I could transfer running it to them. Notably someone tried this already but they weren’t familiar with Guix, and that’s currently the easiest way to get it up and running
Someone else could write a new, much better version of the test suite. In fact we discussed this on a recent SocialCG call and I was asked how I would feel about that, and I more or less said I thought that would be great. But someone needs to do it. Who? Maybe it’s you!

So… what next?

rocco · November 6, 2019, 9:56am

I favor the last option:

Someone else could write a new, much better version of the test suite

It’s even possible that I can help, but I’m a guile/scheme noob so there’s probably not much I can do with the legacy test suite.

To make sure it is not just me to be scared by scheme, elixir & friends I did a quick, unscientific search for activitypub-related repos on github and got 233 repos with this language distribution:

number of repos	programming language
30	Python
25	Go
23	PHP
21	JavaScript
16	Rust
13	Ruby
10	TypeScript
9	HTML
7	Elixir
5	Clojure

IMHO the test suite should be implemented in a language which is both popular among activitypub hackers, and has good support for testing asynchronous protocols.

cjs · November 6, 2019, 3:16pm

I think that’s a nice-to-have. If someone is generous enough with their time to do a rewrite, I think that they should use the language they’re most comfortable with.

However, if they happened to pick Go, and happened to want to use go-fed apcore (which would help motivate my current slow-go near the finish line), I would gladly offer heavy collaboration (voice/video chat).

rocco · November 6, 2019, 4:30pm

Golang can do.

Tentative requirements for this webapp:

test suite results are generated as static HTML files that can be accessed publicly from a permalink; there is a summary page for all implementations, with the lastest test
to start a test, the user has to log in to an “implementer dashboard” using her socialhub.activitypub.rocks account (we can use discourse as a SSO provider, there is an official golang implementation) and request it, filling a form
test requests are stored in a queue and asynchronously processed; basic queue management: list tests, test status, cancel test, retry test
notifications (“test failed”, “test passed” etc.) are posted to a discourse category (https://socialhub.activitypub.rocks/c/tests), quoting the user (@rocco) so that she is pinged

Types of test suites:

user has a client up and running:
- c2s: test user’s client against our server
user has a server up and running:
- c2s: test user’s server against our client
- s2s: test user’s server against our server

The webapp must be dockerized so that it can be run standalone with a development local discourse instance.

It should be also possible to run each test suite locally from CLI so that the same tests can be run as local tests or as part of CI for each implementation (go implementations will benefit more from this).

yvolk · March 31, 2020, 5:17am

AndStatus app for Android is a social networking client app that has most of the features that the ActivityPub test suite mentioned for Client to Server protocol. And it works with at least one real server software: Pleroma. More details are here: https://github.com/andstatus/andstatus/issues/499

In this sense the app may be viewed as semi-automated test suite for a Server’s client to server implementation.

I am willing to extend the application’s “ActivityPub tester’s” features. The only show stopper is absence of server side implementations. What are we talking about here if almost nobody develops ActivityPub Client-to-server part of the specification?

AndStatus already has large automated self-test suite, so it’s not a problem to create (actually, compose from existing blocks) another test suite, focused on testing features that are needed for ActivityPub testing.

The app is written in Java.

how · March 31, 2020, 7:07am

Welcome @yvolk! Indeed C2S seems to have been neglected for too long. I wish it would be used though, since a full client implementation would make “apps” fade away and the protocol shine.

OMG, why are you having this discussion there instead of here?

Sweet! I’m sure you’ll find here people willing to help. Yes, people?

yvolk · March 31, 2020, 6:42pm

OMG, why are you having this discussion there instead of here?

That discussion is about concrete implementation of a concrete application. It gives real ground and subjects for “higher level” discussions and decision making.

I hope that having working client app can motivate server-side developers to implement client-to-server support AND that such app will be used as a development/test tool. In fact, it is used now in this way helping to find programming and conceptual mistakes in the ActivityPub C2S implementation of Pleroma.

strypey · April 12, 2020, 12:29am

It’s great to see Pleroma and Andstatus working together on AP C2S! Is there a thread here where we can follow progress on this, and point other devs wanting to try implementing it too, eg in the C2S category?

yvolk · April 12, 2020, 5:56pm

We just created this topic About AndStatus to discuss C2S
Thanks to @how

how · April 28, 2020, 6:59pm

Someone on the Fediverse asked for the test suite again. @rocco do you think you’d have some time to work on it? Maybe @yvolk can help as well. What do you people need to help around? I could activate Discourse SSO provider if this can help, @rocco.

schmittlauch · April 29, 2020, 9:26am

Also, the person asking had the impression that ActivityPub is dead because of the test suite being down.
So depending on how bad the old test suite is and how much effort is needed to bring it up again,it might make sense to bring the old test suite back up while a new one is being worked upon, @cwebber.

Sebastian · May 6, 2020, 6:30am

[ attachment ]

and pinging @dansup

WClayFerguson · August 18, 2020, 4:38am

I think the main thing holding back lots of developers from adding ActivityPub support to lots of platforms is the fact that although the AP Spec looks great and seems very good, it is nearly impossible to make much progress with coding because when you try to federate against Mastodon or some other AP site, stuff will not work, and there’s absolutely no way to know what’s going wrong.

Without a test suite tool, the only way to make progress is to get a copy of Pleroma or Mastodon, and install it, run a server yourself, and basically reverse engineer it by watching the HTTP traffic or logging.
Obliviously this is a hurdle not many are going to get over. It’s even worse than that, because both Ruby and Elixir are oddball languages that hardly anyone is using, so the odds are that 99% of the devs that want to implement an AP server are just going to find it nearly impossible.

So the net result is that there is MUCH lower and slower adoption of ActivityPub in the world. Once there’s a test suite available the ecosystem will really take off rapidly, but not until then.

=======
Update: There’s one other alternative that ‘could’ work instead of a test suite tool, and that is a set of complete “Request and Response” conversations for various example scenarios (AP actors and actions), that contains all the HTTP Request headers, URL parameters, and JSON responses, so that there’s no “guess work” having to be done to interpret the spec.

aschrijver · August 18, 2020, 5:44am

I agree. There is an alternative testsuite. Check Unofficial Test Suite (go-fed/testsuite) created by @cjs . Imho, given this is the only functioning testsuite, it could as well become the Official one

The testsuite should be made discoverable, though, like with a mention on activitypub.rocks. Same for @dansup’s FediDB (which is out of closed beta, I think?). See Introducing FediDB - DevTools for ActivityPub.

There’s an ongoing discussion related to this: Easing the onboarding of new developers to the Fediverse.

Sebastian · August 18, 2020, 6:19am

Added to

@cjs @rocco
If someone would propose a session for the Conf about Tests, I would join
Also authors of various tools are at the Conference, e.g.
https://tinysubversions.com/notes/activitypub-tool/

how · August 22, 2020, 9:39am

I would be happy to help find funding to fix the test suite if people step up to implement it.

WClayFerguson · August 25, 2020, 2:21pm

One of my goals is to add ActivityPub into my project (to make it Federated) and I’d be willing to do implementation in TypeScript and/or Java for a test suite if I had funding, because I could accomplish both objectives at the same time.

But as I said before, we really wouldn’t even need a TestSuite that much (for new apps at least) if we simply had complete JSON examples including both Request JSON and Response JSON for most of the various C2S and S2S use cases.

These examples would add so much clarity that actually most of the other documentation would be not even needed…at least to get something up and running, which is the primary challenge.

nilesh · August 30, 2020, 9:26pm

This is useful for me too. I have implemented partial ActivityPub support in an open-source Rails app and it would be nice to know if it meets the actual spec or if it falls short.

How about building a Postman collection as a start? It supports fully automated test cases and can generate code in multiple languages.

marnanel · October 2, 2020, 3:39pm

This sounds like a possible #hackathon topic.

I think a test suite naturally falls into two parts. One part checks protocols which can be accessed anonymously. This involves AP services such as looking up a user’s inbox and paging throug it, as well as non-AP services commonly used with AP, such as webfinger and the other .well-known services.

The other, more complicated part, checks AP support. This will need to be able to create ephemeral accounts on the fly (like test1234@validator.example.com), each lasting a few hours, and allow you to follow them and send them messages, with a dashboard showing what has been received.

You should also be able to request that the accounts send you messages— but not to set the content, to stop the validator being used as an anonymous message service. For similar reasons, I’m also supposing that each temporary account would be locked into communicating with a single server, specified at creation time.

cjs · October 2, 2020, 4:00pm

You may be interested in the BoF session: The ActivityPub Test suite