Status of a Robust ActivityPub Test Suite?

Even with Gherkin, this feels like “starting a new test suite” to me. (I realize you may have a different definition of “test suite” than I do.)

Although the tests I’m running are written in Python, they don’t have any server-specific implementation details in them. I’m going to be experimenting with representing at least some of the tests in Gherkin. If that works well, I envision that the current “driver” code would evolve into the Gherkin step implementations for specific servers.

I’ve done some experimentation with Gherkin already, but I haven’t found a representation that I like (one that captures the protocol behavioral requirements with minimal extraneous technical details).

I also believe I’d need some of the features I’ve implemented in the Python tests to conditionally skip or modify tests based on configured server capabilities (and known bugs, etc.). That’s going to require customizing whatever Gherkin runner is used (e.g., “behave”, in Python). Some runners have more hooks for customization than others.

So, while I think there could be value in declarative Gherkin test definitions, there’s still a lot of work to be done to create the step implementations and supporting code for the servers being tested. The Diaspora tests have about 3000 lines of Gherkin and about 2000 lines of Ruby step-related code, for example.