Test cases for FEPs

A small set of tests cases were included in the submission of FEP-4adb and the author @helge requested this topic to discuss such inclusion.

A couple dimensions to consider:

  • Scope for the FEP repo: does this solve an issue we can all agree needs to be addressed in this project?
  • Administrative overhead and expectations of automation around such testing

And from the pull request, the author provides the following comment:

I’ve added a Gherkin feature with corresponding behave test of bovine to this pull request.
I’m not 90% sure that the format, I’ve used isn’t great yet. For example, the feature and bovine specific code is mixed up in the file system.
I’m 100% sure that I’m interested in hearing feedback on how to combine test cases + fep before trying to come up with a solution. My hope would be that this type of test is reusable across different implementations, and thus we are able to get some minimal standard of common test coverage for fundamentals.

What are your thoughts? Have you seen testing integrated in other standards processes?

Have you seen testing integrated in other standards processes?

I may not fully understand the question, but there’s test suites like this:
Test Suite for W3C Linked Data Platform 1.0

By testing, I assume you mean automated testing of some kind for compliance to the specifications (?).

As I’m sure you know, there was an ActivityPub test suite developed during the specification process but it was never really finished.

For the FEP Gherkin tests, I think they could potentially be useful. I’d like to see them more heavily documented and cover the FEP requirements more completely. I’m not an expert on Gherkin, but I do know it can be sensitive to small changes in the pseudo-natural language test specifications. However, that may not be as significant issue for a small test suite.

1 Like

Thanks for the link, yes I mean automated testing. I don’t think it hurts having it in there but as it is code but for automation I worry about the burden of maintenance as whatever frameworks/libraries change over time.

I’m still undecided about whether it’s a good idea to include the tests. However, if we’re going to include any tests, I think it’s a good idea to make them programming language-independent like these. That could be Gherkin or something like the test language used for LDP.

For an FEP like this one, I wonder if the entire FEP could be a Gherkin feature specification with the algorithm details described in the feature documentation and the examples represented by the Scenario test cases. It’s just a thought, but it would be an interesting option if there are tools for nicely formatting Gherkin feature specifications.

Can you point out where an actual test case is described? Clicking through the site, I’m not finding anything I can identify as a test, i.e. an assertion. I found

https://dvcs.w3.org/hg/ldpwg/raw-file/tip/tests/ldp-testsuite.html#test-case-example

but the linked Example seems like a description of who wrote the test and does not contain an assertion.

Further examples done by W3C

  • Json-ld Expansion test cases contains a lot of test cases for the json-ld expansion method. These test cases are all in the form “input document” → “output document”.
  • did-core test suite … not 100% sure what happens there, but my impression is that it’s javascript tests. So it’s very focused on a single language.

:information_source: One approach could be to have a first feature repeating the examples from the FEP and a second one going in more details (in particular covering all kinds of bad cases).

After doing some digging, I found this…

ldp-testsuite/ldp-earl-manifest-client-only.ttl at master · w3c/ldp-testsuite · GitHub

These are client-side tests and appear to be manual tests, similar to the AP server test suite. The automated test metadata appears to be embedded in the Java source code for the testing applications.

Example:

	@Test(
			groups = {MUST},
			description = "LDP servers exposing LDPCs MUST advertise their "
					+ "LDP support by exposing a HTTP Link header with a "
					+ "target URI matching the type of container (see below) "
					+ "the server supports, and a link relation type of type "
					+ "(that is, rel='type') in all responses to requests made "
					+ "to the LDPC's HTTP Request-URI.")
	@SpecTest(
			specRefUri = LdpTestSuite.SPEC_URI + "#ldpc-linktypehdr",
			testMethod = METHOD.AUTOMATED,
			approval = STATUS.WG_APPROVED,
			comment = "Covers only part of the specification requirement. "
					+ "DirectContainerTest.testHttpLinkHeader and "
					+ "IndirectContainerTest.testContainerSupportsHttpLinkHeader "
					+ "covers the rest.")
	public void testContainerSupportsHttpLinkHeader() {
...

Maybe they generate EARL test data from those annotations? I don’t know.

I’ve investigated Gherkin tooling a bit more and I’m quite disappointed. For example, I could not find something that takes a .feature file and checks for syntax mistakes. When checking for an auto formatter, I found:

which might also do the job of a syntax check.


The second thing I have, it might be a good idea to tag the features corresponding to a fep with the fep, e.g.

  @fep-4adb
  Scenario: User found
    Given Webfinger response:
      """
      {
        "links": [
          {
            "href": "https://having.examples.rocks/endpoints/mooo",
            "rel": "self",
            "type": "application/activity+json"
          }
        ],
        "subject": "acct:moocow@having.examples.rocks"
      }
      """
    When Looking up "acct:moocow@having.examples.rocks"
    Then Lookup at "https://having.examples.rocks/.well-known/webfinger?resource=acct%3Amoocow%40having.examples.rocks"
    Then ActivityPub Object Id is "https://having.examples.rocks/endpoints/mooo"

This would allow one to only run the features for the feps one has implemented.

Support depends quite a bit on tools you use and language kits that are available. In VSCode / VSCodium for instance there’s these handy plugins: https://github.com/alexkrechik/VSCucumberAutoComplete and the official Cucumber plugin https://github.com/cucumber/vscode

I was thinking about “automation” when I wrote the above. Basically debating how to integrate features into continuous integration.

Currently in bovine, I run:

      - cd features
      - git clone --branch fep-4adb https://codeberg.org/helge/fep.git
      - cd ..
      - poetry run behave

to execute the features corresponding to the FEP.

After starting to work out my own workflow, I’m tending towards a “fep-feature” repository:

  • Features will need to be checked out by all implementers whenever they want to run them.
  • It might be possible to automate generating an implementation matrix, listing which projects implement which FEP.

Here implement means “green for feature tests”.

There’s no rush on a decision here but so far there doesn’t seem to be a strong case for the tests. I can see doing so if a reference implementation were provided for a complicated protocol, but perhaps it’s better there to develop a set of FEPs that describe modular parts of such a protocol.

In the case of the Linked Data suite, it doesn’t bode well and somewhat proves my point about overhead that the repo hasn’t seen an update in 4 years.

Perhaps it’s best to leave tests to the implementations and keep FEPs agile.

I agree. The case for including something like Gherkin features with submitted FEPs is currently:

  • I like the principle of it.

I don’t think that the community currently has the appetite for this. I’m surprised due to the broken ActivityPub Test Suite still being a matter of debate, but I guess people like the idea of tests, and not the nitty gritty details of actually getting them.

My current plan is to start a repository for myself with features related to FEPs. If interest arises, people will be able to contribute / turn it into a more official process.

1 Like

As it amuses me, just seen on codeberg.org:

3 Likes

Just to check that nobody has any objections to this, before I remove the demo folder from fep-4adb.

New home for tests: helge/fediverse-features: Gherkin Features for BDD in the fediverse - fediverse-features - Codeberg.org

In general I feel that Gherkin behaviour tests:

  • Offer a nice language-independent way to specify behaviours.
  • Where concise Gherkin text offers something extra to free textual description.
  • Allows understanding by non-technical folks, while the format helps guide towards implementation.
  • Lends itself to be automated in test suites for any language where BDD libs are available.
  • For languages where BDD libs aren’t present, still offers enough guidance to write tests.

So, it may mean Gherkin scripts are well usable for FEP’s enough so to be strongly recommended, to be accompanied by the free-form specification text. Then that’d entail:

  • Documenting the FEP process for ways to deal with behaviour testing.
  • Likely having Gherkin scripts in the FEP subdirectory.

In this place is where we are not yet. So I think it is fine if you experiment more with Gherkin Features in own repo. Only to the extent where they belong truly to FEP they should be present in the fep repo (though they can cross-ref other information sources, where possible FEP’s should be self-contained imho).

1 Like

I have now written/recycled a feature for http signatures. My hope is that by using common features across many applications, one would be able to reduce interoperability concerns. As http signatures are a common stumbling block, this seems like a good test case.

The step definitions, I have added for bovine are here. One should note, that this is probably not yet what I want. My interest was mostly in getting the Gherkin written than having it perfectly test bovine yet.

1 Like