FEP-dc88: Formatting Mathematics

Hello!

This is a discussion thread for the proposed FEP-dc88: Formatting Mathematics.
Please use this thread to discuss the proposed FEP and any potential problems
or improvements that can be addressed.

Summary

This FEP recommends a method for formatting mathematics in ActivityPub
post content in [MathML Core]. Furthermore, this FEP describes how to
sanitize and convert such mathematics to plain text, if an
implementation does not wish to support mathematical formatting.

2 Likes

Thanks for writing this! This FEP will hopefully help make the Mathematics Fedi better.

Two quick questions:

  • Do I understand it correctly that this is a different process than mathstodon has implemented?
  • How does MathML Core relate to MathML? My Firefox has problems rendering the MathML Core samples, I just pasted into a html file, but MathML usually works fine.

Do I understand it correctly that this is a different process than mathstodon has implemented?

Yes, all implementations that I know about (mathstodon and types.pl) currently have added client-side LaTeX-like formatting software to their respective web-apps. What this does is render text between certain delimiters (it is currently \( and \) but they have tried quite a few, and several have been incompatible) as math.

But in my opinion this isn’t a reasonable solution, since interoperability is hard (even if everybody could agree on delimiters, there are different quirks between different engines) and it can render text from non-math instances incorrectly.

How does MathML Core relate to MathML? My Firefox has problems rendering the MathML Core samples, I just pasted into a html file, but MathML usually works fine.

MathML is a much broader specification, and while it was originally included as a part of HTML5, it was never implemented in full by browsers (firefox tried).
MathML Core is a much more lean, browser-focused specification, and is implemented by all major browsers. The MDM page about MathML core has a better description that I copied from this FEP, as well as a browser compatibility list.

Can you explain a little bit more about the goals of the requirements in the “Formatting Mathematics” section? For example, the requirements as follows:

A math element MUST contain one <semantics> child element, and no other children. The <semantics> element MUST contain a [MathML Core] expression as its first child, and at least one <annotation> element. The encoding property of this <annotation> element SHOULD be "application/x-tex", but MAY be "text/plain", and MUST contain a plain-text description of the mathematics—preferably in the authored format. The implementation MAY include additional <annotation> or <annotation-xml> elements with other semantic information.

I think, but am not 100% confident, that this requirement is so that clients which do not wish to render math content can remove math elements. If so, this should be explained further since it was very unclear to me while reading the FEP and the language used seemed very opaque.

This requirement was also specifically confusing to me: " The implementation SHALL NOT remove any [Semantic Attributes] or MathML core Elements and instead should replace a math element with text." What is the purpose of this requirement? How is “[replacing] a math element with text” not “removing a Math ML core [sic] Element”? In either case an element is removed, no? Also, I don’t understand how the requirement “The implementation MAY remove a math element completely” from the later section is compatible with the requirement " The implementation SHALL NOT remove any … MathML core Elements". Isn’t that a contradiction?

“The implementation” is to be interpreted as an ActivityPub conformant Client, ActivityPub conformant Server or ActivityPub conformant Federated Server as described in [ActivityPub] which wishes to produce or consume mathematically formatted content.

Nit: only ActivityPub Clients produce and consume mathematically formatted content. In the ActivityPub specification world, Servers and Federated servers only relay mathematically formatted content produced by others.

Mastodon tracking issue: Implement FEP-dc88: Formatting Mathematics · Issue #26943 · mastodon/mastodon · GitHub
Glitch-soc traciking issue: Implement FEP-dc88: Formatting Mathematics · Issue #2410 · glitch-soc/mastodon · GitHub
Akkoma tracking issue: #641 - [feat] Implement FEP-dc88: Formatting Mathematics - akkoma - Akkoma Development

Initial glitch-soc proof of concept: Allow MathML Core tags in post content by 4e554c4c · Pull Request #1933 · glitch-soc/mastodon · GitHub
Initial akkoma proof of concept: #642 - WIP: FEP-dc88: Formatting Mathematics - akkoma - Akkoma Development

Lessons learned from initial proof of concepts

The akkoma proof of concept does not seem to be viable due VueJS incorrectly adding MathML tags to the wrong namespace in the DOM. This looks like it is going to be resolved soon (in the next few months) but can serve as an example of potential issues in client implementations.

1 Like

Yep! This is the goal. I attempted to clarify this in the “Sanitizing Mathematically Formatted Text” section, but this did not appear to work!
Perhaps some background should be added as to how one of the main issues with mathematical formatting up to this point has been that MathML is not ‘round trippable’ , and this specification aims to improve that for plain-text preferring clients.

My attempt here was to give two alternative methods of sanitizing elements. Care must be taken not to partially sanitize a MathML expression by removing semantic information (the attribute linethickness=0 on mfrac tags is a great example here). Instead, the implementation should strip the <math> node completely and display text.

This wasn’t the easiest to lay out precisely. I will try to re-work how to state explicitly that the <math> node can be removed completely.

Noted!

Looking at this, and playing around, I noticed that in order to properly render mathML core, one needs to include

<head>
<meta charset="utf-8">
</head>

in one’s document for it to work. Silly people, who type in vi index.html to test something and then type down plain HTML, might find this information useful.


Something entirely different text/markdown+math is not a valid IANA mime type, see

https://www.iana.org/assignments/markdown-variants/markdown-variants.xhtml

Would it be possible to use text/markdown+quarto in the example? I think one has to replace \( eq \) with $$ eq $$ for that.

I wasn’t attempting to draft using quarto here, and I don’t think we need to aim for a registered MIME type for the contentType of the source property. For example, the ActivityPub standard playfully uses text/x-org here, which isn’t a registered mime type either.

My idea was that math instances are going to provide a custom way to draft LaTeX in markdown or plaintext (probably by surrounding by instance-specific delimeters) and could provide a custom source contentType to represent this.

In any case, the source property isn’t very important in this specification, and was there to represent a valid text encoding of the message. So I suppose it could be changed to text/plain

First implementation live at https://types.pl
Relevant PR: Allow MathML Core tags in post content by 4e554c4c · Pull Request #13 · ralsei/types.pl · GitHub

Example ActivityPub object: pounce :verified_blobcat:: "This is a test MathML post that is publicly avail…" - types.pl

Wordpress tracking issue: Feature Request: Allow MathML Core · Issue #452 · Automattic/wordpress-activitypub · GitHub

types.pl seems to be running in AUTHORIZED_FETCH mode so the activitypub .json payload is not easily available for inspection

reference object that does not require AUTHORIZED_FETCH

also implemented here https://genau.qwertqwefsday.eu/notes/9k1517t772

JSON-LD representation
{
    "@context": [
        "https://www.w3.org/ns/activitystreams",
        "https://w3id.org/security/v1",
        {
            "xsd": "http://www.w3.org/2001/XMLSchema#",
            "manuallyApprovesFollowers": {
                "@id": "as:manuallyApprovesFollowers",
                "@type": "xsd:boolean"
            },
            "sensitive": {
                "@id": "as:sensitive",
                "@type": "xsd:boolean"
            },
            "Hashtag": "as:Hashtag",
            "toot": "http://joinmastodon.org/ns#",
            "Emoji": "toot:Emoji",
            "featured": {
                "@id": "toot:featured",
                "@type": "@id"
            },
            "discoverable": {
                "@id": "toot:discoverable",
                "@type": "xsd:boolean"
            },
            "quoteUri": {
                "@id": "http://fedibird.com/ns#quoteUri",
                "@type": "@id"
            },
            "schema": "http://schema.org/",
            "PropertyValue": {
                "@id": "schema:PropertyValue",
                "@context": {
                    "value": "schema:value",
                    "name": "schema:name"
                }
            },
            "misskey": "https://misskey-hub.net/ns#",
            "_misskey_quote": {
                "@id": "misskey:_misskey_quote",
                "@type": "@id"
            },
            "_misskey_talk": {
                "@id": "misskey:_misskey_talk",
                "@type": "xsd:boolean"
            },
            "isCat": {
                "@id": "misskey:isCat",
                "@type": "xsd:boolean"
            },
            "vcard": "http://www.w3.org/2006/vcard/ns#"
        }
    ],
    "id": "https://genau.qwertqwefsday.eu/notes/9k1517t772",
    "type": "Note",
    "attributedTo": "https://genau.qwertqwefsday.eu/users/8oxbqesrd1",
    "summary": null,
    "content": "<p><math xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi>b</mi><mi>l</mi><mi>a</mi><mi>b</mi></mrow><annotation encoding=\"application/x-tex\">blab</annotation></semantics></math></p>",
    "source": {
        "content": "\\(blab\\)",
        "mediaType": "text/x.misskeymarkdown"
    },
    "published": "2023-09-24T14:47:24.619Z",
    "to": [
        "https://www.w3.org/ns/activitystreams#Public"
    ],
    "cc": [
        "https://genau.qwertqwefsday.eu/users/8oxbqesrd1/followers"
    ],
    "inReplyTo": null,
    "attachment": [],
    "sensitive": false,
    "tag": []
}

I’m a bit concerned with a fallback level for implementations that do not implement this at all. Without any accomodation, formulas will just be sanitized away completely leaving a probably slightly nonsensical message. Though this problem is not unknown, e.g. previous versions of Mastodon removing strikethrough. I myself do not use the formulas that often so I am not really bothered if some users don’t understand my post, but others might have issues with that.

1 Like

oh awesome! is there a revelant (miss, calc, etc.)key PR or issue that I can reference?

Indeed, this is a valid concern. My justification is that math currently federates in an odd way, and this FEP improves upon that somewhat. Other than that, I’m not sure how to provide a fallback for mathematics that works well with existing scrubbers.

There are these two commits for Foundkey, one for outputting MathML and one for parsing MathML. “Parsing” is just taking the TeX annotation and putting it back into the respective formula notation of MFM, so it will be rendered again by clients. (If there is no TeX annotation, taking the plaintext and rendering it as code block.)

1 Like