What MLS means for Matrix and the future of decentralised group chat

strypey · August 30, 2024, 9:20am

In a comment on a thread about the eternal search for a FarceBook killer, @erlend_sh asked about how Matrix devs are planning to use MLS. To keep that discussion on topic, I’m replying here.

I’m about as far as you could get from an expert on cryptographic standards. For an official position, ask someone involved in Matrix dev, not your friendly neighbourhood evangelist ; )

Afterthought: Now that I’ve written all this out, it strikes me that you’re probably right that the group E2EE protocol in the Matrix standard will continue to be called MegOlm. But under the hood it will be a totally different engine.

FWIW here’s my back-of-a-napkin understanding. Any detail in here could be wrong, and I’m keen to be corrected wherever that’s the case. But I’m pretty confident of the overall picture I’m painting.

The Matrix standard includes two protocols for E2EE. Olm for encrypting 1;1 direct chats (one-to-one), and MegOlm for encrypting group chats (3 people or more). Both are implemented in encryption libraries like libolm.

Both Olm and MegOlm (like OMEMO in XMPP) are based on Signal’s Double Rachet, which was designed for the 1;1 use case, and has inherent scaling limits. So while MegOlm does allow Matrix apps to E2EE group chats, it’s always been an ugly hack. Despite the best efforts of Matrix devs, the MegOlm implementations create all sorts of race conditions bugs with both servers and apps (the infamous ‘can’t decrypt, won’t decrypt’ issues). Some of the underlying problems are discussed here in relation to group chat in Signal.

MLS is based on work done at Wire, to support secure and efficient E2EE of large group chats. It takes a significantly different approach than the Double Rachet, and can support E2EE in much larger groups, sending larger message bodies. So my understanding is that the MLS approach will replace the Double Ratchet for E2EE of Matrix group chats.

MLS may also replace the Double Ratchet for 1:1 chats. I presume it’s easier to maintain one encryption protocol in libolm, than two totally different ones. But I really have no idea about this. There may be reasons why the Double Ratchet is so much more efficient than MLS for 1:1 that it’s worth the maintenance burden of using both. You’d have to ask an implementer ; )