"Mastodon" cancelling streaming API access

strypey · September 28, 2024, 6:34am

This question is completely OT for the discussion about an FEP for NodeInfo yet?, so I’ll ask it in a new thread:

Can you clarify what you mean by this? Do you mean it was disabled in a Mastodon version release (which one?), or in a particular Mastodon instance(s)?

Zero · September 28, 2024, 9:24am

Anonymous streaming API access was disabled in Mastodon proper because of a few services that started using the anonymous streaming API to scrape instances for “trending” information and other search capabilities. The change was labeled in a commit but I don’t think the release mentioned it (I couldn’t find which release, I am not good with Github.) there were many people upset about the change in the pull request but the position of the developer was that there was no legitimate use for anonymous streaming access (what’s “open web”). I admit that it was being used for things a bunch of people really disliked though.

My interpretation of events was that this was done without telegraphing intent for making a significant breaking change beforehand and giving a deadline, but that might be an unreasonable burden on projects, opinions vary.

And again to just to be clear, I was not working on a scraper. I was doing Fedi Game Jam and I was developing a 2D “MMO” where you logged in to my game using OAuth against your home server and it would procedurally generate a shared world for people on your server based on what was happening in realtime on your mastodon/pleroma/akkoma instance. It was implemented using anonymous streaming API.

This happened around August 2023 I believe.

strypey · September 28, 2024, 9:33am

I infer from the rest of your comment that you meant a Mastodon software version release. But FWIW, saying …

… clarifies nothing for me. As well as Mastodon-the-software, “Mastodon proper” could refer to mastodon.social, or Mastodon-the-Social-Network, a fantasy creature that a lot of people are convinced is a real animal. For example, see the Wikipedia article for Mastodon (Social Network) and its talks page.

Zero · September 28, 2024, 9:35am

I’ll do better next time, “Mastodon-the-software” would have worked.

I’m also very opinionated on things like this so I apologize if my initial statement was misleading or inflammatory.

strypey · September 28, 2024, 9:47am

No worries, ambiguous language is a pet bugbear of mine As you can see right back in 2017, in the comments about “branding” (a reputation laundering term I wouldn’t use nowadays) at the end of my Brief History of the fediverse piece.

Not at all, and so am I

Again, seeking clarification. To a lay reader (ie me), this reads like it’s saying that servers running Mastodon are no longer able to display posts to anonymous web browsers. But I just checked and that’s not the case. So what exactly is the functionality that’s been lost to the open web due to this software change in Mastodon?

Zero · September 28, 2024, 10:09am

There is an API endpoint in Mastodon/Pleroma that gives you a websocket and then pushes new posts to you after you subscribe to a timeline through it. This is an alternative to using the HTTP web API, which requires repeated polling and paging to iterate over new posts. Streaming API is very fast (almost instant) and doesn’t require the polling, which is subject to rate-limiting in some cases.

Prior to the change, you could connect to most Mastodon instances anonymously, and subscribe to the public timeline and it would stream new data to you. After the change, it only worked if you had a user account and passed an authentication token.

During my development it stopped working for Mastodon and so I quit working on my project because that would have knocked out like 80% of my prior intended audience.

I was probably being too ambitious but my perception prior to this was you could do some really cool social things with these APIs. There are possible alternative methods that could be explored but they would require individual servers granting access to application developers, which is burdensome for the developer as there are thousands of servers and hundreds of developers in an n:m relationship. But I understand the concern. Sometimes cool ideas just aren’t possible because the developers had to make a tradeoff.

strypey · September 28, 2024, 10:18am

Ah yes, I’m with you now. This was a major battle in the SearchWars. A subject on which a huge amount has been written (including by me), so I won’t bore you with a monologue. Suffice to say, IMHO the fundamental problem underlying this needs to be solved by #84 - Mapping all possible posting scopes - fediverse/fediverse-ideas - Codeberg.org, and making it trivial for people to choose when to post to the open web, and when not to.

EDIT: Some of the earliest privacy scandals at FarceBook involved posting scopes being reset or flipped, so that things posted privately became publicly visible without the consent of the people who’d posted them. Unsurprisingly, people are not fond of this, and will lash out with great fury if they suspect a repeat.

nightpool · September 28, 2024, 3:28pm

In my time running a Mastodon instance, anonymous streaming API access was a huge vector for brigading, harassment and spam. Many instances already had it disabled, it was really only mastodon.social and a couple of other very big instances that had it enabled.

You could see in real time as a post made it to mastodon.social’s public timeline with a certain hashtag and then immediately the trolls would start flooding in from channer instances. There were a lot of IRC and Telegram bots that would watch for certain keywords and tell attackers about new posts on Mastodon.social matching those keywords.

I don’t think it’s “anti-open web” to worry about vulnerabilities like that. From the beginning Mastodon has made it clear that a “flat network” where everything is visible to everybody else is NOT one of their goals as a project. Real-time global search and indexing is a tool for harassment, pile-ons, and spam. Users find other users through their existing social networks, not through a “flattened context”. ActivityPub and Mastodon are, in many ways, the anti-context collapse social networks.

nightpool · September 28, 2024, 3:28pm

So why did you need anonymous access, if the user is logged in through OAuth? Your story doesn’t add up.

Zero · September 28, 2024, 4:05pm

My game server made the connection not the client because the world was shared between everyone on your server connecting to me. I definitely can’t trust the client to tell me what the server’s saying, and requesting another oauth token on the client side and then exfiltrating it to my server would be very bad.

I tried to make clear that I understand the important tradeoffs behind why this decision was made even if it affected me adversely, I apologize for being unclear. But of course you can still get the public timeline data through the HTTP API just not realtime.

thisismissem · September 28, 2024, 4:47pm

For what it’s worth the change was made by Claire in Disable anonymous access to the streaming API by ClearlyClaire · Pull Request #23989 · mastodon/mastodon · GitHub and I am fully supportive of it (I help maintain the streaming server in Mastodon)

The streaming API isn’t designed for use-cases that involve slurping up a huge amount of data without consent; it’s designed for clients to get posts & notifications slightly more efficiently than polling.

However, over time it began being used for scraping and other unintended purposes, hence closing off access to be inline with all other mastodon APIs that require a user’s OAuth token.

Streaming also isn’t good for cases where reliably processing activities is desired, since the websocket based protocol is lossy (you only get messages whilst continuously connected)

thisismissem · September 28, 2024, 4:51pm

Oh, and I did investigate and have working the code for potentially enabling access via OAuth Client Credentials, but given the existing issues around moderating and managing OAuth applications accessing a Mastodon server, this would have been a security, privacy and safety nightmare, so I stopped work on it, and recommended that we don’t proceed down that path.

trwnh · September 28, 2024, 5:00pm

nitpick: i don’t think the mastodon streaming API is in any way “web”, so this is probably more “open data” than “open web”

nightpool · September 28, 2024, 5:27pm

Huh? What does this mean? You’re saying that… authorizing the user to see their messages would be very bad? I don’t understand your reasoning. This is how OAuth is supposed to work.

thisismissem · September 28, 2024, 5:51pm

In general for now transferring an access token from one “client” to another (the server) would potentially cause unexpected security issues; employing dpop would arguably improve the situation.

There is also some need to have a new scope for streaming access, along with a scope for offline_access (which is currently default today due to access tokens never expiring

nightpool · September 28, 2024, 9:50pm

What does this even mean? What kind of security issues could result from sending your access token over HTTPS to a server you control? How would DPOP improve this situation except meaning that you’d need to send your private key as well? there’s no difference between sending your access token to the server over HTTPS and sending your access token + a client’s private key.

Or just doing the entire OAuth exchange on the server side in the first place! There’s no reason that unauthenticated access is required here except sloppy coding, sorry.

thisismissem · September 29, 2024, 12:09am

Should probably clarify that DPoP would help prevent such token transfers without the client really opting into it. But yeah, if you need the access token server side, it’d be best to only have it server-side.

Zero · October 1, 2024, 7:26am

Give me a programmatic way to introduce my application to a Mastodon instance and request a streaming oauth token for it so I don’t have to beg for one from the server admin of any of thousands of servers any time someone wants to use my application. Similar to how someone can request a user account if registration is gated.

thisismissem · October 2, 2024, 9:39pm

As mentioned, Mastodon doesn’t offer streaming access for applications, only the clients that end-user use. Building off the Mastodon Streaming API isn’t the way to build what you want, you’d arguably be better setting up an activitypub actor and following the users’ using your application in order to receive their activities. This guarantees to some degree that you’ll receive all their activities (since delivery is retried for ActivityPub) and has better consent management for end-users.