Proposal: The browser as a client with a new web standard- cross- instance interactions on the web

The problem

One of the problems of the Fediverse is the user experience of cross instance interactions on the web (e.g. you watched a video on PeerTube and would like to comment on it with one of your Mastodon accounts). This is already detailed on the post a custom URL scheme handler as a streamlined UX for cross-instance interactions on the web.

The main scenarios I’m aiming to address here are:

  1. Follow/subscribe an account on a different instance
  2. Reply to a post on a different instance
  3. favorite/like (Like) or boost/repost (Announce) a post on a different instance

This is specially an issue for new users, not yet used with this hurdle. I’ll quote an anonymous peer:

I am flabbergasted that I have to copy/paste the URL over to the mastodon instance that I am on to “boost”/retweet it. That doesn’t seem to scale well.

The solution proposal

Inspired by WebLN, I imagined we could have a Web Standard to transform the browser in an ActivityPub client, and expose this to the web applications through a Javascript API.

The browser would abstract the authentication and also handle permissions (e.g. when a webapp wants to behave as a client using this Javascript API, the browser has to prompt the user for consent).

This could enable a seamless experience once the use has configured its account(s) in the browser.

A simplified happy path would look like this:

  1. The user goes to the browser configuration and sets up their ActivityPub accounts
  2. The user visits a website where ActivityPub Objects are shown (e.g. Statuses on a Mastodon instance, a Video on a PeerTube instance) - this is not an instance where the use has an account
  3. The user interacts with the object (e.g. Likes the Mastodon status) directly from the website they are visiting.

Under the hood the frontend application of the website visited by the user is consuming a Web API to post an Activity to the user’s outbox on the instance where the user has an account.

Here is a sketch of how the code on the front end app could look like:

// This code runs when the user clicks on a Like button of a Note on a website where they don't have an acount
// WebAP is a global object exposed by the browser
// each client is a configured account
const activityPubClients = await WebAP.getClients();

// ideally you would have some logic prompt the user to select the proper client (account) when there is more then one, or inform the user to configure it if there are none
const client = activityPubClients[0];

client.postToOutbox({
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Like",
  "object": objectId, // id of the note to be liked
  "to": "https://www.w3.org/ns/activitystreams#Public"
});

How could this get started?

Assuming this is a good idea, we would need some level of browser support to get this started. I cannot imagine big name browsers introducting something like this any time soon, so we would need an alternative.

Again I draw inspiration from WebNL. There is a Browser extension called Joule that somehow exposes the API to front end web applications. I don’t know exactly how it is done, but in theory this could be used to get started and mitigate the browser adoption problem.

From there, we would need some early adopters (imagine for instance if Pleroma and Mastodon would consider consuming this API from their front end to enable seamless cross instance integractions).

Since Mastodon has a big user base and it does not support C2S, I wouldn’t discard a provisional translation layer from AP C2S into Mastodon API on the Extension, to maximize adoption.

Closing remarks

This is just a rough idea, with many rough edges to be polished. Executing it in a future proof way would be challenging. It should also be said, it inherits all the already existing challenges of implementing an AP client.

I’m eager to hear the opinions of the community about this crazy idea.

If I have enough time and energy, I might do some prototyping.

4 Likes

Interesting use case, but couldn’t this just be a browser plugin? ActivityPub already allows different websites to interact with each other. If a user isn’t able to comment on a PeerTube video from their Mastodon account, that sounds like PeerTube just needs to add the functionality.

1 Like

Good idea.
By the way, extension that facilitates remote interactions on Mastodon instances already exists (but it’s not a client):

4 Likes

If a user isn’t able to comment on a PeerTube video from their Mastodon account, that sounds like PeerTube just needs to add the functionality.

This this already possible, but the experience isn’t great, it either involves copying an pasting a URL or some cross-site navigation. The idea here is to enable this interactions without such navigation.

but couldn’t this just be a browser plugin?

Like for instance FediAct, as mentioned by @silverpill ? It could, but then the plug-in would have to “hack” every Fediverse front-end to add this functionality and that is not really optimal.

1 Like

Thanks! I was not aware of this extension, I’ll check how that works.

You’d need some sort of cross-navigation for this because you’d have to login/auth the user in some secure way. For instance, in order for PeerTube to allow a Mastodon user to post comments, the user would need to login securely on the Mastodon site first. So don’t think you can get away without having to cross-navigate the user. You can try to do it via embeds or some hacky way, but that would just be hiding the login implementation from the user, which most users probably wouldn’t trust.

So don’t think you can get away without having to cross-navigate the user. You can try to do it via embeds or some hacky way, but that would just be hiding the login implementation from the user, which most users probably wouldn’t trust.

The essence of this idea is that the users would configure their authentication towards the instances where they have accounts directly in the browser (or browser extension). Then the browser would offer an API to websites where they can make AP C2S requests reusing this central authentication. Of course that puts in the browser the responsibility of preventing misuse and abuse (e.g. prompting the users to check if the website is allowed to perform an action on their behalf).

This reasoning is spot on. However, if you’re watching how browser vendors have come to make the APIs available in the browser today, I think you’d quickly understand why having browser vendors own such an API isn’t such a great idea.

To implement APIs in the browser, vendors have to sort of agree. But if they don’t agree, browsers will just force their hand at what they want. Like Google’s Chrome forcing their own custom-version of APIs that haven’t been agreed to by other vendors. Or stubborn browser vendors like Safari that have literally decided not to implement APIs (without reason) even though other browsers have agreed. And these larger vendors like Chrome and Safari always get their way because they have the most market share. Browsers like Firefox are trying to play nice, but they still get overruled by the larger vendors like Chrome.

I like the idea, but I’d try to get it implemented without it being a browser API. Standardizing this in browsers is essentially centralizing it to be controlled by these large, predatory browser vendors and potentially be manipulated, abused and/or worse.

3 Likes

I think this tackles a very real problem but one which I think is probably better tackled as an identity issue specifically.

Like I’m using mastodon and lemmy mostly through their native apps. I like apps to be accessible through the web and I use the web for the desktop but native apps seem to be a nicer ux for me and they also have this problem. Idk if this gets around that issue. There seems to be a deeper issue with how identity is dealt with.

I know I’ve brought up the solid protocol before and other people have discussed it on this forum. I don’t know if that solves this issue. But it at least seems adjacent to what I think the core issue issue is. We need an sso identity solution for the fediverse so that users can go from server to server easily. The experience you describe does seem good but it’s limited to just a browser and it requires a browser to support it. Is there a solution that can be addressed at a broader level?

2 Likes

This sounds very similar to remote follow which is already supported by some implementations.

1 Like

You bring some interesting perspectives about the downsides of relying on Web APIs.

I like the idea, but I’d try to get it implemented without it being a browser API.

If the instance where I have my account allows for Cross Origin Resource Sharing (CORS), any other instance’s front-end could be a client, but it would need to handle the authentication flow.

About native apps, the problem I describe here will usually start at the browser, as it assumes the user reached a post (i.e. an ActivityPub Object, likely a Note) in another instance on the web. If you reach this content from your native app, then the problem described here does not exist.

In this case, if you find content on the web and want to interact with it on a native app, then a custom URL scheme handler as a streamlined UX for cross-instance interactions on the web seems more suitable.

Ignoring for a moment the criticism to using Web APIs, I think both solutions could coexist. If you use a native app, you would not have configured the browser as a client, or explicitly configured that you do NOT want to use it as a client. In this case, if the web app can identify this intent using the Web API, it could fallback to using the custom url scheme as a fallback, which would then trigger an interaction through your native app.

I can’t comment at the moment on Solid or SSO alternatives, as I don’t have sufficient knowledge or experience with this topic.

I started a PoC, all open source here: Web Activity Pub (WebAP) API · GitHub
As an attempt to gatter interest and drvie adoption, the first step will be to create a Browser Extension (webap-browser-extension) which will expose an experimental implementation of the WebAP API to web applications.
One of the many challenges is the lack of support for C2S on Mastodon, so I’ll be using Pleroma for testing.
My ginea pig as a consuming web application is likely going to be https://podcastindex.org/, as they are anyway “running with scissors”. I’d have to implement support for even showing episode comments, but that could be fun.
There is one blocker for that though, the necessary data is not yet exposed to the API used by website, and that one is, to my best knowledge, not open sourced yet.
Let’s see where it goes :grinning:.

3 Likes

Hubzilla uses OpenWebAuth to handle this issue. You can use your handle to remotely authenticate on any other Hubzilla instance, and can read, follow, and reply directly on their website. You don’t have to go back to your own instance to interact with them… unless you want to.

1 Like

All right, so the first working version of the browser extension is out, and I already posted my first message using a Pleroma account!
I have also deleted that message, because it had some typos :smiley: , but here is the second ever message sent with it: Pleroma
Here is the browser extension code and instructions (written in a rush) on how to try it out: GitHub - webap-api/webap-browser-extension: WebAP Browser Extension

2 Likes