FEP-8967: Generating link previews for attached links

FEP-8967: Generating link previews for attached links

Summary

A common feature in social applications is to show users a rich preview of a link included in the content of a message or post, before the user clicks the link. Currently, applications like Mastodon generate link previews for the first link found in the content, without considering the publisher’s possible intent. This FEP allows publishers to explicitly signal which links are intended for special processing, using the existing attachment model. Optionally, publishers can include their own link preview information so that trusting consumers can skip generating their own previews.

Link in attachment

The attachment model can be used to indicate that something performs a particular function as an extension of the main content. AS2-VOCAB defines attachment as “semantically similar to attachments in email” for something that “potentially requires special handling”.

Publishers MAY include a Link in attachment to signal that this link should be processed semantically as an attached link preview card, appearing similarly to attached images, attached videos, or attached audio. The link attachment MUST have an href, indicating that it is a Link.

{
	"@context": "https://www.w3.org/ns/activitystreams",
	"attachment": {
		"href": "https://foo.example/"
	}
}

Upon encountering an attachment that is a Link, consumers SHOULD show this link as “attached” to the object. At minimum, the href can be rendered directly, perhaps alongside an icon representing a link.

image

Link preview

Publishers MAY include link preview information using the preview property:

{
	"@context": [
		"https://www.w3.org/ns/activitystreams",
		{
			"sensitive": "as:sensitive"
		}
	],
	"attachment": {
		"href": "https://foo.example/",
		"preview": {
			"type": "Article",
			"name": "Example Essay",
			"summary": "In which some information is provided...",
			"image": {
				"sensitive": true,
				"url": {
					"href": "https://cover-image.example/file.jpg",
					"mediaType": "image/jpg",
					"width": 1200,
					"height": 630
				}
			},
			"attributedTo": {
				"name": "The Author",
				"icon": {
					"url": {
						"href": "https://avatar.example/file.png",
						"mediaType": "image/png",
						"width": 48,
						"height": 48
					}
				},
				"url": {
					"href": "https://author.example/",
				}
			}
		}
	}
}

The exact form of the preview and its processing model is out of scope of this FEP (as each consumer is free to render information according to their own design language and understanding), but some properties may be useful as equivalents of OpenGraph properties which are widely used for link previews:

  • name – similar to og:title, indicate the preview card’s primary text.
  • summary – similar to the og:description, indicate the preview card’s secondary text.
  • image – similar to og:image, indicate the preview card’s image.
  • type – similar to og:type, indicate the type of the target resource. This can be used to select an appropriate icon representing the resource.
  • attributedTo – loosely similar to article:author, music:musician, music:creator, book:author, and other such properties, indicate the preview card’s attribution.
    • name – the name that should be attributed
    • icon – the icon that should be displayed alongside the attributed name
    • url – the link that should wrap the attribution

Consumers that do not trust the publisher’s provided preview information MAY generate their own preview through whichever means they find appropriate, such as for example fetching the link target and extracting OpenGraph information or HTML tags such as <title> or <meta>.

Alternative approaches

(This section is non-normative.)

Attaching objects directly

Rather than attaching a Link with an optional preview, resources can be attached directly by their id along with arbitrary optional claims.

{
	"@context": "https://www.w3.org/ns/activitystreams",
	"attachment": {
		"id": "https://foo.example/",
		"type": "Article",
		"name": "Example Essay",
		// ...
	}
}

The difference between attaching an Object versus attaching a Link is that attaching an Object creates a direct relation between the current object and the attached object, whereas attaching a Link does not create this relation between two objects. For example, consider the difference between attaching an Image versus attaching a Link that targets an image. In most cases, directly attaching the Image is probably more appropriate. However, if the publisher attaches a Link instead, it might be because the link is present in the content and the publisher wishes to indicate this for special processing; for whatever reason, the publisher does not want to directly attach the Image. This depends on the specific details of the processing model, which is out of scope for this FEP.

Implementations

  • Mastodon 4.5: Intent to publish Link in attachment, per Mastodon-PR. Publishing link preview information is not currently planned at this time; Mastodon will instead consume attached links by their href as a signal to generate link previews with their existing OpenGraph logic. In the future, Mastodon plans to stop automatically extracting the first link in content, but for now the first link is used as a fallback in case no attachment is present.

References

Copyright

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

To the extent possible under law, the authors of this Fediverse Enhancement Proposal have waived all copyright and related or neighboring rights to this work.

3 Likes

It may be wise to split the preview information out into a separate FEP, which define how to use a preview property on Link . This FEP could just leave it at:

In the future, publishers MAY include a preview of the link as supported by the preview property on the Link type. The semantics of this are yet to be defined.

Which would reduce the likelihood of confusion or endless discussion about how a link preview actually works — yes, I know you said “The exact form of the preview and its processing model is out of scope of this FEP” but I think it’s still going to trip people up a bit.

Perhaps add in:

Consumers MAY display richer previews for the link by requesting the document at the URL via GET request and extracting metadata from the response html, e.g., via opengraph or oembed properties.

This is kind of already a given, but explicitly stating it probably doesn’t hurt.

In may be worth noting here that this version does not include user-control over which Link’s are attached to published posts, instead that uses the first link in content logic once again.

1 Like

Perhaps take the notes about OpenGraph to a separate paragraph ‘Applications’ as a sub-paragraph. Another application is around Schema.org and Google Knowledge Graph and knowledge panels that allow for much richer metadata. Separating to ‘Applications’ makes it clear these are example uses, and avoid the confusion.

PS. Here are some rich structured data examples one might bump into on the web..

I think it’s important for the FEP to talk about both “Link attachment” and “Link preview”, as the latter is the motivation and direct application of the former.

There are two normative requirements in the “Link preview” section:

  • Publishers MAY include link preview information using the preview property
  • Consumers that do not trust the publisher’s provided preview information MAY generate their own preview through whichever means they find appropriate

I feel like the mention of useful properties is not out-of-place where it currently is. I suppose I could split out the example and the useful properties into an appendix, but putting this in an entirely different FEP feels unnecessary right now, unless it really does cause the confusion you predict. If I get an indication that people are confused, or if it ends up resulting in conflicts, then I can split it out into something like “A processing model for link preview information” (tentative title, subject to change).

I don’t think this makes sense to add under “Link attachment” because this is the purpose of the immediately-following “Link preview” section. I also don’t want to go into too much detail about how to generate link previews on your own.

This is probably not relevant to the FEP, since the behavior we are interested in is what Mastodon does as a publisher or consumer, not how/if Mastodon allows user control over what it publishes. The “first link” logic only matters to non-Mastodon applications when Mastodon is a consumer, since Mastodon will currently apply this logic regardless of what other publishers generate. When Mastodon is a publisher, the user input that generated the published resource is fully opaque to other applications.

Indeed. I want to follow up on this. While @trwnh has carefully formulated this FEP to outline “link previews for attached links”, that in itself tells nothing new in comparison to what ActivityStreams already defines: Both Link and Object can have preview property, and attachment can be of type Link or Object, thus have previews. So then I suppose where the FEP wants to add guidance - going from the title - is in “generating” previews consistently. Yet it leaves the processing model out of scope, other than suggesting OpenGraph property mapping and pointing to the example of Mastodon parsing the content property for URL’s in the content type’s text.

What is a preview? Well, it can be anything, any Link or Object type or descendent thereof. It is described as “Identifies an entity that provides a preview of this object”.

Where I feel uncomfortable with so many FEP’s is that the assumed domain is Microblogging, the dominant use case for ActivityPub, and now with long-form text support sprinkled in. And all these FEP’s are further specializations of this assumed domain. While ActivityPub is “addressible actors exchanging semantically meaningful activities with an object payload”.

Maybe the FEP should be titled: “OpenGraph support for Content publishing”

Maybe? I’m not sure if schema.org and OpenGraph are equivalent in this regard. Schema.org seems to be mostly for structured data, while OpenGraph is the de facto standard for link previews, at least in social applications. As far as I’m aware, schema.org is mainly consumed by search engines like Google. Pretty much everything else uses some combination of OpenGraph and Twitter Card (particularly twitter:image:large to switch between small preview cards and big preview cards). I don’t want to cover too much here, or make too many comparisons. A brief mention of OpenGraph and also HTML title/meta is adequate imo.

1 Like

The FEP is mainly about which previews – “for attached links” specifically. How applications generate their own previews is out of scope (called out twice), but the core of the proposal is that “Mastodon wants to allow publishers to specify which links get previewed instead of guessing.” They are doing so using the attachment model; similarly to attached images or attached video or attached audio, you might have attached links. For Mastodon’s use case and processing model, these will all be rendered as appended below the status content.

Yes, it can be anything. I’m not sure why this is being pointed out, though.

I don’t think this is limited to microblogging or long-form text or whatever. It makes no assumptions about what you’re doing with the object itself; it only covers the attachment of a Link with a possible preview. This attachment can go on any object, including activities. The semantics of activities vs objects is a separate issue.

No, because the FEP is not about OpenGraph. OpenGraph is just used as an example in passing. The FEP is about attaching a link so that it might be rendered as a preview card. The converse of “Generating link previews for attached links” is “Generating link previews for the first link in content”.

Do I formulate it correctly then, if I say that what this FEP adds to what ActivityStreams already describes, is hints about the UI that should be used when encountering attachments with previews on-the-wire? And perhaps - reading the last line - the (domain-specific) business logic to apply when encountering a content property?

You still use domain-specific terminology, by referring to “rich preview of a link included in the content of a message or post” and subsequently “publishers”, and the “existing attachment model” which I assume refers to parsing of content? Your UI sketch indicates a card layout typically seen in content publishing.

So the core premise is, rather than parsing content and extracting out whatever the first link is, and then using that for a rich link preview, you should instead use the attachments that are of type Link to find the links to use for rich link previews. That gives the publisher more control over link previews, whilst also simplifying things for consumers (you don’t need to parse the HTML, nor worry about all the fancy syntaxes that get applied to links).

Motivation traces back to the issue that @snarfed had with BridgyFed embedding a link preview in a bluesky post to a person’s fediverse profile accidentally, which that person subsequently objected to.

1 Like

My unease remains that we are really referring to domain-specific current reality and refinements thereof (i.e. use particular business logic), OR using this FEP to say “don’t assume domain-specific business logic, and go back to ActivityStreams previews as-is” (i.e. there should not be an implied relation to content property).

If this FEP is domain-specifc, then that is fine, but it should be mentioned.

No, this is about putting a Link in attachment at all, which ActivityStreams does not describe. ActivityStreams gives you the vocabulary and basic data format to be able to formulate arbitrary statements, with little-to-no guidance on structure beyond that.

Also no. The FEP has nothing to do with content and in fact separates out the concern of link extraction.

This isn’t domain-specific unless your domain is as broadly general as “describing resources”. The “existing attachment model” is the one described in AS2-Vocab’s definition of attachment, i.e. “similar to attachments in email” and “potentially requires special handling”. The summary describes a “common feature” across multiple social domains, including publishing and messaging and anywhere a link may be used. Adopting this FEP means that you avoid needing to parsing content. The UI sketch is just an example and is equally applicable to messaging apps/etc.

Even if this was domain-specific, not every software has to adopt every FEP. I understand your unease, but this is a generic use of the attachment property that mirrors how you would attach other things. There is not an “implied relation to content” except that publishers can use structured data (attachment) in parallel with their markup in content, and consumers can rely on that structured data (attachment) instead of the markup in content. In other words, you can also put a Link in attachment that is not in the content, in the same way that you can include a tag that is also not in the content (as a hashtag or similar).

So then in summary “This FEP advises Objects MAY use Links with a preview, defined in attachment property, for link preview UI”?

Love this!

I haven’t seen it discussed much here, or in the FEP, but if this is enabled by default, it could largely fix the thundering herd problem when a post is federated and all receiving instances fetch the link at the same time: Mastodon can be used as a DDOS tool · Issue #4486 · mastodon/mastodon · GitHub , Reduce load of preview fetching on third-party servers · Issue #23662 · mastodon/mastodon · GitHub . @trwnh mentioned this in Add support for `Link` objects in `attachment` by Gargron · Pull Request #36104 · mastodon/mastodon · GitHub too. Thank you!

Unfortunately it will not solve this problem if the link to the image preview still leads to the origin server (server the link points to) because requesting this image is loading the server, not requesting the link itself.

Sure it will somewhat reduce the load, but not enough to deal with the problem.

To solve this particular issue we need to use some semi-centralized server which caches the link meta info and the image. This server may be implemented using custom fasp protocol, in fact we are running such server internally now and plan to open it soon.

I would expect the preview image to be the URL of the one cached by the Mastodon server, not the original image URL. That’s also how AT Proto / Bluesky handles the link preview images — they’re stored as blobs in your PDS and distributed via Bluesky’s CDN.

1 Like
  1. In summary “This FEP advises putting links that might need special processing in attachment so that they don’t have to be extracted out of the content.” The primary aim is for Mastodon to stop plucking the first link found in the content.

  2. In most cases the “special processing” for an attached link will be to render that link as an attachment. This is why attachment was selected as the property. (It’s much the same as an attached image, video, audio, …)

  3. The preview is optional and can be ignored, but if present and trusted you can maybe use it to render a preview card without fetching the link target yourself. (There’s a kind of “RDF in RDF” going on here that’s inherent to the Link being an indirection, but we can’t do anything about that right now.)

1 Like

the link target and its image preview don’t have to be on the same server.

as for the load itself, the image probably requires more bandwidth but otherwise a request is a request. the thundering herd problem isn’t caused by images vs text, it’s caused by up to 30,000 requests being made within the same 60-second window, and mostly concurrent at the beginning of that window (because not every software adds a random delay). as an example, the default nginx configuration uses 1 worker process that allows 512 worker connections per process. common advice is to use 1 process per CPU core, and maybe raise the connection limit to 1024 per process. so for a typical low-spec web server or VPS, you’re looking at maybe 500 - 4000 connections before nginx starts dropping them. not a problem for static assets, because even for relatively large image previews, not much work has to be done, and requests can be fulfilled quite quickly. it’s more of a problem for resources that have to be dynamically generated, because that’s work that the server has to repeat for each request. a cache can greatly reduce this workload for the server.

the “semi-centralized server” can be the origin server. you don’t strictly need an intermediary, but if you trust an intermediary, it can reduce load on the origin server by effectively using the intermediary as a proxy.

also, instead of using a custom protocol, you can use a standard one.

the publisher of the AS2 resource gets to decide, so yes they can do this if they wish.

1 Like

I think the best option is to include an Object derived type, like Page or Article, at the top level. Link has limited metadata. I don’t think attachment → Link → preview → Article is the shortest path to getting metadata shared to the client.