When media such as images and video are uploaded and shared without captions, it excludes the visually impaired from knowing what the content is.
To address this, some user interfaces highlight such media so that people can be more easily made aware of this deficiency in what they may share. People can then reply to ask for a captioned repost, or decide whether they want to share or otherwise engage with such content.
The question is, would formalization of captioning make sense as a FEP? This area seems to be one that could use another look.
I am not sure if I understand it correct.
The question is if it adds anything new (people can already reply to ask for a captioned repost and I hope they do so âŚ)
Otherwise:
At yesterdays meeting we thought about creating like topic based âBest Practicesâ spaces here and
the same as for Groups could be for Media captioning.
What is frustrating here is that I basically have all of my professional documentaries, films and photos captioned already in the files.
In XMP and DC/IPTC/NewsML âŚ
In the other window I am working on a metadata parser (util.ts but exif and IPTC also exists in other repos) because I think, it should be the task of any posting UI to highlight the need of it.
In redaktor when you upload media, you will then be asked if you want to use default metadata as template.
Which should in the end include
for All [e.g. Page, Place, Event]
meta tags and DC
og and twitter
JSON-LD schema.org by means of schema:mainEntityOfPage
mf2
for Image + XMP/IPTC/EXIF
for Video + XMP
for Audio + ID3
The parser can be tested simply with util.js, add any URL in the end, cd to the directory node ./util
An example output of the last URL used is in /example.json which is a SPIEGEL news article.
About the attachment property, for now only as: elements or videos are copied over here but this page has none.
But any additional elements are in the ld property (for now), so any implementation can choose.
Yes, I would like to have tags like âprotocolâ, âexperimentalâ and âbest practiceâ to know directly what the FEP is about.
What kind of metadata should be stripped (i.e. location or overly granular time information)
Well, I think it is the advantage that we can let the user decide.
Let me explain:
The default in redaktor is that by default all metadata is stripped if the user did not opt-in.
All requests in redaktor go through a proxy server.
For content-type image/* (by default only jpg, png, webp, avif, gif, heif) it is using under the hood sharp [proxy can also resize etc.] and the withMetadata option would only be active if the requesting user wants it.
In the posting window, the user can then choose what metadata to use or not.
Personally I do not post so much photos on mastodon cause the square but this would make publishing my stories much easier.
Most important might be the message before users post, like âYour alt text is empty. You fail.â
Please also note that the redaktor client also uses the natural language possibilities in ActivityPub for different natural languages i18n support. But i18n is probably out of scope.
Something else comes to my mind.
We need to specify what âcaptionâ means.
There are basically two kinds of captions:
Speakable: The alt attribute of e.g. an <img> or a text below depending on the âalt decision treeâ
Visible: A caption below the image as often seen in news media
Both have totally different meanings.
Letâs assume the Actor is a journalist or media organisation.
Then 2. is essential.
Letâs assume the Actor is an artist posting an abstract piece.
This actor probably wants to avoid 2. otherwise he would have become a writer âŚ
We somehow could define what should be 1 and what 2 âŚ
Weâve merged this CSS into two themes that are available by default in Ecko and are looking at making it something like a checkbox instead of a special theme. I believe thereâs a compose-time warning in Glitch as well. I suppose a FEP could be a UX recommendation at compose and display time.
Rather than divide the cases by kind I would rather go back to the intent of captioning, which is to map the content into linear text so that it can be consumed by screenreaders, or in the case of transcripts for the hearing-impaired, by reading.
Many videos have captions but many do not. Most every image on the web does not have a caption and my experience with podcasts is maybe 5% have transcripts.
Much of todayâs information is a mystery to the blind and hearing impaired. The fediverse being newer provides an opportunity to make accessibility a core part of our systems. Itâs clearer to me now that this is a valid problem and a FEP canât hurt, so Iâll start working on a draft.
So there is much code written inbetween which could then help.
A main problem when developing all the @redaktor widgets were hearing impaired because the major browsers do not support WebVTT for the Audio Element which is sad.
So, for the as Audio type widget, we have an overlay of a video element doing WebVTT if there are any tracks.
There is a topic with screenshots (early stage) Seeking opinions on time-based content but now for time based âextra contentâ, itâs most easy is to supply WebVTT Metadata as ActivityPub markup to add time based ActivityPub content.
[edit/addendum:]
Forgot to mention about the proxy (can send the gist link via DM), it is currently handling all Namespaces and Vocabulary in https://developer.adobe.com/xmp/docs/XMPNamespaces/
and converting it to ActivityStreams Vocab.
You can simply chain multiple image operations from sharp and /withMetadata
Then via content-negotiation and Accept-header outputs either the image with metadata embedded or plain json or ld+json âŚ
Privacy is important to me.
The thing should never be used by bots automatically creating ActivityPub objects.
It should be used by the Actor as an opt-in template for posting.
In any case the posting interface should proactively notify the Actor about missing captions or the minimal a11y needs and further guidelines.
In general, metadata should fall into 3 categories:
⢠Critical Metadata â what should always be deleted âŚ
EXIF Maker Notes
they contain often encrypted very private and forensic data, like the temperature of the camera or the encrypted email/Account data where âencyryptedâ is relative, the forementioned examples (temperature/email) are possible to read for a beginner after 2 days playing and other hacks are known by search engines âŚ
⢠Metadata which was not produced by the author
EXIF
TIFF
in my code, the users can opt-in but data is filtered and lands in the instrument property of ActivityStreams
⢠Metadata which was produced by the author
IPTC
DC
photoshop
[which is just a namespace to bridge IPTC/DC and photoshop or other apps (e.g. iView) specific values]
There is a supernice workshop by EBU Tech, wikidata and IPTC about what I describe above.
In 6 days, March 10 âŚ
The invitation by the European Broadcasting Union reads:
âWikidata has become one of the largest collections of open data on the web.
Join our workshop (10 Mar), held with IPTC, on how broadcasters and media organizations can use Wikidata to tag, enrich and enhance their content! #opendata with wikidata and IPTCâ
European Broadcasting Union is the Place, where all the public broadcasters of Europe are working together âŚ
Please note: If we specify any ActivityPub âflowsâ and some software does not support it, we could simply send the icon which Erik mentions and use it as a fallback.
To follow up on that toot a bit. Point made was that for accessibility there must be captions, but people can be unable (e.g. because of different disabilities) to write them, as Erik describes in:
Disabled people know that access needs can clash. I benefit from described images, but I know some people struggle to write them because of their own disabilities.
And he mentioned workarounds that currently exist, like adding a custom emoji for people to indicate they want their image inclusion to be captioned. For this you can also mention the Gup.pe group @imagecaptionspls, and with both emoji and group mention missing, someone can reply with a mention of the @PleaseCaption bot.
(I always caption my images, and also always forget these account names, but thatâs why we need a more thorough built-in handling of captions)