As I am reading this very interesting report about content moderation on the fediverse, I thought it might be useful material to store here.
This studies goes through existing practices and software to detect and report child abuse material on social network, it clearly explains how the existing is made for centralised software and the challenges posed by decentralised networks.
Some challenges are technical due to the necessity of deleting images across the federation, some are human due to the vulnerability of moderators that untrained, untooled (lack of: grayscaling, color shifting or blurring), and often voluntary.
I understand that some of the propositions made here are problematic and lead to some form of centralisation, but still discussing them is necessary as we do need to seize the scope of an existing issue.
BTW I post this here because I find no transversal category that deals with moderation issues. Maybe it would be needed, moderation is a large part of the work for a sustainable fediverse.
Thank you very much for this, I very much would support a Moderation channel here, although maybe for clarity (since this is both a technical and a community/admin forum)
Moderation-tooling might be two different channels with very different audiences. This report might be relevant to both, but it is tooling for moderation that strikes me as most urgent to design, specify, and implement together across architectures.
I also feel like you should not have to apologize for the centralization-favoring recommendations of the authors, it is on this community to propose alternatives with feature-parity and outcome-parity if we do not want to be centrally-moderated. CSAM in particular has a tendency to trigger governments hitting software with blunt objects if outcomes dip below the number of “9s” specified in their SLAs with their customers (i.e. their constituents). Implementers and instance operators may have to work together to avoid this ugly fate.
While I’m in an alarmist mood, let me also point out this worrisome detail:
This strikes me as an issue for not just
Moderation tooling but a core interoperability issue with catastrophic legal implications for instance operators across implementations. I.e., if:
A.) Moderators don’t have or know how to use interfaces for “nuking” toxic accounts, or if
B.) Nuked accounts aren’t transmitting
Delete activities to all followers on other instances (regardless of server), or if
C.) The servers from which followers subscribed to toxic content aren’t executing those
Tombstoning those activities and any attachments/content,
then operators of other instances could be at second-hand risk by no wrong-doing of their own or those of the implementers of their server software
Regarding the delete: My understanding is that
- If CSAM is discovered, the Account is deleted.
- A “Delete Actor” is send out without individual Delete activities for posts.
- “Delete Actor” do not appear in the Streaming API.
- It is up to other servers to delete the content of deleted Actors.
It would be nice if the Mastodon team could release a statement if this is the designed behavior. A second discussion should be if this is the behavior, we want.
I think the centralization or not discussion should start after understanding the threat vector. My understanding from what I’ve read so far is: This is about new accounts posting CSAM.
So something like: Let’s run the first X pictures somebody posts through PhotoDNA, would solve parts of the problem. It also wouldn’t affect normal users at all, or use up too many resources.
I read some time ago on Hacker News a discussion relating to another thread vector, whereby a malicious actor can manipulate an innocent seeming picture such that it triggers CSAM detection software, and pass that along to unsuspecting victims. I don’t know if PhotoDNA is vulnerable to this, in which case one should be very careful with automated procedures. And e.g. add manual review + reporting.
Continuing my thoughts on “Delete by moderators”. It should probably be reflected in the Delete Activity, e.g.
"summary": "Bad Actor was booted from our server for violating the law",
shouldPropagate indicates here that all content by the user MUST be deleted due to legal reasons. Not sure if this is actually a good format, but I feel something like this will be necessary. At least, if the bad content ever federates out… not sure if the study addressed this.
The report itself doesn’t really explain, but I assume that section is just summarizing or synthesizing this thread by the primary researcher:
It’s not very clear which kind of
Delete Activity being described, what version of mastodon the server was running, etc.
Evidence-based recommendations on how to improve child safety (at least in the EU) are available from EDRi.
From the executive summary:
This includes societal measures such as increasing access to welfare, mental health and other support services, as well as reforming judicial institutions and law enforcement authorities. Crucially, it also includes empowering children and teenagers to make sensible and informed decisions about how they act online by educating and empowering them.
National and EU institutions, services and authorities must enable this by ensuring that children and young people are supported and believed when reporting abuse, and that cases are pursued swiftly and with sensitivity to the young person’s needs, which are currently barriers to justice for survivors.
There are also many measures in existing legislation, particularly the 2011 CSA Directive, its upcoming revision, and the 2022 Digital Services Act, which will positively contribute to tackling CSAM, but which have not been (fully) implemented yet. The EU should also reinforce the network of national hotlines already leading the way in the fight against CSA, by ensuring that they have a legal basis for their work and more resources to carry it out.
Low-tech measures, such as ensuring that internet users can easily report abuse, can further help in the fight against online CSA. Implementing evidence-based prevention strategies will ensure that the EU’s approach tackles the roots of CSA, not just the symptoms. And by bringing all the right stakeholders to the table – children’s rights groups, digital rights groups, experts in tackling CSA, other human rights groups, and survivors – the EU will be able to develop sustainable measures that can protect fundamental rights, including children’s rights, and ensure a safe internet for all.
This subject is both a real issue and a right-wing scare tactic, good to be clear that we balance this mess rather than undermine the openweb
I think our current moderation from the prospectives of the #4opens works quite well, let’s look at a few social issues, we can build neticate around:
The issue we maybe need to think about is the mega instances pushing that are too big to defederate from, this is currently an issue.
Our tools are #FOSS and our protocols are open, so anyone can run any code they like as long as it supports #activertypub, so any hard tech encoding is trivial to avoid, thus #techfix an obviously bad path to go down. We should be VERY weary of #mainstreaming agendas that push us down this path.
in general, as our openweb subculture is eaten by the #mainstreaming influx, there will be more and more “normal common sense” that is hard to ignore. We need to build stronger consensuses to resist this presser and push it back when it is obviously implemented anyway.
as we get more mainstream trolling and bad faith instances will become more of an issue, I think our current moderation can deal with this, what do people think?
OK, what can we actually do with the real issue of child safety? As this is also a real question ?
I think DSA is going to make it a real question for instances with reporting requirements relating to moderation decisions in the EU as outlined by Daphne Keller here. Even if an instance isn’t held to those requirements, if they want to federate with those that are (whether commercial/mega or noncommercial/cozy), they’ll have to basically honor them by proxy or somehow filter out the half of their content that poses a risk to their federation counterparties in the EU. As Thiel’s research linked above points out, instance operators are responsible for CSAM “stored on their systems” even if that’s just by following a user on another instance with less attention to these issues. To some degree, the laws against this stuff have always applied as much to the fediverse as to the non-commercial web, just a little selective/under-enforced as a courtesy. That might not be as possible now that it’s being automated and API-based rather than being discretionarily enforced…
I am the first to admit that child safety scare tactics (particularly in the US and UK) are currently being weaponized rampantly by the right, but as Federico’s link above points out, there are still evidence-based ways of scoping the problem to actual problems and not stupid culture wars. I think the dissemination of CSAM is an actual problem for any open system-- it incentivizes everyone to federate less and then we’ll have a few different fediverses: the non-commercial instances that can trust (and audit) each other’s moderation systems, partially federated to the commercial mega-instances that also have tooling to reciprically enforce moderation minimums and reporting requirements, and a third federation that those first two can’t afford to risk federating with. i’m not counting the japanese porn-iverse or the naziverse, so i guess there’s 5?
When I read @aschrijver above about hacking PhotoDNA to propagate fake alerts, I thought that relying on a technical tool will always expose us to attackers that have an interest in making the tool less effective.
It seems to me that some Fediverse Enhancement Proposals should be in order regarding the deletion mechanism for proper implementation of fast propagation of
delete events in such cases – in a way that would not backfire as a censorship tool.
I want to highlight @kaniini’s topic about The Delete Activity And It's Misconceptions. Considering the ActivityPub specification mentions in the S2S Delete activity section:
7.4 Delete Activity
The side effect of receiving this is that (assuming the
object is owned by the sending actor / server) the server receiving the delete activity SHOULD remove its representation of the
object with the same
id, and MAY replace that representation with a
(Note that after an activity has been transmitted from an origin server to a remote server, there is nothing in the ActivityPub protocol that can enforce remote deletion of an object’s representation).
and in the Spam security consideration:
Spam is a problem in any network, perhaps especially so in federated networks. While no specific mechanism for combating spam is provided in ActivityPub, it is recommended that servers filter incoming content both by local untrusted users and any remote users through some sort of spam filter.
I think a FEP addressing CSAM and/or
shouldPropagate property, i.e., federated moderation, SHOULD mention an upgrade for
Tombstone requirement of deleted objects from
SHOULD — since we cannot enforce a MUST. But beware, this can easily backfire with various use-cases:
- over-reacting servers (which may simply be hosted in weird legal jurisdictions where, e.g., breastfeeding images would be forbidden)
- attempts at censorship (which may be distinct from the previous case by the source and intent of the order)
It should be clear that addressing such concerns do require cool headed and thorough discussion.
This is usually where it gets complicated-- there are MANY weird jurisdictions, and CSAM (according to one particular definition) is the closest we have to an almost-universal “legal reason” to delete content (perhaps the only legal reason for which people rarely ask for details or want to re-litigate the issue), with relatively few corner-cases (like the japanese definition excluding computer-generated, for example).
Every other “legal reason” (hate speech, abuse, revenge porn, pornography, blasphemy, unpatriotic content, race science, etc) various WILDLY by jurisdiction-- the most common example is WWII iconography being “illegal” (to host or distribute in any way) in Germany, which is actually a little bit overstated but still workable as a conversational example; the “breastfeeding is porn” or “showing ankle is porn” countries would make a less complicated one.
So while cases where content was tombstoned and its deletion propagated for being declared CSAM are relatively straightforward to know how to parse as the server being requested to propagate, it is a lot less straightforward when a server in Saudi Arabia says they’ve deleted someone’s post as “pornography.” In such cases, a server receiving a “please propagate” deletion might want to know the jurisdiction and the category of moderation, whether it was automatic or appealed/manual, etc. The receiving server might even appreciate, while we’re dreaming big, a link to a machine-readable description of that server’s moderation playbook/standards. With such a per-action object, you might be MORE or LESS willing to “re-moderate” rather than trust another moderation system’s conclusions. This is why I keep referring to “
moderation receipts”, which are what DSA is asking commercial platforms to expose an [as-yet unspecified ] API for fetching!
DSA is asking commercial platforms to expose an [as-yet unspecified ] API
Api Documentation - DSA Transparency Database is the current spec, RFC closed a week or two ago so I imagine we’ll see changes
I know! I sadly did not have time to respond to the RFC-- did anyone you know throw some input into the mulcher?
Just for reference, IFTAS is working with Thorn to make CSAM monitoring and reporting available to any server that wants to use the service. Early days, but we plan on making CSAM detection available by end of year. Anyone using a CDN should also check with their provider as there may be a CSAM hash and scan option available. Here’s Cloudflare’s: https://blog.cloudflare.com/the-csam-scanning-tool/
Yes to sharing resources and yes to platform coöps!!!
For people interested in the Cloudflare route, I would also mention that an instance operator on the fediverse shared with me their install notes for taking advantage of CSAM-detection capabilities built into Cloudflare’s R2 storage module. I’ll be testing it out and writing it up as a more step-by-step tutorial or blog post next week, if I can get it running for myself:
Fantastic, I’ll keep an eye out for your blog post!
Thanks for supplying the link.
I notice that there are two fields concerning automation (which are yes/no):
Its a pity that there do not appear to be supplementary fields which (optionally) contextualise the hardware infrastructure or software infrastucture. As such, its hard to be accountable wrt how a decision was made - which could be pertinent to identify instances where the software is making incorrect decisions (software reproducibility is an important and much under prioritised domain of information systems).
For example, in the USA an 8.5 month pregnant woman (Porcha Woodruff) was arrested and detained following inaccurate facial recognition technology. Understandably, the victims of this technology would want that there be procedures so that there is an accountability trail (to remove flaws and ensure that vendors are not able to continue propagating inaccurate technology stacks).