Overlapping taxonomies and the audience property

In cases of taxonomic overlap this is what I think should happen, and what (should) currently happen in the Discourse plugin (bear in mind this is a very new problem for the Discourse plugin too):

  1. Actors following multiple taxonomies associated with the post get a single activity.

  2. Actors following one taxonomy associated with the post get a single activity.

  3. The “collection” referenced in the Object context will be the Topic the post is in (it’s only in one, even if there are multiple taxonomies).

  4. The “audience” of both the announce and the activity it’s wrapping should include all relevant audiences the activity was published to, i.e. the followers of the various associated taxonomies (it can be a list).

Now that I say this, I’m not sure the Discourse plugin is doing 4 properly yet, but that seems to me to be the right approach. Curious on others’ thoughts.

*edit, yeah 4 is not currently the case in the Discourse plugin. We just select the first taxonomic actor from the list of applicable taxonomic actors (preferencing tags over categories), but I don’t think that’s the “right” approach. Curious on others’ thoughts on 4.

2 Likes

yeah, that’s correct. and technically the webfinger is optional (well, we’re getting there, anyway).

there would indeed be a username conflict. you have a couple of options:

  • pick a unique username at time of creation (e.g. tag.hashtagname instead of hashtagname)
  • use the actual id for tags and don’t use webfinger (in which case, they will not be followable via mastodon at present time, but may be in the future?)

or…

this is actually not part of webfinger. what lemmy does is actually they allow conflicts. when you resolve acct:name@domain.tld you get multiple self-links and they lead to different actor ids. lemmy differentiates them with a weird hack by specifying in properties a property of https://www.w3.org/ns/activitystreams#type and a value of either Person or Group: https://lemmy.world/.well-known/webfinger?resource=acct:technology@lemmy.world

i can’t recommend doing the same, as this is at best a very confusing way of doing things. there is no property as:type, as type is just an alias for @type. also, if anything should be a uri, it should be the values of https://www.w3.org/ns/activitystreams#Person and https://www.w3.org/ns/activitystreams#Group, not Person and Group. and furthermore, there is nothing inherent to the type that implies the handling that they receive – that’s not what “Person” and “Group” mean. one alternative is to use a prefix property instead, and have its value be @ or !. this is somewhat better but still not ideal, because it relies on an otherwise-undefined convention that all relevant projects have to agree upon (and support for “bangtags” is not universal). but at least it doesn’t rely on abusing the type declaration, i guess?

there is a tangential point here in that it is hard to reify exactly what is meant by @, !, # as these are just microsyntaxes and in fact have no semantic meaning – they’re all just signals that some link should be created when parsing plain text input. you could just as easily use + instead of @ (as google+ did), or @ instead of #, or so on. you could have entirely different interpretations where @ is a “public mention” and ! is a “private mention”. or that some symbol is interpreted as to, and some other symbol is interpreted as cc.

this sounds like it won’t work nicely with 1b12 as 1b12 seems to assume a single Group actor in audience. so putting multiple values in there might not violate the letter of the FEP, but it seemingly violates the spirit of it. my personal opinion is that there is a deeper categorical error being made by assuming that some actor must be a Group, but that’s besides the point for now. either you generate one activity per 1b12 actor (contrary to your point 1), or you generate one activity period (which gets forwarded to other, “non-primary” actors who don’t own the thread per se, they just Announce the thread or whatever – there’s wiggle room for determining the exact protocol here by which these “downstream actors” make their followers aware of new threads that aren’t managed by them. the debate is largely between Add and Announce, of course.)

2 Likes

Hm, just an initial take, but not sure I agree. Again, this is preliminary, but I’ll start by listing a few things I think suggest having multiple actors in the audience properly may make some sense, and be in the spirit of 1b12:

  1. Firstly, from a purposive perspective 1b12 “reintroduces” the audience properly as it were, noting that it is underused and could suit this type of federation (i.e a group federating content on behalf of other actors). In that spirit I feel it would be a bit over-determined to limit the ways in which the property could be used in this vein. To put it another way, I don’t think @nutomic was attempting to enumerate all the possible ways this style of federation could work, rather he was attempting to describe a way it can work, based on the way it does work in some existing implementations. I think there is scope for multiple audiences in that spirit.

  2. Secondly, from a normative perspective, I think it’s useful to think about the ways in which the activity of an actor “belongs to” a, or multiple, Group actors in the sense discussed in 1b12 and in the sense implied by the audience property itself. In the case of overlapping taxonomies, e.g categories and tags in Discourse or Wordpress, it doesn’t really make sense to say that the activity of an actor with multiple taxonomies “belongs to” any one of those taxonomies. It belongs to each of them equally. To force it to belong to one of those taxonomies as the “primary” feels a bit artificial in the context of a platform like Discourse (or Wordpress for that matter).

  3. Thirdly, given that 1b12 is itself relatively “new” (in a standards sense) I feel that there is still scope to “flesh out” the way the audience property can work in this approach, particularly as we start to apply 1b12 to new cases like those of NodeBB, Discourse and Wordpress.

  4. Lastly, I should clarify that part of why I think listing multiple follower lists in the audience may make some sense is because the audience affects inbox forwarding. If other platforms are to forward the activity correctly they should know what its full audience is. In the case of multiple taxonomies that audience includes the followers of each taxonomy.

@trwnh@mastodon.social additionally the limitation of a context to a single audience is mostly artificial (and not a technical one) depending on software. At least with ours, it is mainly to satisfy existing user expectations. A context could easily be a member of multiple audiences should the need arise.

I recall in conversation around the time that 1b12 was first being written that the use of audience was motivated by not wanting to iterate through to/cc to find a Group actor. so the point very much seems to be to indicate the Lemmy community that you posted the activity to, but not using to. (i don’t particularly agree with this usage, but that’s how it was presented.)

well, audience has nothing to do with “belongs to”, really. Activity Vocabulary indicates that activities can be “scoped to a particular audience using the audience property” – the definition of which is “one or more entities that represent the total population of entities for which the object can considered to be relevant”, as given in Activity Vocabulary specifically. now, that’s a poorly-worded definition (grammatically speaking), but “can [be] considered to be relevant” is the operative part of it. in activitypub, due to the way that audience works for delivery, it makes for a useful mechanism for “keep these people in the loop” if you copy the audience over to your own activity. and if audience is a collection, you can make use of inbox forwarding to especially keep bto and bcc recipients in the loop.

given all this, the use of audience that makes sense to me is something like this:

id: <some-activity>
actor: <someone>
type: Create
object: <some-note>
context: <some-context>
to: <A>
cc: <B>
bto: <C>
bcc: <D>
audience: <some-audience>

where <some-audience> is a (private) Collection that includes C and D. another way you can look at it is through the lens of FEP-7888. i don’t know if i worded this fully clearly (and i might have to go back and revise the wording), but the intent of audience in that FEP is to “scope to a relevant audience” by “keeping the audience in the loop” (copying context.audience to your own activity). the following example is a bit contrived but technically valid:

id: <some-context>
type: OrderedCollection
inbox: <some-context/inbox>
followers: <some-context/followers>
audience: [<1>,<2>,<3>,...,<100>]
attributedTo: <some-context>
id: <some-activity>
actor: <you>
type: Create
object: <some-object>
to: [<some-context>, <some-context/followers>]
audience: [<1>,<2>,<3>,...,<100>]

in general you wouldn’t want to do this, however. it’s easier to have a collection representing the sum total audience, rather than just relying on a JSON-LD set/array.

the problem with that is that 1b12 is already FINAL status and can’t be updated. with that said, i am wondering if the use of audience deserves its own FEP or whether it’s enough to be “part of” other FEPs like 1b12 and 7888.

i agree with you here, and that is also the intent of how 7888 uses audience. the current general “template” for interaction is like so:

id: <some-activity>
actor: <you>
type: Create
object:
  - id: <your-object>
    context: $(context)
to: $(context.attributedTo) + $(context.followers if context has followers)
audience: $(context.audience)

there’s some flexibility on whether it is good practice to instead rely on the audience-copying behavior and just stuff the context.followers into audience, and there’s also some debate about whether cc would be more appropriate. i haven’t really fully thought about that yet.

more or less, yeah.

I would firstly direct you to the text of 1b12, which is the primary thing we’re talking about here.

Audience property

In order to render content in a forum, it is necessary to know which particular forum the content belongs to.

Emphasis is mine. But this is really by-the-by, the question is what we do now. 1b12 is not a bible. It is a good description of the approach we’ve decided to take to deal with group federation, but it was never going to be the last word on the subject, particularly as it was written prior to platforms like Discourse, NodeBB, Wordpress etc starting to federate in this way.

Perhaps. My thinking is we should flesh out the thinking here a bit more and see if we can land in a place that warrants clear definition. I think the goal here is an approach that works for these various platforms that make significant use of multiple, and overlapping, audiences.

hm, right now, I’m not sure I agree. I think it is better to represent reality as it is. In the case of overlapping taxonomies from a normative perspective the “audience” is not a single collection. It is inherently pluralistic. Moreover the audience property can equally be an array of collections. Right now I don’t see the utility of artificially forcing it into a single collection.

The other thing to keep in mind here is that we’re not really contending with widespread existing usage here. As 1b12 also observes

Currently there are different approaches to specify which group a given object or activity belongs to

To simplify this process, we propose to specify the group identifier in the audience property

In other words, there isn’t a widespread existing convention with respect to the audience property. We need not be constrained by those kind of considerations here, as we sometimes are in other respects. We can operate from first principles here more than we sometimes do.

well, if we’re talking primarily about 1b12 then sure, that is what it says. it’s an assertion i don’t particularly agree with, and it’s an assertion that doesn’t particularly feel like it fully applies, either. namely:

the main point of contention / topic of discussion as i see it is still the audience property and when/why it should be used, and especially as opposed to using to/cc. at least for lemmy’s use-case, they decided to adopt audience as a “shortcut” to point to the community actor, so that they didn’t have to iterate through multiple entries in to/cc. (this is the part where i say that this isn’t really in keeping with the AS2 definition or the AP delivery mechanism.) but again:

the definition is indeed not clear to me either, at least not how 1b12 uses it (“belongs to”?) or intends it. i can only point to what came up in the earlier discussions when this stuff was being written. i specifically suggested at one point that audience could include as:Public but this was rejected as a suggestion because it would defeat the purpose of avoiding iterating over the property.

the thing that i keep coming back to is that it is not entirely clear what the difference between to/cc and audience is or should be, in an ideal sense of “represent[ing] reality as it is”, as you put it. from an AP perspective, all three will trigger delivery. from an AS2 perspective, it’s got something vaguely to do with
scoping" and “relevant”. this doesn’t sound the same as the “belongs to” language that 1b12 uses. conventionally, you could just as easily copy the to/cc just as you could copy the audience.

i’ll do some archaeology on the subject so we can see if there’s any historical evidence towards any particular usage or not.

1 Like

So here’s some interesting things i found…

what is an “audience” generally

wrt “primary and secondary audience” as established by to/cc/bto/bcc:

elf-pavlik: what we consider an audience? other people, groups, circles, lists of contacts …

jasnell: Any of these. This is intentionally left open.

audience used to be scope

The scope indicates that the audience for the note is only members of the Organization.
The to indicates specific people who should be actively notified.
The context indicates a larger context within which the note exists.

so at the very least we can surmise the following points:

  • “to” should actively notify specific actors
    • implying that “cc” should passively notify specific actors? or not notify them, just deliver to them?
  • “context” is as we already understand it, a purposeful grouping within which the object exists
  • “scope” (later “audience”) is some kind of indication of something

To be clear: scope is not access control… it is closely related to to/bto/cc/bcc in that a consuming implementation can use it to determine who it ought to display the content to. So, for instance, given the note example above, a consuming implementation may include the note on the activity timeline of anyone associated with the ‘My Employer’ organization, but it would only activity notify two individuals listed by the to property. The context property, on the other hand, has absolutely nothing to do with audience targeting. The above note is essentially saying, “This is a note that was created in relation to A Project. Make the note available to anyone in the My Employer organization but specifically notify John and Sally”

scope is advisory as to the publishers intent of whose attention they want to draw to the object. A consuming application may use the scope/to/bto/cc/bcc to determine it’s access control policy if it wishes, but is not required to do so.

In the example, a consuming provider could still choose to allow anyone to see the note, but only actively include the note on the activity streams of people in the company.

scope could be renamed to audience

In the AS2 vocabulary, there is a scope property that is used generally
to identify the audience. The targeting properties to, bto, cc and
bcc indicate the audience subsets within that identified scope. The
context is really intended to allow objects and activities to be
logically grouped. For instance, in an enterprise setting, the context may
group activities by project while the scope would identify one or more
teams for which the activity is considered relevant, while the to/cc fields
are used to indicate specific individuals to notify.

gonna emphasize this bit here:

The targeting properties to, bto, cc and
bcc indicate the audience subsets

and also

advisory as to the publishers intent of whose attention they want to draw to the object

a consuming implementation may include the note on the activity timeline of anyone associated with the ‘My Employer’ organization, but it would only activity notify two individuals listed by the to property.

only actively include the note on the activity streams of people in the company.

2 Likes

Very useful! Thanks for doing that. This generally chimes with my current understanding of the role of the audience property. It is making me think we may need an “Addressing” FEP to describe how to use audience AND to/bto/cc/bcc . I’m curious to get others’ takes on what we’ve already laid out, and what you’ve just dug up.

cc @devnull @pfefferle @nutomic @eprodrom

4 Likes

i will catch up on the discussion, but I am not sure, I can be helpful here… I am relatively new to the Groups idea and try to figure out how to support it as best as possible and to be as compatible as possible…

1 Like

synthesis time

i think the use of audience is basically “here’s everybody that can/should see this in a feed”. whereas to and cc are for notification policies (with to being intended to actively generate a notification). delivery happens for all of them, but after delivery, the consuming application needs to reconstruct possible intentions.

BUT

this doesn’t chime with the current usage by mastodon et al. with mastodon, audience is generally ignored, only to and cc are considered, and notifications are instead determined by the presence of a Mention in the tag array. (which imo shouldn’t generate a notification at all? you should be able to mention someone passively if desired.)

so we have the intended ideal of audience being used for delivering to, well, audience… and to/cc for generating active/passive notifications. but instead, we have a situation where to/cc are used for delivering to an audience, and Mention tags are used for generating notifications (and audience is ignored).

If we want to have a compatible migration path forward, then for now:

  • to should include anyone you want to notify
  • cc should include everyone else you want to deliver to
  • audience should include the sum total of both of these? (but if you want to support bto/bcc, then it realistically needs to be a private Collection)
    • alternatively, audience should include any actors whose feeds / activity-streams should include this activity/object. (this still works out in practice to anyone mentioned, plus your followers, plus probably as:Public – which is likely the union of to and cc anyway)

sample object/activity targeting for a basic (read: microblogging) use case:

content: "So here's what I have to say..."
inReplyTo:
  - attributedTo: <john>
    content: "I said something."
to: [<john>] # this should be interpreted as notifying john
cc: [as:Public, <your followers>] # this is interpreted as an "unlisted" post in mastodon parlance, and is necessary for delivery to mastodon currently
audience: [<john>, as:Public,  <your followers>] # this is interpreted as showing in john's home timeline, as well as your follower's home timelines, and also being accessible to anyone without authentication

so maybe the use of audience can address mastodon’s fears of showing “group” posts in home timelines? i know this is a big concern that mastodon brought up with their current groups PR, they very explicitly do not want group posts showing up in home timelines… perhaps some heuristic can be designed so that newer versions of mastodon can filter out posts from home timelines if someone is not included in audience?

sample object/activity targeting for a more complex (read: forum/discussion) use case:

content: "synthesis time. i think the use of audience..."
context: <this thread>
cc: [as:Public, <my followers>]
audience: [as:Public, <socialhub/threadiverse>, <socialhub/threadiverse/followers>, <this thread/context's followers/audience>]

in this case because <my followers> are not in audience, an updated/newer mastodon et al can know not to display this post in my followers’ home feeds. there’s still the sticking issue for mastodon in how they can get older, non-updated mastodon servers to drop the post, although i really think this is foolish, personally… you could still do it by using only audience and then older mastodon will drop the activity because it doesn’t understand the recipients.

1 Like

An “Addressing” FEP would be very helpful.
Public and private groups, followers-only and other limited visibility modes, circles and lists - implementers need a unified framework that will guide the development of these features.

2 Likes

In Lemmy each post belongs to single Group which is specified in audience. Having multiple audience values would be valid according to Activitystreams, but there is no way that Lemmy could support it.

If I understand right, you are discussing about categories or tags here (where a single post can have more than one). Wouldnt it make more sense to use a different field for that case, such as tag which is used for hashtags?

1 Like

I think we're in a bit of a fortunate point in time where audience it's not used widely, so there are limited unintentional side effects to using it.

To expand on what I said earlier, while technically there is nothing stopping a context in NodeBB from being a part of multiple audiences, our UI is largely built around them only being a part of one.

To give additional context (ha!), private groups on NodeBB do not have their own space to talk (e.g. Facebook groups), they are literally a grouping of users. If you wanted to have a private space to discuss within group members, you'd create a category and limit it's access to that group. At least for me, it makes it much easier to think of when there are fewer moving parts.

So to that end, if we were to support federated private group discussions, the audience would just be the category (with access restrictions) those objects and contexts are posted to, exactly how 1b12 expects.

The idea of whether or not to put the public collection in as:audience is semi-related. When @trwnh@mastodon.social mentioned it in-thread, I thought it made a lot of sense.

We are already iterating through to and cc for collections, but the public address always seemed like a one-off exception that needed special handling. I dislike special handling.

1 Like

@julian On this topic, I whole heartedly vote that everyone be the change they want to see in the (de facto) spec.

@jenniferplusplus @julian how do you decide on notifications? to/cc/bcc or tag -> Mention?

@thisismissem@hachyderm.io @jenniferplusplus@hachyderm.io typically two ways:

  • If mentioned in an object
  • New object encountered in a context that you "watch"

The latter is implementation-specific and is not available on Mastodon due to the lack of context.

@julian yeah, I don't think anyone uses to/cc/bcc as a mechanism for notifications — I think everyone's settled on tag->Mention as being the “way”

@thisismissem@hachyderm.io if coming from outside ActivityPub, the whole concept of addressing is foreign. It's only really used in the context of email... so trying to bolt on newer concepts like visibility and private collections may be challenging.