I’m interested in participating in the Fediverse and I’m wondering how people are thinking about search at the moment.
I’ve look at the docs for Mastodon, Lemmy, and Pixelfed and nobody seems to describe how search works for these federated systems! Does anyone know how these platforms or other implement search, and if there is any support to search beyond the local instance?
Is anyone working on a protocol for federated search? It seems like there are a bunch of possibilities beyond ‘giant central index’ which could be really interesting. For example, if thee is a common protocol for search, a server could forward requests on to some other servers. Or there could be a search platform which instead of indexing, it just fires off a request the the search endpoints of a bunch of servers.
What’s the current thinking around search on/in the Fediverse?
Nothing exists that I know of. But it sounds like an interesting idea. You should build it!
For text search, we just do a postgres query with
ILIKE %query% on the local database. There is no federated search, except that you can fetch remote objects by searching something like
Thanks for your replies.
I’m wondering if all this needs is a standardised GET
/search endpoint, which could accept either free text or some Boolean expression. Supporting instances would then be responsible for determining which local objects match the query and returning the collection of results.
Since the Activity vocabulary is specified as JSON, it is reasonable to me to want to support some kind of ‘JSON query language’, so that someone could for example, search for Question type activities with particular options, or only
oneOf Questions. But I’m not sure if there is a standard for this or what the popular solutions are.
With free text search, it would be entirely up to the instance to decide how that was implemented, what fields are searched, etc. So that might be a really easy way to get an MVP spec off the ground.
I rememeber @schmittlauch was working on federated tag search as part of university assignment. Any news on this?
Unfortunately not. After the theoretical work on an architecture I had started implementing a prototype, in the meantime evolving on the DHT used and specifying a preliminary protocol format. See Hash2Pub.
Unfortunately, that work never got to a usable state. While I do still plan to resume that work if necessary at some point, it does not have highest priority so far to me. So feel free to ask me on details if you’re working on this yourself.
Regarding full text search, I’m sure that an architecture for that would make significantly different design decisions. There is work on combining multiple queries already on their path back, but these always looked challenging from a security point of view (risk of censorship or faked replies).
Thank you for the update @schmittlauch! I’m sorry you could not complete that software.
Maybe then this is something to discuss with the people from searx.
Oh, hey, great to hear from you! I really enjoyed your presentation at APConf, in 2019, on this!