Linked Data: Undersold, Overpromised?

Thank you Melvin!

So part of the challenge is crossing the chasm between expert-level and ‘mainstream’ development. I’ve expressed that on Solid forum before, i.e. that I’ve felt there was a sort of fascination to ‘reinvent’ or ‘rewrite the web’ from scratch and suggesting all kinds of end-user apps as ‘killer apps’ for the technology, there should be more focus on providing tools & libraries in the developer toolbox that encapsulate the complexity in better ways so the technology comes withi grasp of more people. See:

Yes, this is a deeply inspiring notion to me, and while we are still a long way away from that as you say, it is also what the SocialHub slogan of “Social Networking Reimagined” is about. And I feel a huge opportunity that exists now to make a fist against Big Tech with our own definition of how “social” ought to be.

:+1:

I received a great response on the Solid forum where I first posted my reaction, and where the author also mentioned:

EDIT: I guess what I’m saying is there needs to be more product champions.

EDIT EDIT: Sorry, the coffee is kicking in. To me, the question becomes for any developer sitting down at their keyboard when starting a project: Why should I use Solid and develop my app to use Solid principles over what I already know? Using the MEAN stack, or SAFE stack, or any other stack really? In my view, if you can convince the general conversation of: I really should be writing my app in a “Solid” manner most of the time, then we’ve reached our goal. (And by Solid, I’m abstracting here to really mean: leveraging socially aware protocols, such as the Semantic Web, etc.)

(Btw, dunno if ‘product champion’ is the word I’d use. Too corporate feeling, but I get the idea).

Similar question we can also ask for the Fediverse in general. Why should a fedi dev take the extra effort to deep-dive into walking the Linked Data path, rather than just slapping a @context in place?

1 Like

Personally I think there is a middle ground, where you have a lite version of Linked Data, with a full upgrade path for those who want it

EDIT: what would a lite version look like?

  • compatible with plain old JSON
  • allowing syntax for links/URIs in JSON
  • additional ability to add @id
  • ability to add @type
  • path to full RDF compatibility
1 Like

That is a good middle ground, yes. What I am also very interested in is all the ways that ease the creation of vocab extensions that facilitate this “path to full RDF compatibility” optimally. In other words the whole developer experience around “methodology & process” that on one hand ensures that developers can scatch the itch to quickly iterate on their own app, while on the other hand staying firmly on the “interoperability & standards” track and ramping up to more powerful semantic “social fabric” in the future.

I feel there are very little tools and (recent) best-practices to help them in this regard. Though there are a couple of ActivityPub ‘frameworks’ that are well positioned to offer a proper support platform, like @cjs #software:go-fed @ivan and @mayel’s #software:bonfire @pukkamustard #software:openengiadina @naturzukunft #software:rdf-pub and semapps.

It would be lovely if we could come to a more appealing proposition together, and attract more people to the ecosystem with that.

1 Like

Just added a comment to the Soild topic highlighting some RDF criticism I just bumped into on HN at: The Block Protocol | Hacker News

1 Like

A post was split to a new topic: Atomic Data: Easy way to create, share and model linked data

Interestingly alongside this discussion in quick succession we have found 5 protocols already that dedicate to being lite versions of Linked Data:

They all look like great initiatives to me, with a focus on practical application, reducing linked data complexity. There are parts that are complementary in them, and other parts that are overlapping, different ways of doing things. They all relate to underlying (W3C) linked data standards in different ways. Which bringes me to the most quoted XKCD cartoon:


XKCD on Standards

But maybe this is not applicable at all because of sufficient linked data standards compliance of each of these protocols, and they all just serve to give developers more choice in how they design their apps without entering competing ecosystems by doing so.

At least I think that each of these projects would be best served to keep a wary eye on providing that compliance and retain a good level of interoperability.


Probably I should start collecting these protocol projects in the delightful project, and any PR’s / Issues are most welcome:


Update: Interesting follow-ups on the Solid companion thread being posted…

1 Like

Hi Arnold and everyone, creator of m-ld here. Thanks for the shout.

I’m coming at this topic from the point of view of a couple of decades in scientific data management – where sophisticated knowledge management, of the kind that may be possible with Linked Data, could provide a huge boost to scientific productivity. We developed a data-linking approach in parallel (ish) with the early days of RDF, which gave us an edge in the market when it came to search, workflows, and reporting. In the end we never transitioned to RDF (though our system was within a hairs-breadth of a 1:1 conceptual mapping), despite many years of championing on my part.

Why?

To cut a long story short, it’s because we didn’t realise we needed it until we had already entrenched our own approach.

I submit to the panel that this is a common problem. I want to build a (social) app to meet a pressing customer need. I take a platform off the shelf and hack together a prototype. I probably have JSON as both a serialisation between distributed components and a readable way to communicate between humans. My prototype is well-received and I get a hundred stories to take me to MVP. Along the way, some suspiciously tricky requirements arise: like internal and external cross-links, a faceted search UI, custom fields. Each of these is addressed with increasingly complex (and, I will stress, fun to invent) solutions involving metadata and query APIs. At no point in this path (or indeed, in the decades to follow) is there any breathing room to take a hatchet to all these custom solutions.

Today I find myself building a software library that will help developers to solve another, parallel, hard problem, which is sharing of live state information allowing multiple concurrent editors. RDF data structures are not easy to make live-sharable (article), but I’ve based m-ld on RDF for the natural extensibility (link to paper). I also know that when m-ld is used in real apps, linked data principles are going to be needed, and they’ll give m-ld an edge in an increasingly competitive space.


PS

That’s the plan :100:. I’m not actually trying to be a lite version of LD. I use LD as my base data representation.

The main place where I may appear to be inventing standard 15 is with json-rql, which is conceptually a mid-point between GraphQL and SPARQL. However in reality it’s just a serialisation for SPARQL, with a lower barrier to entry. Happy to talk more about that, of course.

1 Like

I started working with Linked Data and RDF about a year ago.
I’ve been an IT freelancer for > 20 years, but I didn’t study and don’t have a diploma, master’s degree or anything like that.
I admit, at first I had a hard time understanding triples / linked data. But that was not because of the complexity, but because of my rigidity in the old way of thinking!
and for me definitely json-ld. i find it mega confusing! I would recommend everyone to learn turtle first!

Triples are very simple and everyone works with them. each object has an id, properties and values. Only id, property and value there are no links. This brings some complexity into play, but this is where you find the great advantages.
It gets really complicated when it comes to the definition of onthologies, shacl, reasoning. I still have a lot to read, understand and experience.
But not everyone has to understand and be able to apply these complex topics.

I had looked at SOLID, but at the time it didn’t quite meet my requirements. and it was still too much in development. another hurdle for me: it’s more of a solution used by a browser in JS. and JS isn’t my language at all.

With my rdf-pub implementation, I want to create a C2S interface that makes it easier for clients to participate in fedivers. posting json-ld should be easy, but the response may be a json-ld that the client finds frightening. But then you can work with adapters/translators if necessary.
I think we need more adapters/translators. we are talking about “ACTIVITY” pub and for me an activity is a synonym for event. and therefore for me activitypub / fediverse is an event driven architecture. and it is quite normal that there are adapters/translators !

btw. i have paid work again and will have less time to take care of linkedopenactors and rdf-pub. so rdf-pub will not learn S2S for now.

1 Like

I’ll pop this on here for now

It’s something that came out of a discussion with a friend. Still a one-pager that is an early work in progress.

And not really a use case or a library, yet. Manu Spory of JSON-LD did however take a quick look at it, and said positive things.

But hopefully some food for thought on how a simpler path to JSON-LD could look.

1 Like

I just wrote a very long post to the Solid companion thread (but I added a TL;DR too :wink: ). What you say, @gsvarovsky, rings very true. The kind of interoperability levels that we want to achieve are ultra-hard and it is all-too-easy and from pure practical, short term needs, to end up with something much less versatile than inititially intended.


Rephrasing my summary given to Solid to apply here on the SocialHub and the Fediverse, imho we can only hope to tap into the full potential of ActivityPub et al if we:

  1. Recognize the importance of the social aspects of software development. The levels of collaboration that are needed to create highly interoperable software.
  2. Focus on the processes that are needed to do so and continue to streamline them, instead of diving directly into code and deeply technical matters.
  3. Make it as easy as possible for people to tag along to our processes, to interact socially and contribute their bit, even when they lack the technical expertise to help with the low-level stuff.

Here’s my summary regarding how I feel about Solid project and some recommendations:

[Solid Project] Summary

  • Solid project seems most focused on a Grand Vision where its success hinges on tackling enormous complexity and widespread adoption of standards and ecosystem projects.

  • There’s no gradual path to adoption with stable well-productized milestones all along the roadmap and options to choose from. No way for people to get acquainted with Solid without going all-in and face the brunt of the complexity.

  • There’s little focus on the process of software development, how Solid fits, and what benefits it brings in the short term (i.e. before the Grand Vision is realized).

  • While there is a deep technical focus, there seems to be almost a business myopia, as to all the process and design best-practices that are also needed to create interoperable apps that satisfy people’s needs. Social aspects of development are neglected.

Recommendations:

  • Focus on all the practical things that help average developers leverage Solid technology right now in their apps. Tools, documentation, different langauges supported, etc. And with these people now invested in Solid, entice them towards deeper community involvement.

  • Ensure that not just tech is covered, but that Process is as well. How do I design my app with reasonable expectations for interoperability? How can and should I collaborate with others, and what organization structure and tools can we offer to help with that?

It is noteworthy that unlike Solid, we at Fediverse don’t seem to have a Grand Vision. We are satisfied to bolt additional features on top of existing Microblogging concepts and look in each other’s codebase to see how we may slightly integrate with an app.


Interconnectivity

Lastly I want to mention that the Grand Vision of broadscale seamless semantic interoperability that Solid wants to achieve, has a high risk of never coming to fruition. The high complexity being the key factor in that.

So I was delighted when I heard the term “interconnectivity” for the first time. It is a perfect companion to interoperability. It was @steffen that introduced me to it with a toot announcing:

See also: Interconnective networks: open development starts today!

2 Likes

Nice post

If we’d done away with RDF and RDF/XML and just standardized on JSON w/ URLs we probably would have saved ourselves 10 years

1 Like

I want to copy part of a discussion I had on Matrix, so as to ‘archive’ it. I guess my overall point is this:

Without well-defined processes and the community organization and tools to facilitate the social interaction to keep them going, an ecosystem will only evolve by Ad-hoc Interoperability and will suffer the downsides - stalling or exponential complexity - from that in the long term.

openEngiadina matrix chat …

On the openEngiadina matrix room I asked the following question:

There is an, I think, interesting discussion on Solid forum ‘Is RDF “hard”?’. It boils down to that no matter how you turn it Interoperability is really hard, especially ‘semantic interoperability’ (expressing universal semantic meaning). For Solid community I recommend highlighting more of the Process side of software development and its Social aspects besides the deep technical focus.

In this regard I am curious to hear how openEngiadina envisions the semantic network to be built over time when it is in production. I guess the focus on local knowledge already solves a lot of the problem, and with UI widgets tailored to RDF constructs you can make it easier to crowdsource content aggregation. But still there’s a lot of different knowledge to be modeled and potentially reused across platforms.

To which @pukkamustard gave this answer:

I completely agree. Interoperability is hard, especially semantics.

Agreeing on semantics is the same as agreeing on a certain world view and classification of concepts. It is an intrinsically social endeavor that requires understanding how people think and conceptualize thing. Then you need to find a basis that everybody agrees on that is simple enough to be formalized. It’s not easy. In fact, I think trying to find a universal semantic that works for everybody is futile. Luckily, we don’t need universal semantics.

In my opinion, the most important thing in RDF is the open-world assumption - the fact that you never have complete knowledge over anything. If you hold a piece of RDF data and think that is represents a box, you can not assume that other people also think that the piece of data represents as a box - to them it might represent a musical instrument. This might be because other people have different data on the thing or because they have a different semantic understanding of the thing.

Even if two parties have two completely different understandings of something, the thing might still be described with properties that both parties understand. For example the thing might be annotated with a geographic position using some semantics (vocabulary) that both parties understand (e.g. W3C Semantic Web Interest Group: Basic Geo (WGS84 lat/long) Vocabulary). So even if two parties can not agree if the thing is a box or a musical instrument they might still be able to agree that the things is located at a certain place.

A contrived example, but I hope this shows that RDF does not require us to agree on a universal semantics. If we share partial semantics we can already go very far. Luckily there is already an established and rich collections of specialized semantics/vocabularies that we can re-use (e.g. DC Terms, Geo, ActivityStreams, The Music Ontology). These can be used to make the common basis of understanding larger.

In terms of openEngiadina and local knowledge: Local Knowledge is not only the representation of physical things that are bound to some locality. It is also a conceptualization of things - a semantic - that is shared in a locality. The way I envision things to grow: Start with your own small local semantics and grow by finding larger common semantics.

And my response:

Thank you for this elaboration pukkamustard :pray:
Some time ago I came upon some articles by Kevin Feeney of TerminusDB.com In one article he’s advocating to (at least start with) having closed world interpretations of semantic content:

However, if I have a RDF graph of my own and I want to control and reason about its contents and structure in isolation from whatever else is out there, this is a decidedly closed world problem

And in the next part he elaborates on some big issues with Linked Data:

the big problem is that the well-known ontologies and vocabularies such as foaf and dublin-core that have been reused, cannot really be used as libraries in such a manner. They lack precise and correct definitions and they are full of errors and mutual inconsistencies [1] and they themselves use terms from other ontologies — creating huge and unwieldy dependency trees.

Just now I am discussing this with @melvincarvalho. I feel that when we develop apps we naturally go from closed world vocabularies, and that the incentives to keep these interoperable enough with other apps that are out there, should be taken care of by good Processes and Social interaction.

Not taking these into account will mean we either go towards - what I call - Ad-hoc Interoperability (what the fedi has: learn from code and create your own flavour of interop from that), or striving for Universal Interoperability and all the complexity that comes with that.

@pukkamustard reply:

However, if I have a RDF graph of my own and I want to control and reason about its contents and structure in isolation from whatever else is out there, this is a decidedly closed world problem

I agree with the premise. I don’t agree that means we need to throw out the open-world assumption. I think a solution is that we need to be able to decide finely what data is used for reasoning and what not.

I feel that when we develop apps we naturally go from closed world vocabularies

I agree and really think this is what we need to change. We should start thinking open-world by default.

@how’s follow-up to that:

I guess there’s a question of perspective: on the one hand, the ‘closed world’ perspective makes sense when you’re manipulating your own data. But when other do manipulate your data, and when you describe your vocabulary, and when you conceive data usage, you should always leave space for other things to happen, i.e., consider the open world assumption.

And finally my response…

I once saw a toot by @dansup saying something like (paraphrasing): “Oh, it is so nice to be working on new federated features that do not yet exist, as I can just “invent” what I need on-the-fly”. A completely understandable notion. Focus on the own app’s needs, not having to draft a spec, negotiate with others. There are no others yet. Most fedi devs are open to make changes later if that increases interop opportunities. Unless they became too deeply invested on a certain way of doing things and unwilling to change. If they are a dominant project in their domain, then that will become an issue.

I agree and really think this is what we need to change. We should start thinking open-world by default.

Yes, I think so too. But the reality is that this takes a significant extra effort, and initially this rests with the dev who 1st needs a new AP extension but is mostly interested in delivering a good MVP app at that time. The potential win-win of going the extra mile is somewhere in the future, when others want to interoperate.

We should streamline the process of ‘offering an extension’ to others as much as possible, lowering the barrier, and include the steps in which it can be iterated and mature for more open world application.

Right now the practice shows there’s too little incentive to engage in the process, and ad-hoc interoperability is the only way forward.

I am currently testing my rdf-pub implementation, in which I would like to implement a “karte von morgen” KVM (map of tomorrow) initial import and a one-way sync (KVM->rdf-pub).
And here the “open world principle” is blocking me at the moment, so that I am losing some motivation. adapting a closed world to a linked open one is no fun. at the moment I am taking a break :wink:

1 Like

So your are seeking common vocabulary terms to replace an internal entity-relationship model or property graph?

No, thats already done: Specification of Linked Open Actors (LOA)

There are a lot other questions regarding versioning, linking…

I assume that in KVM a place has an attribute of an address and does not point to an address. If the attributes of the place change but its address does not, a copy of the address is created.
This is how it looks in any case when you look at the rest-api.
an address also seems to be duplicated if there are two organisations in one place. these considerations and possibilities of mapping are currently giving me sleepless nights.

1 Like

Agree with this!

Open World Assumption means “anyone can say anything about anything”

And “no one knows everything about anything”

What is means is that data is always partial, and can be mashed together to form knowledge bases

It’s like the difference between “opinions” and “facts”

As you can imagine it’s a very useful tool in some use cases, but adds complexity in others

This is more about the data that you write, than about open vs closed vocabs. I think all our vocabs are open, really

Open vs Closed is different to local vs global

Let’s take facebook as an example. That’s a closed world. I cant make a friend on mastodon

Mastodon is kind of half open / half closed. It lets you do some things and not others. I can make a friend on the fediverse (or follow them). But I cant make a friend on facebook.

In solid you have the open world assumption so I can link from my profile to a facebook account. And pull in data. I’ve done this in the past. So it embraces the open world assumption.

I can also say that my facebook id is linked to my profile, even if facebook doesnt let me, that claim could be sourced on my home page. So the software now has to collect all the different claims and work out fact from opinion. Valuable in the right places, complex in others.

2 Likes

The stupid thing is that in my case the data should really be available as open linked data. That is my goal. Only the path is a bit hard at the moment. I lack experience. and I have the feeling that only very, very few people have this experience.
However, I see LOD as a valuable goal that we should all strive for.

1 Like

Yes, that is an issue I also have. We should try to improve that situation and get some real Linked Data interactions going on our parts of the web. The Solid community is also de-facto the Linked Data community, but the real experts are loathe to mingle on the forum. The Discord channels (and maybe all those boards they have) may be a way to get some help.

Also I’d like to mention once more the delightful-linked-data curated list I maintain. If you have any resources to add I’d be glad to hear. The better the list, the more I can promote LD on the fedi and beyond. So whenever you bump into something… leave it in an issue (there’s one open for just that)

I dont think there actually any real linked data ‘experts’. And anyone that calls themselves one, probably isnt

The solid discourse area was set up by a commercial entity. From my experience, they are reluctant to help out or support people using solid. The vibe in this forum better IMHO

Fundamentally consider a programming language where every variable you use MUST be a URL, and SHOULD link to another quite complicated page of meta data for that URL. And where the only data structure you are allowed to use is a Set. Arrays are an after thought, shoe-horned in, that no one understand. This programming language does not allow things like addition without specialist servers with atomic updates, which still have not yet been built

That’s the state of linked data

It’s certainly useful in some situations. But to say it’s useful in all situations is wrong. That’s the mistake that Linked Data “experts” make. They make promises they dont understand, and for the most part, dont even use. Then when stuff breaks, there’s often no one to help.

LD should be used to solve a narrow set of problems. Such as merging data from different websites. Or in scenarios where links are under used and high value

At the time of making ActivityPub the idea was that if everyone used this ‘standard’ it would be possible to create a rich network effect through interop. Even with the limitation. Well, understandably developers struggled with the limitations, and some rejected them. Rather than to keep pushing linked data, where the earth has been salted, a better way is to accept its usefulness in some situations, explain it, and also accept the limitations

Linked data should be viewed as a variable scope. One higher than global variables. Then programmers have a range of tools to achieve their goals.

Edit: A possible solution.

  1. Recognize JSON-LD as a form of linked data, which has a syntax for representing hyperlinks, for representing things, types, and some common properties
  2. Recognize JSON as a super set of JSON-LD, with all the features, plus more on top, such as Lists or Arrays of typed things
  3. Match slow changing vocabs, to slow changing software, and allow new types of innovation and interop through JSON
2 Likes

https://linkedopenactors.org/#introduction-to-the-concepts

What do you mean with “the forum” ?