Can JSON and RDF be friends?

Is it possible to turn RDF into an idiomatic tree-based format? The history of RDF notations, especially the experience with RDF/XML and JSON-LD proved the opposite. Ian Davis summed it up nicely:

The main problem I see with the “idiomatic JSON” use case is that although it’s much more usable by the average web author, it’s always going to butt up against various mismatches in model: graphs vs trees, URIs vs shortnames, literals/languages/datatypes vs strings, repeated properties vs simple values, blank nodes, lists/collections vs arrays/dictionaries.

The blunt truth is all of those things make RDF an unfriendly model to web authors and I think it will be very hard, or impossible, to develop an idiomatic JSON serialisation that web authors will care about.

In this post I’ll try to tackle this problem by trying to answer the following questions:

  1. What are the challenges of converting a random JSON document to triples?
  2. What compromises need to be made on both (JSON and RDF) sides to make this possible?
  3. How complex would the JSON -> RDF (and vice versa) parser be?

1. What are the challenges of converting a random JSON document to triples?

I remember a meeting in which a partner mentioned they have data needed for the project. When asked if data is in RDF format he replied with a great relief that data is “just some simple key-value pairs”. In my mind, key-value pairs and triples are just the two sides of the same coin, so I felt kind of sad that people perceived the two as entirely different things.

What is it that makes a bunch of key-value pairs appear so different then a set of triples? Add the subject to key-value pairs and you get triples. Or vice-versa — group the triples around the common subject and you get the key-value pairs. Well, in practice, it’s a bit more complicated.

Today, JSON is the de facto standard for describing data. Despite its simplicity, JSON enables flexibility and expressivness. Still, there is a limited number of patterns used, which opens the opportunity to identify them and understand how they would map to corresponding triples.

As a simple experiment, I am going to pick a random JSON and try to naively convert it to triples, ignoring the URIs, datatypes and other RDF features. Let’s use the description of Underscore JavaScript library on npm.

Let’s start with the first four key-value pairs.

  "_id": "underscore",
  "_rev": "251-c05c3d825f5bc6b691649b4f90a3c894",
  "name": "underscore",
  "description": "JavaScript's functional programming helper library.",

I will use underscore as the subject for simplicity. The keys and values will become predicates and objects respectively.

underscore _id "underscore"
underscore _rev "251-c05c3d825f5bc6b691649b4f90a3c894"
underscore name "underscore"
underscore description "JavaScript's functional programming helper library."

Nested objects

The conversion of flat JSON is unsurprisingly straightforward. What about nested objects?

  "repository": {
    "type": "git",
    "url": "git://"

The nested object could be treated as a blank node, but it’s better to use the node name we already have thanks to dot notation. The property repository, used as a predicate, can simply be interpreted as ‘has repository’.

underscore repository underscore.repository
underscore.repository type "git"
underscore.repository url "git://"

Let’s take a look at another example with two nested objects:

  "versions": {
    "1.0.3": {
      "name": "underscore",
      "description": "Functional programming aid for JavaScript. Works well with jQuery.",
      "url": "",

Following the same method as in the previous example, we would get:

underscore versions underscore.versions
underscore.versions 1.0.3 underscore.versions["1.0.3"]

This looks strange. While the property versions can be interpreted as ‘has versions’, 1.0.3. as a predicate doesn’t make much sense.

In order to interpret nested objects structure, we must understand the author’s intent. The problem is that associative arrays and objects have different semantics, but in JSON (ie. JavaScript) they are written using the same syntax. An associative array is an array where descriptive keys are used instead of integer indices; in an object the keys are the names of its properties.

In the above example the value of versions is an associative array, but the value of 1.0.3 is an object. In other words, 1.0.3 has never meant to be a property of an object, but simply a key of an array.

Therefore, we must choose another strategy. We are going to connect underscore directly to the array item, and consider 1.0.3. as a value of an special “key” property, while versions means ‘has version’:

underscore versions underscore.versions["1.0.3"]
underscore.versions["1.0.3"] key "1.0.3"

The conversion of the rest of the key-values pairs is straightforward:

underscore.versions["1.0.3"] name "underscore"
underscore.versions["1.0.3"] description "Functional programming aid for JavaScript. Works well with jQuery."
underscore.versions["1.0.3"] url ""


Arrays can be parsed in the same way as associative arrays. Let’s use this snippet from the JSON file:

  "versions": {
    "1.1.3": {
      "maintainers": [
          "name": "documentcloud",
          "email": ""
          "name": "jashkenas",
          "email": ""

The resulting triples would look like this:

underscore.versions[1.1.3] maintainers underscore.versions[1.1.3].maintainers[0]
underscore.versions[1.1.3].maintainers[0] name "documentcloud"
underscore.versions[1.1.3].maintainers[0] email ""
underscore.versions[1.1.3] maintainers underscore.versions[1.1.3].maintainers[1]
underscore.versions[1.1.3].maintainers[1] name "jashkenas"
underscore.versions[1.1.3].maintainers[1] email ""

If the array values are primitives and the order is not important, like in this example…

  "keywords": ["util", "functional", "server", "client", "browser"],

… the subject could perhaps be connected directly to the values:

underscore keywords "util"
underscore keywords "functional"
underscore keywords "server"
underscore keywords "client"
underscore keywords "browser"

.. or parsed in the same way as object, using the “special” value (rdf:value?) property.

underscore keywords underscore.keywords[0]
underscore.keywords[0] value "util"
underscore.keywords[0] index 0
underscore keywords underscore.keywords[1]
underscore.keywords[1] value "functional"
underscore.keywords[1] index 1

2. What compromises need to be made on both sides to make this possible?

The examples above cover different patterns used in the underscore JSON. In my experience with writing and reading different JSON documents I have seen these patterns repeating endlessly. However, JSON is flexible and it’s used in many ways, and in same cases it can’t be converted easily to RDF. Take for instance these two examples:

  "prop": [["foo", "bar"], ["fooValue", "barValue"]]

  "properties": [
      "name": "foo",
      "value": "fooValue"
      "name": "bar",
      "value": "barValue"

In the first example nested arrays are used to represent a table, in the second key/value pairs are encoded explicitly. Are these examples idiomatic? I would say no, but the question is where is the border that separates an idiomatic JSON from a non-idiomatic one. Should users be deliberately constrained to simpler patterns, or some of these patterns should be still allowed as some kind of a special, higher abstraction syntax?

Distinguishing between keys and properties

In any case, the experiment has shown that one of the biggest problems is the inability of JSON to distinguish between objects and associative arrays. In order to solve this, we don’t need to extend JSON syntax, we can just use a different syntax for keys and properties.

In RDF, properties are identified by URIs, so it’s common to use the compact URI (CURIE) syntax in prefix:name form. The presence of a special CURIE delimiter (: ) can be used as an indication of whether it’s a key or a property. For instance, look at the following example from the FOAF specification, converted from RDF/XML to JSON:

  "foaf:account": {
      "freenode": {
          "href:": ""

Here we know that the keys containing : are the properties, while the others (freenode) are the keys of the associative array.

Naming nodes with URIs

We need a way to deal with URIs as node names in JSON. In the previous section, we used implied names for nested objects thanks to dot notation. These names (ie. paths) could be easily translated to URIs by using / instead of dots.

That means that for the nodes that are existentially dependent of other nodes, instead of blank nodes or random URIs we would have to use this strict pattern. Following the previous example, the resulting triples would be:



Such URis can be “packed” into the tree, and written in JSON without any friction. The lack of order of structure in the graph and unfriendliness of URIs are now hidden. The same data is much easier to read and write when organized in a tree. That’s exactly why people use tree-based syntaxes, and now we have RDF represented this way! And as an extra bonus, there is no need for blank nodes any longer.

Another challenge is encoding object URIs in JSON. There is a number of different proposals how to do it. Here I followed the paradigm already used on the Web. When you write eg. a blog post and mention something, you put a hyperlink that usually points to some representative webpage describing that concept (eg. Wikipedia page, github, twitter profile, blog post etc.). Therefore, whenever you want to “break” the tree and point to a node on the Web of data, the URI is modelled as the value of the href: property. href: looks like a reserved word, but it’s also a CURIE, only without the “name” part.

3. How complex would the parser be?

The basic algorithm for parsing triple-friendly JSON is extremely simple thanks to the standard way of “packing” triples into a tree structure. In general, the hierarchy of keys and properties directs how the tree is going to be parsed into RDF. Other than that, we would need a few “special” properties, like href: describe above. Also, we need a way to map CURIEs to full URIs, but this doesn’t really require a special syntax (like @prefix in Turtle or xmlns: in XML); it could be described consistently using triples as well. Finally, if used, primitive JSON types could be mapped to corresponding XLS datatypes.

One of the problems of RDF notations is that it’s possible to write an RDF graph in many different ways. The benefit of the approach described here is that it forces people to write JSON in a very predictable and consistent way. A small number of variations allows for the simple parser and easier understanding of data. JSON made this way can serve as a basis for a canonical RDF JSON format.

Finally, thanks to the fact that this approach is not really made for JSON specifically, but a general tree-based format, the same idea can be easily applied to eg. XML or YAML, and the parser would essentially use the same algorithm.

Making data a first-class Web citizen

Working on my startup, Faviki, I have realized how hard it is to get even basic data about a webpage. Faviki is a bookmarking app that lets users connect webpages with structured data from DBpedia.

I was trying hard to figure out how to take it to the next level — to get more data from web pages, connect it with the rest of the graph and help users organize bookmarks better using not just tags (“strings”), but “things” and their relations.

However, it didn’t take long before I realized that despite all the promising semantic standards that enable doing some cool and powerful things with data, in reality, getting even the most basic data from an average webpage can be incredibly hard.

Take the title of the page for instance. In the spirit of working on the “things” level, I tried to get the “real” title, or the name of the thing the webpage is about (i.e. the primary topic).

Getting the value of the <title> tag is easy, but the problem is that it typically contains additional text like the name of the website and SEO keywords noise.

To get the actual title, one can search for <h1> tag in the source, but, as with the <title> tag, it’s often abused or not used properly.

Or maybe we can compare <h1> with <title> hoping that the <h1> is the subset of it? I even thought about downloading a few other pages and trying to figure out the general pattern behind the <title> tag.

The “right” way

You may argue that these are just dirty hacks and that HTML is not suited for this. It’s a document format, and data should be described using RDF, Microdata, OData, CSV or other syntaxes, provided either in a separate document or embedded in HTML.

The trouble is that in a bookmarking application, you deal with random webpages, most of them not publishing any data whatsoever. But, let’s say we want to make use of the ones that do, in order to provide a richer data and better experience for end-users.

In order to get the page title, we need to search the webpage’s HTML code for embedded data or <meta> tags (rel=”alternate”, rel=”primarytopic”, etc.), pointing to external resources that might lead us to the data we need. There is a number of options, even if we limit ourselves to the RDF model: RDFa, Turtle, RDF/XML, JSON-LD…

Now this diversity may sound like a good thing, allowing you to use the syntax that best fits your needs and taste. In a perfect world, this may be the case. But in the reality, if my prefered syntax is not published, I must use whatever is available. So ultimately both publishers and consumers must cover as many alternatives as possible, which is a big burden.

What about SPARQL?

If one only needs a single atomic data such as the title, isn’t the most appropriate solution sending a simple query to the SPARQL endpoint?

The problem is that the number of websites providing SPARQL endpoint on one hand, and the developers familiar with SPARQL syntax on the other, is still small.

Another problem is that if there is the website providing a SPARQL endpoint, given its random webpage URL, how to find the SPARQL endpoint URI? How many websites that provide SPARQL endpoint publish a voiD file containing the description of dataset (where one should be able to find the SPARQL endpoint URL)?

Finally, are the relations between standard web pages and data stored in the dataset and accessible via SPARQL at all?

Take DBpedia for instance. has this tag encoded in its HTML source.

<link rel="foaf:primarytopic" href=""/>

… suggesting the triple:

<> foaf:primarytopic <> .

But if you try to get this triple using SPARQL:

    <> foaf:primarytopic ?o .

… you will get nothing.

Therefore, first I need to figure out where is the SPARQL endpoint by parsing the voiD file, then to download and parse the web page in order to get the “primarytopic” resource, and then to use that resource directly in the SPARQL query.

The frustration

The more I’ve been thinking about the problem of data access, the more I got frustrated. It seemed that most options were already considered and that there was hardly any room for innovation.

Historically, every new solution tried to solve some other solution’s problems, ending up as a balance between different constraints.

For instance, the Turtle syntax is much simpler than RDF/XML but requires a special parser and you can not use the XML stack. RDFa doesn’t require a separate document, but is mixed with HTML content, hard to read and makes the original file bigger.’s Microdata is a simpler alternative to RDFa, but at the cost of being far less expressive.

It was hard to imagine some new, simpler and more elegant syntax than Turtle. But it’s not just about syntax. I was always annoyed by the fact that when I stumble upon the Turtle file on the Web, I either need to download it and read it outside of the browser, or, if it opens in the browser, I can’t click on the deamn links (URIs)!

And I don’t buy the story about its human-friendliness either. Look at this for instance, it’s just painful to read.

Finally, it wasn’t about the simplicity and elegance either. Take HTML for example. It’s definitely not the most elegant syntax in the world, but still enormously successful.

It was hard to imagine some original and radically different aproach. Still, I had a strong feeling that something was wrong and that there must be a better way.

The idea

One day, a strange idea struck me. I was looking at some news page on Guardian and thought: what if the “title” segment is just added to the URL? For instance take the URL:

If you need to get the title, you simply append the “title” segment to the URL. The result is:

When you look up this new URL, you get (HTTP 200 OK) response with the body:

Will the Samsung Galaxy S4 eclipse the iPhone?

That is, you get raw data, and the syntax is not just easier to parse — there is practically no syntax at all! Similarly, you can look up the description and author on the following URLs, respectivelly:

Relation to RDF

How does this fit with RDF and triples? Here is how the RDF might look like:

    dc:title "Will the Samsung Galaxy S4 eclipse the iPhone?" .

Now, if you append the predicate to the subject URI, you will get:

By looking up this new URI, we get the literal value of the title (the object of the triple). But what does that new URI identify? It’s simply the URI of the page’s title. Therefore, the title is not just a “string” any more, it became a “thing” – a separate resource identified by URI.

    dc:title <> .

The result is that the title and its value are separated, which makes sense. The title != the value of the title.

    rdf:value "Will the Samsung Galaxy S4 eclipse the iPhone?" .

By using the property names with prefixes (“dc:title”) as segments, we limited the properties to ones defined in vocabularies, making them unambigous and predicatable.

(Note that the : character is used for clarity, although it is a reserved character according to the URI spec. On the rest of the blog I have used underscores instead.)

Using HTML

In some cases, however, we must use a syntax. When dealing with RDF links, as in the case of stating the author, we need a way to say that the value is not a literal string, but a thing identified by URI.

Should we invent a new syntax? Of course not, there is already “the Web way” of writing URIs — the HTML <a> tag.

Therefore, a look up to

…will return the hyperlink:

<a href="">Juliette Garside</a>

This corresponds to the following triple:

    foaf:maker <> .

(I am using here the web page URL as the identifier for the person for simplicity.)

If there are several values for the same property, e.g. descriptions in different languages, we can append segments that have the role of local “keys” in the rdfs:comment collection:

If one looks up the following resource:

… we need a way to write down this collection. Again, no need to reinvent the wheel — we can use the HTML list, either <ul>

    <li><a href="en">english</a></li>
    <li><a href="fr">french</a></li>

… or <dl>, that also encodes the values:

    <dt><a href="en">english</a></dt><dd>The Galaxy S4 will be unveiled in New York this week...</dd>
    <dt><a href="fr">french</a></dt><dd>The Galaxy S4 sera dévoilée à New York cette semaine...</dd>

Click here for more details on the HTML syntax used.

Data as a first-class citizen

It is nothing new that there has been a large gap between the current Web and the Web of data (Semantic Web) vision. Linked Data showed up in 2006 as an attempt to close this gap by introducing the Web principles to data publishing.

The problem is the the “Web” expected data to adjust to it by using HTTP URIs but it didn’t really return the favor. It has still remained the same Web of documents, with data described in its building blocks — the same old, boring, flat web documents.

In reality, sending a few HTTP requests and getting simple, raw data is much easier than parsing the documents and dealing with the current syntax mess. It is especially useful for data discovery, in which one asks simple questions to predictable URLs, and obtains short answers giving him the clue (links) about the state of the dataset.

On the publisher side, the implementation is perhaps not as easy as uploading e.g. a Turtle file, but it’s not too hard either. The real benefit is that it is easy to understand what is going on — essentially no need for learning a new syntax.

Remember the last time somebody started explaining how to get to some information on the Web? “Search for this, then click there, then… ” provoking you to interrupt him and say “just give me the goddamn URL!”

This is a good metaphor of what we need to do with data as well. The approach I have described results in each piece of data getting its own (HTTP) URL, meaning that one can easily share atomic data the same way we share the ordinary web pages now.

To put it in fancy words, this way data becomes a first-class citizen of the Web.

But data is identified using URIs, that’s the whole point of RDF and Semantic Web! Isn’t data already a first class-citizen? No, it isn’t, because by dereferencing these URIs you don’t get plain data, but documents.

By adding an intermediary resource that acts as a “variable” whose value is then returned after dereferencing it, we will finally allow (atomic) data to have “equal rights” as documents, making the necessary step for the Web of data to arise.

Solving Linked Data problems with Hypernotation (DBpedia example)

In this post I will present a real use case showing what Hypernotation looks like in practice. I have chosen one of the most popular datasets in Linked Data – DBPedia, and published its dataset using Hypernotation principles. It’s published on – the homepage contains some friendly examples, so I invite you to visit it as well.

I’m going to explain the basics of Hypernotation using some real-world examples. I’ll go through the four problems of Linked Data I discussed some time ago on this blog, and show how I attempted to solve them with Hypernotation.

1. Identity

The simple question of „What is exactly Linked Data?“ is not easy to answer. The main concern here is whether or not RDF is required.

In Hypernotation, the data model is a directed, labeled graph whose nodes and links are identified by HTTP URIs. It can be regarded as a simplified and more consistent version of the RDF model.

For instance, let’s take a look of the hyperNode (a node in Hypernotation graph) representing the area (in Km) of Iceland:
>> 103001

The URI of this hyperNode is the path (or pathname) between the two nodes ( and Below the URI is the response returned by looking it up, containing its value.

In the RDF model this is expressed using a single triple:

    "103001"^^<> .

In Hypernotation this is expressed with three triples:

    <> .

      rdf:value "103001" ;
      datatype: <> .

The concept of Iceland area is now a distinct resource identified with URI. Compared to the classical RDF, in Hypernotation every node has a global address, becoming the first class citizen on the Web. That is, each piece of data is easily linkable, shareable and bookmarkable.

2. Concept

Linked Data is highly determined by the level of data granularity. The trouble is that blank nodes cause this level to be too high. The basic building units are not triples, as one could logically assume, but ‘rdf molecules’, a concept not so easy to understand and deal with in practice. By setting the model on such a high level of granularity, a good deal of flexibility is lost.

In Hypernotation, the basic unit is a triple, and there are no blank nodes. However, the ‘spirit’ of blank nodes is preserved thanks to identifying nodes with paths encoded in URIs. Blank nodes enable indirect referencing, one that is generally more natural and closer to how people express themselves.

Using paths ensures that each node has a unique global address, so a sort of compromise is made between humans and ‘machines’. Paths are what enables the chaining of triples, connecting these basic units into more complex, meaningful ‘sentences’.

For instance, check out the resource representing Iceland calling code:

>> 354

Again, a resource in Hypernotation (called hyperNode) is determined by the two dimensions: its path and the response it returns. Its path is a URL that is optimized to be triple-friendly.

Here, several triples are chained in the path that can be understood as a question, or a query:

    <> .

        <> .

            ?value .

The data returned when the resource is looked up (in this case the literal “354″) is the anwser to the question.

How will a ‘machine’ know what is the meaning of dbpedia2:callingCode? To figure that out, it needs the namespace URI mapped to dbpedia2. The mapping is described on the path, where the prefix: (CURIE) is the predicate of the triple:

    <> .

        ?value . returns Therefore, the full URI of the property is; by looking it up, one can find more about its meaning.

3. Publishing

Lots has been said about difficulties of publishing Linked Data. It seems that the main problem with Linked Data is that many requirements just don’t seem to be worth the effort. They are not justified well, at least from the point of an average developer.

Hypernotation, on the other side, is similar to the REST and follows the hypermedia ideas. It is based on HTML(5) format, namely (semantic) <a> and <li> HTML elements. The information about data structure is encoded in URLs, while data is formatted using HTML. A URL pattern and a few HTML elements are all what is needed for putting RDF graph on the Web.

The publisher is guided and have less freedom than in Linked Data. He knows where to put the data and what identifiers to use. Only a few naming conventions are used to ensure interoperability. Take the following hyperNode:

Here, there are two conventions: data__, as a default starting point for data and en, as a segment that implies the language in which the text is written. The rdfs__comment CURIE is also determined by the common prefix and defined local name. Therefore, in this case a publisher need to create only two segments: dbpedia – which is the same for all the resources belonging to that group, and Iceland, a unique key identifying the resource.

Practically, the Iceland is the only segment a publisher has to create, meaning that from thinking about URIs, a focus is shifted to the names, or IDs that are more friendly to people.

Regarding the “sensitive bits” of Linked Data (information vs. non-information resources, HTTP-range, content-negotiation, dealing with different syntaxes etc.), Hypernotation offers solutions for them, but in a way that doesn’t slap a prospective publisher in the face. These subjects will be discussed in detail in the future posts.

4. Consuming

The consuming aspect of Linked data is also problematic. When it comes to getting RDF data, there are two extremes – a primitive one vs. a highly sophisticated one. The former is about the idea of resource lookups and graph traversal, while the latter, of course, refers to SPARQL endpoints.

But where is the middle point? Linked Data alone doesn’t provide a way for meaningful compromise. It is inflexible and unable to evolve due to the inappropriate underlying model and the rigidity caused by the wrong level of granularity.

In Hypernotation, one doesn’t need to prepared for parsing a bunch of different formats when consuming data. Instead, he just look up the data URL and get the value. When needed, a minimum of (familiar) HTML syntax is used that can be easily parsed.

It is important that data is not just easily machine-parsable, but also easy for humans to consume it. Hypernotation enables you to share the URL of the exact chunk of data you are interested in. The receiver will get the readable results in his browser, together with the context of the data (encoded in the URL path) and the ability to interact with it, simply following the links.

Another benefit Hypernotation provides for the consumer is using predictable data locations, making data easier to find. Eariler in this post I described how the publisher is limited when it comes to minting URIs for data. The existing URIs (CURIES) determine the contents of new URIs, while conventions are used when interoperability is needed.

However, the real benefit of this approach is on the consumer side. Imagine you deal with a lot of different data published on different websites. In Linked Data or REST API you have no idea where the data is. You must go to each website separately, browsing through it and finding the location of data. Each one does it differently and the process can’t be automated. In Hypernotation, if you know the website’s URL, you know where is the data.

For example, if you know the website’s homepage:

… where to look for the published data? The answer:

What vocabularies are used for describing it?

Hypernotation also encourages the idea of ‘URL hacking’, i.e. guessing the URL based on the other URLs. Using familiar URL path segments, you can construct new logical URLs for the data you are looking for.

For instance, given that you know the following URI:

… what is the URI of the Iceland’s rdfs label in French? You guess it right:

And what is the homepage of Paris?

Given that dbont:birthPlace is the property, give me the list of people born in Manchester.

As you can see, the idea of default data locations in Hypernotation is very important. The cool part is that the locations of data can be guessed not just by humans, but by ‘machines’ as well, thanks to the fact that the meaning of the relations is well-defined.

Finally, it is important to acknowledge that the opaque axiom doesn’t work in the context of data, and that we need a ‘transparency axiom’ (more on that soon). Structured data forms a graph, and graph is nearly useless without the ability to use paths. In order to use paths, we encode them in URIs, so they must be transparent by definition.

Hope you find the DBpedia example interesting. Any questions and suggestions are welcomed!

An example of Hypernotation

Let’s publish some data using Hypernotation. I am going to use the same (Chuck Norris) RDF example I have used on the blog so far. You can see the published data on and the prefix mappings on The domain is different for obvious reasons – I used my domain instead of, but the structure is the same as in the below diagram.


Every circle represents one web page. The contents of a circle is (rendered) HTML that you see when you visit that page in the browser. If you click on a circle, the web page it represents will open up, so you can see what it looks like in the wild.

As you can see from the image, the web pages are organized hierarchically, with the homepage as the root of the tree. Hypernotation allows you to browse data in a similar way you navigate through folders on a hard disk.

The arrows connecting the nodes (circles) are links between the web pages, pointing from a parent web page to its (immediate) child in a hierarchy. For each node in the tree there is a unique path leading to it, that can be seen by placing the mouse over it. Therefore, the URLs are paths comprised of a number of segments, i.e. the labels of all the links connecting the intermediate nodes between the root and that particular web page.

The labels next to the links are relative paths from the parent to the child page. Therefore, the web address can be figured out by adding that label to its parent’s address. For instance, the relative path from to is data_, which is also the last segment of the child’s URL (

From web pages to hyperNodes

So far, I used the word “web page” for what the circles represent. However, although written using HTML tags, these web resources are not ordinary web pages. Hypernotation is expressed using just a tiny subset of HTML—it needs a way to say “link” and “list”, and (semantic) HTML tags <a>, <ul> mean just that, so there is no need to invent a new syntax.

If using HTML in such a way confuses you, you can think of it as of XML that happens to have tags with the same names as in HTML (and is shown in a browser in a friendly and interactive way).

In the Web of documents, a web page, i.e. an HTML document, is a tree of (nested) HTML tags. In Hypernotation, the whole website is seen as a tree, in which the “web pages” are closer to the idea of a single HTML tag. In Hypernotation, data is realized at the most granular level possible, meaning that each node, i.e. a web resource represents a single, indivisible, atomic piece of data.


The image above shows the comparison of an HTML document and a Hypernotation tree. While a HTML document is serialized in a “flat” file using HTML syntax, Hypernotation has a “real” structure, thanks to using HTTP URLs as the names for its elements. Another difference is that in HTML document a HTML tag can contain more elements of the same name (like the two <script> tags in <head>), while in Hypernotation all nodes with a common parent must have unique name (like in a file system).

Another essential difference is that Hypernotation is based on the RDF model. All web addresses follow the “RDF graph” URI pattern that turns URIs into fully transparent paths encoding the relations between resources in the URI itself. In other words, they encode a set of chained RDF triples where the subjects and objects are nodes in the hierarchy. Therefore, the paths can be decomposed into triples that describe the meaning of the resources.

Take for instance /data_/chuck/foaf_based_near. What does this resource represent? What’s its meaning? You can find out either by a direct answer, or figure it out indirectly, from the context in which the resource exists.

Hypernotation’s way to ask a direct question is to look up its (rdf:)type property. In this case, it explicitely states that the resource is a geo:Point, i.e., as described by its rdfs:comment property, “a point, typically described using a coordinate system relative to Earth, such as WGS84″.

What about /data_/chuck? Here we don’t have the rdf:type property, but there are others that can help. For instance, foaf:knows has the rdfs:domain foaf:Person, so we can indirectly figure out the nature of /data_/chuck.

What about /data_/chuck/foaf_knows/steven? From its address you can read that it’s something with a local name steven that has the relation foaf_knows with /data_/chuck. Here, the rdfs:range of the foaf:knows property tells us that we’re dealing again with a foaf:Person.

As you can see, web resources in the Hypernotation context have some special features. Their addresses use the HTTP scheme and follow the URL pattern which makes the URLs (URIs) and links machine-readable. Finally, these resources are interconnected nodes in a graph. To emphasize these features (constraints) on one side, and to highlight the relation to similar concepts (that often have the word “hyper” in their names), I will refer to them as “hyperNodes”.

From hyperNodes to hyperObjects, hyperArrays and hyperLiterals

HyperNodes (circles) and links (arrows) are the basic elements of Hypernotation. However, as you can see from the first diagram, not all circles are the same. Some return a list of links, and some show a plain text or a single link in the rectangle.

Also, if you look at the links carefully, you’ll notice that some has prefixed labels denoting properties, while others look more like some kind of identifiers. Finally, there are links without a label altogether.

In the previous two posts, I discussed different types of links and hyperNodes in detail. Hopefully the following image will clarify those ideas and show that Hypernotation is mostly based on well-known, existing concepts combined in a new way.

The above diagram is the same as the first one, except for the added colors distinguishing different types of elements.

The red hyperNodes represent objects described with properties, while the pink colored ones are arrays pointing to their elements. Such objects and arrays are in the Hypernotation context called hyperObjects and hyperArrays.

The links directed from both hyperObjects and hyperArrays are called tree links, meaning that they always connect parent nodes with their direct children in a hierarchy. However, there is an important difference between them. The red links are typed links with labels (prefixed names, i.e. CURIEs) that are mapped to URIs. Typed links are always pointed from hyperObjects, so they share the same red color together representing a logical unit.

The pink links, on the other side, are always pointed from hyperArrays; their role is to provide unique keys (IDs) for the hyperArray’s elements. These keys are not globally unique URIs like typed links, but only unique in the local context (namespace) of a hyperArray. These key links are always pointed from hyperArrays, so they share the same pink color.


The image above should help you better understand the relation between hyperObjects and hyperArrays. Here, togehter they build up what can be considered a table, in which the hyperArray provides unique keys for the “rows” (objects), while the object’s properties are in the “columns”. Of course Hypernotation enables a greater flexibility, where not all objects in the same “table” (i.e. belonging to the same class) must share the same set of proprites.

The black circles, called hyperAttributes, are hyperObjects that can be represented by a single value. In other words, a hyperAttribute is a hyperObject with the rdf:value property. Instead of returning the list with the single property, it directly returns the value (in rectangle). Just like you don’t say “the value of its name is Carlos” in English, but rather “His name is Carlos”, equalizing the name with its value for practical reasons. But, “Carlos” != the name, it’s the value of the name, and the value is just one of the name’s possible properties.

The blue arrows in the diagram are hyperlinks. On the Web, hyperlinks don’t have any particular meaning, but in Hypernotation they connect two different hyperNodes that represent the same thing. That is, they represent the owl:sameAs relation. If a hyperObject has just one owl:sameAs property, it becomes a hyperReference. Instead of showing the list with a single property, a hyperReference directly returns the value of owl:sameAs, i.e. the URL of the target hyperNode. Therefore, like hyperAttributes, hyperReferences are a “syntax sugar” enabling a shorter notation.


So far, we have dealt with the two types of hyperNodes—hyperObjects and hyperArrays. Besides the two “primitive” subtypes of hyperObject (hyperAttributes and hyperReferences), the image above shows another type of hyperNode—a hyperLiteral. Simply put, it’s a hyperNode representing a literal value. This value can be in the form of a plain text (black circle on the top right) or a hyperlink (blue circle on the bottom right).

A hyperLiteral is the only type of a hyperNode that actually holds data. HyperObjects and hyperArrays return the list of links to their direct children, i.e. hyperNodes they are directly connected to. This list is not “their data”—they don’t hold any data by definition. The list is just a representation of a structure that exists around a particular hyperNode, telling a client about the available links to follow to go on with traversal. A hyperLiteral, on the other hand, doesn’t have any children by definition and returns the data it holds.


As the above image suggests, all the types of hyperNodes can be divided into the three categories: structural, peripheral and terminal. The classification is based on the place a node holds in a tree structure. The structural hyperNodes build up the structure of the tree, while the terminal are the end nodes holding data. Peripheral are, simply put, hyperObjects directly connected to a terminal hyperNode.

Alternative methods of publishing hyperObjects

What if a hyperObject described with rdf:value has additional properties? If the other properties are as important as rdf:value, it can be realized as an ordinary hyperObject. But if rdf:value is dominant, the hyperObject can still directly return that value, and provide the list of all properties on the hyperObjectURI/_ path. The image below shows the both situations.

The left diagram shows a hyperObject that has two properties—rdf:value and ex:units. The two properties are shown in the list as equally important. The example on the right shows the exact same hyperObject published in the different way. It returns the value of rdf:value, like a hyperAttribute, but offers an alternative “hidden” link “_” to a hyperNode that is realized as a hyperObject (in the left example), but with links to the parent’s properties (../rdf_value and ../ex_units).

The homepage hyperNode (the first diagram) is another example of the second approach. Here, the rdf:value of the homepage is a HTML document that is returned after look up. The list of the properties (data_ and prefix_) could be obtained on the path (although it shouldn’t be mandatory in this case, because the two properties are so common).

In fact, every web page can be regarded as a hyperObject, realized as a hyperAttribute, whose content is the value of the rdf:value property. Put in this perspective, the Web of documents is just a special case of the Web of data.

What about hyperReferences? As explained above, a hyperReference is a hyperObject described with the owl:sameAs property, which tells that it has the same meaning as some other hyperObject. But they can also have other properties besides owl:sameAs.

For instance, we can add the property foaf:name to the /data_/chuck/foaf_knows/bruce hyperReference (from the first diagram). In that case, owl:sameAs is treated equally to other properties and returned in the <ul> list (as shown in the image on the right).


In this post I discussed an example of data published according to Hypernotation principles. Although the example covers many important aspects of Hypernotation, many questions are left open. For instance, how to quickly get the complete descriptions of hyperObjects? Sending HTTP requests to every hyperNode is obviously not an efficient solution, so a way to “export” serialized parts of RDF graph must be provided. In addition, what is the relation of Hypernotation and other RDF technologies and concepts, like SPARQL, named graphs, reification, data provenance, RDF notations? Is it possible to navigate through and filter large amounts of data easily (without using SPARQL)? Finally, is Hypernotation read-only or does it allow modifying data as well?

These are some of the topics that will be covered in the next posts. This post’s example shows what a backbone of Hypernotation looks like. The backbone contains all the information required for a “machine” to understand the data. It can be enriched and facilitated for a client, depending on the needs of publishers and consumers, the size of data and its purpose. Hypernotation is flexible in this regard, enabling many levels of sophistication. What’s even more important, it provides a consistent framework in which more “advanced” features can (often surprisingly) be realized elegantly using the same fundamental principles. Image Map

Hypernotation: Classification of hyperNodes

In the previous post I discussed how RDF and Object-oriented model can happily live together. In this post, I am going to talk about various types of nodes in the Web of data graph and different ways to classify them.

In the OO model, variables can be assigned objects, arrays, primitive data types (numbers, string, boolean…) and so on. In order to make things simple, let’s (again) use a JavaScript (literal) object representation of our chuck object.

var chuck = {
        foaf_name : "Carlos Ray Norris",
        foaf_based_near : {
                rdf_type : geo_Point,
                geo_lat : 45.45385,
                geo_long : 10.19273
        foaf_knows : [steven, bruce]

Now let’s observe the various types of variables that describe the chuck object. There are primitive data types, like foaf_name, geo_lat, geo_long that have string and float value types respectively. There is the nested object that is assigned to the property foaf_based_near of the main chuck object. rdf_type is a reference to another object, geo_Point. Finally, we have the array foaf_knows whose elements point to the objects steven and bruce.

An object in the OO model can be represented as a graph, where each variable becomes a node. In the Web of data graph published using Hypernotation, this node is called a hyperNode. It is identified by HTTP URI and may or may not hold data. A hyperNode is the most general concept representing a parent class from which all other node types inherit.

All kind of different types used to describe the object chuck exist in Hypernotation, keeping the same names plus the added prefix „hyper“. Therefore, we have hyperObjects, hyperArrays, hyperReferences and „primitive“ hyperNodes corresponding to primitive data types. Here, the prefix „hyper“ is added to make a distinction from „classical“ concepts and to emphasize the new context in which they exist, together with the fact that they are, without exception, identified by an HTTP URI.

So far, the parallel between the OO model and Hypernotation should be pretty clear. What can be potentially confusing is that in Hypernotation the value itself is a distinct node, called hyperLiteral. For instance, „Carlos Ray Norris“ is a value of foaf_name, meaning that the two are distinct concepts (the former is the value of the latter), and thus represented as separate nodes in a graph.

Of course, literals as separate „nodes“ in RDF are well known, but what is new here is that they are explicitly described as values (using the rdf:value property). In addition, they are, like all the other types of nodes, identified by a URI. Hypernotation is consistent in this regard – a hyperLiteral is just an instance of the hyperNode “class”, thus inheriting its properties. Using a URI as an identifier is one of the defining properties of a hyperNode.

One can make a comparison with another model based on a tree structure – the Document Object Model (DOM). In DOM, there is a top class (constructor) Node and all the more specific node types inherit from it. There are conceptual differences between elements, attributes, text nodes, but everything, even a text, is a node. The result is that each text node, as any other node, has common Node properties like e.g. nodeName, nodeType and nodeValue.

Compared to the classical RDF model that is a labeled, directed graph, Hypernotation is based on a predominantly tree model. Therefore, if you have trouble understanding why literals in Hypernotation have URIs, blame the simplicity and consistency of a tree as a data structure. In a tree, everything is a node, and every node has a unique name (in the form of a path, i.e. a URI).


There are several ways to classify nodes of the Web of data graph.

1. Based on the place a node holds in the tree structure, hyperNodes can be divided into two basic types of hyperNodes – the ones that participate in forming the structure of a tree (structural), and those placed on the end of branches, holding some data (terminal). HyperLiterals are terminal nodes, while all the other types of hyperNodes are structural.

Structural nodes don’t have a value, unless they are directly connected to terminal nodes. In that case, they act like a primitive data types that can be represented by single values. These „primitive“ hyperNodes connect structural and terminal nodes and can be regarded as the third class of nodes (peripheral). They can have a literal value, like a primitive data type, or can be a reference to another hyperObject, in both cases using a terminal node to express themselves.

2. When projecting an (adjusted) RDF onto the Web, additional nodes that are not originally contained in the RDF graph are added. In that sense, another criteria of classification is by whether a node is part of the original RDF graph. HyperObjects and hyperLiterals represent nodes of the original RDF graph (URI references and literals, respectively) and can be called basic hyperNodes, while the other hyperNodes (hyperArrays, hyperReferences) that make the projection possible could be called auxiliary hyperNodes. What happened to blank nodes? Nodes without names don’t exist in Hypernotation, but there are nested hyperObjects that vaguely reflect the idea of chaining blank nodes.

3. Another way to classify hyperNodes is by data types, and corresponding HTML tags used to represent them. There are four possibilities: objects, arrays, references and plain text. They are realized using semantic HTML tags <ul> (and <li>), <a> and no tag at all. HyperObjects and hyperArrays are encoded as lists, while the value of a hyperReference is a single hyperlink. A hyperLiteral is realized as a plain text (or a hyperlink).

The first method of classification seems like the most natural one, so it will be used for the purpose of this post. By this criteria, nodes can be:

Structural hyperNodes


A hyperObject is basically a URI reference realized on the Web of data. It can be regarded as an object (in the object-oriented model sense) realized using fundamental Web standards.

After being looked up, a hyperObject returns a collection of typed links that it establishes with adjacent hyperNodes. This list is written using a <ul> HTML tag, while typed links are realized with <a> tags. A hyperObject can be connected to all other types of hyperNodes (including other hyperObjects), but these must reside on the same website as the hyperObject, and be its child nodes in the website tree.

Such a realization of a hyperNode resembles the view of a folder in a file system, which shows its “inner structure”, containing a list of its subfolders that enables further navigation through the directory tree.

A hyperObject is a property of its parent object in a tree. If a  property is multi-valued, it will be implemented as an element of a hyperArray. The standard „root“ hyperArray is, so the root hyperObjects have the URIs in the form of You can think of hyperObjects as of REST item resources, while keySegments are their IDs.


A hyperArray is a structural hyperNode that is used when a hyperObject is described with a multi-valued property, i.e. when there are several RDF triples with the same subject and predicate. In that case, there is a need for an “internode” which is a kind of a local “namespace” connecting the parent hyperObject with different values ​​of the same property. Here, the equivalent REST concept would be a collection (or a list) resource.

A hyperArray returns a list of key segments, tree links to the child hyperNodes realized using HTML <ul> and <a> tags. While a hyperObject acts as the subject of an RDF triple containing the list of predicates, a hyperArray gets the role of the „predicate“ connecting the subject to a list of objects (objects in the RDF sense).

A hyperArray represents a class that is a subclass of the range of the property used as a predicate, while its elements are instances of these classes. For example, the hyperArray represents the class with the meaning “a person Chuck Norris knows” that reduces the class of all people (foaf:Person), which is the range of foaf:knows, to the set of people known by Chuck. Compared to a hyperObject that returns a list of typed links, a hyperArray is a list of untyped tree links (key segments) that have an implicit meaning of “has an instance”, being the inverse properties of rdf:type.

Peripheral hyperNodes

Primitive hyperNode

If a hyperObject is described by the rdf:value property (i.e. connected to a terminal node representing its main value), the hyperObject’s nature changes. The fact that it can be represented by a single value turns this hyperObject into what can be considered a primitive data type rather than an object.

In different object-oriented languages ​​primitive types and their relationship to objects are implemented in different ways. In Java, there are eight primitive data types implemented as special data types built into the language. However, Java provides a class for each to enable them the object’s functionality.

Similarly, in JavaScript objects can be instantiated behind the scenes what allows treating primitives as objects. For example, "foo".length will instantiate temporary String object and return the value its length property (3). On the other hand, every object has the method valueOf() which returns a single-value representation of the object. For example, new String("foo").valueOf() will return "foo".

Hypernotation is designed with a similar kind of flexibility in mind. That is, a hyperObject that has a value can be realized in two ways: as a primitive data type or an object, depending on the situation. If there are several equally important properties where rdf:value is one of them, the hyperObject can be realized as a regular (structural) hyperObject. That is, it will return the list of its child nodes with rdf:value realized as a separate node on the hyperObjectURI/rdf_value path.


On the other hand, if rdf:value is dominant (or, what’s often the case, the only one) property, the hyperObject can directly return that value instead of the list. This way, the value can be obtained by sending an HTTP request directly to the hyperObject URI, which is often more intuitive and practical.

If realized this way, a "primitive" hyperObject can be called hyperAttribute.  Here, the downside is that the list of other potential properties is hidden. The solution is to provide an alternative standard path for showing hyperObject's properties - "hyperObjectURI/_" (which will be discussed in more detail in future posts).

If a hyperObject has no properties whatsoever, one should distinguish this kind of  hyperObject from one with the value "" (an empty string) realized as a hyperAttribute. A hyperObject without any branches should return an empty list, i.e. "<ul />", which is in Hypernotation equivalent to the "null" value.


If a hyperObject has just one property „owl:sameAs“, it becomes a hyperReference. As in the case of „primitive“ hyperObjects, instead of returning a list with just one property (a hyperLiteral hyperObjectURI/owl_sameAs), it can return the value of owl:sameAs. It’s realized with HTML <a> element, a hyperlink that can point to any resource on the Web. Therefore, a hyperReference can be considered the special case of a hyperObject that has just a single owl:sameAs property.

However, a hyperNode that starts as a hyperReference can eventually evolve into an hyperObject, described by many additional properties besides the initial owl:sameAs. Or the reason of creating a hyperReference can be describing an external resource in the first place (which is the only way to do that in Hypernotation), than owl:sameAs has a role of telling that the described hyperObject is the same as the target hyperObject. In this situation, all properties are important and the list of properties (with owl:sameAs as one of them) should be returned the same way as with an ordinary hyperObject.

Terminal hyperNodes

A hyperLiteral is a plain literal realized in the Hypernotation context. It’s a node that represents raw data itself and is used when a hyperObject has a single-value representation that is expressed with rdf:value property. The URI of a hyperLiteral, therefore, always has the form of hyperObjectURI/rdf_value and it doesn’t branch further. Using its own URI is one way of publishing a hyperLiteral. The other way is to implement it as a value of hyperAttribute, what was discussed earlier in this post. This type of literal returns a plain unformatted text (atomic data), and can be called „raw data“ literal. Dealing with different data types is discussed in the post Getting rid of typed literals.

A hyperlink literal, on the other hand, is another type of a terminal hyperNode. Simply put, it’s the value of a hyperReference. A link from a hyperReference to its target object is another property, so it must exist as a separate hyperNode as well. Its URI has a form of hyperReferenceURI/owl_sameAs, and returns <a> tag with a referred resource's URI in the href attribute. Similarly as with the „raw data“ literal, it can be published in two ways, using its own URI or returning the value via its parent hyperReference.

Classification of hyperNodes

The above image shows the three basic types of hyperNodes and their mutual relationships. Hypernodes that strictly participate in forming a structure and are not connected to terminal nodes are in yellow column. P1, P2, etc. are different properties (typed links). A hyperObject that is described with rdf:value or owl:sameAs is considered peripheral hyperNode (green column). The two ways in which can be realized are shown on the image: one in which it returns the value of the hyperLiteral, and the other where it's realized as a regular hyperObject and the only way to get to the hyperLiteral is using its own URI (blue column). The additional hyperNodes it is connected to can be both structural and peripheral, so they are placed on the border between the yellow and green columns.

Bringing together the RDF and OO models in the Semantic Web

The RDF model has many similarities to the object-oriented model. These are described in A Semantic Web Primer for Object-Oriented Software Developers:

Domain models consist of classes, properties and instances (individuals). Classes can be arranged in a subclass hierarchy with inheritance. Properties can take objects or primitive values (literals) as values.

The same document states that some of the differences are that RDF is theoretically based on logic, a property in the RDF model is the “first-class citizen”, while in the OO (object-oriented) model it’s defined in the context of a class. The RDF model does not have methods and unlike in the OO model, all parts of the RDF graph are public.

Despite the differences, the basic ideas the OO and RDF models are based on are similar. Objects (resources) are described by properties and relations to other objects and form a graph. One can make an analogy between an object and a URI reference, a primitive data type and a URI reference linked to a (literal) value. This comparison helps understanding not just Hypernotation but the RDF model in general as well.

Hypernotation is most easily understood as an OO model applied to the Web. The Web can be seen as a global root object, while all other objects (resources) are properties of the Web or its children’s objects. These resources are represented by hypernodes, i.e. special types of web resources. Hypernotation sets the rules for publishing hypernodes and linking them into a consistent system of the Web of data.

Dot notation is popular in object-oriented programming languages as an intuitive way of representing and manipulating variables. Dot notation enables unique names/paths representing a location in hierarchy. Hypernotation applies the idea of dot notation to the global context, using URIs as paths for Web “variables”. (where “/” (slash) is used as delimiter instead of “.” (dot)).

On the other hand, it’s based on the RDF model, so the paths actually encode all kinds of relations between the variables, and every data structure can be decomposed to triples.

In this context, hypernodes are just variables, with a difference that the value of a variable is also a hypernode. Types of hypernodes–hyperobjects, hyperarrays and hyperreferences reflect the similar ideas from programing languages. Let’s remind ourselves of the example that I have had used in this blog so far.

Here a person (hyperobject) called Chuck Norris is described with a few properties/relations. You can see the structure that is basically a tree where the nodes represent his name, location and friends. Used terminology is “defined” in the prefix_ hyperarray, whose elements point to different vocabularies.

The names of nodes are URIs, obtained by concatenating all preceeding segments in one branch, similarly as in dot notation. For instance, the URI of the hypernode chuck is This hypernode, together with its child hypernodes describing it, constitutes a hyperobject that can be naturally represented as an object in the OO sense. For instance, in JavaScript, it would look like this:

var chuck = {
        foaf_name : "Carlos Ray Norris",
        foaf_based_near : {
                rdf_type : geo_Point,
                geo_lat : 45.45385,
                geo_long : 10.19273
        foaf_knows : {
                steven : steven,
                bruce : bruce

Here, the variable chuck represents the hyperobject The variables geo_Point, steven and bruce are also objects.

If you try to evaluate the object chuck in JavaScript, you will get "[object Object]" string back. Not very useful. The equivalent of “evaluating” in Hypernotation is looking up (de-referencing) a hypernode. So what will you get in return if you send the HTTP GET request to You’ll get the answer HTTP/1.1 200 OK with the following body:

    <li><a href="foaf_name">name</a></li>
    <li><a href="foaf_knows">knows</a></li>
    <li><a href="foaf_based_near">based_near</a></li>

As you see, Hypernotation is more verbose than JavaScript in this regard. It follows the REST HATEOAS principle, where links are used in the representation of a resource enabling a user (or an application) to follow for more information, i.e. to “move the application from one state to the next“.

Hypernotation is based on the RDF model, so describing resources is standardized thanks to the self-describing nature of RDF. In other words, you don’t need a documentation that describes properties such as “foaf_knows”, because it is the URI of a hypernode itself which is described according to the same universal principles. Also, Hypernotation uses semantic HTML elements. A link is always <a> and a list is always <ul>. As simple as that.

Let’s back to the JavaScript example. I simply named the variable chuck, but the problem is that we are dealing with the global scope here, and by global I don’t mean the property of the global window variable, but global in the context of the whole Web. Therefore, it’s important for variables to have globally unique names.

The problem of name collisions in JavaScript is solved by using namespaces, where all objects are nested in other objects forming a big object (tree) assigned to a single global variable, reducing the chance of name collisions. But if we imagine that instead of the browser window context we suddenly end up in the Web context with a huge number of global variables, solving the problem by picking a random name hoping that it’s unique is no longer a solution.

Earlier in this post, I said that the Web can be seen as a global root object, while all the other objects (resources) are properties of the Web or its children’s objects. How to represent the variable in the context of its relation to the global “Web variable” then? Let’s divide the path using the standard Hypernotation “/” delimiter.

http:/ / / data_ / chuck

We ended up with four nodes where is a kind of property, meaning "has domain". This fact can be represented with a strange-looking triple, where the property is a special in a sense that it's neither CURIE nor the standard URI, but it's still unique (which is important):

<http:/> <> .
<> data: <>.

The subject http:/ is the "root" segment common to all variables (hypernodes) names. It's a bit awkward looking, so perhaps it should be renamed to, say, a friendlier web. Also, when representing this path in JavaScript, we must put in square brackets because it is not a valid variable name (it contains ".").

Therefore, can be represented in JavaScript as web[''].data_.chuck. The names of other objects can be written in the similar fashion. In that sense, e.g. the foaf_knows hyperarray would have the following value:

web[''].data_.chuck.foaf_knows = {
	steven : web[''].data_.steven,
	bruce : web[''].data_.bruce

Don't be confused by the fact the hyperarray is represented as an object in JavaScript. It is a hash (associative array / dictionary), and in JavaScript a hash and an object are the same thing. You can also think of it as an array:

web[''].data_.chuck.foaf_knows = [web[''].data_.steven, web[''].data_.bruce];

In this case, the URIs would use indexes instead of keys. For instance, instead of, the URI would be

The key difference between hyperarrays and hyperobjects is that the in hyperobjects keys (properties) are URIs, while in hyperarrays they are strings. These keys (segments) represent different types of tree links–typed links and key links. In the previous post I described this in more details.

In the REST context, the hyperarray would be a collection or a list resource, while and would be item resources belonging to this collection (with steven and bruce as IDs). In REST, the name of a collection typically ends with a plural noun, while in Hypernotation a hyperarray always ends with a verb (relation).

In Hypernotation, the representation of this hyperarray would look as follows:

    <li><a href="bruce">bruce</a></li>
    <li><a href="steven">steven</a></li>

What about assignment? In the previous example we had the (hyper)array containing two variables: bruce and steven that are references to the corresponding objects. For instance,

web[''].data_.chuck.foaf_knows.steven = web['']data_.steven;

The hyperreference would return the following HTML:

<a href="">Steven Seagal</a>

Describing the object's property as a reference to another object is represented using triples as follows:

    foaf:knows <> .

                   owl:sameAs <> .

Now let's take a look of the property that is a primitive data type. The property foaf_name is one example. In JavaScript it looks like this:

web[''].data_.chuck.foaf_name = "Chuck Norris";

Analogously, in Hypernotation the hypernode will return a string:

Chuck Norris

Finally, the object chuck from the beginning of the post can be fully decomposed to triples in the following way.

    foaf:name <> ;
    foaf:based_near <>;
        <> .

    rdf:value "Chuck Norris" .

    rdf:type <> ;
    geo:lat <> ;
    geo:long <> .

    owl:sameAs <> .
    rdf:value "45.45385" .
    rdf:value "10.19273" .

    owl:sameAs <> .
    owl:sameAs <> .

Using extended CURIE, a syntatic sugar inspired by the flexible chained blank nodes syntax, this can be written much shorter:

    foaf:name [ rdf:value "Chuck Norris" ] ;
    foaf:based_near [
        rdf:type [ owl:sameAs <> ] ;
        geo:lat [ rdf:value "45.45385" ];
        geo:long [ rdf:value "10.19273" ];
    foaf:knows [
        steven [ owl:sameAs <> ],
        bruce [ owl:sameAs <> ].

The property rdf:value has a special role, i.e. it's always used for connecting hyperobjects to literal values. If the object is a literal, the property must be rdf:value, so this doesn't need to be written explicitly. Although theoretically owl:sameAs can be removed as well, we are not going to go there right now. For now, let's just write owl:sameAs using a friendlier CURIE is:. Therefore, the example can be even shorter/friendlier:

    foaf:name "Chuck Norris" ;
    foaf:based_near [
        rdf:type [ is: <> ];
        geo:lat "45.45385";
        geo:long "10.19273"
    foaf:knows [
        steven [ is: <> ],
        bruce [ is: <> ]

Now the RDF syntax has almost the same structure as the example using object literal notation written in JavaScript. Finally, let's write the full example, using the global root variable and prefix definitions as well:

<http:/> [
        prefix: [
                is [ is: <> ],
                rdf [ is: <> ],
                foaf [ is: <> ],
                geo [ is: <> ]
        data::chuck [
            foaf:name "Chuck Norris" ;
            foaf:based_near [
                rdf:type <> ;
                geo:lat "45.45385";
                geo:long "10.19273"
            foaf:knows [
               steven [ is: <> ],
               bruce [ is: <> ]

Voila! The Web has a domain that defines a concept chuck that has a name with a value "Chuck Norris" and a bunch of other properties, which prefixes are all nicely mapped in the prefix_ array. In other words, the object is fully comprised of triples.

Of course, you can say that the price is too high: one must use a lot more triples, together with dealing with rdf:value and owl:sameAs all the time. The elegancy of RDF is lost! That argument doesn't sound unreasonable. However, it think it's just the result of the false perception.

The real nature of triples is that they are building blocks. They are the lowest level of the abstraction of connecting (linking) data. Put in this perspective, complaining about the fact that triples are not elegant for work is similar to complaining that assembly language is not elegant for programming.

Or imagine a human language where everything is decomposed to a large number of the most basic sentences. Would that be elegant? Like we use more complex constructs in a language, we find "higher-level" data structures more intuitive and practical than triples.

Links in Hypernotation

In one of the previous posts I discussed the idea of two types of links on the Web: tree links and graph links. The Web can be seen as a collection of trees with hyperlinks connecting random nodes of these trees. These hyperlinks are what cause the trees to become graphs, thus the name “graph” links, while links between the nodes in hierarchy are called “tree links”.

Interesting thing about tree links, compared to hyperlinks, is their ability to encode information. This is possible due to the hierarchical paths that dictate the names of resources (nodes). This information is encoded in the paths (i.e. pathnames) and is based on the difference between the paths of two adjacent nodes, denoting a relation between those nodes.

Typically, they are just human-readable strings denoting relations in a hierarchy. However, in Hypernotation, these strings can also become CURIEs, i.e. shortened URIs that identify any type of relation.

The classification of links in Hypernotation looks as follows:

In the Hypernotation context, an RDF graph is projected onto the Web. In this RDF graph all links (properties or relations) have defined meaning. In Hypernotation, only typed links have the ability to “mean” different things, while hyperlinks and key segments always hold the same meaning.

Hyperlinks, or graph links, are not able to encode any information other than a (implied) direction. Therefore, their meaning is always the same. In the Web of documents, this meaning is vague – a hyperlink to another webpage can be interpreted such that the webpage “points to”, “mentions”, or “finds valuable” another webpage.

In the Hypernotation context, an act of hyperlinking can be also interpreted as “adding the value to the target”, but its meaning is different. Here, a hyperlink has the meaning “is same as”. In Hypernotation, a hyperlink always comes from a special type of a hypernode called hyperreference.

A hyperreference defines the same concept in a new context. In the “Chuck Norris” example used on this blog so far, Bruce Lee ( becomes a person Chuck Norris knows ( This new context requires a new web resource (hypernode), together with a connection (hyperlink) from that hypernode to the original one. This hyperlink would be written as follows:

<a href="">Bruce Lee</a>

The new hypernode is identified by a URI which pathname contains the information about the new context of the concept. defines Bruce as an acquaintance of Chuck. This path can be decomposed into the following triples:

    data: <> .

              foaf:knows <> .

                             owl:sameAs <> .

In Hypernotation there are two subtypes of tree links: typed links and key segments.

A typed link is a link that, in addition to having a direction, represents a particular relation between two resources. In Hypernotation, the subject and object of a triple are nodes in a tree identified by URIs. These URIs have the form of paths. The object (child) URI is created by adding a new path segment to the subject (parent) URI. The path segment, i.e. the difference between the two paths, encodes the type of the link.

subjectURI + segment = subjectURI/segment = objectURI

Here, the segment encodes the CURIE/shortened form of the URI identifying the relation. The CURIE is the segment of the another URI, meaning that multiple URIs can be combined in a new, “compound”, URI. This ability to use multiple URIs in a single “main” URI is one of the key features Hypernotation brings to the table.

Typed links are those “inner” URIs, encoded as CURIEs (prefix:name), whose prefix is defined on For instance, the subject node is described with a triple where the predicate is foaf:name and the object is

    foaf:name <> .

The difference between the subject and object URI is the segment in bold foaf_name. It’s the CURIE (foaf:name) that has its prefix defined on the hyperreference. returns the hyperlink pointing to the namespace, enabling that the property URI can be obtained by a HTTP GET request.

A typed link is realized with <a> element on the subject hypernode. Therefore, when HTTP request is sent to, this typed link will be one of the typed links returned in the list.

    <li><a href="foaf_name">name</a></li>

The typed link contains not only information about the direction to a resource, but also information about the type of the relation, identified by URI. However, the value and the name of a typed link need not to be the same. For example, take this triple:

        foaf:knows <> .

The difference between the subject and the object written as a relative path looks as follows:

<a href = "foaf_knows/steven">

In this case, we have two segments: the typed link “foaf_knows” and the key segment “steven”. Here, the type of the link “foaf_knows” is obtained after removing the key segment “steven” and differs from the value of the link, i.e. the relative link “foaf_knows/steven”.

Relative path has been used to help understand the idea of typed links. Of course, the path can also be absolute:

<a href="">

… instead of relative

<a href="foaf_knows/steven">

In this case, the value of the “href” attribute is no different from the URI of the object in the RDF triple. What’s important is the difference between the two URIs. Thus, a typed link can never be an external one, i.e. directed to a Web resource in a different domain (web site).

A typed link is defined by several constraints:

  • A typed link connects two hypernodes on the same web site, where the first node is the subject, and the second is the object of an RDF triple.
  • The URI of the subject is contained in the URI of the object. The type of a typed link equals the difference of the two hypernodes it connects, minus the optional key segment.
  • Its type is in the form of CURIE – an shortened URI using the “_” as delimiter (prefix_name).
  • A namespace to which “prefix1″ points in “prefix1_name” must be defined on the path “″.

Thanks to typed links, the URI of a hypernode contains not just a random identifier, but adds a whole another dimension. A hypernode’s URI path tells us what the hypernode means, in terms of its relations to another hypernodes.

Key segments

In the previous example, we touched on the idea of key segments. Key segments are also tree links, but they doesn’t encode the type of a relation. A key segment does not encode a CURIE, but a local ID, or the key in the context (namespace) defined by a typed link.

Key segments are used with the multi-valued properties. For instance, foaf:knows can have many values – people that a person knows. is a type of hypernode called hyperarray that cointins items that must be differentiated by indexes, or keys (a number or a string).

A hyperarray represents a class, which is a subclass of the property’s range, while the relation (key segment) between a hyperarray and its items, hypernodes which URIs end with key segments, like hyperlinks, always have the same meaning – “has an instance”. is, therefore, a class representing “a person Chuck Norris knows” and it is a subclass of foaf:Person, which is the range of the property foaf:knows. and are the instances of this class, representing concrete people.

"bruce and steven are foaf:Person chuck foaf:knows".

Here the key segments are in bold. The elements of a hyperarray can be hyperreferences or hyperobjects. In this case, bruce and
steven are hyperreferences, while chuck is a hyperobject and an item of the hyperarray.

Every path can be expressed using just typed links and key segments. You can think of them as of verbs and nouns. They can be combined in different ways to form “sentences”. In a human language, a sentence can usually be decomposed into a number of smaller, simpler sentences. The same holds true in Hypernotation–the URI of a hypernode is a “sentence” that can be broken into a number of smaller sentences–triples, that cannot be decomposed further.

The rules of decomposing URIs are simple–the number of triples in a URI equals the number of typed links (CURIE segments). If a key segment comes after the CURIE segment, the URI of the object is the URI of the subject + “CURIE/key”. If two CURIE segments are concatenated, the object doesn’t contain the key segment, and a CURIE segment is just added to the subject’s URI.

Introducing Hypernotation, an alternative to Linked Data

URL, URI, IRI, URIref, CURIE, QName, slash URIs, hash URIs, bnodes, information resources, non-information resources, dereferencability, HTTP 303, redirection, content-negotiation, RDF model, RDF syntax, RDFa core, RDFa lite, Microdata, Turtle, N3, RDF/XML, JSON-LD, RDF/JSON…

Want to publish some data? Well, these are some of the things you will have to learn and understand to do so. Is the concept of data really so hard that you can’t publish it without understanding the concepts of information and non-information resources? Do you really need to deal with the HTTP 303 redirection and a number of different syntaxes? It’s just data, damn it!

Really, how have we got to this?

I did a detailed analysis on the problems of Linked Data, but it seems that I missed the most important thing. It’s not about the Web technologies but about economics. The key Linked Data problem is that it holds a monopoly in the market. One can’t compare it to anything else, and thus one can’t be objective about it. There is no competition, and without competition, there is no real progress. Without competition, it’s possible for many odd ideas to survive, such as requiring people to implement HTTP 303 redirection.

Of course, one can argue that there is a diversity of syntaxes that can describe structured/linked/meta- data, but Linked Data is more than just a syntax. It’s a set of rules and values defining a framework based on the Web technologies. It attempts to apply the principles and technologies of the Web to the data world. It tries to project a (RDF) data graph to the Web graph using HTTP URIs as identifiers that enable resolvability.

This is what I mean when I refer to the lack of real alternative. What other method of publishing data follows the Linked Data principles/values? What other approach is there for building the Web of data using the same principles of the original Web of documents?

In my opinion this idea is huge, and Linked Data has clearly opened a new research space. But Linked Data is just one approach. Although based on the right starting ideas, it has gone the wrong direction.

Put in a wider perspective, that is kind of natural because Linked Data has been the first player in the game. It’s not reasonable to expected the perfect solution for all the problems in the period of just five years. Linked Data did some things right, and some things wrong. It has showed where the real problems are and what needs to be changed. It has paved the way for the evolution of new, alternative approaches.

A new, alternative approach

Here I am going to introduce an alternative to Linked Data called Hypernotation. Hypernotation is, like Linked Data, a method of publishing data on the Web.

In the last post I described the notion of projecting data in the form of a labeled, directed (RDF) graph to the Web graph. Hypernotation is a sort of a framework that enables the projection to happen in practice on the global scale. It sets a small number of universal rules and conventions that result in a consistent system–the Web of data.

The main conceptual difference between Hypernotation and Linked Data is in the level of granularity. In Linked Data there is a concept of “RDF molecule” as an element corresponding to the lowest level of granularity, that loosely refers to a set of triples describing a resource.

Hypernotation is focused on a finer level of granularity, dealing with atomic data that is then composed into more complex structures. Atomic data is represented via nodes of the Web of data graph. This node, called hypernode, can represent a thing (resource) or a literal value. Each hypernode is identified by an HTTP URI and is the object of the RDF triple whose subject and predicate are encoded in that HTTP URI. This makes the node inseparable from the triple that contains it.

Triples are basic building blocks for creating various data structures–objects, arrays, primitive types, references, literals. Those are further organized within collections of related objects, being part of graphs located on websites. Finally, all interconnected graphs form the global Web of data.

Hypernotation connects two related models–RDF and object-oriented model. The result of this relationship is merging the ideas of URI reference and object into the new concept called hyperobject. Its realization is inspired by the third model–the model of hierarchical file system. By using the two well-known paradigms, the RDF model becomes much closer to web developers and ordinary people.

The idea of folders is something that most people using computers understand. If you know how to make a folder and navigate through a file system, and if you understand the idea of hyperlinks, you know everything you need to understand how Hypernotation works. After all, the Web is just a bunch of trees plus shortcuts.

The object-oriented model is, on the other hand, the model most programmers understand. The data model in which every node has a URI (that is a path) enables elegantly creating objects out of triples and vice versa. This means much easier manipulating of triples and graphs in a programming environment.

Unlike the Web of documents, where the opacity axiom defines the concept of a URI telling us that the content of the URIs itself is irrelevant, the HTTP URI used in Hypernotation is a machine-readable path using the RDF graph URI pattern. It is fully transparent and contains the information that unambiguously defines the relationship between the resources represented by nodes.

The idea of ​​dot notation is applied to the HTTP URI path, where slashes (/) are used instead of dots (.). This way a system of namespaced variables is created that can be accessed by simple HTTP requests. Hypernotation is primarily represented using HTTP URIs on the Web, but is a flexible model that can be encoded in different formats and easily processed programatically. Thanks to the fact it’s optimized for the hierarchical structure, Hypernotation can be elegantly written in both JSON and XML formats.

Hypernotation does to data what the Web has done to documents. With the emergence of the Web, every web document suddenly got a unique global address. Just try to imagine explaining people how to get to some content every time instead of just sending them the URL. Now apply that to the data context: How radically the world (of data) will change if every piece of data suddenly obtains a unique global address? Sharing URIs is perhaps even more relevant in the context of “machines” communicating with each other.

Hypernotation builds upon the basic ​​Linked Data ideas and attempts to correct its mistakes. Hypernotation implements those ideas consistently while respecting the true nature of the Web, allowing us to finally use the full potential of the Web technologies. The result is a framework based on a small set of rules and conventions that enables a great level of simplicity and flexibility on one side, and a tremendous power on the other.

Hypernotation is based on the improved RDF model–one in which all nodes are identified by URIs. Like Linked Data, Hypernotation relies on the fundamental Web technologies HTTP and URI. However, it uses the third core Web technology as well–HTML.


I have mentioned the two new concepts: hypernode and hyperobject. The Web of data is a graph consisting of nodes and links between them. All the nodes are identified by HTTP URIs and are called hypernodes. A hypernode is an abstract element that can take different roles. It can become an object, an array, a literal, a reference… You can think of them as variables, just like in a programing language. In the context of Hypernotation (or the Web of data), they get the prefix (hyper-). Therefore, an object becomes a hyperobject, an array a hyperarray and so on.

In order to describe a typical hyperobject, we’ll need an example. Let’s use the same RDF graph example we used in the previous posts. The following image shows three version of the RDF graph: the first based on the current RDF model, the second based on the improved RDF model (every node is identified by a URI) and the third, that is further extended by extra nodes needed to project the whole graph to the Web.

The green bold ellipses depict the difference between the adjacent graphs. All web resources displayed as ellipses in the third graph represent hypernodes in the Web of data.

Take a moment to look at these three graphs. The first is human-friendly – easy to read but contains special cases – blank nodes and literals. The second one is more consistent – each node has a URI, what comes with a price of having additional nodes. The last graph adds a few extra nodes that makes a graph Web-friendly, allowing the complete projection to the Web and the full traversal through the graph.

The node is a typical hyperobject. Like a URI reference in Linked Data, it is identified by HTTP URI and returns a useful information when looked up. It represents some object (a person) hat is described with properties, like the object in the object-oriented sense.

If you are running over this graph and get to the node, you can continue the journey using three roads – three branches directed towards the adjacent nodes: foaf_name, foaf_based_near and foaf_knows.

Therefore, if you look up the node, this is what you’ll get. A list of three links (predicates) describing the node and representing new paths you can follow to continue traversal.

What makes this three links interesting is that they are typed links. For instance, the link that directs to has a type as well: foaf_name (foaf:name) is just a shorter way for writing the URI This means that the following triple can be extracted:

    <> <> .

Hypernotation is based on the idea that one shouldn’t reinvent elements on the Web. There are already Web elements for an unordered list and a hyperlink that evolved on the Web as semantic HTML tags <ul> and <a>. These are actually the only two elements that are needed for publishing data using Hypernotation.

Therefore, the list representing the node’s “point of view” containing the links directed to other hypernodes is encoded using HTML. After the look up, the server will return “200 OK” and the content will contain this syntax:

    <li><a href="foaf_name">name</a></li>
    <li><a href="foaf_knows">knows</a></li>
    <li><a href="foaf_based_near">based_near</a></li>

This approach brings several advantages. First, the Web of data can be browsed (traversed) using a regular Web browser. No need for special RDF or Linked Data browsers.

Second, if accessed programatically, the parsing is easy. It doesn’t require a special RDF libraries – a simple XML or HTML (DOM) will do the job. Parsing can be even done using regular expressions. Аfter all, <ul>, <li> and <a> are the only syntax elements you can get.

Third, HTML code can be used for describing data for humans and „machines“ at the same time–no need for two versions. In this example, the href attributes contain the CURIES (links) for machines to „understand“ the data, while the part between <a> and </a> that is visible in browser and is not a part of the RDF graph, contains friendlier prefix-free names intended for people. This part is not parsed and can contain any information or HTML tag (such as <img>) that can help with describing the resource.

Finally, this approach is easy to understand. The hyperobject can be understood as the folder chuck taking place in the folder data_. is the path telling where you are, the same way a folder or file path contains the location on a hard disk. Furthermore, the properties foaf_name, foaf_knows and foaf_based_near are just the subfolders of the chuck folder. If we open one of them, we’ll get their subfolders and so on.

As a matter of fact, you can publish data literally creating folders. This is obviously not the most elegant way to do it, but it’s a completely legitimate way of using Hypernotation. The result is a universal interface, and a person accessing data doesn’t have a clue if you’ve created a bunch of folders or used some powerful engine on the backend.


In the previous posts I’ve described how URIs are assigned to all nodes of an RDF graph. For example, the URI is formed by concatenating the property URI (predicate) in the CURIE form to the URI of the resource (subject). Therefore, the is a hypernode that, besides identifying a node in an RDF graph, encodes the triple:

    foaf:name <> .

This Web resource’s URI is thus comprised of three URIs: the subject, the predicate foaf_name (the CURIE for and the object An HTTP request sent to this URI, as in the example above, returns a list of typed links directed from this node to its immediate nodes. In this case the list will contain only one URI:

    <li><a href="rdf_value">value</a></li>

This is the link to the URI of a hyperliteral, a hypernode that differs in that it represents a literal value. This node doesn’t branch further, and an HTTP request sent to returns a response “200 OK” and content:

Carlos Ray Norris

Hypernodes described so far (hyperobjects and hyperliterals) correspond to the RDF concepts of URI references and literals, representing RDF graph nodes that are easily projected to the Web. However, there are cases where the implementation of RDF graph on the Web requires additional nodes that don’t exist in the original RDF graph.


In the above image, there are two triples:

<> foaf:knows <> .
<> foaf:knows <> .

When a propety is multi-valued, simple concatenating a subject URI and a predicate CURIE (in this case is not enough. It is therefore necessary to add another part to the URI–a key which is arbitrary but must be unique, relative to other keys of the same level. Therefore, – acts like an array with keys bruce and steven and is thus called hyperarray. After look up, the list containing the keys is returned:

    <li><a href="bruce">bruce</a></li>
    <li><a href="steven">steven</a></li>

The difference between this list and the list returned by looking up the hyperobject is that bruce and steve are not CURIEs. This way, the parser will know they are not predicates but keys used to distinguish between the different object of the triples containing the subject and multi-valued property foaf:knows.


What will you GET when looking up e.g. A simple hyperlink:

<a href="">Bruce Lee</a>

Again, the part between <a> and </a> is human-friendly. What you’ll put into it is up to you, although some conventions could be useful here–e.g. using rdfs:label, foaf:name, dc:title or similar properties based on the type of the resource. In this case foaf:name is used because the resource is of type foaf:Person.

This type of a hypernode is called hyperreference, because its purpose is to hold a reference to another hypernode.


A hypernode is a basic unit of the Web of data–an element that makes up its structure and is connected with other hypernodes. It is flexible in that it can take on different roles. In general, it returns a description of a resource the node represents. This description can be the list of adjacent nodes it is connected to, the data held by a hyperliteral, or the link of a hyperreference.

This is the first post on the Hypernotation series, where I’ve described the principles of Hypernotation and the basic structure of the Web of data. I hope I managed to explain the key ideas behind Hypernotation. Don’t worry if you didn’t understand everything. In the future posts I am going to describe all aspects of publishing and consuming data in the context of Hypernotation in more detail.

The Challenge of Building the Semantic Web

The Semantic Web is often described as an extension of the current Web. The idea of what extending the Web should look like can be seen in Linked Data.

In order to better understand the importance of Linked Data, one has to understand the context in which it emerged, i.e. the problem it has been trying to solve.

In Linked Data – design issues in 2006 Tim Berners-Lee wrote:

 Many research and evaluation projects in the few years of the Semantic Web technologies produced ontologies, and significant data stores, but the data, if available at all, is buried in a zip archive somewhere, rather than being accessible on the web as linked data.

Put in this perspective, Linked Data did an important thing –  it required that data is actually put on the Web, and demanded that resolvable (HTTP) URIs are used as identifiers. It set the rules of how to use the existing Web technologies to publish and connect structured data on the Web. The data that extends the old Web is interconnected and itself form a new Web, often referred to as the Semantic Web or the Web of data.

Therefore, one can say that Linked Data paved the way for structured data to evolve into what really can be considered as some sort of a web. This web, the Web of data, is perhaps not so magnificent as once seen in the Semantic Web vision, but, for the first time, the „Web“ part of the „Semantic Web“ has started to take off.

Different data models

The Web is a type of graph where all nodes (web resources) are identified by URIs and edges (links) have a direction, but not a name. This kind of a graph is called a directed graph.

On the other hand, the RDF model is a graph as well, albeit a different kind – a labeled, directed graph. It differs from the Web graph in that its edges (links) besides a direction have a label. Some nodes in an RDF graph can be identified by URIs (URI references), but in addition there are nodes that are not identified at all (blank nodes), and ones that are identified by their value and not by a URI (literals).

RDF graph Web graph
Type Directed labeled graph Directed graph
Edges Have direction and labels Have direction
Nodes Not all are identified by URIs Identified by URIs

The above table shows the comparison between the two graphs. It’s clear that there are significant differences, but there are also a couple of things common to the both models:

  • nodes identified by URIs
  • all edges have a direction

Linked Data has been utilizing these similarities in order to „project“ an RDF graph to the Web graph. RDF nodes identified by URIs become web resources and the part of a graph describing them is retrieved by dereferencing those URIs. Other nodes in the RDF graph that are not identified by URIs are obviously not assigned a web resource; they are simply encoded in triples using RDF notations. Labeled (or typed) links are in a similar way realized on the syntax level.

Considering all the limitations it faced, Linked Data has offered perhaps the only reasonable solution. Of course, one can argue that there are many unnecessarily complicated aspects of it, partly caused by the same limitations and partly because of a number of problematic decisions.

But the essential idea of Linked Data seems right. After all, having all the differences between the two kinds of graphs, what is an alternative? Is there really any better way to project the RDF graph to the Web?

The problem

The key problem is not in Linked Data itself. Given all the circumstances, it does a pretty good job in “adjusting the data” for the Web. But the problem is the mere idea of „adjusting“ data, which implies that data is not modeled for the Web in the first place.

That is the wrong way of solving the problem. It is not data that need to be adapted, but the model data is based on.

One can argue that the RDF model is a good, even a perfect model that implements the ideas of description logic. But from the evolutionary perspective, there is just a perfect adaptation – perfection is not due to some absolute criteria.

The Web is based on simple and pretty clear rules – one of them is that all nodes (resources) are identified by URIs. One could ask, how it’s possible at all that in a model that has the ambition to live on such a Web, there are nodes that don’t follow this obvious requirement of the environment.

On the most basic, data model level, the directed graph has to somehow hold all the information of the directed, labeled graph. In other words, every element of the RDF graph has to be projected to some element of the Web graph.

A logical assumption is that for every node of an RDF graph, there has to be a corresponding node of the Web graph. But if we try to do that, soon we will face serious problems.

Let’s use the same RDF graph example we used in the previous posts.

We can see from the above image that although URI references can be projected to the nodes on the Web identified with the same URIs, projecting blank nodes and literals is impossible.

Another challenge is projecting edges (links) from the RDF graph to the Web graph. Even if we find a way to project all the nodes, how to project the links’ labels on directed edges (hyperlinks)?

We can compare this kind of projection with projecting a 3D object onto a 2D plane, where the challenge is to map the information from the 3D space to the space with one dimension less.

Following this metaphor, we can refer to the label of a link as that third dimension. The hyperlinks in the (directed) Web graph just doesn’t have that dimension.

Projecting nodes

The good thing is that we have one clear requirement: to project an RDF graph to the Web graph, all nodes in the RDF graph must be identified by URIs. There is simply no other way to name nodes on the Web.

Therefore, the only way is to change the RDF model, so that every node gets a name – not just any name, but a URI. Blank nodes are not acceptable any more and the method of identification of literals must be changed, so that they are identified by URIs and not by values they represent.

Changing the RDF model may sound like a ridiculous idea at this time. After all, it is used for all these years and it is proved to work in various contexts. The problem is that it doesn’t work properly in the Web context. If we want to build the Semantic Web, the RDF model must adapt to the Web environment.

On this blog I’ve already described how each node in the RDF model can be assigned a URI using a very simple method, so I won’t go into details here. In short, the only way to do so is by utilising the paths, i.e. using names for nodes that correspond to graph traversal. In this way we can assign the URIs to what was previously considered as blank nodes. The same principle can be applied to literals as well. Therefore, with a relatively small changes to the model, all nodes can be assigned a URI.

In the context of projecting nodes, a special attention must be given to literals. Literal nodes are special in that they hold two pieces of information: a name (URI) and the data they represent. Therefore, they can be referenced by the name and by the value.

In the Web context, nodes are always referenced by their names (URIs). But in the context of RDF notations, it makes more sense to reference a literal by its value. Therefore, one must have a way to come up with a literal URI based on the context.

Take an example where two literals are referenced by values:

<> foaf:nick "Chuck Norris" .
<> foaf:nick "Fatality" .

What are the URIs of the literals? We can mint the <>, but there is simply no additional information to differentiate the two. This can be solved with an additional node that takes place between the URI reference and the literal:

    foaf:nick <> .

    rdf:value "Chuck Norris" .

    foaf:nick <> .

    rdf:value "Fatality" .

Now, we can easily come up with the URIs for both literals:

<> and <> respectively.

Here, rdf_value is used as a convention by which every literal’s URI ends. In the future posts, we’ll see that this approach to literals adds other important benefits, especially when it comes to bringing the RDF model and OO model closer together.

In any case, we made the first important step in the challenge of projecting the RDF graph to the Web graph the proper way. Now we have the RDF model in which all nodes are identified by URIs.

As seen in the above image, every node in the RDF graph can be projected to a node in the Web graph.

Projecting edges

We’ve figured out how to assign a URI to all RDF nodes, but what about the inherently nameless links in the Web graph? How to project labeled edges of the RDF model to the nameless edges of the Web graph?

In the first problem we had to change the RDF model. Does it mean that we must change the Web to come up with named links? After all, there is no way for directed edges in a directed graph to hold any other information than a direction.

Fortunately, no.

The thing is, the Web is not just a random directed graph. It contains websites which form hierarchical structures. These trees are comprised of nodes from a single website, organized in a hierarchical order.

In a tree, the URI of every child node is the URI of its parent + „something“. That „something“ is what links these two nodes, representing the information that a „tree“ link holds, i.e. its name. These links can have a name, being quite different from the typical nameless „graph“ (hyper)links.

The ability for „tree“ links to hold a name is what we are looking for. Now here comes the exciting part. Nodes linked with tree links form paths, the same kind of paths we used for naming all the nodes in the RDF model! In fact, the named (or typed) links appear almost by themselves when using the concept of paths to assign all nodes names.

In the RDF context, in a tree path, the parent + „something“ equals to the subject URI + predicate CURIE, where the object (child) URI becomes the resulting URI created this way.

      <> .

The predicate URI is represented in the form of CURE (prefix:name), that needs a definition of the prefix. Given that a web site can be seen as the namespace of all of its web pages, the prefix is defined for every website separately.

For now, let’s see what projecting the labeled edges will look like in practice:

In the above image, one can see that CURIEs represent the differences between the URIs of adjacent nodes in the hieararchy. Or from the RDF point of view, the differences between the URIs of the subject and the object in a triple.

However, what about the edges targeting nodes from the external websites, that are not part of the hierarchy? In the image, these edges are assigned the red question marks – another problem we have to solve.

Projecting references

In the RDF graph example, there are triples in which the object is not the child in a hierarchy. For instance:

<> foaf:knows <> .

In the Turtle notation, this triple expresses all the information we need to know. However, there are actually two distinct concepts here: the reference and the target node. The reference can be understood as a variable, or a node that points to the target. If we want to project this relation, we must separate these two concepts. In other words, we will need an additional node that will act like the reference.

If we follow the same principles as where the object is a child node, we will end up with the URI like this:


This way, we encoded the information about the link as the difference between the nodes <> and <> .

In the Web context, "pointing to" means "linking to". Thus, we will use hyperlinks to connect references with targets. Therefore, the URI <> identifies the reference that holds a hyperlink to <>.

It says that the first node is the same as the second, meaning the hyperlink represents the „same as“ relation. Because hyperlinks can’t hold a name, they all have to share the same meaning. In the RDF context, this meaning can be expressed with the property owl:sameAs.

In the image, three hyperlinks are depicted in the blue color. Also, one can notice the intermediate node taking place between and the two child reference nodes.

The tree links between this node and its children nodes are not named by a CURIE, thus not defined explicitly. Their names are the "keys" that distinguish the members of the "array" the node represents. The implied property between the members and node is rdf:type, because is a subclass of the range of the foaf:knows property (i.e. foaf:Person). This will be discussed in more detail in the future posts.

Prefix definitions

We are almost finished. The final thing we have to do is to provide a "dictionary" - the definitions for the all prefixes used in CURIEs.

The question is whether it’s better to place all data together with prefix "subtree" on a single tree branch in a website or to use several branches. The latter approach allows binding the prefix_ segment directly to the website address which results in somewhat nicer URIs.

In practice this will look like this: one branch of the website tree is used for projecting the RDF graph, and another is used for prefix definitions, in a similar way it’s done in RDF notations.

In this case, the prefixes rdf, foaf and geo that are used in CURIEs are the child nodes of the standard prefix_ node. These nodes are reference nodes to relevant namespace URIs. For instance, the foaf prefix will be defined as follows:

<> prefix: <> .
<> owl:sameAs <> .

In the image below, the namespaces are shown as single nodes for simplicity. They are of course nodes in the websites forming similar structures as the other depicted websites.

The final image shows the projected RDF graph in a simpler and friendlier way. As opposed to the previous image where full URIs are written together with reduntant typed links, here the nodes' names are CURIE segments. The name of a node represents the predicate between it and its parent node. The full URI of any node can be obtained by connecting all the nodes in the hierarchy.

Literals are terminal nodes and are represented as leaves, while their CURIEs are not shown explicitly. A literal must always end with the rdf_value segment, so they are implied. For instance, the URI of the rightmost leaf can be obtained in the following way:

   + '/' + 'data_'
      + '/' + 'chuck'
         + '/' + 'foaf_based_near'
            + '/' + 'geo_long'
               + '/' + 'rdf_value' =


Reference nodes have a blue border and provide the connection using the hyperlink (blue arrow) to a target node. They can also be considered terminal nodes in the website context, because as literals they can’t branch further.

The final image shows that the projection of an RDF graph onto the Web is possible and can be done in an elegant way.

What the Web looks like

How do you imagine the Web (or do you at all)?

I imagine it like this:

First, let’s imagine a small website. The website typically has a hierarchical structure, i.e. a tree:


Here, the circles are web pages – the big one is a homepage, smaller circles are the next level of hierarchy and so on. The web pages of the same level are drawn at the same distance from their parent web page.

The web pages are connected with tree links. “Tree” links are links that can connect just a parent with a child in a hierarchy, hence the name.

As we know, trees can be organized in a number of different ways. Also, websites can vary from a few web pages to millions of them, causing a huge variety of possible arrangements and shapes.


But the Web with just tree links would be a boring place. Websites would be isolated islands not aware of each other. Also, the navigation between the web pages on a single website would be limited by the hierarchical order of web pages. The only way to get to a desired destination would be to find the right branch, and then, one node at a time, travel to your target web page. Just like you do everyday browsing through the folders in your file system.

Fortunately, there is another kind of link: a hyperlink. This link is not restricted by a hierarchical order and can magically connect any two web pages, not just in a single website, but on the whole Web. No matter how low in a hierarchy, when it comes to a hyperlink, every web page is equal.

Therefore, hyperlinks “break” the hierarchical order of a website and cause the tree to become a graph, so we can also call them “graph” links. An analog in a file system is a shortcut that allows you “teleporting” to any place on the disk.

With added hyperlinks, our four websites will look something like this:


We have added the third dimension. That way we ensure that the hyperlinks don’t overlap.

Now, let’s try to imagine that our little “Web” is growing. New domains are registered and new websites are emerging. New hyperlinks are added pointing from the new web pages to the existing ones and vice versa. As we see in the next image, everything is pretty much the same, just on a larger scale.


With a fair amount of certainty we can conclude that if we add more and more web pages and finally reach the real size of the Web, the same basic structure will remain.

However, there is a problem. In this kind of a structure, not all websites are equal. The distance of an average website is much shorter if you’re near the center, meaning the centrally positioned websites are more privileged than the ones placed on the edge. This arrangement doesn’t reflect the democracy of the Web. Also, this kind of spatial structure is not elegant in a sense that results in very long hyperlinks connecting distant parts.

Maybe we could give up a “flat” arrangement of the websites and allow them to float in 3D space, perhaps something like stars in the Universe. However, this doesn’t solve the problem. Still there are central parts and periphery parts like in the previous model.

We need a structure in which there won’t be any “edges”. An elegant solution would assume a 3D geometrical object that enables desired equality. We can imagine that the flat surface in the previous image is actually the surface of a big sphere that just looks flat when looked from a close distance, just like the Earth. A sphere ensures that all web pages are equal and controls the maximum length of a hyperlink. The most distant web page is no more distant than a diameter of a sphere.


The interiour of the sphere is white because of the vast amount of (white) hyperlinks. If we grab a website and pull it from the sphere, this white matter of hyperlinks become clearly visible.


Although this model looks pretty elegant, it still has a problem. The Web is growing constantly, meaning that the sphere is getting bigger and bigger. In a purely theoretical model, this doesn’t make much difference. But imagine the Web realized in the physical world. It would eventually become so big that the cost of material for hyperlinks would be extremely high.

In addition, the maximum size of the Web would have to be in some way restricted, not just due to cost but for practical reasons. Perhaps it would be placed in some kind of container, which would be holding it and possibly protect it from the environment. So how the Web would grow in such limiting conditions?

Well, the “cortex” of the Web can be folded, allowing much larger surface for websites to grow, while retaining the basic idea of equality and keeping the hyperlinks relatively short. Unfourtunately, I’ve not provided an image for this. But I have no doubt that your brain is capable of creating one.