URL, URI, IRI, URIref, CURIE, QName, slash URIs, hash URIs, bnodes, information resources, non-information resources, dereferencability, HTTP 303, redirection, content-negotiation, RDF model, RDF syntax, RDFa core, RDFa lite, Microdata, Turtle, N3, RDF/XML, JSON-LD, RDF/JSON…
Want to publish some data? Well, these are some of the things you will have to learn and understand to do so. Is the concept of data really so hard that you can’t publish it without understanding the concepts of information and non-information resources? Do you really need to deal with the HTTP 303 redirection and a number of different syntaxes? It’s just data, damn it!
Really, how have we got to this?
I did a detailed analysis on the problems of Linked Data, but it seems that I missed the most important thing. It’s not about the Web technologies but about economics. The key Linked Data problem is that it holds a monopoly in the market. One can’t compare it to anything else, and thus one can’t be objective about it. There is no competition, and without competition, there is no real progress. Without competition, it’s possible for many odd ideas to survive, such as requiring people to implement HTTP 303 redirection.
Of course, one can argue that there is a diversity of syntaxes that can describe structured/linked/meta- data, but Linked Data is more than just a syntax. It’s a set of rules and values defining a framework based on the Web technologies. It attempts to apply the principles and technologies of the Web to the data world. It tries to project a (RDF) data graph to the Web graph using HTTP URIs as identifiers that enable resolvability.
This is what I mean when I refer to the lack of real alternative. What other method of publishing data follows the Linked Data principles/values? What other approach is there for building the Web of data using the same principles of the original Web of documents?
In my opinion this idea is huge, and Linked Data has clearly opened a new research space. But Linked Data is just one approach. Although based on the right starting ideas, it has gone the wrong direction.
Put in a wider perspective, that is kind of natural because Linked Data has been the first player in the game. It’s not reasonable to expected the perfect solution for all the problems in the period of just five years. Linked Data did some things right, and some things wrong. It has showed where the real problems are and what needs to be changed. It has paved the way for the evolution of new, alternative approaches.
A new, alternative approach
Here I am going to introduce an alternative to Linked Data called Hypernotation. Hypernotation is, like Linked Data, a method of publishing data on the Web.
In the last post I described the notion of projecting data in the form of a labeled, directed (RDF) graph to the Web graph. Hypernotation is a sort of a framework that enables the projection to happen in practice on the global scale. It sets a small number of universal rules and conventions that result in a consistent system–the Web of data.
The main conceptual difference between Hypernotation and Linked Data is in the level of granularity. In Linked Data there is a concept of “RDF molecule” as an element corresponding to the lowest level of granularity, that loosely refers to a set of triples describing a resource.
Hypernotation is focused on a finer level of granularity, dealing with atomic data that is then composed into more complex structures. Atomic data is represented via nodes of the Web of data graph. This node, called hypernode, can represent a thing (resource) or a literal value. Each hypernode is identified by an HTTP URI and is the object of the RDF triple whose subject and predicate are encoded in that HTTP URI. This makes the node inseparable from the triple that contains it.
Triples are basic building blocks for creating various data structures–objects, arrays, primitive types, references, literals. Those are further organized within collections of related objects, being part of graphs located on websites. Finally, all interconnected graphs form the global Web of data.
Hypernotation connects two related models–RDF and object-oriented model. The result of this relationship is merging the ideas of URI reference and object into the new concept called hyperobject. Its realization is inspired by the third model–the model of hierarchical file system. By using the two well-known paradigms, the RDF model becomes much closer to web developers and ordinary people.
The idea of folders is something that most people using computers understand. If you know how to make a folder and navigate through a file system, and if you understand the idea of hyperlinks, you know everything you need to understand how Hypernotation works. After all, the Web is just a bunch of trees plus shortcuts.
The object-oriented model is, on the other hand, the model most programmers understand. The data model in which every node has a URI (that is a path) enables elegantly creating objects out of triples and vice versa. This means much easier manipulating of triples and graphs in a programming environment.
Unlike the Web of documents, where the opacity axiom defines the concept of a URI telling us that the content of the URIs itself is irrelevant, the HTTP URI used in Hypernotation is a machine-readable path using the RDF graph URI pattern. It is fully transparent and contains the information that unambiguously defines the relationship between the resources represented by nodes.
The idea of dot notation is applied to the HTTP URI path, where slashes (/) are used instead of dots (.). This way a system of namespaced variables is created that can be accessed by simple HTTP requests. Hypernotation is primarily represented using HTTP URIs on the Web, but is a flexible model that can be encoded in different formats and easily processed programatically. Thanks to the fact it’s optimized for the hierarchical structure, Hypernotation can be elegantly written in both JSON and XML formats.
Hypernotation does to data what the Web has done to documents. With the emergence of the Web, every web document suddenly got a unique global address. Just try to imagine explaining people how to get to some content every time instead of just sending them the URL. Now apply that to the data context: How radically the world (of data) will change if every piece of data suddenly obtains a unique global address? Sharing URIs is perhaps even more relevant in the context of “machines” communicating with each other.
Hypernotation builds upon the basic Linked Data ideas and attempts to correct its mistakes. Hypernotation implements those ideas consistently while respecting the true nature of the Web, allowing us to finally use the full potential of the Web technologies. The result is a framework based on a small set of rules and conventions that enables a great level of simplicity and flexibility on one side, and a tremendous power on the other.
Hypernotation is based on the improved RDF model–one in which all nodes are identified by URIs. Like Linked Data, Hypernotation relies on the fundamental Web technologies HTTP and URI. However, it uses the third core Web technology as well–HTML.
I have mentioned the two new concepts: hypernode and hyperobject. The Web of data is a graph consisting of nodes and links between them. All the nodes are identified by HTTP URIs and are called hypernodes. A hypernode is an abstract element that can take different roles. It can become an object, an array, a literal, a reference… You can think of them as variables, just like in a programing language. In the context of Hypernotation (or the Web of data), they get the prefix (hyper-). Therefore, an object becomes a hyperobject, an array a hyperarray and so on.
In order to describe a typical hyperobject, we’ll need an example. Let’s use the same RDF graph example we used in the previous posts. The following image shows three version of the RDF graph: the first based on the current RDF model, the second based on the improved RDF model (every node is identified by a URI) and the third, that is further extended by extra nodes needed to project the whole graph to the Web.
The green bold ellipses depict the difference between the adjacent graphs. All web resources displayed as ellipses in the third graph represent hypernodes in the Web of data.
Take a moment to look at these three graphs. The first is human-friendly – easy to read but contains special cases – blank nodes and literals. The second one is more consistent – each node has a URI, what comes with a price of having additional nodes. The last graph adds a few extra nodes that makes a graph Web-friendly, allowing the complete projection to the Web and the full traversal through the graph.
http://chucknorris.com/data_/chuck is a typical hyperobject. Like a URI reference in Linked Data, it is identified by HTTP URI and returns a useful information when looked up. It represents some object (a person) hat is described with properties, like the object in the object-oriented sense.
If you are running over this graph and get to the node
http://chucknorris.com/data_/chuck, you can continue the journey using three roads – three branches directed towards the adjacent nodes: foaf_name, foaf_based_near and foaf_knows.
Therefore, if you look up the node, this is what you’ll get. A list of three links (predicates) describing the node and representing new paths you can follow to continue traversal.
What makes this three links interesting is that they are typed links. For instance, the link that directs to
http://chucknorris.com/data_/chuck/foaf_name has a type as well:
foaf:name) is just a shorter way for writing the URI
http://xmlns.com/foaf/0.1/name. This means that the following triple can be extracted:
<http://chucknorris.com/data_/chuck> <http://xmlns.com/foaf/0.1/name> <http://chucknorris.com/data_/chuck/foaf_name> .
Hypernotation is based on the idea that one shouldn’t reinvent elements on the Web. There are already Web elements for an unordered list and a hyperlink that evolved on the Web as semantic HTML tags <ul> and <a>. These are actually the only two elements that are needed for publishing data using Hypernotation.
Therefore, the list representing the node’s “point of view” containing the links directed to other hypernodes is encoded using HTML. After the look up, the server will return “200 OK” and the content will contain this syntax:
<ul> <li><a href="foaf_name">name</a></li> <li><a href="foaf_knows">knows</a></li> <li><a href="foaf_based_near">based_near</a></li> </ul>
This approach brings several advantages. First, the Web of data can be browsed (traversed) using a regular Web browser. No need for special RDF or Linked Data browsers.
Second, if accessed programatically, the parsing is easy. It doesn’t require a special RDF libraries – a simple XML or HTML (DOM) will do the job. Parsing can be even done using regular expressions. Аfter all, <ul>, <li> and <a> are the only syntax elements you can get.
Third, HTML code can be used for describing data for humans and „machines“ at the same time–no need for two versions. In this example, the
href attributes contain the CURIES (links) for machines to „understand“ the data, while the part between <a> and </a> that is visible in browser and is not a part of the RDF graph, contains friendlier prefix-free names intended for people. This part is not parsed and can contain any information or HTML tag (such as <img>) that can help with describing the resource.
Finally, this approach is easy to understand. The hyperobject
http://chucknorris.com/data_/chuck can be understood as the folder
chuck taking place in the folder
http://chucknorris.com/data_/chuck is the path telling where you are, the same way a folder or file path contains the location on a hard disk. Furthermore, the properties
foaf_based_near are just the subfolders of the
chuck folder. If we open one of them, we’ll get their subfolders and so on.
As a matter of fact, you can publish data literally creating folders. This is obviously not the most elegant way to do it, but it’s a completely legitimate way of using Hypernotation. The result is a universal interface, and a person accessing data doesn’t have a clue if you’ve created a bunch of folders or used some powerful engine on the backend.
In the previous posts I’ve described how URIs are assigned to all nodes of an RDF graph. For example, the URI
http://ckucknorris.com/data_/chuck/foaf_name is formed by concatenating the property URI (predicate) in the CURIE form to the URI of the resource (subject). Therefore, the
http://chucknorris.com/data_/chuck/foaf_name is a hypernode that, besides identifying a node in an RDF graph, encodes the triple:
<http://chucknorris.com/data_/chuck> foaf:name <http://chucknorris.com/data_/chuck/foaf_name> .
This Web resource’s URI is thus comprised of three URIs: the subject
http://chucknorris.com/data_/chuck, the predicate
foaf_name (the CURIE for
http://xmlns.com/foaf/0.1/name) and the object
http://chucknorris.com/data_/chuck/foaf_name. An HTTP request sent to this URI, as in the example above, returns a list of typed links directed from this node to its immediate nodes. In this case the list will contain only one URI:
<ul> <li><a href="rdf_value">value</a></li> </ul>
This is the link to the URI of a hyperliteral, a hypernode that differs in that it represents a literal value. This node doesn’t branch further, and an HTTP request sent to
http://chucknorris.com/data_/chuck/foaf_name/rdf_value returns a response “200 OK” and content:
Carlos Ray Norris
Hypernodes described so far (hyperobjects and hyperliterals) correspond to the RDF concepts of URI references and literals, representing RDF graph nodes that are easily projected to the Web. However, there are cases where the implementation of RDF graph on the Web requires additional nodes that don’t exist in the original RDF graph.
In the above image, there are two triples:
<http://chucknorris.com/data_/chuck> foaf:knows <http://brucelee.com/data_/bruce> . <http://chucknorris.com/data_/chuck> foaf:knows <http://stevenseagal.com/data_/steven> .
When a propety is multi-valued, simple concatenating a subject URI and a predicate CURIE (in this case
http://chucknorris.com/data_/chuck/foaf_knows) is not enough. It is therefore necessary to add another part to the URI–a key which is arbitrary but must be unique, relative to other keys of the same level. Therefore, –
http://chucknorris.com/data_/chuck/foaf_knows acts like an array with keys
steven and is thus called hyperarray. After look up, the list containing the keys is returned:
<ul> <li><a href="bruce">bruce</a></li> <li><a href="steven">steven</a></li> </ul>
The difference between this list and the list returned by looking up the hyperobject
http://chucknorris.com/data_/chuck is that
steve are not CURIEs. This way, the parser will know they are not predicates but keys used to distinguish between the different object of the triples containing the subject
http://chucknorris.com/data_/chuck and multi-valued property
What will you
GET when looking up e.g.
http://chucknorris.com/data_/chuck/foaf_knows/bruce? A simple hyperlink:
<a href="http://brucelee.com/data_/bruce">Bruce Lee</a>
Again, the part between <a> and </a> is human-friendly. What you’ll put into it is up to you, although some conventions could be useful here–e.g. using
dc:title or similar properties based on the type of the resource. In this case
foaf:name is used because the resource is of type
This type of a hypernode is called hyperreference, because its purpose is to hold a reference to another hypernode.
A hypernode is a basic unit of the Web of data–an element that makes up its structure and is connected with other hypernodes. It is flexible in that it can take on different roles. In general, it returns a description of a resource the node represents. This description can be the list of adjacent nodes it is connected to, the data held by a hyperliteral, or the link of a hyperreference.
This is the first post on the Hypernotation series, where I’ve described the principles of Hypernotation and the basic structure of the Web of data. I hope I managed to explain the key ideas behind Hypernotation. Don’t worry if you didn’t understand everything. In the future posts I am going to describe all aspects of publishing and consuming data in the context of Hypernotation in more detail.