Tuesday, September 13, 2011

Navigating resource representations with Atom documents

I've been browsing the web for some time looking for good documentation on Linked Data and Semantic Web. I've done some resource around the Atom and RDF formats, but I often lack the visibility from the resource itself and to the metadata representation.

I need some way of navigating my data. RDF, Atom and triples are good stuff, but they only describe a single resource and the links to other resources it relates to. For this case RDF seems like a bad solution since it doesn't reference other RDF documents. And basically, an RDF document doesn't need to be hosted on a web server at all.

Atom with its link element seems like a much better solution, since I can link to anything I like. Both Atom and RDF supports linking to thinks, even things not presented on the web, like a book. With Atom on the other hand, it can also reference other Atom documents, or even RDF documents.

Cool URIs are cool, but still not the bullet...

Linking with Atom documents looked like the nicest solution for me... until I came across the Cool URIs for the Semantic Web article over at w3.org. It's about accessing resources and metadata about them, to use RDF to describe a company or a person, with HTML for human readable presentation.

I found the article fascinating, but I didn't care much for the "you must use HTTP for identifying your resources" part. According to the article, the way to access resources is to request its URI, that must be a HTTP(S) URL using an Accept header of "application/rdf+xml" or alike, to request the wanted representation. If you wanted the HTML representation you ask the resource for "text/html". It walks through some scenarios using URI fragments, Content-Location headers and 303 redirects, with possibilities of making bookmarkable URLs for each representation. This is called content negotiation.

This content negotiation looks kind of like a good solution, but it gets harder if you want to store the metadata about the resource on a different system than the described resource, for example on a different domain. The article presents a solution for this too, in the case of HTML, where it's possible to reference back to the RDF metadata using the HTML link element with a relation type of "alternate" with the mime type set to "application/rdf+xml".

But isn't that circular references?

Ok, so with HTML we can reference back to the metadata referencing the HTML representation. That is called circular relations, and I find that a bad practice. You might consider the resource URI as a third reference here. If you define that as the master of all data for that resource you can get away with it, since it's mastering both the RDF and HTML representations and knows about them both.

My biggest problem with the back-referencing solution presented in the Cool URIs article is that I don't use HTML for presentation of my resources, but instead some other type of documents that are unfitted to provide the linking feature that both Atom and HTML has.

A plausible solution anyway?

So basically, to solve my problem using one of the possibilities presented by the Cool URIs article my webserver has to handle both the document representation and the metadata presentation. In my case, Atom is the preferred format, with a sane abstraction level.

For the moment I can live with that the same system presents both the representations.

The HTTP specification defines a Link response header that provides much the same features as HTML and Atom links do. On a request for a resource, regardless of the requested representation, I can supply the response with Link headers to all available representations for that resource.

Later it might also be possible to maintain a registry of external systems providing representations to the same resource. If we also finds a way of retrieving representations of things identified with non-HTTP-URI without content negotiation, now that would be a silver bullet, eeyy?

No comments:

Post a Comment