I have been thinking about RDFa recently. With the announcement from Google and continued support from Yahoo/Search Monkey there is an increased buzz around RDFa. So, why RDFa and what is it good for?

TopBraid have had support for RDFa as long as I can remember – at least two years now. A user can point to a page with RDFa markups and TopBraid will import them. I remember getting existed about this and wanting to mark up all our web pages with RDF. This did not happen. At least partially because RDFa’s interaction with HTML formatting tags is pretty funky – the pages become harder to maintain. Then, there was also a persistent question on why do it at all. If one wants to provide data in RDF, why not do exactly that?

Each web page on a site, could have a corresponding N3 page. There is a standard tag in HTML that can be used to refer to related information. It can be used to point to the N3 page and/or the naming convention could be the same as for the given HTML page, but with the N3 extension. In TopQuadran’t case this would be an only alternative solution since the information on our web site is not in a database (at least not yet, this is changing). If it was in a database, then a way to go would be to provide a SPARQL endpoint.
I looked at the RDFa presentation by Mark Birbeck at the Semantic Technologies conference. I did not get a chance to attend – 7:30 AM is way too early for me , but I browsed through slides. Here is an example of RDFa markup (from the presentation):

This says that there is a dc:creator relationship between the header “RDFa: Now everyone can have an API” and a string “Mark Birbeck”.

Good, but we have not given a URI to the thing we are talking about – a presentation entitled “RDFa: Now everyone can have an API”.

Absence of the URI makes it somewhat hard to talk about the presentation. Any RDFa crawler/importer would have to generate some kind of URI for it. If we used the URI to begin with, we could have simply put the triple {:RDFa_presentation dc:creator “Mark Birbeck” } into an RDF file.

One issue may be the maintenance – having 2 files to maintain. But, embedding RDFa into HTML arguably creates even worse maintenance problems. And, if RDFa markup was automatically generated (most serious publishing happens by generation, not hand crafting), then the maintenance issue is not there – it is easier to generate RDF file in addition to HTML file that it is to generate and insert markups. Not to mention that automatic generation means there is a database that could be exposed through SPARQL.

There must be something I am missing here. While I could not attend Mark Birbeck’s presentation, I just discovered he is giving a webinar on July 12th: http://skillsmatter.com/event/ajax-ria/the-possibilities-of-rdfa-and-the-semantic-web/ng-94 . I think I will sign up and see if some of my questions get answered.

I’ll report what I learn here, so stay tuned.