Jana Herwig

Linked Data @ TRIPLE-I: Measuring the size of a fact, not of a fiction

The TRIPLE-I 2008 conference ended three days ago, yet there are a couple of loose ends I’d still like to tie up. First of all: Linked Data. Tom Heath was invited to give a keynote on “Humans and the Web of Data” – there are a variety of roles in which people may come across Tom and his LOD related work:

He administrates the site LinkedData.org (on behalf of the Linked Data community), he is the creator of Revyu.com (“Review anything!”), which won him the 1st prize in the Semantic Web Challenge 2007, he was a co-organizer of the Linked Data on the Web Workshop at this year’s World Wide Web conference in Beijing, and he was an interviewee in my 12 seconds definitions mission @ TRIPLE-I – see his micro definition of Linked Data in the vid below. (To learn more about Tom and the different roles he fulfils, look here).


Tom Heath explains Linked Data TRIPLE-I 2008 on 12seconds.tv

His keynote was not so much an introduction to Linked Data (I should expect that a conference like TRIPLE-I/I-Semantics would typically attract people who at least have an idea of what Linked Data is about), but rather a confirmation that the Web of Data is no longer a fiction, but a fact. One of the often cited proofs is the growth of the LOD dataset cloud over the last year, as shown in the image below (clicky for biggy, visualization created by Richard Cyganiak).

At the same time – and this was accordingly acknowledged by a later presentation given by Wolfgang Halb which had been prepared collaboratively by Tom, Wolfgang, Michael Hausenblas and Yves Raimond – it’s not just the sheer number of triples on the web that counts. Over the course of one year, the efforts of the Linked Data community (who seek to populate the web with open data, data in RDF) generated 4 billion triples – but only 3 million interlinks.

Their paper was an attempt to measure the size of the Semantic Web based on interlinks. A brief excerpt from the conclusion:

We have identified two different types of datasets, namely single- point-of-access datasets (such as DBpedia), and distributed datasets (e.g. the FOAF-o-sphere). At least for the single-point-of-access datasets it seems that automatic interlinking yields a high number of semantic links, however of rather shallow quality. Our finding was that not only the number of triples is relevant, but also how the datasets both internally and externally are interlinked. Based on this observation we will further research into other types of Semantic Web data and propose a metric for gauging it, based on the quality and quantity of the semantic links. We expect similar mechanisms (for example regarding automatic interlinking) to take place on the Semantic Web.

Another point raised by Tom in his key note was the issue of trust: According to his research, there are five parameters that have an influence on whether we trust a source or recommendation on the web or not: experience , expertise, impartiality (we don’t trust a travel agent, because we can’t help but believe that she is mainly going to recommend the offer of her ‘favourite’ clients), affinity, and track record, with experience, expertise and affinity being the most important ones. A semantic people search engine Tom presented, Hoonoh.com (currently in alpha), thus allows to weight search results according to these three criteria.

Tom’s concluding statement emphasized that Linking Data makes sense not for the sake of it, but for the sake of being at the service of humans: “A web of machine-readable data is even more interesting from a human than from a machine perspective,” for instance in search engines like Hoonoh.com

Reblog this post [with Zemanta]
Andreas Blumauer

The social hub @ LinkedData Planet 2008

Eric HofferThe LinkedData Planet conference is over now. I had a great time here and met a lot of great and inspiring people. The exhibition area especially turned out to be THE meeting point of the conference. People from media companies, major IT-companies like IBM or from governmental and non-governmental organizations were there, meeting up with some of the most prestigious software providers and experts of the semantic web world.

Mike Bergman in SWC gear

And that says a lot about the semantic web both as a technology and a movement: The semantic future is made happen not behind closed doors or in some ivory tower, not thought up by some secluded genius, but by people, companies and research institutions that are as close to the heart of the web as one can be.

I learned a lot about the upcoming new release of UMBEL (Upper Mapping and Binding Exchange Layer) thanks to Mike Bergman (who you can see in the picture below, sporting Semantic Web Company gear). UMBEL (in the words of the project itself) has two purposes:

1) to provide a lightweight structure of subject concepts as a reference to what Web content or data “is about”;
and 2) to define a variety of binding protocols for different Web data formats to map to this “backbone.”

You might want have a look at the UMBEL subject concepts explorer provided by Mike’s Zitgist: Start exploring here, with a preset concept search for ‘Manager’.

I also learned more about the huge variety of possible applications which can be built on top of the Talis platform – thanks to Ian Davis. One example is the Lancashire Lantern WiCI – WiCI because it is a service providing Community Information.

And finally I met Richard Cyganiak in person who gave me a thorough overview of the Semantic Web index Sindice – try a search for Richard Cyganiak to see how it works (and to learn more about him, of course).

I ended up discussing possible applications using linked data with Tom Heath, Mike Bergman, Gregory Williams, Eric Hoffer (picture on top, see also his blogpost where he features the SWC “Escape from the Data Silo” logo) and Marco Neumann, both from Semantic Web Meetup NYC. It was a great evening!

Thank you, folks!

Read also pt. 1 of our conference report: LinkedData Planet in New York: A great community event for all things semantic

Zemanta Pixie
Jana Herwig

Linked Data pave the way to a meaningful web

You will probably know the name of Tom Heath: He won last year’s Semantic Web Challenge with his web application revyu.com, which “lets you review and rate absolutely anything you can name”, and has in the meantime joined Talis. This week, Andreas Blumauer did an email interview with Tom Heath – here is just a small excerpt, a bit of Tom’s response to the question whether Linked Data is just another Semantic Web alias:

Far from being a cynical marketing exercise, use of terms such as ‘Linked Data’ and ‘Web of Data’ simply represent a clarification of the intentions behind the Semantic Web vision. The label ‘Semantic Web’ has itself been a victim of semantics, which has not aided adoption of the underlying ideas.

How the Semantic Web develops over the next ten years will remain to be seen, but in the meantime it’s essential that we use terms that speak to people in clear terms, and convey more of the key features that can lead to a more meaningful Web. ‘Web of Data’ does just that, and ‘Linked Data’ is the means by which we are reaching that goal.

You can read the whole interview here.

The screen below – a visualization of the currently available Linked Open data – is taken from a talk Tom Heath gave in February 2008 in Amsterdam, at the occasion of the CATCH Programme and E-Culture Project Meeting on Metadata Interoperability. This snap shot of the LOD cloud documents its size in May 2007:

Now compare this to its size one year later (snap shot provided by Richard Cyganiak on May 8, 2008):
LOD CLoud May 2008