Pascal Hitzler

Reasonable Minutes from ISWC2010

I find it quite clearly noticeable that ontology reasoning is slowly making its way into mainstream. I begin seeing more and more applications – and industry investigations – picking up ontology reasoning in a matter-of-fact way. It seems that the bickering between scientists whether ontology reasoning is needed and/or useful is simply ignored when it’s about applications. And I very much welcome this. The “why” question is no longer important. In fact, even the “how” question isn’t. It’s being used – although sometimes perhaps not in an entirely conscious way, or in a way in which traditional reasoning applications would have been set up. And I very much welcome this as well.

I’m not talking about the fact that 2 out of 3 shortlisted papers for the best paper award at ISWC2010 are reasoning papers (which continues an established trend) – the winner has not been announced yet, there’s one day of the conference still ahead. Rather, I found it noticeable that reasoning prominently popped up in the first in-use-track session (and that wasn’t artificially arranged – in fact the first session was on life sciences applications). Another, less obvious case in point was the excellent keynote given by Evan Sandhaus on how nytimes.com utilizes semantic technologies. Among other things, they used GeoNames for inferring that news from Rome are also news from Italy, and they used Freebase for equating different identifiers for entities (in this case, politicians). Both of these were not explicitly executed or identified as reasoning steps, but this is only a matter of algorithmization. Conceptually, this is ontology reasoning at its simple best: The derivation of implicit knowledge by automated deductive means is reasoning, whether you are aware of it or not.

Talking about applications – Tania Tudorache from Stanford Biomedical presented the ongoing work on ICD-11, which centrally utilizes WebProtege and OWL. I think that this work is completely underappreciated by the Semantic Web community, perhaps because they are not aware of the impact of this. The ICD classification of diseases is the world-wide manual for medical diagnostics, which means that Semantic Technologies – in a rather invisible manner, as it should be – result in something which will be used by millions of physicians world-wide in their everyday work life. That’s what I call dissemination into practice!

By the way – in the context of such trends, it strikes me as oddly outdated to hear panel comments like “OWL still needs to show its worth – what can it do what you cannot do with rules?” It’s about time we stop bickering and pushing our pet paradigms and simply make things work and improve. (And no, I didn’t bother to comment during the panel. A discussion like this is futile, and I think more and more people are realizing this now anyway.)

Another keynote, by mc schraefel, very nicely also put applications into perspective. And highlighted some of the shortcomings of the currently hyped Linked Data. (Don’t get me wrong – Linked Data is extremely necessary for the Semantic Web on several accounts, but there are indeed lot of issues with it which we need to face.) Interestingly, the reactions I heard were mixed – but in an unexpected way. On the one hand, there was wide positive reaction that this was an excellent keynote with a very important message (which is also my take). On the other hand, I heard voices saying that we already know these and other problems with Linked Data, so there wasn’t really any useful content in the talk. I’m rather happy, though, that I didn’t hear anybody disagree with the general message.

Another very notable presentation, as part of the Semantic Web Challenge, was by Deborah McGuinness, on the data.gov work at RPI. The scope of dissemination is simply impressive, and another milestone in the making of the Semantic Web.

What else? The Semantic Web journal‘s first Editorial Board meeting took place at the conference (the first issue will be out shortly). My showcase volume of our book was not stolen this time. And there were a considerable number of very interesting-looking papers in the reasoning sessions – all of which I regretfully missed because I was tied up in parallel events. I’m looking forward to reading the papers, though.

On the culinary side, I have to say that I was a bit disappointed. Actually, the food was very good, but at previous conferences I’ve visited in China, it was much more exotic (from a European perspective, anyway) – perhaps the reason for this was that these other events I’ve been to were mainly Chinese, with only a few international guests. And, certainly, the cuisine was not at all as bad as the internet connection at the conference center. But we’ve already become very accustomed to having Semantic Web conferences with too little bandwidth, so it’s kind of expected anyway.

[Author: Pascal Hitzler]

Pascal Hitzler

Reasoning Problems?

I’m not going to explicitly comment on the panel discussion at ISWC08, entitled An OWL 2 Far? Let’s simply say it was controversial. I don’t mind controversial panels. In fact, I think that few things are more boring than a panel where all panelists more or less agree. But at the same time, at the ISWC08 panel, I think, an important message got lost, namely that we really need reasoning for the Semantic Web, and that we need diversity in reasoning. (Admittedly, some people said so, but I think the message didn’t really get through.)

So, instead, let me give you some web search problems. They all came up in my real life, so they are not artificially created. It seems to me that the Semantic Web should make answering them easier, but with the existing web resources, they are really difficult.

  • Find all papers having received best paper awards at ISWC conferences. I did that today, and it took me more than 30 minutes. And I’m not sure if I got all of them – indeed I would have missed one of them if I hadn’t known beforehand about that specific paper having received the award. Isn’t this a typical Semantic Web problem? (The results of my search are further below.)
  • There’s an owl-like bird in southern German woods, and in colloquial german it’s called Käuzchen. Try to find out the english name for this bird. I actually failed, though I think I got close to the answer when I merged web search with an external knowledge base (in form of a biologist I happen to know). And actually, simply going to Wikipedia and clicking on the English link is not enough, since I’m not looking for the Strix genus of owls, but rather for a particular bird …
  • Who is this researcher with the russian looking name who worked on resolution-based methods for the description logic EL? This also looks like a typical Semantic Search problem, which shouldn’t be too difficult if you have the corresponding knowledge (and background knowledge) available. I admit I failed on this one using traditional methods (unless you consider it a traditional method to ask Franz Baader by email about it.)
  • Are lobsters spiders? I.e. are lobsters classified as spiders by biologists? This one is actually tougher than you would think using traditional methods. Should be easy using Semantic Web knowledge bases and some simple reasoning, shouldn’t it?

For all these tasks (and many others), it seems to be apparent that Semantic Web Reasoning – and the availability of corresponding knowledge bases – would make the finding of answers much easier. The current reality of the Semantic Web is still quite a bit away from this. But we’re working on it.

Finally, as promised, the results of my inquiry about the ISWC best paper awards:

So why did I dig these awards out? Because I noticed that among these 6 papers there are 3 which are explicitly concerned with OWL. And the 2007 paper involves RDF inferencing. Talk about the importance of reasoning for the Semantic Web …

Author: Pascal Hitzler, AIFB, University of Karlsruhe (TH), Germany

Jana Herwig

The Day after Freebase went RDF

So what’s been happening on the blogosphere after John Giannandrea’s keynote at ISWC and the revelation that Freebase now produces Linked Data from an RDF service

Tetherless World sums up the Freebase facts (e.g. 156,000,000 assertions made; 1370 published types; 75 domains; graph model, identity, web based) and further points out that ontology creation “is a social process, and both freebase and semantic wiki are tools that enable users to create ontological vocabulary without worrying too much on building a comprehensive ontology.”

Inkdroid notes that the RDF service release “is important news because Freebase is an active community of content creators, creating rich data-centric descriptions with a wiki style interface, fancy data loaders, and useful machine APIs.” This is followed up by a quick and handy tutorial how you can get machine readable data back from freebase using a URI with Freebase. Conclusion:

So why is this important? Because following your nose in HTML is what enabled companies like Lycos, AltaVista, Yahoo and Google to be born. It allowed for agents to be able to crawl the web of documents and build indexes of the data to allow people to find what they want (hopefully). Being able to link data in this way allows us to harvest data assets across organizational boundaries and merge them together. It’s early days still, but seeing an organization like Freebase get it is pretty exciting.

Yves Raimond was the first to wonder on the public W3C LOD mailinglist: “now, to see whether it links to other datasets :-) ” – the idea of having linked data without the linkage would indeed seem like love’s labour lost. Semantic Focus / James Simmons seconds: “One downside is the data doesn’t appear to link to external resources, in a sense walling itself in. It should be trivial to link the topics that came from Wikipedia back to Wikipedia as well as DBpedia (which would be killer, by the way).” This is followed up a later post, where James expresses concerns regarding the relationship DBpedia / Freebase: “Freebase may see a drop in userbase growth and participation if it becomes a mirror of DBpedia (or vice-versa) and the popularity once garnered by one project may shift towards the other, or away entirely.”

More News / Andrew Newman puts the Freebase RDF service release in context with Cathrin Weiss’ “250 million triples on your iphone” submission, iMoCo, to the Billion triples challenges, also DBpedia and Semaplorer, developed at the University of Koblenz:

DBPedia stood out because it was the only one that allowed you to write data to the Semantic Web rather than just read the carefully prepared triples. For a similar reason I though SemaPlorer was good because they tried to do more than just the standard triples but went that extra bit further by making it more generic like integrating flickr. But they were all excellent, all of them showing what you get with a billion or more triples and inferencing.

That combined with the guys at Freebase making all of their data available as RDF and it was a big day for the Semantic Web.

ARQtick / AndyS plays a bit with the Blade Runner example cited by Freebase, e.g. takes a look at the graph, looks for interesting properties and extracts author names

N.B. If you want to follow ARQtick’s example: use the Linked Data browser plugin Tabulator or go to the Marbles site to view the RDF – without a data browser you’ll be redirected to the HTML page. You will also need it to make sense of rdf.freebase.com.

Pascal Hitzler

Improving OWL

I am sure you are aware that the Web Ontology Language OWL is currently undergoing a revision by the W3C OWL Working Group. The revised version will be known as OWL 2, and is going to feature some enhancements of expressivity, and will also designate tractable sublanguages (called profiles).

What is less known is that the current effort to revise OWL was in part driven by the workshop series OWL – Experiences and Directions (OWLED), where the initial proposal for the revision was discussed as early as 2005. The OWLED series has become a major forum for the OWL community, where practitioners in the industry and academia, tool developers and others interested in OWL can describe real and potential applications, share experience and discuss requirements for language extensions/modifications.

The next installment of the OWLED workshop, OWLED2008, has just released its call for papers. So if you have an opinion how OWL could or should be improved, consider writing a note and participate in the meeting and the discussions. The general chair of OWLED2008, Alan Ruttenberg, is actually also co-chair of the W3C OWL Working Group, which obviously isn’t just coincidence.

OWLED2008 is going to take place end of October 2008 in Karlsruhe, Germany, and is co-located with the International Semantic Web Conference (ISWC2008), and the conference on Web Reasoning and Rule Systems. So it’s going to be a good place to be to get up-to-date news on what’s going on in and around the Semantic Web.