Andreas Blumauer

Open Intranet

The following blog post was used by Andreas Blumauer as a basis for a talk at TEDxVienna on Monday, November 29, 2010:



Open Data, Open Government, Open Source, Open Innovation – “Open” everywhere. Today I want to talk about another “Open something”: The “Open Intranet”. This might sound a bit radical but it will also help to reflect a little bit on the term “open” in general.

“Open Intranet” – isn´t this a contradiction by definition? What is understood by “Intranet”? It means a network of computers and users “within” some organisational boundaries. But boundaries don´t necessarily have to be closed as nature teaches us: Organisms aren´t closed systems. A watch would be an example for a closed system but living organisms tend to be open – to survive. Of course they aren´t totally open, in systems theory we are talking about systems which are structurally coupled with their medium when we refer to this special kind of openness. As an example, an immune system, having learned to recognise a class of virus it will remain sensitive to that and similar viruses in future. In contrast to this, imagine a fly walking over a painting of Rembrandt: Since the fly isn´t structurally coupled to the cultural space of human aesthetics it is not “open” to the beauty of Rembrandt´s work.

When we think of today´s intranets, we can see that they tend to be isolated from the world wide web, they don´t seem to perceive the internet as their medium. From a user perspective, those two systems aren´t connected to each other. Typically, when working on the intranet we jump from time to time to be in the “internet mode” and start to Google something, we copy it, jump back and paste it into the intranet. It´s the user who is the only part of the whole system connecting the internet with the intranet. Isn´t this exhausting for us?

And now I start with the good news: Intranets all over the world start to open up, slowly – but they do. It seems like the “pressure” from “the outside” just became too huge. In the first instance it seems that it´s not the data and the information which will “break” in, it´s rather the “cool functions” which web apps offer and which we (as digital natives) would like to have in our intranets too. We want:

  • better search,
  • more possibilities to interact with information,
  • integrated views instead of jumping around,
  • and we want more possibilities to self-serve our extensive hunger for more and well structured information.

On the information level intranets are still rather conservative: Typical pieces of information already “injected” from the web into an average intranet would be:

  • weather forecasts,
  • stock exchange rates,
  • time zones and
  • jokes.

How could companies use the web to inspire their employees (without opening up totally), how could the web “inject” the right amount of information into an intranet to make an enterprise portal as vivid as the web is being perceived by today´s typical end-user. How could this tremendous amount of data and knowledge on the web be “structurally” coupled with intranet repositories and workflows? What are the advantages a company could gain from publishing (at least some) data on the web?

Let me give you a few examples for intranet apps which have started to consume other information than jokes from the web:

  • Enterprise Mashups: Combine CRM systems with social networks like LinkedIn
  • Open innovation: Let´s bring the knowledge of consumers and producers together and improve certain products and services. As an example, just recently after BP´s oil spill more than 40.000 people came up with ideas on how to clean up  the oil, more than two dozen were deployed to help clean up the oil
  • Content Augmentation: Enrich content which is being edited, let´s say in an enterprise wiki, automatically with some background knowledge from Wikipedia or with news from a news company

Finally I will also give you two examples for use cases where companies expose and publish internal data on the web (without violating privacy) and benefit from it.

  • Wisdom of the crowd: The Canadian gold mining group Goldcorp made 400 megabytes of geological survey data available to the public over the Internet. They offered over $500,000 to anyone who could analyze the data and suggest places where gold could be found. The company claims that the contest produced 110 targets, 8 million ounces of gold, worth more than $3 billion.
  • Prize economics: Netflix, a movie rental service in the US has published data for a contest to improve their recommender engine. One team out of 50.000 contestants after nearly 3 years has improved the existing recommender engine by more than 10% and won 1 Million dollar

To end with a conclusion: What Tim Berners-Lee has demanded in one of his famous TED talks was “raw data now!”. It has started to become reality. Just think of all the “Open Government Data Initiatives” around the globe which were initiated since then. Now companies with a “Web DNA” have started to understand the value of open data and to contribute their “5 cents” to the global “open data cloud”. I think this will not only be of value for many companies but also will increase tremendously the chances to resolve some global problems in the near future.

Thomas Thurner

data.reegle.info – Linked Open Data on Clean Energy

Following the worldwide trend of Open Government Data as well as Linked (Open) Data the reegle.info team has decided to launch a reegle data portal in November 2010: data.reegle.info.

The idea of providing raw data (first mentioned by Sir Tim Berners Lee in the course of the W3C Linked (Open) Data movement) for free and unrestricted re-use follows the idea and objectives of the reegle.info information system as the single point of access for worldwide clean energy data (renewable energy as well as energy efficiency).

On data.reegle.info you can find data on stakeholders in the clean energy area as well as country (energy) profiles from the 1st day of the launch in November 2010 – later on the reegle.info team will open up its renewable energy and energy efficiency thesaurus (SKOS format) for public re-use and continuously will open up and provide more and more clean energy data on data.reegle.info. As license for data.reegle.info the Open Government Data License for public sector information is used. data.reegle.info follows W3C standards and recommendations for Linked Open Data as well as Open Government Data.

For developers data.reegle.info have created a comprehensive developer guide as well as a SPARQL endpoint as the central API to the reegle.info data. So the the reegle.info consortium  hopes that data.reegle.info initiates a lot of new (data) mash ups as well as innovative apps using data.reegle.info.

Andreas Blumauer

Les Kneebone: “Semantic web technologies are one solution to linking education data in Australia”

Les Kneebone is Project Manager at Education Services Australia Ltd.
Among other projects he is responsible for Schools Online Thesaurus (ScOT).

PoolParty Team asked Les a couple of questions about thesaurus management, linked data and the semantic web. Here is a short summary of this interview:

Why did you choose thesauri to organize your information? What kind of problems are you able to solve with this approach?

A thesaurus approach was chosen rather than a subject headings approach because we assumed (and continue to assume) that post-coordinate indexing will drive vocabulary-assisted discovery.

Which role does SKOS and/or Linked Data play in order to achieve your goals?

ScOT concepts are now published as URIs. This approach solves the problem of different ScOT versions in disparate systems.

What are the most important values you generate for your stakeholders? What kind of applications can be built or have been built on top of your thesauri?

The Achievement Standards Network (ASN) provides a model for profiling curriculum statements and linking those statements to education resources using various rdf vocabularies. By profiling curriculum statements to learning resources, more precise matching is achieved.

What are the most important arguments to use Semantic Web standards and linked data, especially in education?

The Australian education sector is characterized by many disparate systems in different education jurisdictions. Semantic web technologies are one solution to linking education data in Australia.

Why did you choose PoolParty to manage your thesauri?

We had already identified SKOS as an important standard for ScOT so it was natural to select PoolParty as a our new thesaurus management tool.

What are your future plans and next steps? How do you manage to get your thesauri used, how are you going to build an “eco-system” around your work? (Do you plan to publish ScOT on the LOD cloud? Under which licenses?)

Our vocabularies are currently for non-commercial use and we don’t anticipate any change to the license at this stage. The ScOT license requires attribution, permits derivatives that must be shared, and is for non-commercial use.

Read the full interview here.

Pascal Hitzler

Reasonable Minutes from ISWC2010

I find it quite clearly noticeable that ontology reasoning is slowly making its way into mainstream. I begin seeing more and more applications – and industry investigations – picking up ontology reasoning in a matter-of-fact way. It seems that the bickering between scientists whether ontology reasoning is needed and/or useful is simply ignored when it’s about applications. And I very much welcome this. The “why” question is no longer important. In fact, even the “how” question isn’t. It’s being used – although sometimes perhaps not in an entirely conscious way, or in a way in which traditional reasoning applications would have been set up. And I very much welcome this as well.

I’m not talking about the fact that 2 out of 3 shortlisted papers for the best paper award at ISWC2010 are reasoning papers (which continues an established trend) – the winner has not been announced yet, there’s one day of the conference still ahead. Rather, I found it noticeable that reasoning prominently popped up in the first in-use-track session (and that wasn’t artificially arranged – in fact the first session was on life sciences applications). Another, less obvious case in point was the excellent keynote given by Evan Sandhaus on how nytimes.com utilizes semantic technologies. Among other things, they used GeoNames for inferring that news from Rome are also news from Italy, and they used Freebase for equating different identifiers for entities (in this case, politicians). Both of these were not explicitly executed or identified as reasoning steps, but this is only a matter of algorithmization. Conceptually, this is ontology reasoning at its simple best: The derivation of implicit knowledge by automated deductive means is reasoning, whether you are aware of it or not.

Talking about applications – Tania Tudorache from Stanford Biomedical presented the ongoing work on ICD-11, which centrally utilizes WebProtege and OWL. I think that this work is completely underappreciated by the Semantic Web community, perhaps because they are not aware of the impact of this. The ICD classification of diseases is the world-wide manual for medical diagnostics, which means that Semantic Technologies – in a rather invisible manner, as it should be – result in something which will be used by millions of physicians world-wide in their everyday work life. That’s what I call dissemination into practice!

By the way – in the context of such trends, it strikes me as oddly outdated to hear panel comments like “OWL still needs to show its worth – what can it do what you cannot do with rules?” It’s about time we stop bickering and pushing our pet paradigms and simply make things work and improve. (And no, I didn’t bother to comment during the panel. A discussion like this is futile, and I think more and more people are realizing this now anyway.)

Another keynote, by mc schraefel, very nicely also put applications into perspective. And highlighted some of the shortcomings of the currently hyped Linked Data. (Don’t get me wrong – Linked Data is extremely necessary for the Semantic Web on several accounts, but there are indeed lot of issues with it which we need to face.) Interestingly, the reactions I heard were mixed – but in an unexpected way. On the one hand, there was wide positive reaction that this was an excellent keynote with a very important message (which is also my take). On the other hand, I heard voices saying that we already know these and other problems with Linked Data, so there wasn’t really any useful content in the talk. I’m rather happy, though, that I didn’t hear anybody disagree with the general message.

Another very notable presentation, as part of the Semantic Web Challenge, was by Deborah McGuinness, on the data.gov work at RPI. The scope of dissemination is simply impressive, and another milestone in the making of the Semantic Web.

What else? The Semantic Web journal‘s first Editorial Board meeting took place at the conference (the first issue will be out shortly). My showcase volume of our book was not stolen this time. And there were a considerable number of very interesting-looking papers in the reasoning sessions – all of which I regretfully missed because I was tied up in parallel events. I’m looking forward to reading the papers, though.

On the culinary side, I have to say that I was a bit disappointed. Actually, the food was very good, but at previous conferences I’ve visited in China, it was much more exotic (from a European perspective, anyway) – perhaps the reason for this was that these other events I’ve been to were mainly Chinese, with only a few international guests. And, certainly, the cuisine was not at all as bad as the internet connection at the conference center. But we’ve already become very accustomed to having Semantic Web conferences with too little bandwidth, so it’s kind of expected anyway.

[Author: Pascal Hitzler]