Andreas Blumauer

The LOD cloud is dead, long live the trusted LOD cloud

The ongoing debate around the question whether ‘there is money in linked data or not’ has now been formulated more poignantly by Prateek Jain (one of the authors of the original article) recently: He is asking, ‘why linked open data hasn’t been used that much so far besides for research projects?‘.

I believe there are two reasons (amongst others) for the low uptake of LOD in non-academic settings which haven’t been discussed in detail until today:

1. The LOD cloud covers mainly ‘general knowledge‘ in contrast to ‘domain knowledge

Since most organizations live on their internal knowledge which they combine intelligently with very specific (and most often publicly available) knowledge (and data), they would benefit from LOD only if certain domains were covered. A frequently quoted ‘best practice’ for LOD is that portion of data sets which is available at Bio2RDF. This part of the LOD cloud has been used again and again by the life sciences industry due to its specific information and its highly active maintainers.

We need more ‘micro LOD clouds’ like this.

Another example for such is the one which represents the German Library Linked Open Data Cloud (thanks to Adrian Pohl for this pointer!) or the Clean Energy Linked Open Data Cloud:

reegle-lod-cloud

I believe that the first generation of LOD cloud has done a great job. It has visualised the general principles of linked data and was able to communicate the idea behind. It even helped – at least in the very first versions of it – to identify possibly interesting data sets. And most of all: it showed how fast the cloud was growing and attracted a lot of attention.

But now it’s time to clean up:

A first step should be to make a clear distinction between the section of the LOD cloud which is open and which is not. Datasets without licenses should be marked explicitly, because those are the ones which are most problematic for commercial use, not the ones which are not open.

A second improvement could be made by making some quality criteria clearly visible. I believe that the most important one is about maintenance and authorship: Who takes responsibility for the quality and trustworthiness of the data? Who exactly is the maintainer?

This brings me to the second and most important reason for the low uptake of LOD in commercial applications:

2. Most datasets of the LOD cloud are maintained by a single person or by nobody at all (at least as stated on datahub.io)

Would you integrate a web service which is provided by a single, maybe private person into a (core-)application of your company? Wouldn’t you prefer to work with data and services provided by a legal entity which has high reputation at least in its own knowledge domain? We all know: data has very little value if it’s not maintained in a professional manner. An example for a ’good practice’ is the integrated authority file provided by German National Library. I think this is a trustworthy source, isn’t it? And we can expect that it will be maintained in the future.

It’s not the data only which is linked in a LOD cloud, most of all it’s the people and organizations ‘behind the datasets’ that will be linked and will co-operate and communicate based on their datasets. They will create on top of their joint data infrastructure efficient collaboration platforms, like the one in the area of clean energy – the ‘Trusted Clean Energy LOD Cloud‘:

reegle.info trusted links

REEEP and its reegle-LD platform has become a central hub in the clean energy community. Not only data-wise but also as an important cooperation partner in a network of NGOs and other types of stakeholders which promote clean energy globally.

Linked Data has become the basis for more effective communication in that sector.

To sum up: To publish LOD which is interesting for the usage beyond research projects, datasets should be specific and trustworthy (another example is the German labor law thesaurus by Wolters Kluwer). I am not saying that datasets like DBpedia are waivable. They serve as important hubs in the LOD cloud, but for non-academic projects based on LOD we need an additional layer of linked open datasets, the Trusted LOD cloud.

 

Martin Kaltenböck

GBPN Knowledge Platform using Semantic Technologies and Linked Open Data launched

The brand new web based GBPN Knowledge Platform has been launched on 21 February 2013. It helps the building sector effectively reduce its impact on climate change!

It has been designed as a participative knowledge hub and data hub harvesting, sharing and curating best practice policies in building energy performance globally. Available in English and soon in Mandarin, this new web-based tool of the Global Buildings Performance Network (GBPN) aims to stimulate collective research and analysis from experts worldwide to promote better decision-making and help the building sector effectively reduce its impact on climate change. To sustain and accelerate change in the building sector, the GBPN encourages open and transparent access to good quality and verifiable data. The data can be used and re-used in HTML, PDF and machine readable raw data (CSV) formats – provided by a Creative Commons Attribution (CC-BY 3.0 FR) license.

The GBPN Knowledge Platform is built on Drupal CMS and seamless connected with the PoolParty Semantic Information Management Platform of Semantic Web Company. Thereby this knowledge platform makes use of semantic technologies and Linked Open Data (LOD) principles and techniques under the hood. A lot of the available data of the various GBPN tools is provided as (linked) open data under a Creative Commons Attribution license. The Semantic Web Company is responsible for conceptual design and technical implementation of the GBPN Knowledge Platform.

As follows an overview and description of the most important features, tools and services of the information management system.

Continue reading

Thomas Thurner

Quality energy data released: buildingsdata.eu celebrates International Open Data Day

In advance of the 3rd International Open Data Day (http://opendataday.org/) on Saturday 23rd February 2013, BPIE has now made its online knowledge assets “open data ready” by enabling downloads in raw data CSV format, as well as in PDF form.

The comprehensive open data portal presents facts and figures collected in the context of BPIE’s ‘Europe’s Buildings under the Microscope’ study released end of 2011 (see http://bpie.eu/eu_buildings_under_microscope.html). It includes a wide variety of technical data never before collected EU-wide.

The raw data export covers now:

  • 29 European countries
  • 10 building types
  • up to 18 climatic zones per country
  • a total building stock floor area nearly equivalent to the size of Belgium

The Open Data Portal provides data and statistics on:

  • Building stock performance (energy consumption, envelope performance, energy sources);
  • Building stock inventories reflecting floor area, construction year, ownership profile;
  • National policies and regulation;
  • Financial schemes (333 in total).

English: Open Data stickersIn addition, the user can access country fact sheets and definitions.

The data will be improved on an on-going basis and over time, the hub will get enriched with additional topics and information generated through data exchange projects and research partnerships.

BPIE now invites other organisations to add their data to the portal and grow www.buildingsdata.eu into the comprehensive knowledge hub on the energy performance of  Europe’s building stock.

more information

Enhanced by Zemanta
Thomas Thurner

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud

How a business-related application that exploits open-data may look like is presented to the Semantic Web Challenge 2012 by Amar-Djalil Mezaour, Julien Law-To, Robert Isele, Thomas Schandl (SWC) and Gerd Zechmeister (SWC). The paper describes a prototypic linked data application for the Online Hiring Market.

”Active Hiring” is a search based application providing analytics on on-line job posts. This application uses services from the LOD cloud to disambiguate, geotag and interlink data entities acquired from on-line job boards web sites and provides a demonstration of the usefulness of linked open data in business setting.

from Active Hiring a Use Case Study, Paper, 2012

The search based application that combines semantic technologies and services to produce Human Resources (HR) analytics and highlight major trends on online hiring market. So Active Hiring is a demonstration of the benefit of combining open data sets and services with semantic tools as a support technology for increasing the accuracy of business applications. The Active Hiring demonstrator has been developed within the activities of the European project LOD2.

Full paper: Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study (PDF)

Enhanced by Zemanta