Andreas Blumauer

There’s Money in Linked Data

I believe that the ongoing debate whether there ‘is money in linked (open) data or not’ is a bit misleading. ‘Linked (open) data’ is not only the data itself. It’s much more, even more than yet another technology stack. Linked data is most of all a set of principles how to organize information in agile organizations that are embedded in fast moving and dynamic environments. And from this perspective there is a huge amount of money in it – but let me refine that a bit later.

networkMan

Crying out loud in 2013 that ‘there is no money in linked data’ is an important step towards the right direction because it points out that data publishers should be more precise with data licensing. Although quite flexible licensing models would already exist – it’s the people (and probably other legal entities) who forget to publish their data together with statements about the ‘openness’ of it. As a result, the data remains closed for commercial users. This hasn’t been properly noticed in the early days of the linked open data cloud since commercial users haven’t been around at all (in contrast to academic institutions which considered the LOD cloud to be a wonderful playground). It’s the same thing with linked data as a technology and linked data as a set of standards: the standards and the technology stack are mature now (just think about Virtuoso’s brilliant SPARQL performance, for example), but most people from IT still wouldn’t have things like URIs, RDF and SPARQL off the top of their head when they seek solutions for powerful data integration methodologies.

Why is that?

I believe that so far ‘linked data’ has always been perceived by people from outside the linked data core-community only as a new way to organize data on the web, thus technologies are still not mature for enterprises.

But the truth is, that linked data has at least a threefold nature. Linked data is

  1. a method to organize information in general, not only on the web but also in enterprises
  2. a set of standards which is flexible and expressive enough to link data across boundaries (organizational, political, philosophical), cultures and languages
  3. a way of using IT and information in a quite intuitive way, very close to the patterns like human beings tend to create realities, thus comprehensible also for non-techies.

I think that technologists have made a brilliant job so far with creating the linked data technology stack, its underlying standards, triple-stores and quad-stores, reasoners etc., and for specialists it’s absolutely clear why this kind of technologies will outperform traditional databases, BI-tools, search engines etc. by far.

But: the crucial point now is that enterprises have to adapt linked data technologies inside their corporate boundaries (and not only for SEO purposes or the like). The key question is not whether there is enough LOD out there for app-makers or not. High-quality LOD will be produced very quickly as soon as there are commercial consumers like large enterprises. I am not talking about use cases for linked data in the fields of data publishing or SEO.

The main driver for the further Linked Data development will be enterprises which embrace LD technologies for their internal information management.

It’s true that there are already some large companies (like Daimler – meet them at this year’s I-SEMANTICS in Graz!) dealing with that question but to be honest: there is not the same hype around ‘linked data’ as we can see with ‘big data’. IBM, Microsoft & Co. are not that interested in linked data of course because it is a platform by itself and doesn’t foresee any kind of lucrative lock-in effects. Internet companies like Google and Facebook make use of linked data quite hesitantly. Although Facebook’s Graph Search or Google’s Knowledge Graph contain large portions of this kind of technology, Google would never say ‘oh, we are a semantic web company now, we make heavy use of linked data, and of course we will also contribute to the LOD cloud.’

Why is that? Simply spoken, because through the glasses of Google, Facebook & Co. the internet is a huge machine which produces data for them. Not the other way around.

But shouldn’t the enterprise customers themselves be interested in a cost-effective way of information management? They are, but as stated before, they haven’t perceived linked data as such, although it clearly is.

To develop technologies, we need critical questions, and of course the most critical ones always come from the inside of a community or movement. But time has come to spread the good news for the ‘outside’.

  • Yes, databases which rely on linked data standards have become mature and enough performing for many query types so that they outperform even ‘traditional’ relational databases
  • Yes, also issues which are critical for enterprise usage like privacy and security have been solved by most linked data technology vendors
  • Yes, there is a critical mass of available LOD sources (for example UK Ordnance Survey) and also of high-quality thesauri and ontologies (for example Wolter Kluwer’s working law thesaurus) to be reused in corporate settings
  • Yes, there is a volume of developers and consultants on the labor market (in the U.S. as well as in the E.U.) which is big enough to being able to execute large linked data projects
  • Yes, there are tons of business cases that can benefit from linked data. Linked data and semantic web technologies should be considered as core technologies for any information architecture, at least in larger corporations
  • Yes, SPARQL Query Language is not only a second SQL but comes with some brilliant features like transitive queries which help to save a lot of time when developing applications like business intelligence reporting and analysis
  • Yes, Linked Data has the potential to become the basis for a large variety of tools which help decision-makers (not only in enterprises but also in politics) to become true ‘digerati’ instead of being degraded to masters of the ‘bullshit bingo’.

Yes, this list can be further extended and it is a core element for the further expansion of the LOD cloud. It’s the enterprises that will drive the next level of maturity of the linked data landscape. Because at the end of the day it’s only them who will pay or have already paid the bill for open (government) data.

Thomas Thurner

Free Webinar: Linked Data for the Environmental Sector – Use Cases and Opportunities

Organizations working in the environmental sector most often act as intermediates between politics, economy and citizens. They are growing out of their role as plain content providers. To service the demands of their stakeholders they have to act also as data and tool providers for their respective communities.

On June 13 this webinar introduces several good practice examples achieving data governance in using the linked open data paradigms. Together with a basic overview of the possibilities of linked open data you get an appealing picture of the new opportunities which are provided by these principles and technologies, also for your organisation!

Register Now!

Learn more about three organizations and their linked data projects

Global Buildings Performance Network (GBPN)

GBPN_logo_rgb_72GBPN established the “Policy comparative tool on building stock data” together with a domain specific thesaurus used for a domain specific news aggregator.

Renewable Energy and Efficiency Partnership (REEEP)

logo_reeepAs one of the pioneers in the sector, REEEP has an extensive focus on the use of linked data for renewable energy and energy efficiency, facilitating that in various services, like an automatic annotation service, aggregated country data presented as fact sheets, a domain specific search engine, etc.

Austrian Geological Survey (GBA)

gbaThe main driving factor for institutions like the GBA to invest in thesaurus and taxonomy projects, is the increasing need for a uniform description of their data. The idea is that this enhances value and re-usability of their products for their stakeholders. Especially in the geo-spatial sector the INSPIRE directive of the European Parliament and Council gave a push in that direction. As a public authority, the GBA was legally called to implement the directive for its domain.

Presenters in this Webinar

  • Martin Kaltenböck (SWC)
    CFO and Project Lead at Semantic Web Company for Data Portal Solutions
  • Florian Bauer (REEEP)
    Operations and IT Director of REEEP as well as the  clean energy information portal www.reegle.info
  • Andreas Blumauer (SWC)
    CEO and Evangelist for Linked Data and SKOS based Thesaurus Management

Free Register

 

Andreas Blumauer

It’s All about Finding the Needle in the Big Data Haystack

Wolters Kluwer Deutschland GmbH and Viennese Semantic Web Company agree on cooperating on the development of innovative and highly efficient products for data, information and metadata management.

Cologne/Vienna (February 05, 2013) – Wolters Kluwer Deutschland (WKD), knowledge and information service provider located in Cologne and the Austrian Semantic Web Company (SWC) act in collusion with each other. The aim of this cooperation is to offer the sustainable creation and targeted usage of domain specific thesauri and enterprise taxonomies based on Linked Data technologies as a market-ready and ready to use product.

Whereas WKD with its core competences in law, business, tax, finance and health is covering the domain and methodological dimension of the cooperation, SWC is contributing the technological know-how. Main target is to fulfill concrete needs and requirements of customers in a highly efficient and practical way.
“Our offering is addressing large national institutions like ministries, federal agencies and social insurances as well as banks. This includes also larger NGOs and administrations that have an international focus”, explains WKD content architect Christian Dirschl the direction of the cooperation. As with searching and finding the ‘proverbial needle in the data haystack’, we also address specialists like large law firms or smaller units in large enterprises, “who are specifically working on legal matters, making with their work an important contribution to the success of the company as a whole.” Especially knowledge domains like law, industry, tax and finance are getting more and more intransparent on global scale, Dirschl explains, “so that semantic technologies and Linked Data methods gain importance”.
Amongst others, the following services and products are offered, based on this cooperation:
  • Metadata management and enterprise thesaurus management
  • Semantic search and data integration
  • Text mining and knowledge extraction
  • Creation of knowledge networks and knowledge management systems
  • Supporting the creation of Linked Data and Open Data infrastructures
We observed in the last 10 years “how search and linking of information have gained importance in certain domains and what competitive advantages can evolve from that,” stresses Andreas Blumauer, managing director of SWC. “Our customer base profits from this cooperation. We immediately guarantee state-of-the-art technologies paired with professional domain assistance, e.g. with the creation of domain taxonomies and thesauri, so that information resources can be used more efficiently”, Blumauer says.

About Wolters Kluwer Germany
Wolters Kluwer Germany is an information services company specializing in the legal, business and tax sectors. Wolters Kluwer provides pertinent information to professionals in the form of literature, software and services. Headquartered in Cologne, it has over 1,200 employees located at over 20 offices throughout Germany and has been conducting business on the German market for over 25 years.
Wolters Kluwer Germany is part of the leading international information services company, Wolters Kluwer n.v., located in Alphen aan den Rijn (The Netherlands). The core market segments, targeting an audience of professional users, are legal, business, tax, accounting, corporate and finance services, and healthcare.  Its shares are quoted on the Euronext Amsterdam (WKL), and are included in the AEX and the Euronext 100 indices. Wolters Kluwer has annual sales of 3.4 billion Euros (2011), employs approx.19,000 people worldwide and has over 40 offices located throughout Europe, North America and Asia Pacific and in Latin America.
Thomas Thurner

Wolters Kluwer Deutschland is publishing 2 legal thesauri as Linked Open Data

Wolters Kluwer Deutschland GmbH (WKD) publishes two legal thesauri as Linked Open Data for free re-use by public administrations, industry and the Open Data community

(Munich, 12.07.2012, WKD) From today on, two thesauri (controlled vocabularies) covering juridical/legal topics are provided for free re-use as Linked Open Data: One thesaurus is covering topics around labor law in German language, while the other one describes German and European courts. Both vocabularies can be accessed at: vocabulary.wolterskluwer.de/.

Labor law thesaurus covers all main areas of labor law, like the roles of employee and employer; legal aspects around labor contracts and dismissal; also co-determination and industrial action. Therefore, this thesaurus is interesting and relevant for all parties, who are dealing with labor law – professionals like specialized lawyers as well as for employees looking for definitions of legal terms. Linking to thematically similar thesauri (Linked Open Data paradigm) has already taken place and is therefore available as well.

Courts thesaurus is structuring German and European courts in a hierarchical fashion and includes e.g. address information. This thesaurus is not only dedicated to parties interested in legal matters, but also to developers developing geo data applications. Information concerning courts and their roles and responsibilities can become an interesting aspect of many applications in the future.

Publication of these data sets as Open Data is motivated by many reasons. In particular two major directions should be mentioned here: first is to help our customers with their information overload and the other one is to support activities in the OGD (Open Government Data) community.

The creation of legal vocabularies is far from being a trivial thing and there are hardly any resources available in German language. By making these thesauri publicly available, we want to support especially administrations to classify and structure their internal data, in order to easily connect this data to relevant WKD legal resources afterwards (Interoperability of data). The Community on the other hand is very active in some domains, but unfortunately very reluctant when it comes to legal topics. Our aim here is to give initial support in order to create awareness, that also with this data it is possible to create highly interesting and relevant applications. In the end, all interested parties have to work together in a collaborative fashion, in order to bring transparency to the diversity and sheer amount of legal information – this is not possible within insular silos of applications and isolated approaches.

With this effort, Wolters Kluwer Deutschland GmbH is becoming part of the global Open Data movement, which is also heavily promoted by the European Commission, in order to strengthen Europe as an industrial location.

License models used here (like Creative Commons, CC-BY 3.0 for the contents) are as open as possible, in order to have available a real basis for further development in a collaborative fashion.

This commitment also implies next steps: both thesauri will be communicated to different target groups and the resulting discussions will hopefully generate many new requirements and concrete models for collaboration.

Facts and Figures

Licenses of WKD thesauri

  • Data is licensed using ‘Creative Commons Namensnennung 3.0 Deutschland (CC BY 3.0)’ License.
  • Data model is licensed using ‘ODBL’ License.
  • Links to external sources are licensed using a ‘CC0 1.0 Universal (CC0 1.0) Public Domain Dedication’ License.

Published as Linked Open Data (LOD)

WKD Thesauri are linked with

Programming interfaces as API / SPAQRL endpoints available at:

Used software tool

PoolParty Thesaurus Management Suite (www.poolparty.biz)

Both thesauri are described in ADMS format

coming from the European Commission, in order to be easily re-used in e-government services: http://joinup.ec.europa.eu/asset/adms/description

This project was implemented in a partnership between

Wolters Kluwer Deutschland GmbH (http://www.wolterskluwer.de), Semantic Web Company Wien (http://www.semantic-web.at) and the FP7 Project LOD2 (http://lod2.eu)

for more information you may contact

Christian Dirschl
Wolters Kluwer Deutschland GmbH (WKD)
Freisinger Strasse 3
D-85716 Unterschleißheim
cdirschl@wolterskluwer.de