Andreas Blumauer

Do you like Google’s Knowledge Graph?

Semantic Enterprise Search enters the second phase.

Finally the Knowledge Graph has arrived in Europe: What has been provided on google.com for the US-Market since May 2012, is now available also for most European countries. Search results are no longer only a list of documents (and advertisements) but also a mashup of facts, points of interest, events etc. referring to the search phrase.

For example, if the user is searching for ‘Wiener Philharmoniker’ (‘Vienna Philharmonic Orchestra’) a factbox including related searches is provided:

Do you like this rather new way of knowledge discovery? We do, except the fact that Google hasn´t properly explained to the audience which technology is behind the Knowledge Graph which is the Web of Linked Data aka the Semantic Web (Do you want to know more about the relationship between the Knowledge Graph and Linked Data? Click here).

But anyway, here are some benefits we can see, if search technologies make use of a ‘knowledge graph’, a ‘knowledge model’, a ‘thesaurus’ or generally spoken: Linked Data.

  • Facts around an object (or an entity) can be found nicely packed up to a dossier
  • Serendipity can be stimulated by ‘related searches’ which means: Users can discover the formely ‘unknown’ in a more comfortable way
  • Data from various sources can be pulled together to a mashup (e.g. ‘upcoming events’ could come from a different database than the basic facts of Vienna Philharmonic Orchestra)
  • Search phrases are well understood by the engine since they are based on concepts and not anymore on literals, e.g. if the user searches for ‘Red Bull Stratos’, also results for ‘Felix Baumgartner’ will be delivered
  • Search can be refined, e.g. if one searches for ‘Vienna’, a list of POIs will be displayed to refine the actual place the user is looking for

Now imagine you would have a search engine in your company’s intranet based on a knowledge graph which is about the enterprise you are working for.

Such an advanced search application would look like this:

  • Data streams and all kind of content from internal sources are nicely mashed with information from the web (e.g. from Twitter, Youtube etc.)
  • Search assistants are provided to help users to refine their information needs to make them more specific
  • Entities and their sub-concepts (e.g. subsidiaries of large companies or regions of countries) are nicely packed together to one dossier

The key question now is: “how to set up a customised knowledge graph for a certain company?”.

Corporate Semantic Web based applications can be realised on top of software platforms like PoolParty. They all have a customised knowledge graph in their core. This is always the basis for concept-based indexing of specialised content from a corporate intranet. The basic standard for this is SKOS which can be used together with advanced query languages like SPARQL. Such graphs can be used for semantic indexing but also to ask for relations like ‘is point-of-interest in’, ‘is event of’, ‘is related search for’ etc. This is the next-generation semantic search which help decision-makers, information professionals and all kind of knowledge workers to improve their work significantly.
One comfortable way to create customised knowledge graphs is to make use of Linked Data sources like Freebase (like Google does) or DBpedia. More details wanted? Take a look at the PoolParty approach for efficient knowledge modeling.
Thomas Thurner

Wolters Kluwer Deutschland is publishing 2 legal thesauri as Linked Open Data

Wolters Kluwer Deutschland GmbH (WKD) publishes two legal thesauri as Linked Open Data for free re-use by public administrations, industry and the Open Data community

(Munich, 12.07.2012, WKD) From today on, two thesauri (controlled vocabularies) covering juridical/legal topics are provided for free re-use as Linked Open Data: One thesaurus is covering topics around labor law in German language, while the other one describes German and European courts. Both vocabularies can be accessed at: vocabulary.wolterskluwer.de/.

Labor law thesaurus covers all main areas of labor law, like the roles of employee and employer; legal aspects around labor contracts and dismissal; also co-determination and industrial action. Therefore, this thesaurus is interesting and relevant for all parties, who are dealing with labor law – professionals like specialized lawyers as well as for employees looking for definitions of legal terms. Linking to thematically similar thesauri (Linked Open Data paradigm) has already taken place and is therefore available as well.

Courts thesaurus is structuring German and European courts in a hierarchical fashion and includes e.g. address information. This thesaurus is not only dedicated to parties interested in legal matters, but also to developers developing geo data applications. Information concerning courts and their roles and responsibilities can become an interesting aspect of many applications in the future.

Publication of these data sets as Open Data is motivated by many reasons. In particular two major directions should be mentioned here: first is to help our customers with their information overload and the other one is to support activities in the OGD (Open Government Data) community.

The creation of legal vocabularies is far from being a trivial thing and there are hardly any resources available in German language. By making these thesauri publicly available, we want to support especially administrations to classify and structure their internal data, in order to easily connect this data to relevant WKD legal resources afterwards (Interoperability of data). The Community on the other hand is very active in some domains, but unfortunately very reluctant when it comes to legal topics. Our aim here is to give initial support in order to create awareness, that also with this data it is possible to create highly interesting and relevant applications. In the end, all interested parties have to work together in a collaborative fashion, in order to bring transparency to the diversity and sheer amount of legal information – this is not possible within insular silos of applications and isolated approaches.

With this effort, Wolters Kluwer Deutschland GmbH is becoming part of the global Open Data movement, which is also heavily promoted by the European Commission, in order to strengthen Europe as an industrial location.

License models used here (like Creative Commons, CC-BY 3.0 for the contents) are as open as possible, in order to have available a real basis for further development in a collaborative fashion.

This commitment also implies next steps: both thesauri will be communicated to different target groups and the resulting discussions will hopefully generate many new requirements and concrete models for collaboration.

Facts and Figures

Licenses of WKD thesauri

  • Data is licensed using ‘Creative Commons Namensnennung 3.0 Deutschland (CC BY 3.0)’ License.
  • Data model is licensed using ‘ODBL’ License.
  • Links to external sources are licensed using a ‘CC0 1.0 Universal (CC0 1.0) Public Domain Dedication’ License.

Published as Linked Open Data (LOD)

WKD Thesauri are linked with

Programming interfaces as API / SPAQRL endpoints available at:

Used software tool

PoolParty Thesaurus Management Suite (www.poolparty.biz)

Both thesauri are described in ADMS format

coming from the European Commission, in order to be easily re-used in e-government services: http://joinup.ec.europa.eu/asset/adms/description

This project was implemented in a partnership between

Wolters Kluwer Deutschland GmbH (http://www.wolterskluwer.de), Semantic Web Company Wien (http://www.semantic-web.at) and the FP7 Project LOD2 (http://lod2.eu)

for more information you may contact

Christian Dirschl
Wolters Kluwer Deutschland GmbH (WKD)
Freisinger Strasse 3
D-85716 Unterschleißheim
cdirschl@wolterskluwer.de

 

Andreas Blumauer

PoolParty Thesaurus Manager 3.1 with auto-population feature was presented at SemTechBiz 2012 in San Francisco

A new PoolParty Thesaurus Manager (PPT) release was presented at this year´s Semantic Technology & Business Conference in San Francisco: Version 3.1.0 is a major release offering lots of great new funcitionalities and improvements including auto-population of thesauri and linked data knowledge models.

The main new features are:

  • Autopopulation of Thesauri from DBpedia
    The Skossy functionality has been integrated into PPT. You can assign DBpedia categories to concepts and then autopopulate your thesaurus based on data from DBpedia.

  • Linked Data Based Synonym and Translation Service
    You can add labels (pref, alt, hidden) to the concepts of your thesaurus based on suggestions for synonyms and translations provided by data from DBpedia.
  • ADMS Description for Projects
    Metadata for PoolParty projects can now be published according to the Asset Description Metadata Schema (ADMS) developed by the joinup project of the European Union.

  • Windows Theme
    A new theme has been added based on the Windows GUI guidelines.

Andreas Koller from Semantic Web Company: “SemTechBiz 2012 was a great success for us, we had a lot of talks with people from various industries at our booth. Demonstrating how building knowledge models on top of linked data sources can improve text mining for example, attracted wide interest. We enjoyed the whole conference, the location and the support from the organization team.”

To get an overview over all changes made in Release 3.1.0 take a look at the Release Notes.

Andreas Blumauer

Re-vamped PoolParty Knowledge Discoverer has been released

PoolParty team has released a brandnew version of its knowledge discoverer to showcase the power of knowledge models in combination with linked data and text mining.

First of all: PoolParty Knowledge Discoverer is more about collecting context information about documents which deal with domain-specific ‘things’ like persons, places, companies etc. than a search engine in a ‘classical’ sense.

PoolParty Knowledge Discoverer

Don´t expect to find a pizzeria in your neighbourhood with this kind of tool. If you want to build a similar tool like this, take a look at the PoolParty product family.

How does it work?

Provide some text either by

  • typing your topic or
  • by retrieving text from a URL or
  • by entering a text directly into the editor

PoolParty will analyse your text.

Now you will get smart recommendations and context information:

  • related contents from Wikipeda
  • categories related to the text
  • images related to the text
  • tags relevant for the text

For example: If you want to get a quick overview over an interesting article of ‘The Guardian’ about open data, just click on the bookmarklet which can be installed to use the Knowledge Discoverer instantly, and you will be redirected to the following page.

The tool is a blueprint for many use cases in different sectors, here are some examples:

  • find relations between open positions and applicants in your recruiting database
  • find those pieces of your technical documentation which are related to a concrete description of a customer´s problem
  • save time when analysing new markets by collecting and linking information about your target market from different databases

Interested? Wanna see how this could work in established platforms like Confluence? Come to Atlassian Summit or SemTechBiz (both to be held in San Francisco) next week and visit us at the PoolParty booth!