As an open data fan or as someone who is just looking to learn how to publish data on the Web and distribute it through the Semantic Web you will be facing the question “How to describe the dataset that I want to publish?” The same question is asked also by people who apply for a publicly funded project at the European Commission and want to have a Data Management plan. Next we are going to discuss possibilities which help describe the dataset to be published. Continue reading
SEMANTiCS conference celebrated its 10th anniversary this September in Leipzig. And this year’s venue has been capable of opening a new age for the Semantic Web in Europe – a marketplace for the next generation of semantic technologies was born.
As Phil Archer stated in his key note, the Semantic Web is now mature, and academia and industry can be proud of the achievements so far. And exactly that fact gave the thread for the conference: Real world use cases demonstrated by industry representatives, new and already running applied projects presented by the leading consortia in the field and a vivid academia showing the next ideas and developments in the field. So this years SEMANTiCS conference brought together the European Community in Semantic Web Technology – both from academia and industry.
- Papers and Presentations: 45 (50% of them industry talks)
- Posters: 10 (out of 22)
- A marketplace with 11 permanent booths
- Presented Vocabularies at the 1st Vocabulary Carnival: 24
- Attendance: 225
- Geographic Coverage: 21 countries
This year’s SEMANTiCS was co-located and connected with a couple of other related events, like the German ISKO, the Multilingual Linked Open Data for Enterprises (MLODE 2014) and the 2nd DBpedia Community Meeting 2014. This wisely connected gatherings brought people together and allowed transdisciplinary exchange.
Recapitulatory speaking: This SEMANTiCS has opened up new sights on Semantic Technologies, when it comes to
- industry use
- problem solving capacity
- next generation development
- knowledge about top companies, institutes and people in the sector
- Save the date for SEMANTiCS 2015: 15th – 17th of September 2015, Vienna
- SEMANTiCS 2014 – picture gallery: Flickr
Imagine you could generate any thesaurus you would like for nearly any knowledge domain you can think of with quite a good quality! Sounds impossible? Reminds you of all the promises made by text mining software which generates “semantic nets” from scratch?
Let me introduce you to SKOSsy. I will explain what this web service can do for you:
SKOSsy generates SKOS based thesauri in German or in English for a domain you are interested in. Not any domain but nearly any: SKOSsy extracts data from DBpedia, so it can cover anything which is in DBpedia. Thus, SKOSsy works well whenever a first seed thesaurus should be generated for a certain organisation or project. If you load the automatically generated thesaurus into an editor like PoolParty Thesaurus Manager (PPT) you can start to enrich the knowledge model by additional concepts, relations and links to other LOD sources. But you don´t have to start in the open countryside with your thesaurus project.
Let me give you an example: Imagine you are working for a company which is an international plant builder and you would like to index several thousands of documents the “semantic way”. You have to walk through the following steps:
- Identify proper categories in Wikipedia/DBpedia which describe best what your business or your domain is all about. Those categories should contain pages / resources which are related to the documents you would like to index. For example: http://dbpedia.org/resource/Category:Metalworking or http://dbpedia.org/resource/Category:Industrial_automation
- After you have selected proper categories SKOSsy will traverse DBpedia for you and collect all resources, their hierarchical and non-hierarchical relations, alternative labels, definitions and other properties and put them together as a valid SKOS thesaurus; this step will last a couple of minutes. (Find the resulting vocabulary here)
- Load the resulting thesaurus into PPT, explore it, improve it and enrich it with additional facts.
- After you´re done you can generate a tailor-made text extractor by using PoolParty Extractor (PPX) which is the second component of PoolParty product family
- With PPX and its extraction model especially curated for your special use case you can extract named entities from your documents automatically and index your documents in a meaningful way.
- After a few seconds your semantic search engine is ready to be used. PoolParty Semantic Search (PPS) which is the third PoolParty component will offer some nice facilities like categorized auto-complete, faceted search, content recommendation (similarity search) and smart search suggestions to ease your life as a knowledge worker.
We have constantly discussed the application of thesauri and other knowledge models to improve search over the last years. Many people understood straight away why thesaurus based search is most often much better than search algorithms purely based on statistics. Of course the big contra always was, “the costs are too high to establish a “good-enough” thesaurus or even a “high-quality” one”.
With SKOSsy in place those kinds of arguments become weaker and weaker. To sum up,
- SKOSsy makes heavy use of Linked Data sources, especially DBpedia
- SKOSsy can generate SKOS thesauri for virtually any domain within a few minutes
- Such thesauri can be improved, curated and extended to one´s individual needs but they serve usually as “good-enough” knowledge models for any semantic search application you like
- SKOSsy based semantic search usually outperform search algorithms based on statistics since they contain high-quality information about relations, labels and disambiguation
- SKOSsy works perfectly together with PoolParty product family
If you are interested in the results produced by SKOSsy, just send us a short note about your domain or your project and we will send you an invitation as beta-tester or prepare a demo for you.
- Geological Survey Austria launches thesaurus project (semantic-web.at)
- PoolParty 3.0 and its all new Linked Data framework (semantic-web.at)
- PoolParty DemoZone Content Extractor Semantic Search Thesaurus Manager (poolparty.punkt.at)
- Query DBpedia for multiple keywords (stackoverflow.com)