The release of the ISO standard for thesauri “ISO 25964 Part 1: Thesauri for information retrieval” in 2011 was a huge step, as it replaced standards that dated back to 1986 (ISO 2788) and 1985 (ISO 5964). By that, methodologies from a pre-Web era, when thesauri where rather developed to be published on paper have been further developed. The new standard also brought a shift from a term-based model to a concept-based model stating: “Each term included in a thesaurus should represent a single concept (or unit of thought)” – from: ISO 25964 Part 1, page 15. That brings it close to Semantic Web based data models like SKOS and also shows that formerly disconnected communities are now working together.
Promising title. After two and a half day (well for almost all of us) we entered the final phase of the plenary. So two and a half days of intense and interesting discussions catching up with all that has been done so far and planning what should happen the next half year. But still two session in front of us.
The afternoon started with the discussion of WP9 the “Open Government Data” use case. First Uroš Milošević from Institut Mihajlo Pupin (IMP) reported about the Serbian CKAN project already holding some data from the Statistical Office of Serbia. Also tools from the LOD2 stack have been and will be used for this project. Sounds great!
Then Irina Bolychevsky of OKFN continued the session announcing that a better integration between CKAN and LOD2Stack should be made to get more RDF in publicdata.eu. Good idea! We were collecting ideas for integration and talked about e.g. a wizard for generating RDF from .csv files (ULEI is working on something like that). Also a integration of google refine has been discussed. The consortium decided to make an extraction sprint transforming a (to be defined) number of interesting data sets from CKAN to RDF.
Finally we had a discussion if linked data is a (the) solution for CKAN to find data and find related data etc. Well i think the people in the consortium are pretty sure it is (not so sure if people from OKFN are). Irina and Mark from OKFN invited everyone to provide input to the Use Case.
This session ended with a presentation about WP9a from Jindřich Mynarz from UEP and Martin Nečaský from CU. They are developing a distributed market place for public contracts. A ontology for public contracts has been developed and is open for review on google code. Next step here will be a web application for filing/creating public contracts in RDF as linked data using tools from the stack. So all in all pretty good progress in WP9.
The third day and the plenary ended with Martin Kaltenböck from SWC and Sören Auer our project lead from ULEI presenting WP10-11-12 Dissemination, Exploitation and Project Management. First we voted for our next plenary to be in Cambridge (hosted by OKFN). Past dissemination activities have already been presented on day one, so Martin reminded us all to write blog posts about all the great things we are doing in LOD2. Next big dissemination activity and also a good opportunity to meet people from the consortium will be the European Data Forum from June 6-7 in Copenhagen.
And that was pretty much it. I as i hope all the others enjoyed three days with a bunch of great people from all over Europe working on a great project. As always it was intense but it was also fun. Hope everyone had a save trip home!
Reviewing the interview we made with Les Kneebone (project manager of the vocabulary projects at Education Services Australia) in November 2010 we can see that ESA has been one of the early adopters of SKOS as a standard for thesaurus development. Les said then: “We had already identified SKOS as an important standard for ScOT so it was natural to select PoolParty as our new thesaurus management tool”. Around a year later ESA´s vocabulary site went online with PoolParty as its basis.
We asked Les to comment on his statement from last year and he confirmed that SKOS continues to be central to the ESA vocabulary business model and that it has also been important for ESA that PoolParty has been flexible enough to support continued publication of non-RDF formats, especially IMS VDEX.
In the course of this project it became more and more obvious that SKOS cannot only be used as yet another format for publishing thesauri but rather as a unified model to build thesauri in general. This approach made possible several improvements to the vocabulary development model and the maintenance process of ESA. Since all data is stored as RDF in a triple store, and SKOS and RDF are flexible formats supporting interoperability and interchangeability of data, many manual transformations that had to be done before are not needed anymore and all other systems using the vocabularies are dynamically fed by PoolParty offering the data in its needed formats (see image below).
Les states that while some manual processes still exist to support legacy systems, PoolParty ensures the integrity and richness of ESA data. Support and customizations for legacy systems can be achieved in the confidence that the linked-data capabilities are centrally managed and stored in the PoolParty triple store.
From the publishing perspective, the previous vocabulary publishing site has been replaced by the PoolParty Linked Data Frontend (LD-Frontend) that has been customized especially for this project to offer more flexibility in the display and the layout of the data. Similar to the frontend for the Austrian Geological Survey mentioned in a previous blog post , the LD-Frontend has been adapted to the ESA styleguide and the display of the data in the HTML view of the frontend has been adapted to be more user-friendly (see screenshot below).
From ESA’s perspective Les commented here that for the vocabulary manager, edits to the frontend styles and templates are intuitive and can be tested in staging environments. But he also stated that for publishing support is important, and that SWC was very responsive.
Of course we asked Les to give a preview of the next steps for ESA. He stated that they include language translation projects so that its vocabularies, especially Schools Online Thesaurus (ScOT), can be accessed by wider markets and by students of other languages. He also stated that PoolParty handles multi-lingual thesauri very well.
We here at SWC are glad to see PoolParty used in more and more applications and usage scenarios. We are looking forward to the next steps that will be done in this project and also to see how the data offered by the ESA vocabulary site is used in other applications.
Thanks to Les Kneebone from ESA for his contribution to his blog post.
Throughout the last year the Semantic Web Company team has supported the Geological Survey of Austria (GBA) in setting up their thesaurus project. It started with a workshop in summer 2010 where we discussed use cases for using semantic web technologies as means to fulfill the INSPIRE directive. Now in fall 2011 GBA published their first thesauri as Linked Data using PoolParty’s new Linked Data front-end.
The Thesaurus Project of the GBA aims to create controlled vocabularies for the semantic harmonization of map-based geodata. The content-related realization of this project is governed by the Thesaurus Editorial Team, which consists of domain experts from the Geological Survey of Austria. With the development of semantically and technically interoperable geo-data the Geological Survey of Austria implements its legal obligation defined by the EU-Directive 2007/2/EC INSPIRE and the national “Geodateninfrastrukturgesetz” (GeoDIG), respectively.
The construction of the thesauri has been done using the PoolParty Thesaurus Manager so they all are based on SKOS and fully compliant to the Linked Data principles. Apart from the standard implementation of SKOS some additions were made to the data model using Dublin Core terms for extra metadata and custom sub properties of skos:related to give some semantic constraints to related properties. This basically means that a big effort was put into the integration of bibliographic references for every concept in the data set using dcterms:source. This aims at the requirements of reuse by the scientific community and incorporation in domain specific data sets. On the other hand rdfs:subProperityOf was used to express how international geologic time scales map on regional concepts.
With the new PoolParty Release (3.0) the Linked Data front-end has been redesigned and is now highly customizable and extendable. In the GBA Thesaurus Project it is used as an publishing interface for the created controlled vocabularies both for the machine readable RDF version and an custom HTML version for comfortable browsing and searching.
After all it’s satisfying to see a project we’ve supported and worked on for some time now come to live and now we are looking forward to the next steps that will be done in this project.
P.S.: Thanks to Marcus Ebner from GBA for his contribution to his blog post.