Martin Kaltenböck

Linked (Open) Data has reached the European Publishing Industry – but is it the ‘Real Linked Data’ – a short review on the Publishers’ Forum 2013

Invited by Helmut von Berg, Director at Klopotek & Partner (Klopotek is THE European vendor for publishing production software) I had the chance to participate and speak at this years Publishers’ Forum 2013 at the Concorde Hotel in Berlin on 22nd to 23rd of April 2013.

Coming from the semantic web / linked (open) data community to this publishing industry event with about 320 participants (mainly decision makers) from small to huge publishers all across Europe made me really curious in the forefront of the Forum – what would be the most important issues for innovative publishing processes, what would be the hypes and hopes of a sector that is in the middle of a big change: coming from paper publishing straight into the world of our todays’ data economy?

And  then in Berlin, Monday morning – the big surprise: already the opening keynotes by David Worlock, Outsell, UK (Title of Talk: The Atomization of Everything) and Dan Pollock, Nature Publishing Group, UK (Title of Talk: Networked Publishing is Open for Business) mentioned topics as the Semantic Web, Linked (Open) Data and even RDF and Triple Stores – last but not least pointing out that the content of publishers needs to be atomized down to the ‘data level’ and then can to be used successfully for new and innovative business models to serve existing and future customers…

D-Worlock_PublishersForum2013_Keynote
David Worlock ‘singing my song’ at the Publishers’ Forum 2013

As I participated in the European Data Forum 2013 (EDF2013) just a few days before the Publishers’ Forum my first thought was: WOW – publishers today have arrived in modern data economy (following already the data value chain)! And I enjoyed talking to David Worlock in the coffee break telling him my thoughts and that I will manage a workshop about ‘Enterprise Terminology as a basis for powerful semantic services for publishers’ in the afternoon that day (see slides on slideshare) and his answer was ‘Yes Martin, it seems that I was singing your song’.

The following 1.5 days of the Publishers’ Forum 2013 were full of presentations, workshops and discussions about innovative publishing processes, new business models for publishers and innovative approaches and services – full of terms that are well known by myself like: meta data management, semantics, contextualisation and very very often: Big Data and Linked (Open) Data…..and I listened very carefully to all of this – and at some point it was clear: this discussion needs to be evaluated more carefully – because many of talks and presentations were using the above mentioned terms, principles and technologies only as marketing buzz words – but taking a deeper look showed: there is no semantic web technology in place?!

Hey, Linked Data does NOT mean to establish something like a relation / a link between ‘an Author and a publication’ inside of a repository / a database – Linked (Open) Data is a well established and specified methodology using W3C semantic web standards:

Tim Berners-Lee outlined four principles of linked data in his Design Issues: Linked Data as follows:

  • Use URIs to denote things.
  • Use HTTP URIs so that these things can be referred to and looked up (“dereferenced”) by people and user agents.
  • Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF*, SPARQL.
  • Include links to other related things (using their URIs) when publishing data on the Web.

Please read in more detail here:

As being a bit like an evangelist for Linked (Open) Data I think such a hype can be very dangerous for the publishing industry – because I see a very strong need for these companies to go for innovative content- and data management approaches very quickly to ensure competitiveness today as well as competitive advantage tomorrow – but not using the respective standards (means: only having the packaging and marketing brochures branded with it) cannot fulfill the hopes in the mid- and the long term!

Thereby I would like to point out here that ‘Linked Data’ seems not always to be ‘Linked Data’ – and I would like to strongly recommend to take a look at the well proven standards – and when selecting IT consultants and IT vendors (means: your IT partners – also a very interesting message taken home from the Forum: that publishers and IT vendors should co-operate more closely in the future in the form of sustainable partnerships) to ensure that these partners really have worked already and are working continuously with these standards and mechanisms!

C-Dirschl_PublishersForum2013_Terminology-Workshop

Christian Dirschl (Wolters Kluwer) presenting the
WKD Use Case on Enterprise Terminologies

Btw. I had a great workshop on Monday afternoon together with Christian Dirschl from Wolters Kluwer Germany (WKD) discussing applications on top of enterprise terminologies (controlled vocabularies using real linked (open) data principles). And: The Semantic Web Company (SWC) is already a partner of the publisher WKD – and this partnership seems to become a more and more fruitful and sustainable one every day – using real linked (open) data…

Andreas Blumauer

Free Webinar on Enterprise Semantics

PoolParty Team gave a webinar on November 28, 2012. We talked about scenarios and applications using semantics in enterprises. Some of the use cases we have discussed were:

  • Semi-automatic tagging of content (SharePoint, Confluence, …)
  • Semantic enterprise search (Mindbreeze, FAST, Exalead, …)
  • Linked Enterprise Vocabularies
  • Enterprise linked data integration (queries across Oracle databases and unstructured text)

WATCH the VIDEO.

We showed the latest developments of PoolParty platform and we gave insights how structured data from relational databases can be mashed with unstructured text when using linked data alignment. We also showcased how we mashed a large text corpus with statistical financial data on top of PoolParty and UltraWrap.

Andreas Blumauer

Survey on “Perception and Relevance of Controlled Vocabulary Quality Issues”

The University of Vienna (Research Group Multimedia Information Systems) and the Semantic Web Company are conducting a survey on “Perception and Relevance of Controlled Vocabulary Quality Issues”.

Image by Sean MacEntee

 

The survey is aimed at practitioners who are using or who are planning to use controlled vocabularies in their organisation. We’d be happy if you take the time to fill in the questionnaire here.

The goal of this study is to find out how developers and users of controlled vocabularies deal with quality aspects of these vocabularies. More specifically, we want to answer these questions:

  • What does vocabulary quality mean for taxonomists?
  • Given a number of possible quality issues, what is their relevance in practical settings?
  • What vocabulary usage scenarios are affected by the quality issues?

The questionnaire can be answered anonymously. Similar to our preceding
survey from last year (Do Controlled Vocabularies Matter?) we will publish the results as a scientific contribution so the community can gain a better knowledge on how to
create and use controlled vocabularies.

Andreas Blumauer

Education Services Australia announces new release of Schools Online Thesaurus

Education Services Australia has recently announced the release of Schools Online Thesaurus (ScOT) v6.7.
ScOT and agreed standards for digital resources, technical infrastructure, metadata and rights management support a national operating environment for the digital resources and infrastructure.
As part of this infrastructure, the National Digital Learning Resources Network contains over 12,000 digital resources that are free for use in all Australian schools. The resources are made available to teachers through State and Territory portals or Scootle and to pre-service teachers through eContent.
Version 6.7 has made possible by a recent addition to the ScOT project team. Ben Chadwick has undertaken a Vocabulary Support role since January 2012 and his work with web-services, data mining and thesaurus editing has contributed to the delivery of a substantial body of work.

Significant steps have been taken in the area of non-English labels, especially the addition of Chinese, Indonesian, Japanese and Korean term translations. Other preliminary work, including development of language and character encoding support, facilitates translations in Arabic, Māori and other languages. A sample concept can be found at http://vocabulary.curriculum.edu.au/scot/976

This represents a substantial opportunity for ScOT to support users who are learning or who have a background in languages other than English. Online environments can be designed or adapted to take advantage of standardised language encoding and character support.
A number of new features and improvements have been developed in the ScOT website:
  • User generated reports – lists new, modified and deprecated terms
  • Cataloguing tool – quickly identifies hierarchical relationships within a group of terms
  • Linked Data API for querying ScOT database: SPARQL Endpoint
  • Tips and code examples for developing search tools and managing term changes
  • Revised license and simplified registration process
  • Auto-complete feature for searching ScOT
  • Web Content Accessibility Guidelines (WCAG 2.0) – range of issues identified and fixed
The associated report can be accessed from the ScOT website releases page http://scot.curriculum.edu.au/releases.asp.
ScOT is based on PoolParty technologies.