<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Semantic Puzzle &#187; Linked Data &amp; Open Data</title>
	<atom:link href="http://blog.semantic-web.at/category/linked-data-open-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.semantic-web.at</link>
	<description>Open World Assumptions</description>
	<lastBuildDate>Tue, 31 Aug 2010 04:44:19 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>I-Semantics 2010: Relevance of semantic technologies for industry increases fast</title>
		<link>http://blog.semantic-web.at/2010/07/01/i-semantics-2010-relevance-of-semantic-technologies-for-industry-increases-fast/</link>
		<comments>http://blog.semantic-web.at/2010/07/01/i-semantics-2010-relevance-of-semantic-technologies-for-industry-increases-fast/#comments</comments>
		<pubDate>Thu, 01 Jul 2010 08:50:39 +0000</pubDate>
		<dc:creator>Andreas Blumauer</dc:creator>
				<category><![CDATA[Calls & Competitions]]></category>
		<category><![CDATA[Conferences & Events]]></category>
		<category><![CDATA[Corporate Semantic Web]]></category>
		<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[I-SEMANTICS]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1650</guid>
		<description><![CDATA[
I-Semantics will take place for the 6th time this year in September and it will be co-located again with I-Know in Graz/Austria. This year´s programme shows that Semantic Web and semantic technologies in general are increasingly relevant for all kind of industries:

Biomedicine
Public administration &#38; Public transport
Information technology
Libraries
Media &#38; Content Industry
E-commerce
Education etc.


I-Semantics &#8220;Industry Track&#8221; with its [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><img class="alignnone" src="http://i-semantics.tugraz.at/wp-content/themes/i-know/images/logo_i-semantics.png" alt="I-Semantics 2010" width="219" height="39" /></p>
<p><a href="http://i-semantics.tugraz.at/" target="_blank">I-Semantics</a> will take place for the 6th time this year in September and it will be co-located again with <a href="http://i-know.tugraz.at/">I-Know</a> in <a href="http://www.geonames.org/2778058/graz-stadt.html" target="_blank">Graz/Austria</a>. This year´s programme shows that Semantic Web and semantic technologies in general are increasingly relevant for all kind of industries:</p>
<ul>
<li>Biomedicine</li>
<li>Public administration &amp; Public transport</li>
<li>Information technology</li>
<li>Libraries</li>
<li>Media &amp; Content Industry</li>
<li>E-commerce</li>
<li>Education etc.</li>
</ul>
<p><img class="alignnone" src="http://i-semantics.tugraz.at/wp-content/uploads/2009/11/i_know_pictures_small.jpg" alt="450 people in 2009" width="600" height="111" /></p>
<p>I-Semantics &#8220;<a href="http://i-semantics.tugraz.at/industry-track" target="_blank">Industry Track</a>&#8221; with its 3-days programme full of demos is one of the highlights of the congress. With 28 submissions this year´s <a href="http://i-semantics.tugraz.at/triplification-challenge" target="_blank">Triplification Challenge</a> tells a lot about the significance of Linked Data in areas like librarianship, public administration or GIS &amp; environmental planning. Take a look at the <a href="http://i-semantics.tugraz.at/triplification-challenge/nominated-papers" target="_blank">15 nominees</a> &#8211; and if you consider to come to I-Semantics 2010 follow the link for <a href="http://i-semantics.tugraz.at/registration" target="_blank">registration</a>.</p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/07/01/i-semantics-2010-relevance-of-semantic-technologies-for-industry-increases-fast/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/07/01/i-semantics-2010-relevance-of-semantic-technologies-for-industry-increases-fast/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/07/01/i-semantics-2010-relevance-of-semantic-technologies-for-industry-increases-fast/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Stella Dextre Clarke &amp; Alan Gilchrist about the &#8220;Future of Knowledge Organization on the Web&#8221;</title>
		<link>http://blog.semantic-web.at/2010/06/21/stella-dextre-clarke-alan-gilchrist-about-the-future-of-knowledge-organization-on-the-web/</link>
		<comments>http://blog.semantic-web.at/2010/06/21/stella-dextre-clarke-alan-gilchrist-about-the-future-of-knowledge-organization-on-the-web/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 06:00:10 +0000</pubDate>
		<dc:creator>Andreas Blumauer</dc:creator>
				<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Tools & Software]]></category>
		<category><![CDATA[Vocabularies & Languages]]></category>
		<category><![CDATA[ISO]]></category>
		<category><![CDATA[Knowledge organization]]></category>
		<category><![CDATA[SKOS]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1625</guid>
		<description><![CDATA[Semantic Web Company (SWC) had the pleasure and the opportunity to talk with two internationally recognised experts in the fields of information management and knowledge organization: Alan Gilchrist and Stella Dextre Clarke. SWC asked some questions about the &#8220;Future of Knowledge Organization on the Web &#38; Linked Data&#8221; on the occasion of an event of [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p>Semantic Web Company (SWC) had the pleasure and the opportunity to talk with two internationally recognised experts in the fields of information management and knowledge organization: <a href="http://www.metataxis.com/exponent-0.96.5-GA/themes/metataxistheme/AlanGilchristCV.pdf" target="_blank">Alan Gilchrist</a> and <a href="http://uk.linkedin.com/pub/stella-dextre-clarke/18/a55/609" target="_blank">Stella Dextre Clarke</a>. SWC asked some questions about the <strong>&#8220;Future of Knowledge Organization on the Web &amp; Linked Data&#8221;</strong> on the occasion of an event of the same name organised by <a href="http://www.iskouk.org/">ISKO UK</a> which will take place on <a href="http://www.iskouk.org/events/linked_data_sep2010.htm" target="_blank">September 14, 2010 in London</a>.</p>
<p><img class="alignnone" title="ISKO UK - Linked Data" src="http://www.iskouk.org/events/images/linked_data_titleimage.jpg" alt="" width="560" height="133" /></p>
<p><em>1. Alan, you are one of the leading experts in the field of thesaurus  construction. Organising knowledge in a (worldwide) Semantic Web is a  rather young discipline compared to your domain. What do you think can the  Semantic Web community learn from &#8220;traditional&#8221; thesaurus management and  vice versa?</em></p>
<p>You put inverted commas round the word traditional, but it might be more appropriate to put them round the word thesaurus! So long as words are used in information retrieval and in information sharing, different forms of structured vocabularies will be required, and many of the fundamental principles of thesaurus construction are still valid for their construction. Of course, the “traditional” thesaurus has mutated since the days when it was used only for controlled indexing and retrieval; and now, with the many enrichments possible it can be viewed as an ontology (in one of the definitions of this word). What remains a difficulty is to create a generalisable typology of associative relationships, though this is, of course, possible in relatively closed systems. In short, structured vocabularies with broadly thesaurus formats will be a necessary component in the web stack.</p>
<p><em>2. Stella, as a consultant you are specialized in  the design and implementation of knowledge structures for  information retrieval applications. In the last few months we have seen  that SKOS can serve as a significant building block to link  &#8220;traditional&#8221; thesaurus management to knowledge structures from the semantic  web. Can you see that this development is market-driven, is there a  significant growth of demand for solutions built around SKOS?</em></p>
<p>This question sounds surprisingly sceptical about the growth of SKOS. I guess the dizzying speed of phenomena like Facebook and Twitter has fuelled expectations of tools springing up overnight like mushrooms, fully formed and ready to eat. But actually it takes time, not just for the tools to be fashioned, but for the potential market to develop an understanding of what they can do and what will happen next when they are used.</p>
<p>Applications for SKOS are springing up all the time, as fast as people can grow the skills and vision to deploy them. At the moment the market, or shall we say the power-base, seems to be with the academic sector and allied not-for-profit organisations. This will spread progressively through the public to the private sector, as enterprises find ways of adapting their business models. The main hurdles to overcome could be intellectual property rights and the need for compilers of databases to keep earning their living.</p>
<p><em>3.  Alan, constructing thesauri for the semantic web also means that one  has to make the &#8220;open world assumption&#8221;. In which sense does this  change the way to manage thesauri, keep them growing and assure quality? Can  you see new, upcoming methodologies to do that?</em></p>
<p>Everything changes with the “open world assumption”! Following on from my answer to the previous question, it seems clear that one manifestation of the thesaurus will be found in those systems that support interoperability, such as federated searching or metadata registries. Even with simple thesaurus management software, it is possible to construct a “master vocabulary” or “word bank” to support different applications within an enterprise; thereby promoting interoperability. More sophisticated software is already available (though not very widely); more will be needed and, doubtless, will be created.</p>
<p>A more formal answer to both questions will be found in a new standard – ISO 25964, currently being prepared on the basis of <a href="http://schemas.bs8723.org/" target="_blank">BS 8723</a>. The two fundamental features of these two standards are (1) the thesaurus as a theoretical and practical basis for the construction of structured vocabularies for information retieval and (2) the growing and vital need for interoperability between systems and the intelligent mapping of the vocabularies used by those systems.</p>
<p><em>4. Stella, just recently  at ESWC 2010, Sean Bechhofer was asked during his keynote why there are so few SKOS tools on the  market. What do you  think are the reasons for this? Are there still shortcomings of the  SKOS specification compared to other existing thesaurus standards? (see  also: <a href="http://www.eswc2010.org/program-menu/keynote-speakers/155-sean-bechhofer" target="_blank">http://www.eswc2010.org/program-menu/keynote-speakers/155-sean-bechhofer</a> &amp;<a href="http://www.slideshare.net/seanb/skos-past-present-and-future" target="_blank"> http://www.slideshare.net/seanb/skos-past-present-and-future</a> )</em></p>
<p>Regarding the speed of development, see my reply above. As to shortcomings, did you note in one of Bechhofer&#8217;s slides: &#8220;Standardisation is necessarily a compromise: Everyone equally unhappy = success!&#8221; The SKOS development team took a conscious decision to keep the schema sufficiently simple that it could be applicable to as many different types of KOS as possible.  On the downside, this means SKOS is unsatisfactory for conveying sophisticated features of some thesauri and classification schemes. But by keeping the entry barrier low, more widespread use has been encouraged.</p>
<p>By way of illustration, compare SKOS with the data model and XML schema of BS 8723. This schema is comparatively specialized, with the aim of enabling exchange of any thesaurus carrying any or all of the features recommended in the standard. And incidentally, this data model and schema will have some further capabilities added when published in the forthcoming standard ISO 25964. SKOS does not provide for a number of features in these standards (such as compound equivalence). But the schemas in BS 8723 and ISO 25964 are designed for thesaurus developers to share their work, rather than for easy publication on the Web, and will never have so many users or associated tools as SKOS.</p>
<p>So I believe that SKOS has done well to accept compromises that encourage generalisation although they might not suit some specialists. That said, I do regret one of its weaknesses in the context of mapping. Compound equivalence mappings (that is to say, where Concept A in one vocabulary maps to a combination of Concepts  B and C in another) are very commonly needed when extending a search across multiple databases, and the SKOS mapping properties do not currently allow for them. Perhaps there will be some provision in future?</p>
<p><em>5. Stella, Alan, in September ISKO UK will organise an event  on &#8220;The Future of Knowledge Organisation on the Web&#8221;. &#8220;Linked Data&#8221;  seems to be a promising approach to organise knowledge in large scale  environments.<br />
Could you imagine that SKOS as a small subset of  semantic web specifications will play a central role in this environment since  it is quite intuitively comprehensible by virtually any knowledge  worker or do you  rather think SKOS is too simple (or too complex)? (see also: <a href="http://poolparty.punkt.at/using-skos-as-an-interface-to-the-linked-data-cloud" target="_blank">http://poolparty.punkt.at/using-skos-as-an-interface-to-the-linked-data-cloud</a> )</em></p>
<p>Stella: Of course SKOS will have a central role (whether or not every knowledge worker finds it as intuitive as you suppose). &#8220;Linked Data&#8221; will find even wider applicability. ISKO-UK (the organiser of the meeting in London on 14 September) has a mission not just to spread the word about both these technologies, but to build bridges between the several communities who must share their expertise and data to build more exciting applications. We&#8217;re expecting an audience of over 100 at this low-cost event.</p>
<p>Alan: Yes, of course, just as all the tools in the web stack will be necessary if semantic web technologies are to be effective. But it is obvious that we are dealing with complexities of a higher order than ever before. Any structured vocabulary is an “artificial language” which, while acknowledging many aspects of theoretical linguistics is forced to be pragmatic in its construction. Consequently, it would not be surprising if SKOS is seen to be “catching up”, and this became apparent in the work of BS 8723 when thesaurus models using UML were being constructed. There remains much work to be done on all fronts.</p>
<p><strong>Stella Dextre Clarke</strong> is an independent consultant specializing in the design and implementation of thesauri and other knowledge organization structures. She currently leads ISO NP 25964, the project to update and revise the international standards for thesauri. Previously she was the Convenor of the Working Group which developed BS 8723. In 2006 she won the Tony Kent Strix Award for outstanding achievement in information retrieval, in recognition for her development work on IPSV (Integrated Public Sector Vocabulary), as well as on the vocabulary standards. She is a Fellow of the Chartered Institute of Library and Information Professionals.</p>
<p><strong>Alan Gilchrist</strong> has been a consultant for many years in the fields of information management and information architecture, specialising in the vocabulary aspects of information retrieval. He is co-author, with Jean Aitchison and David Bawden of <em><a href="http://www.amazon.de/Thesaurus-Construction-Use-Practical-Manual/dp/0851424465/" target="_blank">Thesaurus Construction and Use</a>, </em>now in its fourth edition. In 1979 he founded and edited the <em>Journal of Information Science, </em>and is now Editor Emeritus. He has an Honorary Degree (D. Litt.) from the University of Brighton and is an Honorary Fellow of the Chartered Institute of Librarians and Information Professionals.<em> </em></p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/06/21/stella-dextre-clarke-alan-gilchrist-about-the-future-of-knowledge-organization-on-the-web/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/06/21/stella-dextre-clarke-alan-gilchrist-about-the-future-of-knowledge-organization-on-the-web/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/06/21/stella-dextre-clarke-alan-gilchrist-about-the-future-of-knowledge-organization-on-the-web/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Kingsley Idehen: &#8220;By declaring its context, Linked Data can be made more easily reusable by others&#8221;</title>
		<link>http://blog.semantic-web.at/2010/06/16/kingsley-idehen-i-only-think-in-terms-of-a-web-of-linked-data/</link>
		<comments>http://blog.semantic-web.at/2010/06/16/kingsley-idehen-i-only-think-in-terms-of-a-web-of-linked-data/#comments</comments>
		<pubDate>Wed, 16 Jun 2010 13:23:18 +0000</pubDate>
		<dc:creator>Andreas Blumauer</dc:creator>
				<category><![CDATA[Corporate Semantic Web]]></category>
		<category><![CDATA[Enterprise 2.0]]></category>
		<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Tools & Software]]></category>
		<category><![CDATA[Kingsley Idehen]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[OpenLink Software]]></category>
		<category><![CDATA[OpenLink Virtuoso]]></category>
		<category><![CDATA[SPARQL]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1607</guid>
		<description><![CDATA[


Semantic Web Company talked with Kingsley Idehen who is CEO of OpenLink Software and probably one of the most profound experts on data integration issues about &#8220;Linked Data&#8221;.
The interview covers questions like:

How can Linked Data help to make companies more productive?
Do you think that the Linked Data Initiative can build upon a stable  architecture [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><table>
<tr>
<td valign="top"><a href="http://blog.semantic-web.at/wp-content/uploads/2010/06/Bild-1.png"><img style="margin: 5px;" title="Kingsley Idehin" src="http://blog.semantic-web.at/wp-content/uploads/2010/06/Bild-1-150x150.png" alt="" width="100" align="left"/></a></td>
<td valign="top">Semantic Web Company talked with <a href="http://twitter.com/kidehen" target="_blank">Kingsley Idehen</a> who is CEO of <a href="http://www.openlinksw.com/">OpenLink Software</a> and probably one of the most profound experts on data integration issues about &#8220;Linked Data&#8221;.</p>
<p>The interview covers questions like:</p>
<ul>
<li>How can Linked Data help to make companies more productive?</li>
<li>Do you think that the Linked Data Initiative can build upon a stable  architecture or will it face more and more problems the bigger the  &#8220;cloud&#8221; will grow?</li>
<li>What´s the ultimate argument for an Enterprise Architect to use  languages like SPARQL at least in addition to SQL?</li>
<li>How will a &#8220;Real Time Semantic Web&#8221; change the whole game?</li>
<li>How will the &#8220;Semantic Web&#8221; be called in 10 years? Will there still be a  &#8220;Semantic Web&#8221;?</li>
</ul>
<p>Read the full version of the interview <a href="http://www.semantic-web.at/1.36.resource.308.7-questions-to-kingsley-idehen-x22-by-declaring-its-context-linked-data-can-be-made-more-e.htm" target="_blank">here</a>.</td>
</tr>
</table>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/06/16/kingsley-idehen-i-only-think-in-terms-of-a-web-of-linked-data/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/06/16/kingsley-idehen-i-only-think-in-terms-of-a-web-of-linked-data/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/06/16/kingsley-idehen-i-only-think-in-terms-of-a-web-of-linked-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adrian Pohl: &#8220;We believe the Semantic Web plays an important role for the future of libraries.&#8221;</title>
		<link>http://blog.semantic-web.at/2010/05/20/adrian-pohl-we-believe-the-semantic-web-plays-an-important-role-for-the-future-of-libraries/</link>
		<comments>http://blog.semantic-web.at/2010/05/20/adrian-pohl-we-believe-the-semantic-web-plays-an-important-role-for-the-future-of-libraries/#comments</comments>
		<pubDate>Thu, 20 May 2010 08:54:31 +0000</pubDate>
		<dc:creator>Tassilo Pellegrini</dc:creator>
				<category><![CDATA[Companies & Institutions]]></category>
		<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[libraries]]></category>
		<category><![CDATA[Semantic Web]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1596</guid>
		<description><![CDATA[A group of Cologne-based  libraries has taken a big step towards open data. In an concerted action  they have relased their catalogue data for reuse on the web. Project  manager Adrian Pohl comments on the initiative and what role the  Semantic Web will play for libraries in the future.
In March 2010 [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/05/pohl.jpg"><img class="alignnone size-full wp-image-1597" title="pohl" src="http://blog.semantic-web.at/wp-content/uploads/2010/05/pohl.jpg" alt="" width="75" height="89" /></a>A group of Cologne-based  libraries has taken a big step towards open data. In an concerted action  they have relased their catalogue data for reuse on the web. Project  manager Adrian Pohl comments on the initiative and what role the  Semantic Web will play for libraries in the future.</p>
<h3>In March 2010 several Cologne-based libraries have opened their  catalogue data under a CC0 license following Tim Berners-Lee&#8217;s call for  &#8220;Raw Data Now!&#8221;. What has been the motivation behind this step?</h3>
<p>The <a href="http://www.hbz-nrw.de/" target="_blank">hbz</a> (&#8220;Hochschulbibliothekzentrum des Landes Nordrhein-Westfalen&#8221;, english:   &#8220;North Rhine-Westphalian Library Service Centre&#8221;) has come to the   conclusion that libraries need to participate in the development of the   Semantic Web<strong>.</strong> The opening of   catalog data followed as a necessary first step. Our intention is to   show with this first legal-political step how important the   legal/licensing dimension is when you publish data on the web, be it   Linked Data or not. So for us at the hbz the Open Data initiative   primarily is seen as the first step in eventually publishing Linked Open   Data just as Tim Berners-Lee had called for.</p>
<p>Other participants in the <a href="http://www.hbz-nrw.de/dokumentencenter/presse/pm/datenfreigabe_engl" target="_blank">Cologne Open  Data initiative</a> like the  Cologne University and City  Library focus more on the direct advantages  the releasing of raw  bibliographic data bings: With other libraries and  consortia following  this example it will be easy to enrich existing  catalog or other  bibliographic services with subject headings,  classification numbers,  tags etc. Also, published raw data is integrated  into other web  services like Wikipedia which point back to libraries&#8217;   services. Indeed, Open Data is an end in itself which should be pursued   by more organizations in the library world and beyond it.</p>
<h3>The provided data is currently availble in a proprietary but open  format. Can you give us some technical description of the published  data? Do you have plans in providing more structured datasets in the  future?</h3>
<p>&#8220;Opaque but open&#8221; would be the better description of the underlying   format because it isn&#8217;t proprietary at all. Actually, alongside the <a href="http://opendata.hbz-nrw.de/projects/data-publishing/wiki/Download-en" target="_blank">data  from the hbz union catalog</a> there is  data stemming from  libraries&#8217; local databases (see <a href="http://opendata.ub.uni-koeln.de/" target="_blank">http://opendata.ub.uni-koeln.de/</a> and <a href="http://opendata.zbsport.de/" target="_blank">http://opendata.zbsport.de/</a>). We   are using different internal formats. Generally, all the formats are   based on the MAB format (an acronym for &#8220;Maschinelles Austauschformat   für Bibliotheken&#8221; which means &#8220;Automatic Interchange Format for   Libraries&#8221;) that is only used in the German and Austrian library world   for the data interchange between libraries similar to the better known   MARC format (Machine-Readable Cataloging) of the Library of Congress. It   was developed in the 1970s for storing data on magnetic tape. The   format documentation can be viewed <a href="http://www.d-nb.de/standardisierung/formate/mab.htm" target="_blank">on the German  National Library&#8217;s webpages</a>.   As the format is nearly 40 years old, the processing of MAB data is   very cumbersome on modern computers. Therefore, the hbz provides an   encapsulation method called &#8220;generic format&#8221;, where the historic data   records of the library catalogs are unwrapped into a more common,   user-friendly scheme. Each record is placed into a Unicode UTF-8 encoded   file, containing all the MAB fields, each of them separated by line   feeds, and the whole record set of a library is forming a &#8220;tar&#8221; archive,   which is compressed afterwards to save space.    It is possible to dump those archives by a usual unpack tool. This   software is available on all known Windows/Linux/Unix platforms. Or you   can use a simple Perl helper script provided by hbz. More tools and   scripts, even in other programming languages, are in preparation for   publication.   The opaqueness and the age of the standards used in the library   world (the english standard MARC which is used worldwide doesn&#8217;t differ   in these respects from MAB) make it necessary to change to a more open   and widely adopted standard. That&#8217;s where Linked Data comes into play   which is based on the accepted and widespread standards HTTP and   URIs. The construction of RDF out of the library catalog raw data is a   very sophisticated design task. Our plans are to convert the existing   data to RDF using proper vocabularies which enable us to lose as little   information as possible and giving access to the data by providing a   SPARQL endpoint.</p>
<h3>Currently the data you provide is open but not yet linked. What are  your plans when it comes to contribute to the Linked Data Cloud?</h3>
<p>I have to go into greater detail to answer this question  properly.  Viewed simply, the data of library institutions can be  divided into two  broad types: authority data and bibliographic data.  Authority data  splits up in data about people, about corporate entities  and about  subject headings. In Germany, authority data is maintained  centrally by  the German National Library in cooperation with the six  German library  consortia. Bibliographic databases consist of records  about books or  rather editions of books. Authority data and  bibliographic data are  already heavily linked, for instance a  bibliographic record contains the  author&#8217;s or editor&#8217;s authority number  which links to the corresponding  authority record.   The German National Library is also working on migrating library   data, especially authority data, into the Semantic Web. They recently   made their <a href="https://wiki.d-nb.de/display/LDS/Dokumentation+des+Linked+Data+Service+Prototyps+der+DNB" target="_blank">Linked  Data prototype for authority data</a> publicly available. We  have already taken first steps to cooperate and  coordinate our efforts.  The colleagues at the German National Library  have recently developed  a Linked Data prototype for their authority  data. As they take care of  authority data we focus ourselves on  bibliographic data. At the moment  we are exploring the technology and  vocabularies for publishing  bibliographic data as Linked Data. That&#8217;s a  demanding task because  besides the known vocabularies like Dublin Core  or the Bibliographic  Ontology (Bibo) which don&#8217;t fully map to the  density and structure of  the information in the catalogs, there has been  several years&#8217; work on  the new comprehensive cataloging standard <a href="http://www.rdaonline.org/" target="_blank">RDA</a> (Resource Description and Access) for which a <a href="http://metadataregistry.org/rdabrowse.htm" target="_blank">RDF  representation</a> has been developed. However, RDA in RDF needs  to be modified a lot so  that it can be applied to our bibliographic  data. We are currently  working on a vocabulary for the union catalog&#8217;s  data based on existing  vocabularies like Bibo and RDA.   Of course, as soon as we will have published bibliographic data as   linked data we will start linking to hubs in the Linked Data Cloud like   DBpedia or GeoNames.</p>
<h3>Publishing data to the LOD Cloud is one thing. Consuming data is  another. Have you plans to integrate data from the LOD Cloud into your  systems? Do you have policies for quality assurance?</h3>
<p>Of course the possibility to incorporate data from other  sources  easily is one major reason for us to publish Linked Data  besides the  goal of making libraries&#8217; data an integral part of the web.  Enriching our data with other data and providing new   services through and with mashups would be a main reason to link to   other data. We are, however, not working on such projects yet, because   we first need to convert our legacy data to RDF.</p>
<h3>What role will the Semantic Web play for libraries in the future?</h3>
<p>We believe the Semantic Web plays an important role for the future   of libraries. Discussions about &#8220;Next Generation Catalogs&#8221; are a   recurring theme in the library world since the 1990s. It is time to   finally act and move our data enprisoned in opaque formats to a new   level by improving its structure and underlying technology   and by migrating to formats that can be easily consumed by others who   are not part of the library world. Joining the Linked Open Data   community seems to us the best way to go.   Also, the production, publication and dissemination of academic   literature is subject to ongoing and fundamental changes which have   far-reaching implications for the work of academic libraries and their   role in research and education. We believe that semantic markup and   interlinking will play an important role in the development of knowledge   production and thus indirectly will have great impact on libraries.   Clearly, the Semantic Web can&#8217;t be cancelled out of the future of   libraries.</p>
<p>Moreover, turning your question around, libraries could play an   important role for the future of the Semantic Web. Libraries are trusted   institutions and deeply grounded in our culture. As indicated above   libraries have produced linked data (again: lower case) since the time   of card catalogs. We undoubtly have some practice in producing and   curating linked data which should be worth a lot to the Semantic Web   community. We thus think libraries are predestinated for helping to   coninuously order the messy place the Semantic Web always will be and   ensuring its trustworthiness and stability.</p>
<h3>About Adrian Pohl</h3>
<p>Adrian Pohl is working at the Cologne-based North  Rhine-Westphalian  Library Service Center on Open Data, Linked Data and  its conceptual,  theoretical and legal implications. He regularly writes  at <a href="http://www.uebertext.org/" target="_blank">Übertext: Blog</a> about the  internet, libraries  and metadata, Linked Open Data, communication,  epistemology and the  like. He has studied communication science and  philosophy in Aachen and  is currently studying Library and Information  Science at the Cologne  University of Applied Science. You can follow  him on Twitter: <a href="http://twitter.com/acka47" target="_blank">http://twitter.com/acka47</a>.</p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/05/20/adrian-pohl-we-believe-the-semantic-web-plays-an-important-role-for-the-future-of-libraries/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/05/20/adrian-pohl-we-believe-the-semantic-web-plays-an-important-role-for-the-future-of-libraries/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/05/20/adrian-pohl-we-believe-the-semantic-web-plays-an-important-role-for-the-future-of-libraries/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Dynamic Web Of Data</title>
		<link>http://blog.semantic-web.at/2010/04/26/a-dynamic-web-of-data/</link>
		<comments>http://blog.semantic-web.at/2010/04/26/a-dynamic-web-of-data/#comments</comments>
		<pubDate>Mon, 26 Apr 2010 13:45:27 +0000</pubDate>
		<dc:creator>Michael Hausenblas</dc:creator>
				<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Semantic Web Applications]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[dataset dynamics]]></category>
		<category><![CDATA[Pubsubhubbub]]></category>
		<category><![CDATA[scaling]]></category>
		<category><![CDATA[sparqlpush]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1571</guid>
		<description><![CDATA[As a matter of fact things change &#8211; the Web of Data is no  exception in that respect. While some sources, such as  Twitter, are intrinsically dynamic, others change every now and  then, potentially in unforeseeable intervals. In the recent Talis  Nodalities Magazine, we made a case for Keeping up with a [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p>As a matter of fact things change &#8211; the Web of Data is no  exception in that respect. While some sources, such as  Twitter, are intrinsically dynamic, others change every now and  then, potentially in unforeseeable intervals. In the recent Talis  Nodalities Magazine, we made a case for <a id="nsyj" title="Keeping up with a LOD of changes" href="http://www.talis.com/nodalities/pdf/nodalities_issue9.pdf">Keeping up with a LOD  of changes</a>; here I&#8217;m going to elaborate a bit more on the current  state of <a id="vp42" title="Dataset Dynamics" href="http://esw.w3.org/DatasetDynamics">Dataset Dynamics</a> and its challenges.</p>
<p>Let  us first step a back a bit and have a look what Dataset Dynamics are  and why this is important. In the <a id="s1e-" title="Web of Linked Data" href="http://linkeddata.org/">Web of Linked Data</a> we typically  deal with <a id="ef7z" title="datasets" href="http://rdfs.org/ns/void-guide#sec_1_Describing_Datasets">datasets</a>, for example, from the <a id="tei0" title="biomedical  domain" href="http://linkedlifedata.com/sources">biomedical domain</a> or the <a id="yjmf" title="media industry" href="http://data.nytimes.com/">media industry</a> on the one hand, and  entities, such as a <a id="xt5_" title="certain protein" href="http://linkedlifedata.com/resource/entrezgene/id/7157">certain protein</a> or <a id="qhre" title="people" href="http://data.nytimes.com/38832438934068808203">people</a> on the other. For the entity-level case  established HTTP caching mechanism can be leveraged (see the <a id="f9s2" title="Caching  Tutorial" href="http://www.mnot.net/cache_docs/">Caching Tutorial</a> and <a id="k_kt" title="Things Caches Do" href="http://tomayko.com/writings/things-caches-do">Things Caches Do</a>). Further, with Memento, a  <a id="qqi:" title="HTTP-based versioning mechanisms" href="http://events.linkeddata.org/ldow2010/papers/ldow2010_paper13.pdf">HTTP-based  versioning mechanisms</a> has been proposed as well as <a id="c-r6" title="implemented" href="http://www.mementoweb.org/">implemented</a>,  adding a &#8220;time dimension&#8221; to HTTP (see Fig. 1).</p>
<div id="attachment_1573" class="wp-caption aligncenter" style="width: 510px"><a href="http://blog.semantic-web.at/wp-content/uploads/2010/04/memento.jpg"><img class="size-full wp-image-1573" title="memento" src="http://blog.semantic-web.at/wp-content/uploads/2010/04/memento.jpg" alt="" width="500" height="260" /></a><p class="wp-caption-text">Fig. 1 Memento Framework (Source: &quot;An HTTP-Based Versioning Mechanism for Linked Data&quot; Herbert Van de Sompel, Robert Sanderson, Michael Nelson, Lyudmila Balakireva, Harihar Shankar, Scott Ainsworth, LDOW 2010)</p></div>
<p style="text-align: center;">
<h2>Dataset-level changes</h2>
<p>However,  tackling <em>dataset-level changes</em> is a rather new field with no  agreed-upon, even less standardised solution handy. The main problem is  that a dataset typically talks about many thousands to millions of  distinct entities, which makes it impractical to apply entity-level  solutions for a range of <a id="rw18" title="use cases" href="http://groups.google.com/group/dataset-dynamics/web/use-cases">use cases</a>, such as link maintenance or  replication (see also Fig. 2).</p>
<div id="attachment_1574" class="wp-caption aligncenter" style="width: 510px"><a href="http://blog.semantic-web.at/wp-content/uploads/2010/04/change_frequency.jpg"><img class="size-full wp-image-1574" title="change_frequency" src="http://blog.semantic-web.at/wp-content/uploads/2010/04/change_frequency.jpg" alt="" width="500" height="332" /></a><p class="wp-caption-text">Fig. 2 Change frequency vs. change volume</p></div>
<p>I often hear these days: &#8220;it seems there is no solution  for handling of dataset-level changes&#8221;; nevertheless, I think quite the  opposite it true. There are plenty of proposed solutions from both the  academia and practitioners, targeting different challenges in the areas  of:</p>
<ul>
<li><strong>Change discovery</strong> &#8211; how  do I find out about about dataset changes?</li>
<li><strong>Propagating changes </strong>- if there is a change, how is the  change communciated to a consumer?</li>
<li><strong>Change  semantics</strong> &#8211; how do I learn what has changed (has been added,  removed, etc.)?</li>
</ul>
<p>Some proposals on  the table are integrated approaches (such as <a id="rnm6" title="DSNotify" href="http://dsnotify.org/">DSNotify</a>, <a id="bfuu" title="SemanticPingback" href="http://aksw.org/Projects/SemanticPingback">SemanticPingback</a>, Talis <a id="v_7s" title="Changeset" href="http://n2.talis.com/wiki/Changeset_Protocol">Changeset</a>) while others focus on certain aspects  (like the <a id="tqaa" title="dady  vocabulary" href="http://vocab.deri.ie/dady">dady vocabulary</a> for discovery or the <a id="pliz" title="Graph Update  Ontology" href="http://webr3.org/specs/guo/">Graph Update Ontology</a> for change semantics) or deal  concrete environments, for example <a id="yh2b" title="sparqlPuSH" href="http://apassant.net/blog/2010/04/18/sparql-pubsubhubbub-sparqlpush">sparqlPuSH</a> for SPARQL enpdoints.</p>
<h2>A Dataset Dynamics Manifesto</h2>
<p>No  matter on what (set of) solutions the community eventually agrees on to  address the handling of dataset-level changes, it should adhere to the  following principles:</p>
<ul>
<li><strong>light-weight</strong></li>
<li><strong>distributed and scalable</strong></li>
<li><strong>standards-based</strong></li>
</ul>
<p>Obviously, a <strong>light-weight</strong> (and ideally  RESTful) approach lowers the barriers to adoption and enables a quick  uptake. When I say light-weight, I mean it both in terms of protocol and  code. It should be easy to integrate in RDF stores and libraries and  available in all common Web programming languages including but not  limited to Java, PHP, .NET family, etc.</p>
<p>Just as the Web of Data  is a globally <strong>distributed </strong>dataspace, handling of changes should  be done in a distributed fashion. There will be many different  publishers and consumers (<span style="font-size: small;">such as agents, indexer,  consolidator platforms, etc.</span>) of datasets with different requirements and capabilities. A distributed approach can cope with this  challenge in a cost- and performance-efficient way. Tightly connected  to this: It has to <strong>scale</strong>. Today, we&#8217;re dealing with some hundreds  of LOD datasets. In the next couple of years, this will likely explode  into the millions and hence one needs to be able to deal with such a  growth. The same, just sooner, is true for the number of consumers of  the changes.</p>
<p>Last but not least the Dataset Dynamics solution  should be based on <strong>standards</strong>. It doesn&#8217;t necessarily need to be  RDF for all of the challenges as outlined above. For example, <a id="s:mu" title="Atom" href="http://tools.ietf.org/html/rfc4287">Atom</a> offers a standardised, extensible and widely accepted format to  propagate changes; to take this further <a id="g3p4" title="Pubsubhubbub" href="http://code.google.com/p/pubsubhubbub/">Pubsubhubbub</a> can be utilised to enable a  standardised, distributed publisher-subscriber scheme (Fig 3.)</p>
<div id="attachment_1575" class="wp-caption aligncenter" style="width: 510px"><strong><a href="http://blog.semantic-web.at/wp-content/uploads/2010/04/hubs.jpg"><img class="size-full wp-image-1575" title="hubs" src="http://blog.semantic-web.at/wp-content/uploads/2010/04/hubs.jpg" alt="" width="500" height="396" /></a></strong><p class="wp-caption-text">Fig. 3  Pubsubhubbub - a standard-based, distributed publisher-subscriber-hub system (Source: http://docs.google.com/present/view?id=ajd8t6gk4mh2_34dvbpchfs)</p></div>
<p style="text-align: center;">
<p>As I&#8217;ve outlined above, it might still be too early for a  conclusion on how to deal with dataset-level changes. However, people  interested in this area have gathered already in the <a id="tdz." title="Dataset Dynamics group" href="http://groups.google.com/group/dataset-dynamics">Dataset Dynamics group</a> where  solutions are discussed and implemented, potentially leading to a W3C  standardisation work.</p>
<p><strong><em>As an aside: in case you&#8217;re at the <a id="b1qr" title="WWW2010" href="http://www2010.org/www/">WWW2010</a> in  Raleigh (NC, USA) these days, you may want to join the<a id="ro6g" title="break-out meeting on Dataset Dynamics" href="http://esw.w3.org/Camps:LODCampW3CTrack#breakout"> break-out meeting on  Dataset Dynamics</a> during the W3C Linked Open Data track on 29 April  2010.</em></strong></p>
<p>(This blog post was written by <a href="http://sw-app.org/mic.xhtml" target="_blank">Michael Hausenblas</a>)<strong><em><br />
</em></strong></p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/04/26/a-dynamic-web-of-data/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/04/26/a-dynamic-web-of-data/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/04/26/a-dynamic-web-of-data/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Sören Auer: &#8220;Establishing a network effect around linked data is the most important R&amp;D goal for the near future.&#8221;</title>
		<link>http://blog.semantic-web.at/2010/04/15/soren-auer-establishing-a-network-effect-around-linked-data-is-the-most-important-rd-goal-for-the-near-future/</link>
		<comments>http://blog.semantic-web.at/2010/04/15/soren-auer-establishing-a-network-effect-around-linked-data-is-the-most-important-rd-goal-for-the-near-future/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 11:34:13 +0000</pubDate>
		<dc:creator>Tassilo Pellegrini</dc:creator>
				<category><![CDATA[Conferences & Events]]></category>
		<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Politics]]></category>
		<category><![CDATA[Privacy & Information Ethics]]></category>
		<category><![CDATA[Leipzig]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Semantic Web Day]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1558</guid>
		<description><![CDATA[Leipzig is one of Germany&#8217;s  Semantic Web hotspots. From May 5-6, 2010 the annual Semantic Web Day  provides the opportunity to catch up with latest developments especially  in the domain of Linked Data and the foundation of the German chapter  of the Open Knowledge Foundation. Organizer Sören Auer gave us some [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/04/soeren.jpg"><img class="alignnone size-full wp-image-1560" title="soeren" src="http://blog.semantic-web.at/wp-content/uploads/2010/04/soeren.jpg" alt="" width="75" height="96" /></a>Leipzig is one of Germany&#8217;s  Semantic Web hotspots. From May 5-6, 2010 the annual Semantic Web Day  provides the opportunity to catch up with latest developments especially  in the domain of Linked Data and the foundation of the German chapter  of the Open Knowledge Foundation. Organizer Sören Auer gave us some  background information.</p>
<h3>From May 5 &#8211; 6, 2010 the 3rd Semantic Web Day in Leipzig will take  place. What will be this year&#8217;s topics? Who should attend?</h3>
<p>The <a href="http://aksw.org/Events/2010/LeipzigerSemanticWebDay">Semantic Web  Day</a> is targeting IT people, software developers, decision makers and  users interested in learning about the potential of semantic  technologies. The language during the event is German, so primarily  Austrians, Swiss and Germans will attend. Beside semantic technologies a  particular focus of this years event is open data in governments,  public administrations and science. Although the programme is not yet  finalized we already compiled an interesting number of talks and  presentations including talks about the open biodiversity database  Fishbase, the European Digital Library Europeana, a Linked Data project  of the German Umweltbundesamt, use case presentations in the pharma,  publishing and telecommunication industries and many more (cf.<a href="http://aksw.org/LSWT"> http://aksw.org/LSWT</a>). Also, in  addition to AKSW the Topic Maps Lab and the Web Data Integration Labs  from Universität Leipzig be present at LSWT.</p>
<h3>One of the highlights of this year`s Semantic Web Day is the  official institutionalization of the German Chapter of the Open  Knowledge Foundation. How did this come around? What does this mean for  the OKF as a whole?</h3>
<p><a href="http://www.okfn.org/">OKFN </a>started to work in  2006 and since then managed to sucessfully complete a number of projects  facilitating open knowledge. In particular, the<a href="http://www.ckan.net/"> Comprehensive Knowledge Archive Network  (CKAN)</a>, the <a href="http://www.okfn.org/okcon/">OKCon conference  series</a>, the open knowledge definition and recently OKFN&#8217;s  involvement in the launch of <a href="http://data.gov.uk/">data.gov.uk</a> are prominent examples of OKFN&#8217;s successful work. However, many of the  OKFN activities were primarily driven by an active group of volunteers  in the UK. With the official launch of the German OKFN branch we will  strengthen the international dimension of OKFN&#8217;s work. Especially for  Germany, where data privacy and security are perceived to be most  important, raising awareness for enabling open, standards compliant  access to public information will be an important target of OKFN&#8217;s  activities.</p>
<h3>The InFAI has become one of the hotspots in Semantic Web  development in Germany over the past few years. What are you working on  at the moment? What are the most interesting research and development  aspects for the near future?</h3>
<p>From our point of view establishing a network effect around  the publishing and use of linked data is the most important research and  development goal for the near future. We just completed a first draft  and implementations of a semantic enabled pingback method (<a href="http://aksw.org/Projects/SemanticPingBack">http://aksw.org/Projects/SemanticPingBack</a>),  which applies a similar peer notification mechanism to linked data  endpoints as it is widely deployed on the blogosphere. Other important  research issues we are tackling with our partners are closing the  performance gap between RDF and relational data management, increasing  the coherence and quality of linked data and the provisioning of  adaptive user interfaces for authoring and maintaining information on  the data web.</p>
<h3>About Sören Auer</h3>
<p>Dr. Sören Auer leads the research group<a href="http://aksw.org/About"> Agile Knowledge Engineering and Semantic  Web (AKSW)</a> at University of Leipzig. His research interests include  Semantic Web technologies, knowledge representation, engineering and  management, agile methodologies as well as databases and information  systems. Sören is founder (respectively co-founder) of several  high-impact research and community projects such as the Wikipedia  semantification project DBpedia, the open-source innovation platform <a href="http://cofundos.org/">Cofundos.org</a> or the social Semantic Web  toolkit <a href="http://ontowiki.net/Projects/OntoWiki">OntoWiki</a>.  Sören is author of over 50 peer-reviewed scientific publications,  co-organiser of several workshops, chair of the Social Semantic Web  conference 2007 and I-Semantics 2008, serves as an expert for industry,  the European Commission, the W3C and is member of the advisory board of  the Open Knowledge Foundation.</p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/04/15/soren-auer-establishing-a-network-effect-around-linked-data-is-the-most-important-rd-goal-for-the-near-future/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/04/15/soren-auer-establishing-a-network-effect-around-linked-data-is-the-most-important-rd-goal-for-the-near-future/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/04/15/soren-auer-establishing-a-network-effect-around-linked-data-is-the-most-important-rd-goal-for-the-near-future/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Juan Sequeda: &#8220;I believe Linked Data will enable new killer apps that are only possible thanks to Linked Data.&#8221;</title>
		<link>http://blog.semantic-web.at/2010/04/14/interview-with-juan-sequeda-i-believe-linked-data-will-enable-new-killer-apps-that-are-only-possible-thanks-to-linked-data/</link>
		<comments>http://blog.semantic-web.at/2010/04/14/interview-with-juan-sequeda-i-believe-linked-data-will-enable-new-killer-apps-that-are-only-possible-thanks-to-linked-data/#comments</comments>
		<pubDate>Wed, 14 Apr 2010 15:30:40 +0000</pubDate>
		<dc:creator>Tassilo Pellegrini</dc:creator>
				<category><![CDATA[Calls & Competitions]]></category>
		<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Semantic Web Applications]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Web of Data]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1548</guid>
		<description><![CDATA[Juan Sequeda, co-chair of the Triplification Challenge 2010 and one of the core figures in the Linked  Data movement, gives us his view how the Semantic Web might evolve. His  central message: &#8220;Once there is an incentive to create quality links,  these links will start to show up. And then users will [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/04/juan.jpg"><img class="size-full wp-image-1549 alignnone" title="juan" src="http://blog.semantic-web.at/wp-content/uploads/2010/04/juan.jpg" alt="" width="75" height="93" /></a>Juan Sequeda, co-chair of the<a href="http://i-semantics.tugraz.at/triplification-challenge"> Triplification Challenge 2010</a> and one of the core figures in the Linked  Data movement, gives us his view how the Semantic Web might evolve. His  central message: &#8220;Once there is an incentive to create quality links,  these links will start to show up. And then users will start linking to  the data hubs of their interest.&#8221;</p>
<h3>Linked Data itself has grabbed a lot of attention inside the  Semantic Web community recently. But what about the outside perspective?  Could linked data be called the killer app for the Semantic Web?</h3>
<p>I foresee two things happening with Linked Data. One is from  the web development perspective (the so-called Web 2.0 developers) and  the other is from the enterprise perspective. The web development  community will sooner than later realize that Linked Data will enable  easy integration of data and therefore will ease the pain of consuming  data from different data sources. Thanks to big organizations such as  BBC, New York Times, Reuters, Best Buy, etc. web developers will start  paying attention to this &#8220;new thing&#8221; called Linked Data.</p>
<p>What we need is that the inside Semantic Web community starts to  create applications on top of current Linked Data so when the outside  web development community starts to pay attention, they have something  to chew on. We (the semantic web community) needs to start speaking the  web development language. There is still a big gap. I have had personal  experiences with people in the web development community who think that  RDF is XML and because they hate XML, they will never consider it. This  is false and this is something that we need to change.</p>
<p>From the enterprise perspective, Linked Data is another data integration  solution. Data integration has been a problem since day one of  relational databases. I believe enterprises will be open to consider new  solutions with new technologies. I&#8217;m hoping to see new startups  tackling the enterprise domain. Imagine being able to query &#8220;get all my  clients from cities whose population is greater than 1 million&#8221; even  though I don&#8217;t have the data about population of cities in my database.</p>
<p>Is Linked Data the killer app for the Semantic Web? Before I answer  that, I would like to ask, what was the killer app of the Web? Was it  the browser? Was it e-commerce? Was it search? Was it Amazon or Ebay or  Google? I believe Linked Data will enable new killer apps, apps that are  only possible thanks to Linked Data. The browser was only possible  because of HTML. So let&#8217;s ask ourselves what is possible because of  Linked Data, and there we will find our killer app.</p>
<h3>One of the core deficiencies of the young open data cloud is the  little amount of interlinks between datasets. Is it just a matter of  time to overcome this or are there other measures needed to turn the  existing datasets into a true giant global graph?</h3>
<p>I like to remind myself that this new wave of semantic web  technologies is an extension of the current web. Therefore we should  analyze how the web evolved in the beginning. Initially, everything were  a bunch of documents on the web in which people manually created links  to other documents. When Google started, it created an incentive to  offer quality links between documents. This also created data hubs. If  you write a blog post about a book, most probably you will link to the  web document of that book either on Amazon and/or Wikipedia. I believe  that this will happen with Linked Data. Once there is an incentive to  create quality links, these links will start to show up. And then users  will start linking to the data hubs of their interest.</p>
<h3>Open Governmental Data is a big issue at the moment. The US and UK  government have started to apply Linked Data principles to turn this  vision into reality. Lots of other countries are following. What do you  expect from this trend?</h3>
<p>I believe that Linked Data will take off thanks to the  initiative of governments. We always talk about the chicken and egg  problem of the semantic web. Once we have organizations that don&#8217;t even  think about it and are just interested in putting their data on the web,  the semantic web will start to grow. If Bookstore ABC puts their data  on the web, it may not be so meaningful. But if the US and UK government  puts their data on the web, following the Linked Data principles, then  people can wake up and say &#8220;ok, so this is for real. Let me start paying  attention to this&#8221;.</p>
<h3>You are one of the chairs of the Triplification Challenge 2010. Can  you give us a brief insight what to expect from this year&#8217;s challenge?  What are the conditions to participate?</h3>
<p>The <a href="http://i-semantics.tugraz.at/triplification-challenge/call-for-submissions">Triplification  Challenge</a> this year has grown and is very exciting. For the first  time, it is offering two different tracks.</p>
<p>The first track, the Open Track will accept submissions on three  areas 1) new datasets that are published following the Linked Data  principles and that show potential benefit, 2) generic methods,  mechanisms and approaches of creating Linked Data from legacy datasets  and 3) applications that make use of Linked Data.</p>
<p>The second track is the New York Times track which will accept  submissions of applications that make use of the New York Times Linked  Data and one or more government dataset. The objective is to create an  application powered by Linked Data that would be of interest to any  constituent of that government.</p>
<p>I personally believe that the year 2010 is the year of creating Linked  Data applications and the Triplification Challenge is the way to be part  of it.</p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/04/14/interview-with-juan-sequeda-i-believe-linked-data-will-enable-new-killer-apps-that-are-only-possible-thanks-to-linked-data/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/04/14/interview-with-juan-sequeda-i-believe-linked-data-will-enable-new-killer-apps-that-are-only-possible-thanks-to-linked-data/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/04/14/interview-with-juan-sequeda-i-believe-linked-data-will-enable-new-killer-apps-that-are-only-possible-thanks-to-linked-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Georgi Kobilarov: &#8220;I believe that data publishing must happen in a distributed style.&#8221;</title>
		<link>http://blog.semantic-web.at/2010/03/26/interview-with-georgi-kobilarov-i-believe-that-data-publishing-must-happen-in-a-distributed-style/</link>
		<comments>http://blog.semantic-web.at/2010/03/26/interview-with-georgi-kobilarov-i-believe-that-data-publishing-must-happen-in-a-distributed-style/#comments</comments>
		<pubDate>Fri, 26 Mar 2010 11:09:45 +0000</pubDate>
		<dc:creator>Tassilo Pellegrini</dc:creator>
				<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Mashups & Web services]]></category>
		<category><![CDATA[Semantic Web Applications]]></category>
		<category><![CDATA[Tools & Software]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Web of Data]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1509</guid>
		<description><![CDATA[Uberblic.org connects  structured data from the web. The Berlin-based inventor Georgi Kobilarov  gives a brief insight into the mashup service and talks about the  challenges when it comes to build applications upon linked data.
You have recently published the service uberblic.org, a Linked Data  mashup editor. What was your motivation to develop [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><a href="http://uberblic.org"></a><a href="http://blog.semantic-web.at/wp-content/uploads/2010/03/georgi.jpg"><img style="margin-left: 5px; margin-right: 5px;" title="georgi" src="http://blog.semantic-web.at/wp-content/uploads/2010/03/georgi.jpg" alt="" width="75" height="80" /></a>Uberblic.org connects  structured data from the web. The Berlin-based inventor Georgi Kobilarov  gives a brief insight into the mashup service and talks about the  challenges when it comes to build applications upon linked data.</p>
<h3>You have recently published the service uberblic.org, a Linked Data  mashup editor. What was your motivation to develop this tool?</h3>
<p><a href="http://uberblic.org/">Uberblic.org</a> provides an  integrated view of web data. Our goal is to integrate all the structured  data on the web, and give web-developers a single point to access to  that reconciled data. More than that, we will open up the tools we use  to manage the data sources to the community, so that the people can help  us curating that repository of free data. We re-publish all the data we  import as Linked Data, under the licenses of the original data  publishers.</p>
<p>Some of the data sources we import are available in the Linked Open Data  cloud as well, but many are not. Linked Data is an elegant way to  publish data in a distributed way on the web, but consuming it from that  distributed cloud is &#8211; at least &#8211; impractical. In every real-world  application using linked data from the web I&#8217;ve seen, organizations  built up internal copies of the cloud, and often even reconcile linked  data sources. They build their own Linked Data proxies. <a href="http://www.uberblic.org/">Uberblic.org</a> helps those users by  providing one public proxy for data from the web. Many of our sources  get monitored for data changes, and the according data in uberblic is  updated in real-time.</p>
<p><img title="uberblic" src="http://www.semantic-web.at/file_upload/1_tmpphpmM4pWv.jpg" alt="uberblic" /></p>
<h3>Can you give us a brief insight how the tool works? What technology  is is built on?</h3>
<p>My company, Uberblic Labs, has developed a data integration  platform that we use to power uberblic.org. We call it the Uberblic  Platform (the name uberblic is derived from the German &#8220;Überblick&#8221; &#8211;  English &#8220;overview&#8221;). This platform enables us to do the full process of  &#8220;data fusion&#8221;: Importing and converting external data sources, mapping  the data schemas to a central ontology, filtering out data errors,  automatically suggesting duplicates to the user, and merging data from  different sources into a single, reconciled representation.</p>
<p>Structured and semi-structured data from the web is an excellent use  case for our software platform, since there we come across all the  interesting cases of real-world data heterogeneity. But what I think is  especially powerful and yet missing in other Linked Data projects I  know, is the ability to subscribe to update-feeds. We do that  extensively, fetching updates in real-time from Wikipedia and the like.</p>
<p>Our platform is built in Scala and runs a on cluster of machines, with  workers communicating through a messaging system. We developed an RDF  storage layer on top of a distributed key-values store for storing all  provenance information used in the extraction process, currently around  100 million named graphs for uberblic.org. That storage layer does not  directly provide SPARQL access, so we push all the output data into a  SPARQL endpoint hosted by Talis as well.</p>
<h3>What have been the biggest challenges in tackling the integration  issues of dispersed data?</h3>
<p>It was quite a steep learning curve to do Linked Data not  only in an academic environment, but in a reliable, industry-strength  set-up. In academia, there was always the excuse that things are just  research prototypes. Now that excuse is gone. That&#8217;s also where it  becomes necessary to manually clean up data. And there are two ways to  do that: Either you enable the users to change facts directly in your  repository after you have imported the external data (that is what  Freebase does), or you facilitate clean-up cycles in the original data  source and fetch these updates in real-time. That is what we do.</p>
<p>I believe that data publishing must happen in a distributed style,  because then each data source gets taken care of by a specialized group  of people using specialized tools. And it&#8217;s what you see not only on the  web, but also inside organizations and enterprises. But consuming data  trough centralized APIs is more than just convenient. We all use Google<br />
or another search engine as a central access point to web pages which  are published in a distributed way all over the web, don&#8217;t we? Can you  imagine today researching a topic on the web without the centralization  power of search engines, just by following links across web sites, like  in the old days?</p>
<p>When we built the Uberblic Platform, some of the things I imagined to be  large headaches, like schema mapping, turned out to work really well.  Those pathologic cases you often see in academic &#8220;challenges&#8221; are &#8211; well  &#8211; pathologic. It&#8217;s not necessary to solve them fully automatically  through super-intelligent algorithms. Much more important than the  sophistication of your algorithms are well designed workflows so that  the user becomes a part of the solution. And that&#8217;s not about  crowd-sourcing or swarm intelligence, the editorial curating of schema  mappings and object reconciliation can be done just by a small team of  people. If they have the right set of tools.</p>
<h3>What are the next plans with uberblic.org? Where will the journey  go?</h3>
<p>Uberblic.org will continue to integrate more interesting and  useful data sources from the web, and we will start making more APIs  available to web developers to build their applications on top. We are  also looking for partners who are interested in developing applications  and have been struggling in the past to get the cross-source data from  the web they need.</p>
<p>The work on improving uberblic.org will also benefit our Uberblic  Platform, and hence our clients who use that same software for  integrating organizational data sources with each other and with the web  of data.</p>
<h3>About Georgi Kobilarov</h3>
<p>Georgi is founder and managing director of <a href="http://uberblic.com/">Uberblic Labs</a>, a company based in Berlin  specialized in Linked Data integration. He worked as a research  associate in the Web-based Systems Group at Freie Universität Berlin and  as a visiting researcher at Hewlett Packard Labs Bristol. As co-founder  and lead developer of DBpedia, he was also a day-one contributor to the  Linking Open Data project. Georgi is consulting with the BBC on several  Linked Data related projects. He organizes the Web of Data Meetup  London, a bi-yearly gathering of the UK Linked Data community. Georgi  graduated with a Diplom in business administration from Freie  Universität Berlin and has many years of work experience as a software  developer. Visit his blog: <a href="http://blog.georgikobilarov.com%20/" target="_blank">http://blog.georgikobilarov.com</a></p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/03/26/interview-with-georgi-kobilarov-i-believe-that-data-publishing-must-happen-in-a-distributed-style/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/03/26/interview-with-georgi-kobilarov-i-believe-that-data-publishing-must-happen-in-a-distributed-style/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/03/26/interview-with-georgi-kobilarov-i-believe-that-data-publishing-must-happen-in-a-distributed-style/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Marco Neumann: &#8220;It&#8217;s definitely an exciting time to be on the Semantic Web!&#8221;</title>
		<link>http://blog.semantic-web.at/2010/03/25/interview-with-marco-neumann-its-definitely-an-exciting-time-to-be-on-the-semantic-web/</link>
		<comments>http://blog.semantic-web.at/2010/03/25/interview-with-marco-neumann-its-definitely-an-exciting-time-to-be-on-the-semantic-web/#comments</comments>
		<pubDate>Thu, 25 Mar 2010 10:39:26 +0000</pubDate>
		<dc:creator>Tassilo Pellegrini</dc:creator>
				<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Semantic Web Applications]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[html 5]]></category>
		<category><![CDATA[semantic web standards]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1493</guid>
		<description><![CDATA[Marco Neumann is an Information Scientist and CEO of KONA a consulting  and technology service company based in New York City. The Semantic Web  activist is an invited expert to the W3C HTML 5 working group. He  recently started a discussion on the challenges and difficulties in  bringing the Semantic Web [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/03/marco1.jpg"><img style="margin-left: 5px; margin-right: 5px;" title="marco" src="http://blog.semantic-web.at/wp-content/uploads/2010/03/marco1.jpg" alt="" width="70" height="89" /></a>Marco Neumann is an Information Scientist and CEO of KONA a consulting  and technology service company based in New York City. The Semantic Web  activist is an invited expert to the W3C HTML 5 working group. He  recently started a discussion on the challenges and difficulties in  bringing the Semantic Web into business. SWC asked him for some  additional comments.</p>
<h3>Marco, you recently initiated a discussion in a Google Group on the  difficulty to change Semantic Web standards. What was the background of  the discussion? Where do you perceive a need for action?</h3>
<p>It&#8217;s not so much about changing this existing standards but  the challenge  to bring them into the world of practitioners and  standards developers.  The language used in W3C recommendations quite  frequently requires  advanced topic knowledge and familiarity with the  jargon of the  discussion about the respective technologies. I recently  discussed this  with a senior standards maven at the W3C and got the  answer that the recommendations can&#8217;t be changed retrospectively and  that they are intended  to be used primarily by vendors for  implementation purposes.</p>
<p>Well  this might be the case but I also got the impression that Tim  Berners-Lee  objective for the W3C is primarily to meet the needs of a  larger  community. And the W3C took this into account for most of the  Semantic  Web recommendations in the past. Something I still find  amazing is  the fact that the work process at the W3C is partially and the  recommendations  are entirely publicly accessible. Though we definitely  still need  more and better tools to work with semantic web data, higher  quality  documentation and last but not least more user adoption on the  web.</p>
<h3>Critics of the Semantic Web often refer to the slow uptake of  Semantic Web standards by industry. Is standards adoption actually a  valid and sufficient metric to evaluate the maturity of a standard? What  would be needed to accelerate the uptake?</h3>
<p>I think we might see a similar scenario to the uptake of HTML  in the early  90s, a relatively small number of technology mavens will  pave the  way towards making the Semantic Web more attractive as a  technology  solution for a wide range of applications and will  successfully  publish open data before we see business application  developers make  use of Semantic Web standards.</p>
<h3>The availability of trustable and quality approved RDF data is  crucial for the success of the Semantic Web. Given the fact that the aggregation business on the WWW is highly concentrated the corresponding formula  is simple: If Google just consumes but does not give back RDF the  Semantic Web won&#8217;t scale. Do you agree?</h3>
<p>Yes and no. Yes we need better and more semantic data on the  Web, but we will also need better ways to deal with trust in a  lightweight and web friendly fashion. I currently see a number of semi  automated approaches emerging  that could scale on the web. An example  are distributed user based recommendation systems to validate  authenticity, open Wikipedia style community evaluation and content  curation a la freebase. Increased public accountability for data  producers might be an interesting venue as well. In regards to Google  I&#8217;d say web search engines will go where the web goes. A problem I might  see arising is that web search engines will initially develop their own  standards to deal with the emerging Semantic Web and confuse users on  the web or might pursue a time consuming power play with the W3C. I see a  little bit of that in the current discussion in the HTML 5 working  group.</p>
<h3>As we know from social sciences technological standards are  necessary but always incomplete and unsatisfactory. From a standards  design and outreach perspective: What would it need to make the Semantic  Web flourish?</h3>
<p>I&#8217;m not sure if we really know all that much about the laws  of innovation  and the evolution of technology standards at this point.  If we draw  from the short experience with the World Wide Web I would  come to the  conclusion that innovation takes place in small to medium  size teams  that pursue an independent vision of how services should be  delivered  and how the technology should be designed. In addition Tim  Berners-Lee&#8217;s  encourages the production of lots and lots of data to  bootstrap the  Semantic Web and create a pull for services in the  industry. And  indeed we really see some traction for example with the  Linked Open  Data and Open Government initiatives. It&#8217;s definitely an  exciting  time to be on the Semantic Web!</p>
<h3>About Marco Neumann</h3>
<p>Marco Neumann is an Information Scientist and CEO of KONA a  consulting and  technology service company based in New York City. KONA  provides semantic  technologies to businesses solutions and adds value  to products and  services in a highly networked economy. In addition  Marco currently  acts as an Invited Expert to the W3C on the HTML 5  working group and  is the director of the global semantic social network  <a href="http://www.lotico.com./">lotico.com</a>.</p>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/03/25/interview-with-marco-neumann-its-definitely-an-exciting-time-to-be-on-the-semantic-web/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/03/25/interview-with-marco-neumann-its-definitely-an-exciting-time-to-be-on-the-semantic-web/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/03/25/interview-with-marco-neumann-its-definitely-an-exciting-time-to-be-on-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Linking Open Data to Thesaurus Management</title>
		<link>http://blog.semantic-web.at/2010/02/16/linking-open-data-to-thesaurus-management/</link>
		<comments>http://blog.semantic-web.at/2010/02/16/linking-open-data-to-thesaurus-management/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 16:23:11 +0000</pubDate>
		<dc:creator>Tassilo Pellegrini</dc:creator>
				<category><![CDATA[Corporate Semantic Web]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Semantic Web Applications]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[dbpedia]]></category>
		<category><![CDATA[KIWI]]></category>
		<category><![CDATA[kiwiknows]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[PoolParty]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Simple Knowledge Organization System]]></category>
		<category><![CDATA[SKOS]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=1430</guid>
		<description><![CDATA[The Vienna-based company punkt. netServices is just about to release a demo version of their PoolParty service, a SKOS-based thesaurus management tool with linked data capabilities. I had the chance to pre-read a white paper and test their service. Here is a brief overview. You can also try a demo.
Purpose
Poolparty was conceived to facilitate various [...]]]></description>
			<content:encoded><![CDATA[<!-- sphereit start --><p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/02/poolparty-logo.jpg"><img class="alignleft size-full wp-image-1466" title="poolparty-logo" src="http://blog.semantic-web.at/wp-content/uploads/2010/02/poolparty-logo-e1266070425356.jpg" alt="" width="261" height="95" /></a>The Vienna-based company <a href="http://www.punkt.at" target="_blank">punkt. netServices</a> is just about to release a demo version of their PoolParty service, a SKOS-based thesaurus management tool with linked data capabilities. I had the chance to pre-read a white paper and test their service. Here is a brief overview. You can also try a <a href="http://poolparty.punkt.at/PoolParty/" target="_blank">demo</a>.</p>
<p><strong>Purpose</strong></p>
<p>Poolparty was conceived to facilitate various applications like</p>
<ul>
<li> Semantic search engines</li>
<li> Recommender systems (similarity search)</li>
<li> Corporate bookmarking</li>
<li> Annotation- &amp; tag recommender systems</li>
<li> Autocomplete services and facetted browsing.</li>
</ul>
<p>These use cases can be either achieved by using PoolParty stand-alone or by integrating it with existing Enterprise Search Engines and Document Management Systems or Enterprise Wikis.</p>
<p><strong>Thesaurus Management</strong></p>
<p>PoolParty is aiming to be easy to use for people without a strong Semantic Web background or special technical skills. The GUI is entirely web-based and utilizes AJAX so the user can e.g. quickly merge two concepts via drag &amp; drop. An overview over the thesaurus can be gained with a tree or a graph view on the concepts.</p>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/02/poolparty-blueskin.jpg"><img title="poolparty-blueskin" src="http://blog.semantic-web.at/wp-content/uploads/2010/02/poolparty-blueskin.jpg" alt="poolparty-blueskin" width="504" height="263" /></a></p>
<p>PoolParty also helps to semi-automatically add concepts to a thesaurus as it can be used to analyse documents (e.g. web pages or PDF files) relevant to a thesaurus&#8217; domain in order to glean candidate terms. This is done by the key-phrase extractor of <a href="http://www.nzdl.org/Kea/index.html">KEA</a>. The extracted terms can be selected by the user, thereby becoming &#8220;free concepts&#8221; which later can be integrated into the thesaurus, turning them into &#8220;approved concepts&#8221;.</p>
<p>Documents can be searched in various ways – either by keyword search in the full text, by searching for their tags or by semantic search and similarity search. The latter takes not only a concept&#8217;s preferred label into account, but also its synonyms and the labels of its related concepts are considered in the search. The user might manually remove query terms used in semantic search. Boost values for the various relations considered in semantic search may also be adjusted. In the same way the recommendation mechanism for document similarity calculation works.</p>
<p>PoolParty by default also publishes a Semantic Wiki version of its thesauri, which provides an alternative way to browse and edit concepts. Through this feature anyone can get read access to a thesaurus, and optionally also edit, add or delete labels of concepts. Search and autocomplete functions are available here as well. The Wiki’s XHTML source is also enriched with RDFa, thereby exposing all RDF metadata associated with a concept to be picked up by RDF search engines and crawlers. (See two examples: <a href="http://poolparty.punkt.at/PoolParty/HTMLFrontEnd/urn:uuid:1D64A764-CBCE-0001-6148-DA20F637144F/" target="_blank">Cocktail thesaurus</a> &amp;  <a href="http://poolparty.punkt.at/PoolParty/HTMLFrontEnd/urn:uuid:1D649E15-C6CC-0001-C311-60702F00C880/?URI=http%3A%2F%2Fzbw.eu%2Fstw" target="_blank">Standard Thesaurus for Economics</a>)</p>
<p style="text-align: center;"><a href="http://blog.semantic-web.at/wp-content/uploads/2010/02/PoolParty-Wiki-Frontend.png"><img class="aligncenter size-full wp-image-1468" title="PoolParty Wiki Frontend" src="http://blog.semantic-web.at/wp-content/uploads/2010/02/PoolParty-Wiki-Frontend.png" alt=""  /></a></p>
<p>PoolParty also supports the import of thesauri in SKOS (including several consistency checks) or <a href="http://zthes.z3950.org/" target="_blank">Zthes</a> format. Those functionalities can also be consumed as stand-alone web services via <a href="http://demo.semantic-web.at:8080/SkosServices/index" target="_blank">PoolParty SKOS Services</a>. Additionaly, lists of concepts and their labels can also be imported via CSV files.</p>
<p><strong>Linked (Open) Data</strong></p>
<p>PoolParty not only publishes its thesauri as Linked Open Data (in addition to a SPARQL endpoint), but it also consumes LOD in order to expand thesauri with information from LOD sources.</p>
<p>Concepts in the thesaurus can be linked to e.g. DBpedia  via a service like <a href="http://www.georgikobilarov.com/">Georgi Kobilarov</a>&#8217;s <a href="http://lookup.dbpedia.org/" target="_blank">DBpedia lookup service</a>, which takes the label of a concept and returns possible matching candidates. The system suggests relevant resources from DBpedia and the user can select the one that matches the concept from his thesaurus, thereby creating a skos:exactMatch relation between the concept URI in PoolParty and the DBpedia URI. The same approach can be used to link to other SKOS thesauri available as Linked Data.</p>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2010/02/poolparty-lod.jpg"><img title="poolparty-lod" src="http://blog.semantic-web.at/wp-content/uploads/2010/02/poolparty-lod.jpg" alt="poolparty-lod" width="630" height="265" /></a></p>
<p>Other triples can also be retrieved from the target data source, e.g. the DBpedia abstract can become a skos:definition and geographical coordinates can be imported and be used to display the location of a concept on the map, where appropriate. The DBpedia category information may also be used to retrieve additional concepts of that category as siblings of the concept in focus, in order to populate the thesaurus.</p>
<p>PoolParty is capable of importing a SKOS thesaurus from a Linked Data server, and may also receive updates to thesauri imported this way. This feature has been implemented in the course of the <a href="http://www.kiwi-project.eu/" target="_blank">KiWi  project</a> funded by the European Commission. KiWi also contains SKOS thesauri and exposes them as LOD. Both systems can read a thesaurus via the other’s LOD interfaces and may write it to their own store. This is facilitated by special Linked Data URIs that return e.g. all the top-concepts of a thesaurus, with pointers to the URIs of their narrower concepts, which allow other systems to retrieve a complete thesaurus through iterative dereferencing of concept URIs.</p>
<p>Additionally KiWi and PoolParty publish lists of concepts created, modified, merged or deleted within user specified time-frames. With this information the systems can learn about updates to one of their thesauri in an external system. They then can compare the versions of concepts in both stores and may write according updates to their own store.</p>
<p>This means each system decides autonomously which data it accepts and there is no risk of a system pushing data that might lead to inconsistencies into an external store. Data transfer and communication are achieved using REST/HTTP, no other protocols or middleware are necessary. Also no rights management for each external systems is needed, which otherwise would have to be configured separately for each source.</p>
<p><strong>Technology</strong></p>
<p>The software is written in Java and utilizes the <a href="http://www.openrdf.org/doc/sesame2/system/ch05.html" target="_blank">SAIL API</a>, so it can be used with various triple stores. The thesaurus management itself (viewing, creating and editing SKOS concepts and their relationships) can be done in an AJAX Frontend based on <a href="http://developer.yahoo.com/yui/" target="_blank">Yahoo User Interface (YUI)</a>. Editing of labels can alternatively be done in a Wiki style HTML frontend. For key-phrase extraction from documents PoolParty uses a modified version of the <a href="http://www.nzdl.org/Kea/" target="_blank">KEA</a> 5 API, which is extended for the use of controlled vocabularies stored in a SAIL Repository (this module is available under GNU GPL). The analysed documents can be stored and indexed in <a href="http://en.wikipedia.org/wiki/Lucene" target="_blank">Lucene</a>/<a href="http://en.wikipedia.org/wiki/Solr" target="_blank">Solr</a> or any other (enterprise) search system along with extracted and semantically related concepts.</p>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/4251823d-5925-4c7d-8d67-e74c82af33f9/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=4251823d-5925-4c7d-8d67-e74c82af33f9" alt="Reblog this post [with Zemanta]" /></a><span class="zem-script more-related pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div>
<!-- sphereit end --><span style="margin-bottom:40px; border-bottom:none;"><a class="iconsphere" title="Sphere: Related Content" onclick="return Sphere.Widget.search('http://blog.semantic-web.at/2010/02/16/linking-open-data-to-thesaurus-management/')" href="http://www.sphere.com/search?q=sphereit:http://blog.semantic-web.at/2010/02/16/linking-open-data-to-thesaurus-management/">Sphere: Related Content</a></span><br/><br/>]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2010/02/16/linking-open-data-to-thesaurus-management/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
