<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Semantic Puzzle&#187; Controlled Vocabulary</title>
	<atom:link href="http://blog.semantic-web.at/tag/controlled-vocabulary/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.semantic-web.at</link>
	<description>Open World Assumptions</description>
	<lastBuildDate>Thu, 02 Feb 2012 14:26:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>&#8220;Thesaurus based search engines will become main stream in the near future&#8221;</title>
		<link>http://blog.semantic-web.at/2011/06/26/thesaurus-based-search-engines-will-become-main-stream-in-the-near-future/</link>
		<comments>http://blog.semantic-web.at/2011/06/26/thesaurus-based-search-engines-will-become-main-stream-in-the-near-future/#comments</comments>
		<pubDate>Sun, 26 Jun 2011 08:19:52 +0000</pubDate>
		<dc:creator>Andreas Blumauer</dc:creator>
				<category><![CDATA[Linked Data & Open Data]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Semantics & Philosophy]]></category>
		<category><![CDATA[Controlled Vocabulary]]></category>
		<category><![CDATA[SKOS]]></category>
		<category><![CDATA[survey]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=2186</guid>
		<description><![CDATA[The results of the survey titled &#8220;Do controlled vocabularies matter?&#8221; which was conducted by Semantic Web Company from May until June 2011 are public now. Over 150 participants from 27 countries draw a picture of the current and future usage &#8230; <a href="http://blog.semantic-web.at/2011/06/26/thesaurus-based-search-engines-will-become-main-stream-in-the-near-future/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The results of the survey titled &#8220;Do controlled vocabularies matter?&#8221; which was conducted by Semantic Web Company from May until June 2011 are public now. Over 150 participants from 27 countries draw a picture of the current and future usage behaviour in the realm of controlled vocabularies.</p>
<p>Here are three of the most interesting outcomes of this questionnaire &#8211; the <a href="http://issuu.com/andreas_blumauer/docs/survey_do_controlled_vocabularies_matter_2011_june" target="_blank">whole report can be found and downloaded on issuu</a>:</p>
<blockquote><p><strong>Do you think enterprises and other organizations can significantly benefit from using Linked Data?</strong></p></blockquote>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2011/06/linked_data_benefit.jpg"><img class="alignleft size-medium wp-image-2187" title="linked data benefit" src="http://blog.semantic-web.at/wp-content/uploads/2011/06/linked_data_benefit-300x117.jpg" alt="" width="300" height="117" /></a>The answer is a clear<strong> YES. </strong>A subsequent question also reveals that all kind of organisation sizes have about the same opinion concerning linked data. Only few people think that linked data is a &#8220;niche thing&#8221;.<strong> </strong>In general it can be said, that over <strong>90% of the participants</strong> think that <strong>most or at least some organisations can benefit from using linked data.</strong></p>
<blockquote><p><strong>Do you think that search engines which utilize thesauri to improve results will become main-stream</strong></p></blockquote>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2011/06/thesaurus_based_search.jpg"><img class="alignleft size-medium wp-image-2193" title="thesaurus_based_search" src="http://blog.semantic-web.at/wp-content/uploads/2011/06/thesaurus_based_search-300x112.jpg" alt="" width="300" height="112" /></a>The results of this question are amazing: <strong>Two thirds</strong> of the participants think that <strong>thesaurus based search</strong> is already or will become main-stream in the near future. Scepticism towards this development seems to be low &#8211; at least it can be stated, that a clear majority thinks that <strong>thesaurus based search engines will become main stream in the near future.</strong></p>
<p>&nbsp;</p>
<blockquote><p><strong>How important is the usage of standards like SKOS for controlled vocabularies?</strong></p></blockquote>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2011/06/importance-of-skos.jpg"><img class="alignleft size-medium wp-image-2200" title="importance of skos" src="http://blog.semantic-web.at/wp-content/uploads/2011/06/importance-of-skos-300x111.jpg" alt="" width="300" height="111" /></a>The results speak for themselves. The majority of the participants are convinced that standards like SKOS are important for their daily work. In August 2009 W3C announced the new SKOS standard – now, nearly two years after, it looks like this standard has well arrived. <strong>48.7% stated that standards like SKOS are very important and 29.1% voted for “relevant”</strong>.</p>
<p>&nbsp;</p>
<p>As an overall result of the survey it can be stated: <em>Semantic Web community has done a great job to convince the controlled vocabulary people to benefit from SKOS and linked data &#8211; on the other side only 3-5% are aware of SPARQL as a valuable resource to build standard APIs around controlled vocabularies to lower costs when implementing such knowledge organization systems.</em></p>
<p>Many thanks to all participants of this survey!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2011/06/26/thesaurus-based-search-engines-will-become-main-stream-in-the-near-future/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Which kind of controlled vocabularies matter?</title>
		<link>http://blog.semantic-web.at/2011/05/11/which-kind-of-controlled-vocabularies-matter/</link>
		<comments>http://blog.semantic-web.at/2011/05/11/which-kind-of-controlled-vocabularies-matter/#comments</comments>
		<pubDate>Wed, 11 May 2011 14:50:07 +0000</pubDate>
		<dc:creator>Thomas Schandl</dc:creator>
				<category><![CDATA[Calls & Competitions]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Semantic Web Applications]]></category>
		<category><![CDATA[Controlled Vocabulary]]></category>
		<category><![CDATA[glossary]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[SKOS]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[Thesaurus]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=2113</guid>
		<description><![CDATA[Looking at intermediate results of the Controlled Vocabularies Survey an interesting finding concerns the question which types of knowledge models are currently best fit for actual use in applications. So far 143 people whose organization already make use of controlled &#8230; <a href="http://blog.semantic-web.at/2011/05/11/which-kind-of-controlled-vocabularies-matter/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Looking at intermediate results of the <a href="http://www.surveygizmo.com/s3/480834/Controlled-Vocabularies-Survey">Controlled Vocabularies Survey</a> an interesting finding concerns the question which types of knowledge models are currently best fit for actual use in applications.</p>
<p>So far 143 people whose organization already make use of controlled vocabularies answered the question <strong>&#8220;Which kind of controlled vocabulary do you use or plan to use in your applications?&#8221;</strong>.<br />
The results so far show that lightweight models like taxonomies and thesauri are somewhat preferred over ontologies: </p>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2011/05/survey-question.jpg"><img src="http://blog.semantic-web.at/wp-content/uploads/2011/05/survey-question.jpg" alt="" title="survey question regarding types of knowledge models" width="435" height="221" class="aligncenter size-full wp-image-2114" /></a></p>
<p>Taxonomies are the favorite, as 73.6% of participants use or plan to use them, followed by thesauri (62%) and ontologies (61.2%), while simple glossaries lag considerably behind with a usage of 31.4%.</p>
<p>This survey will close in about a week, so please take this chance to make your opinions on this topic count! You can find the questions <a href="http://www.surveygizmo.com/s3/480834/Controlled-Vocabularies-Survey">here</a>, it will take 5-10 minutes to answer them.</p>
<p>All participants will gain access to a report with the results within the following month. The most interesting results will be made public on this blog.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2011/05/11/which-kind-of-controlled-vocabularies-matter/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Controlled vocabularies: &#8220;Data integration is king&#8221;</title>
		<link>http://blog.semantic-web.at/2011/04/11/controlled-vocabularies-data-integration-is-king/</link>
		<comments>http://blog.semantic-web.at/2011/04/11/controlled-vocabularies-data-integration-is-king/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 08:52:56 +0000</pubDate>
		<dc:creator>Andreas Blumauer</dc:creator>
				<category><![CDATA[Calls & Competitions]]></category>
		<category><![CDATA[Vocabularies & Languages]]></category>
		<category><![CDATA[Controlled Vocabulary]]></category>
		<category><![CDATA[data integrat]]></category>
		<category><![CDATA[SKOS]]></category>
		<category><![CDATA[survey]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=2069</guid>
		<description><![CDATA[Just recently a survey about &#8220;Controlled vocabularies&#8221; and their significance for enterprise information management has started. Until today 143 participants have responded and completed the survey at least partially. To give a first example what was found out, I would &#8230; <a href="http://blog.semantic-web.at/2011/04/11/controlled-vocabularies-data-integration-is-king/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Just recently a survey about &#8220;Controlled vocabularies&#8221; and their significance for enterprise information management has started. Until today 143 participants have responded and completed the survey at least partially. To give a first example what was found out, I would like to take a closer at the question: <strong>What are the main application areas of controlled vocabularies from your perspective?</strong></p>
<p>A bit surprising is the intermediate result, that it´s not &#8220;Semantic Search&#8221; or &#8220;Support of multilingual applications&#8221; which was considered to be the most important application. Instead of this it turned out that &#8220;Data Integration&#8221; is king:</p>
<p><a href="http://blog.semantic-web.at/wp-content/uploads/2011/04/Main_applications.jpg"></a><a href="http://blog.semantic-web.at/wp-content/uploads/2011/04/Main_applications1.jpg"><img class="aligncenter size-medium wp-image-2072" title="Main_applications" src="http://blog.semantic-web.at/wp-content/uploads/2011/04/Main_applications1-300x184.jpg" alt="" width="300" height="184" /></a><br />
<a href="http://blog.semantic-web.at/wp-content/uploads/2011/04/Main_applications.jpg"><br />
</a>The bar graph shows the weighed value of each application candidate (1.0 would be a 100% acceptance that this is an important application area of controlled vocabularies). Regarding the top candidate &#8220;data integration&#8221;</p>
<ul>
<li>57,4% said &#8220;very important&#8221;</li>
<li>29,8% &#8220;relevant&#8221;</li>
<li>7,4% &#8220;somewhat relevant&#8221;</li>
<li>2,1% &#8220;not relevant&#8221;</li>
<li>3,2% &#8220;Don´t know&#8221;</li>
</ul>
<p>If you don´t think this should be the final result, please help to get a better overview of what´s going on in the controlled vocabulary community. <a href="http://www.surveygizmo.com/s3/480834/Controlled-Vocabularies-Survey">The survey</a> is open until May 18th, 2011 &#8211; all participants will gain access to a report with the results within the following month. Most interesting results will be made public on this blog.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2011/04/11/controlled-vocabularies-data-integration-is-king/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Combining Closed and Open Data Classification Mechanisms in an Extended Thesaurus</title>
		<link>http://blog.semantic-web.at/2008/06/26/combining-closed-and-open-data-classification-mechanisms-in-an-extended-thesaurus/</link>
		<comments>http://blog.semantic-web.at/2008/06/26/combining-closed-and-open-data-classification-mechanisms-in-an-extended-thesaurus/#comments</comments>
		<pubDate>Thu, 26 Jun 2008 14:36:00 +0000</pubDate>
		<dc:creator>Jana Herwig</dc:creator>
				<category><![CDATA[Ontology Engineering]]></category>
		<category><![CDATA[Social Software]]></category>
		<category><![CDATA[Controlled Vocabulary]]></category>
		<category><![CDATA[kiwiknows]]></category>
		<category><![CDATA[Social Tagging]]></category>
		<category><![CDATA[Tagging]]></category>
		<category><![CDATA[Thesaurus]]></category>

		<guid isPermaLink="false">http://blog.semantic-web.at/?p=173</guid>
		<description><![CDATA[In the next session, Rolf Sint gave us insights into his approach to the combination of closed and open data classification mechanisms, which is informed by his findings in his master&#8217;s thesis. The probably most widely used retrieval method for &#8230; <a href="http://blog.semantic-web.at/2008/06/26/combining-closed-and-open-data-classification-mechanisms-in-an-extended-thesaurus/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img src="http://blog.semantic-web.at/wp-content/uploads/2008/06/portrsint_112x129.jpg" alt="Rolf Sint" title="Rolf Sint" align="right" height="129" width="112">In the next session, <a href="http://www.salzburgresearch.at/contact/team_detail.php?person=142">Rolf Sint</a> gave us insights into his approach to the combination of closed and open data classification mechanisms, which is informed by his findings in his master&#8217;s thesis. The probably most widely used retrieval method for digital content is <strong>full-text search</strong>; Google and Yahoo&#8217;s indexing methods, for instance, rely on full-text search. To be able to use this method, words must be contained within the content, leading to obvious problems with synonyms, ambiguities or the different lexical inventory of different languages. Advantages are that full-text search is easy to use, and that no maintenance is required as this responsibility rests with the content providers.</p>
<p>On the other end of the spectrum, within open data classification mechanisms, we have <strong>social tagging</strong>. Tagging (in general) means that a user asigns labels to content items. The advantage here is that content is immediately classified; as such, tagging is an easy way to provide metadata for content, in particular as the user does not to have think about (arbitrary, system-dictated) structures. However, this leads to problems if singulars and plurals are used simultaneously, if synonyms are used, spelling mistakes occur etc etc. With tags, the exact same spelling has to be used if items are to be assigned to the same group. But if done collectively (and that is what social tagging is about), the wisdom of crowds can improve the signal to noise ratio significantly &#8211; see the miracle of the <a href="http://en.wikipedia.org/wiki/Tag_cloud">tag cloud</a>.</p>
<p>What Rolf proposed in his thesis was to combine the two approaches. In his design, he used an extended thesaurus as an instrument to achieve vocabulary control &#8211; we&#8217;re looking at an <strong>extended thesaurus</strong> here, because it&#8217;s not simply built around a taxonomy, but expanded by tags that were assigned by users and integrated using a vocabulary management tool.<br />
<img src="http://blog.semantic-web.at/wp-content/uploads/2008/06/screen_rsintextthes_500w.gif" alt="Extended Theasurus" title="Extended Theasurus" class="alignnone size-full wp-image-176" height="293" width="500"></p>
<p>This extended thesaurus can be applied in multiple ways. <span id="more-173"></span> During a tag event, for instance, the user can be assisted by questions like &#8220;Did you mean&#8230;&#8221; if a term is ambiguous: </p>
<p><img src="http://blog.semantic-web.at/wp-content/uploads/2008/06/screen_rsintass_500wide.gif" alt="Tag Assistant" title="Tag Assistant" height="324" width="500"></p>
<p>Search can be improved, too: If a user makes a search query, related terms can be suggested, drawing on the thesaurus. E.g., the term &#8216;jaguar&#8217; would call up similar terms, allowing the user to specify the query and clarify that he (or she) is looking for a predatory animal (i.e. not the car).</p>
<p><img src="http://blog.semantic-web.at/wp-content/uploads/2008/06/chart_rsintpres_385x155.gif" alt="Screen Related Query" title="Screen Related Query" class="alignnone size-full wp-image-174" height="155" width="385"></p>
<p>In the long term, using an extended thesaurus as a light-weight ontology can reduce the amount of work needed to maintain a vocabulary. What&#8217;s special in Rolf&#8217;s proposal is that the controlled vocabulary also  contains the terminology of the community. The user is thus able to navigate within the communal information space and, as a result, problems with homonyms, synonyms and different languages would be reduced.</p>
<p>A paper in which Rolf and two of his colleagues explain this approach in more detail is currently being prepared for publication: GÃ¼ntner, G., Sint, R., Westenthaler, R. (2008): &#8220;Ein Ansatz zur UnterstÃ¼tzung traditioneller Klassifikation durch Social Tagging&#8221;. Tagungsband des ExpertInnenworkshops &#8220;Social Tagging in der Wissensorganisation â€“ Perspektiven und Potenziale&#8221;, 2008 (im Druck). Further details about the publication can be <a href="http://www.salzburgresearch.at/contact/team_detail.php?person=142">obtained from Rolf</a>.<br />
<fieldset class="zemanta-related">
<legend>Related articles</legend>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a title="Open in new window" href="http://www.masternewmedia.org/news/2008/05/16/tags_and_tagging_how_do.htm">Tags and Tagging: How Do You Create Good Tags?</a> [via&nbsp;Zemanta]</li>
<li class="zemanta-article-ul-li"><a title="Open in new window" href="http://www.socialmediatoday.com/SMC/35993">How Important Are Tags to You?</a> [via&nbsp;Zemanta]</li>
<li class="zemanta-article-ul-li"><a title="Open in new window" href="http://anand.typepad.com/datawocky/2008/06/how-google-measures-search-quality.html">How Google Measures Search Quality</a> [via&nbsp;Zemanta]</li>
</ul>
</fieldset>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/e0c1d907-e5ce-4ace-9d8f-adead7a7d1de/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=e0c1d907-e5ce-4ace-9d8f-adead7a7d1de" alt="Zemanta Pixie"></a></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.semantic-web.at/2008/06/26/combining-closed-and-open-data-classification-mechanisms-in-an-extended-thesaurus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

