Semantic Web Company

The Semantic Puzzle

Open World Assumptions

subscribe RSS

Linked Data is not owl:sameAs Semantic Web

March 30, 2009 By: Andreas Blumauer Category: Linked Data & Open Data, Search Engines 3 Comments →

twitter_cloudletWhile some people work heavily on the extension of the semantic web infrastructure, like Talis Connected Commons or OpenLink´s Amazon EC2 Instantiation others have started to bring the semantic web closer to the developers and therefore to a much broader audience: They offer search facilities or Linked Data Navigators like OpenLink´s Entity Finder or DERI´s VisiNav.

Those kind of applications should not be confused with “semantic web” end-user-applications like Google´s Wonderwheel or INTSPEI´s Cloudlet: To add some semantics to existing user-interfaces can be helpful and obviously users are ready for such experiments, but of course this is NOT the innovation which the semantic web will bring but it is a very important step to be taken in parallel with the linked data initiative.

Let´s take a look at Cloudlet: This tool is an easy-to-use free Firefox extension that adds context-sensitive tag clouds to the most popular search engines and helps people more efficiently navigate through their search results. The previous version of Search Cloudlet worked with Google and Yahoo; the new version also works with Twitter. It adds Tag Clouds, Author Clouds, Recipient Clouds and Hashtag Clouds to Twitter search, Twitter user profiles and home pages. See some reviews on this popular tool.

Cloudlet is a child of the Web. INTSPEI has learned all lessons from Web 2.0 especially how to promote ideas using the blogosphere and how to identify market trends as early as possible, and it generates some added value for the users which is obvious. Sure, it doesn´t make use of linked data yet, but as a typical representative of the fast growing “semantic search evolution” it reminds me on Chris Welty´s famous insight: “In the Semantic Web, it is not the Semantic which is new, it is the Web which is new.”

Web 1.0 was the WWW without tons of network effects. Web 2.0 changed that a lot.

Linked Data is not the Semantic Web, it´s the basement for it. From a software developer´s and an IT archictect´s perspective it might seem as those two concepts were the same. But this community represents a very small percentage of all web-users.

So where is the User´s Web in the Linked Data architecture? If you´re looking at TimBL´s Linked Data principles one can clearly see that this is a “Web” for developers.

But things evolve. And some Web companies will jump on the bandwagon and will, for instance, improve their tagclouds, their semantic search, their recommender systems (Twine?) or their similarity search a lot by making use of linked data.

Like semantic search becomes mainstream (or call it “semantic search 2.0″) right now, then (in about three years, I guess) linked data will become part of a lot of mainstream applications. Linked data will generate tons of new network effects, maybe even new business models, it won´t be avant-garde anymore. It will be part of the Semantic Web.

Sphere: Related Content

Pimp your Google

February 04, 2009 By: Andreas Blumauer Category: Mashups & Web services, Search Engines No Comments →

Sure, that´s not the end of the flagpole – but “a little semantics goes a long way” (Jim Hendler): With two Firefox add-ons, you can pimp your Google and you will get (1) a better overview over the search results, (2) kind of a moderated search and (3) information from Wikipedia along with the results.

Install Cloudlet and Googlepedia (Don´t forget to donate!) and you will see something like this:

pimp_your_google

Sure, both “mashups” are not based on RDF, and the “TagCloud” is not as accurate as we wished, but let us be patient again. At least this picture makes end-users yearning for a bit more semantics (which goes a long way…) on top of the usual lists of search results.

Sphere: Related Content

The Wild vs The Orderly: Folksonomies and Semantics (TRIPLE-I 2008)

September 04, 2008 By: Jana Herwig Category: Collective Intelligence, Search Engines, Social Software, Vocabularies & Languages 2 Comments →

This second day of TRIPLE-I 2008 was my personal folksonomy day, even though the theme was already set yesterday, with Andreas Hotho’s invited talk about “Extracting Semantics from Folksonomies” which was the opening lecture of the workshop “Knowledge acquisition from the Social Web.”

Andreas Hotho is directing the Bibsonomy project at Kassel University’s Knowledge and Data Engineering resarch group; Bibsonomy is a social bookmark and publication sharing system catering especially for researchers who, next to bookmarkingm also wish to manage publications. Next to other interesting things, Bibsonomy supports the import of bookmarks from del.icio.us, Firefox bookmarks and local BibTex files. Being a project led by a university’s computer science department, Bibsonomy is at the same time the result, the object and a stimulus for research in the area of tagging and folksonomies. Andreas describes this double appeal of folksonomies to both ordinary people and researchers in a 12 seconds vlog post:


Andreas Hotho’s statement about folksonomies and research (see www.bibsonomy.org) on 12seconds.tv

One of the outcomes of the research into folksonomies is FolkRank, a search algorithm that exploits the structure of folksonomies; the name reveals that it was inspired by PageRank, but as the graph of folksonomy structures does not correspond to the web graph, some adaptations had to be made. The specifics of these adaptations can be found in an online article by Andreas and his colleagues: “FolkRank: A Ranking Algorithm for Folksonomies” (PDF, 268 KB).

Andreas Hotho’s talk more specifically addressed the search for methods to identify tags which describe the same concept (or a more specific / a more general concept respectively) within a folksonomy. He suggested two approaches:

  1. Applying measures directly to folksonomy statistics, allowing to describe tags as a vector; e.g. co-occurrence frequency and FolkRank could serve as a similarity measure (with these two having a tendency towards high-frequency tags) or a cosine method (which is more likely to produce “siblings”)
  2. Looking up tags in an external thesaurus/vocabulary (for instance achieving semantic grounding by mapping a tag and its most similar tags with Wordnet Synsets)

Future areas of interest within folksonomy research Andreas proposed were trend detection, tag recommendation, detecting spam (a major challenge!), logsonomies (i.e. the structure of search engine query log files) and learning synsets, hierarchies, and structures of folksonomies. Andreas Hotho can be contacted via his homepage, if you have any further questions regarding Bibsonomy, FolkRank or this present piece of research.

Another presentation dedicated to folksonomies – and the presentation that won my personal presentation design award – was “Seeding, Weeding, Fertilizing – Different Tag Gardening Activities for Folksonomy Maintenance and Enrichment” by Katrin Weller and Isabella Peters, both from the Dept. of Information Science at Heinrich Heine University in Düsseldorf. The entire presentation was designed to match the CI of Tagcare, a tag gardening tool that is hopefully going to go online soon.

The term “Tag Gardening” was borrowed from James Governor who wrote in a 2006 blogpost:

“Like plants or animals, tags evolve in an emergent fashion, open to hybridisation. Stewardship can help grow and put roots down.

Helping the darwinian process is tag gardening.

Tag gardening is about taking tags in the wild and tending to them, or identifying a wild tag that will do well in your south facing IT

garden. I am talking about domestication here.

Just like there are professional bloggers i am pretty sure some parties will emerge that get paid for their abilities.”

I seriously hope that the latter is going to come true, even though I have the feeling that most providers will continue to consider user input and effort pro bono work!

Katrin Weller’s intro (Isabella Peters had excused herself) focused on the well-known problems with tags and folksonomies, e.g. :

  • spelling variants, synonyms, abbreviations, different natural languages
  • adhoc or personal functions of tags other than content description (e.g. “toread”, “@Henry”, “nicepic”)
  • flatness of tag clouds which allows for browsing by popularity, but not by semantic interrelations

She further distinguished three levels where tag or tag cloud improvement becomes relevant:

  • single document vs document collection level
  • Single user vs collaborative level
  • intra- and cross plattform level (e.g. different tagging conventions, tag separation with comma or blank space, etc)

To push the gardening metaphor even further, Kathrin presented us their ideas of weeding, seeding, fertilizing etc.:

Weeding
The weeds in this case are “bad” tags like spam or misspelled tags (weed: any plant that crowds out cultivated plants)
Aim: enhancing recall and a consistent indexing vocabulary
Achieved by: type-ahead functionality, editing funcionalities, natural language processing, user guidelines for indexing and retrieval, nomination of authorized users as gardeners

Seeding
Seeding in folksonomies means to expand frequently used tags by more specific tags (called “baby tags” or “seedlings” by Katrin Weller; seedling: young plant or tree grown from a seed)

Landscaping
The idea of landscaping here means to create “flower beds” through identifying species of tags, e.g. by similarity.
Aim: enhancing precision and expressiveness

Fertilizing
Fertilizing in this context means to combine folksonomies with other knowledge organization systems (KOS): thesauri, controlled vocabularies, ontologies, etc. (fertilizer: any substance such as manure or a mixture of nitrates used to make soil more fertile). Fertilizing might work both ways, Katrin suggested: a folksonomy might be fertilized with the semantic structure of a KOS, or a KOS enhanced by terms from a folksonomy.

And finally TagCare: The ambitious plan is to have a system that allows to import tag clouds from Flickr, deli.icio.us and Bibsonomy, cleanse out dissimilarities between tags, add hierarchical structure to the tag clouds, allow the user to view tag statistics and probably also to have community features, such calibrating one’s tags with those of the chief gardener or to activate collaborative spam elimination. It is going to be a free service, and if you want to be notified when it goes live, you might want to send an email to Katrin.

This full-service proposal for tag gardening does of course sound brilliant – yet is it going to be feasible, on a technical level? In the post-presentation discussion, somebody mentioned Faviki, which relies on DBpedia concepts to solidify the tag cloud. It didn’t exactly seem as though the TagCare team had already thought along these (semantic web) lines, even though this perfectly corresponded to their ‘Fertilizing’ idea. But if TagCare solely relies on good human gardeners, how long will it take until they have gained a big enough community to stimulate someone’s altruism? The idea of tag gardening of course is beautiful, and I am curious to learn more about the technology it is going to use.

Other folksonomy and tag related presentations that I was unable to attend or am unable to describe now, after the 10th hour of my 2nd day at TRIPLE-I, with a band performing folkore music involving yodeling and probably Schuhplattler right outside of this room:

  • Quality Metrics for Tags of Broad Folksonomies (Celine Van Damme, Martin Hepp, Tanguy, Coenen, University of Brussels, Universität der Bundeswehr München
  • Providing Multi Source Tag Recommendations in a Social Resource Sharing Platform (Martin Memmel, Michael Kockler, Rafael Schirru, German Research Centre for Artificial Intelligence DFKI)
  • Semantic Tagging and Inference in Online Communities, Yildirim Ahmet, Üsküdarli Suzan, BoÄŸaziçi University
  • Using Visual Features to Improve Tag Suggestions in Image Sharing Sites (Mathias Lux, Oge Marques, Arthur Pitman, Klagenfurt University)
  • Harnessing Wikipedia for Smart Tags Clustering (Maria Grineva, Maxim Grinev, Denis Turdakov, Pavel Velikhov, Russian Academy of Sciences)

Please leave a comment if you think that any of the above needs correction.

EDIT: I got the chance to record another 12 seconds definition (and am thinking of setting up a video glossary for the Semantic Web now): Rolf Sint from Salzburg Research explains what folksonomies are and why folksonomies and ontologies go together well in 12 seconds! Rolf is also involved in the KiWi project, which aims to develop a wiki-based knowledge management system boosted by semantic technologies.


Rolf Sint explains folksonomies and their relation to ontologies on 12seconds.tv

Reblog this post [with Zemanta]
Sphere: Related Content

Tagclouds 2.0

August 25, 2008 By: Andreas Blumauer Category: Miscellaneous No Comments →

Just recently, when Michael Hausenblas tagged it on del.icio.us, I stumbled upon Wordle and I continue to be fascinated by the results it can produce for the RSS feed of our blog:

Word Cloud for \

This confirms that:

  1. Tagclouds are still evolving.
  2. It’s all about the web, data, semantics AND people!
  3. Tagclouds can be VERY pretty (on T-Shirts etc.).

Great work, Mr. Feinberg!

Reblog this post [with Zemanta]
Sphere: Related Content