Semantic Web Company

The Semantic Puzzle

Open World Assumptions

subscribe RSS

Why Faviki is able to suggest tags in 13 languages

September 26, 2008 By: Jana Herwig Category: Linked Data & Open Data, Mashups & Web services, Tools & Software No Comments →

Just got in touch with Vuk Miličić from Faviki recently – Faviki has been selected as a featured project on Google code, and in that context, Vuk describes the process of how Faviki retrieves its suggestions in a little more detail. It’s really interesting! It also sheds more light on the way that DBpedia is used in Faviki: Not immediately for the retrieval of tags, but for the translation of tags – long live the smartness of linked data!

  1. Faviki fetches a web page and extracts a core text (without HTML and non-relevant content).
  2. Then it tries to figure out if a content is in English. If it isn’t, it is sent to Google language API, which detects the original language automatically, translates it into English and returns the translation.
  3. The content is then sent to and analyzed by Zemanta API, which then finds relevant links. Faviki uses links from English Wikipedia – titles are used as semantic tags.
  4. If users language is not English, we must translate them. Using DBpedia datasets “Links to Wikipedia Article” , we can find names of Wikipedia’s titles in one of 13 languages. These datasets actually contain the connections between English Wikipedia articles and articles from Wikipedia in other languages.
  5. Finally, suggested tags are offered to a user.

Read the whole blog post on Vuk’s Faviki blog

Reblog this post [with Zemanta]
Sphere: Related Content

Jury Award for Semantic Wikis in eGovernment, and: Semantic MediaWiki for Wikipedia?

September 24, 2008 By: Jana Herwig Category: Collective Intelligence, Internet & Media No Comments →

An implementation of Semantic MediaWiki in public administration reiceved a jury award yesterday in the final ceremony of the highly coveted multimedia state award (Staatspreis Multimedia) 2008 in Vienna: Centre for Public Administration KDZ’s platform for the cooperation of administrations (Plattform Verwaltungskooperation) in Austria, Germany, Italy and Switzerland received praise for its use of open, semantic technologies in their effort to further the collaboration between administrations and administrative staff. Those of you who can read German: read the response from Bernhard Krabina, KDZ, here or contact him here, if you’d like to learn more. The top state award itself went to HPC Dual, a combination of electronic and physical mail delivery.

Also published yesterday was an interview with Matthias Schindler, former member of board of Wikimedia Germany, at the occasion of the publication of a physical Wikipedia, i.e. a one-volume encyclopedia in print (publisher: Wissen Media, a Bertelsmann division). According to the English Wikipedia, “the volume is planned to include abbreviated entries for the 50,000 most commonly used search terms of the prior two years. The book is to be priced at 19.95 euros, with one euro from every sale going to the German chapter of the Wikimedia Foundation.”

The interviewers also asked Schindler for his “encyclopedic Wikipedia dream” – I hope his response will catch on in the Wikimedia chapters worldwide:

I would one day like to see a large edition of Wikipedia (including a German language edition), which makes use of the Semantic MediaWiki extension. The dream in a nutshell, without consideration of the current state of research and development: A wikipedia that can be read not only by humans, but also by computers, a Wikipedia that can offer concrete answers to concrete questions and that creates content individually for users, something that they can make use of; great if Wikipedia played the role of the first, mainstream Semantic Web application. While this is still in the process of coming together, there are enough other things for us to do.

(btw, my translation).

Concrete answers to concrete questions, a personalized Wikipedia – I am not even aiming that high at the moment.

Just consider the absurd amount of lists in Wikipedia, all of which are maintained manually. Take for instance the list of hardcore punk bands, the list of fictional countries (to be distinguished from the list of European fictional countries) or the list of military operations.

How often do you think these need an update? And if a new hardcore punk band is added – will the creators of the new article think about adding it to the list? What about articles which make make a reference to or mention things that are or should be on a particular list?

As a list has the inherent claim of being complete, it shouldn’t be left to humans to create and maintain them – leave that to the machines! Vote Semantic MediaWiki for Wikipedia!

Author: Jana Herwig

Reblog this post [with Zemanta]
Sphere: Related Content