The Semantic Puzzle

Jana Herwig

Why Faviki is able to suggest tags in 13 languages

Just got in touch with Vuk Miličić from Faviki recently – Faviki has been selected as a featured project on Google code, and in that context, Vuk describes the process of how Faviki retrieves its suggestions in a little more detail. It’s really interesting! It also sheds more light on the way that DBpediaDBpedia is a project aiming to extract structured information from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web. DBpedia allows users to query relationships and properties associated with Wikipedia resources, ... is used in Faviki: Not immediately for the retrieval of tags, but for the translation of tags – long live the smartness of linked data!

  1. Faviki fetches a web page and extracts a core text (without HTML and non-relevant content).
  2. Then it tries to figure out if a content is in English. If it isn’t, it is sent to GoogleGoogle Inc. is a multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program. The company was ... language APIAn application programming interface (API) is an interface implemented by a software program to enable interaction with other software, similar to the way a user interface facilitates interaction between humans and computers. APIs are implemented by applications, libraries and operating systems ..., which detects the original language automatically, translates it into English and returns the translation.
  3. The content is then sent to and analyzed by ZemantaZemanta is a content suggestion engine for bloggers and other content creators. API, which then finds relevant links. Faviki uses links from English Wikipedia – titles are used as semantic tags.
  4. If users language is not English, we must translate them. Using DBpedia datasets “Links to Wikipedia Article” , we can find names of Wikipedia’s titles in one of 13 languages. These datasets actually contain the connections between English Wikipedia articles and articles from Wikipedia in other languages.
  5. Finally, suggested tags are offered to a user.

Read the whole blog post on Vuk’s Faviki blog

Reblog this post [with Zemanta]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>