The Semantic Puzzle

Andreas Blumauer

Do you like Google’s Knowledge Graph?

Semantic Enterprise Search enters the second phase.

Finally the Knowledge Graph has arrived in EuropeEurope is, by convention, one of the world's seven continents. Comprising the westernmost peninsula of Eurasia, Europe is generally divided from Asia to its east by the water divide of the Ural Mountains, the Ural River, the Caspian Sea, the Caucasus Mountains, and the Black Sea to the ...: What has been provided on googleGoogle Inc. is a multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program. The company was ....com for the US-Market since May 2012, is now available also for most European countries. Search results are no longer only a list of documents (and advertisements) but also a mashup of facts, points of interest, events etc. referring to the search phrase.

For example, if the user is searching for ‘Wiener Philharmoniker’ (‘ViennaVienna (/viːˈɛnə/; German: About this sound Wien [viːn], Austro-Bavarian: Wean) is the capital and largest city of Austria, and one of the nine states of Austria. Vienna is Austria's primary city, with a population of about 1.757 million (2.4 million within the metropolitan area, more than ... Philharmonic Orchestra’) a factbox including related searches is provided:

Do you like this rather new way of knowledge discovery? We do, except the fact that Google hasn´t properly explained to the audience which technology is behind the Knowledge Graph which is the Web of Linked Data aka the Semantic Web (Do you want to know more about the relationship between the Knowledge Graph and Linked Data? Click here).

But anyway, here are some benefits we can see, if search technologies make use of a ‘knowledge graph’, a ‘knowledge model’, a ‘thesaurusA thesaurus is a book that lists words grouped together according to similarity of meaning, in contrast to a dictionary, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary, which contains more than ...’ or generally spoken: Linked Data.

  • Facts around an object (or an entity) can be found nicely packed up to a dossier
  • Serendipity can be stimulated by ‘related searches’ which means: Users can discover the formely ‘unknown’ in a more comfortable way
  • Data from various sources can be pulled together to a mashup (e.g. ‘upcoming events’ could come from a different database than the basic facts of Vienna Philharmonic Orchestra)
  • Search phrases are well understood by the engine since they are based on concepts and not anymore on literals, e.g. if the user searches for ‘Red Bull Stratos’, also results for ‘Felix Baumgartner’ will be delivered
  • Search can be refined, e.g. if one searches for ‘Vienna’, a list of POIs will be displayed to refine the actual place the user is looking for

Now imagine you would have a search engine in your company’s intranet based on a knowledge graph which is about the enterprise you are working for.

Such an advanced search application would look like this:

  • Data streams and all kind of content from internal sources are nicely mashed with information from the web (e.g. from Twitter, Youtube etc.)
  • Search assistants are provided to help users to refine their information needs to make them more specific
  • Entities and their sub-concepts (e.g. subsidiaries of large companiesA company is a form of business organization. In the United States, a company is a corporation—or, less commonly, an association, partnership, or union—that carries on an industrial enterprise. " Generally, a company may be a "corporation, partnership, association, joint-stock ... or regions of countries) are nicely packed together to one dossier

The key question now is: “how to set up a customised knowledge graph for a certain company?”.

Corporate Semantic Web based applications can be realised on top of software platforms like PoolParty. They all have a customised knowledge graph in their core. This is always the basis for concept-based indexing of specialised content from a corporate intranet. The basic standard for this is SKOSSimple Knowledge Organization System (SKOS) is a family of formal languages designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is built upon RDF and RDFS, and its main objective is to ... which can be used together with advanced query languages like SPARQL. Such graphs can be used for semantic indexing but also to ask for relations like ‘is point-of-interest in’, ‘is event of’, ‘is related search for’ etc. This is the next-generation semantic search which help decision-makers, information professionals and all kind of knowledge workers to improve their work significantly.
One comfortable way to create customised knowledge graphs is to make use of Linked Data sources like FreebaseFreebase is a large collaborative knowledge base. It is an online collection of structured data harvested from many sources, including individual 'wiki' contribution. Freebase aims to create a global resource which allows people (and machines) to access common information more effectively. It is ... (like Google does) or DBpediaDBpedia is a project aiming to extract structured information from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web. DBpedia allows users to query relationships and properties associated with Wikipedia resources, .... More details wanted? Take a look at the PoolParty approach for efficient knowledge modeling.