Tassilo Pellegrini

Thoughts on KOS (Part 2): Classifying Knowledge Organisation Systems

Traditional KOSs include a broad range of system types from term lists to classification systems and thesauri. These organization systems vary in functional purpose and semantic expressivity. Most of these traditional KOSs were developed in a print and library environment. They have been used to control the vocabulary used when indexing and searching a specific product, such as a bibliographic database, or when organizing a physical collection such as a library (Hodge et al. 2000). Continue reading

Martin Kaltenböck

José Manuel Alonso: “If you want to scale up, you should consider LOD”

José Manuel Alonso has been working for W3C and CTIC in many open data projects. At the Web Foundation he promotes and supports (linked) open data in developing countries. Martin Kaltenböck from SWC talked with José about ongoing activities in the area of Open Government Data.

Open Data is a powerful worldwide movement these days. Regarding open data projects in developing countries and in high industrialised countries (Europe, US, Australia et al) where do you see the main differences – regarding organisational – cultural – technical issues?

We conducted feasibility studies in Ghana and Chile several months ago, are supporting the Ghana government on the development of its national initiative and have visited and have engaged in Open Data discussions with many other countries in Africa, Latin America and Asia.
The situations are quite diverse and can vary significantly from country to country. It is always difficult to generalize, but I think there are a few important differences that can be highlighted (in no particular order):

  • The amount of information available in digital form is generally much lower
  • The IT infrastructure is yet to be fully developed or under development
  • The capacities on the government and civil society side have to be improved
  • The mobile phone is the main device to access information but data connectivity is still scarce, only available in the big cities and not at all in the rural areas
  • Digital literacy related issues have to be seriously considered and addressed
  • Multilingualism is an important factor, as there are dozens of dialects being spoken in many countries

Said all of the above, I would say that there are also quite a number of commonalities such as privacy and security concerns, the resistance to change but also the existence of champions within government, and the interest and willingness in civil society, that is already producing a number of interesting applications.

You are also very familiar with the concept of Linked Open Data (LOD) – where do you see the main benefit in using LOD – where do you think are the main challenges – where the main obstacles?

Having managed a few projects achieving 5-star open data, I’ve learned a thing or two about the pros and cons. I’ve been saying consistently that there are a few important issues:

  • There is still little knowledge about LOD out there and it is perceived as too complex
  • The demand for LOD is, hence, very low
  • The tooling is not powerful enough yet, specially when compared to XML tooling and others
  • The modeling part is very tough

People are used to work with XML and Web Services and believe that anything along this line such as REST+JSON fulfils most expectations and needs. But this is not fully true. In my opinion, the power of LOD resides on the linking part more than anything else. Combination of data from disparate sources using RESTful techniques is much more difficult while it’s a natural fit for LOD.
My experience tells me that for dealing with few and simple datasets, investing in LOD is not really needed, but if you want to scale up and, specially, if you want to link and integrate, then you should consider LOD. It is generally a bigger investment but it pays back for interlinking big volumes of information, facilitates re-use in multiple formats, and can get very powerful when using SPARQL appropriately as it allows access to the whole underlying knowledge base.

Where do you see the main differences regarding effort of publishing and benefit in re-use (or the re-use itself) between Open Data and Linked Open Data?

I would say that the main difference here is between using the Web as an archive for files and using the full potential of the Web. If one publishes hundreds of spreadsheets on the Web using an open format and license, he is already doing Open Data, but more than using the Web, he is going back to the FTP days. And that is not too different from giving away a USB stick with the files. We can do much better nowadays.

The often cited Tim Berners-Lee’s 5-star scale is a good reference here. The higher you can achieve on that scale, the more power of the Web you are using, the more you are facilitating reuse.

Are there differences regarding the use of LOD principles and technologies between developing countries and industrialised countries in your opinion? For example: does it make sense to start an Open Data Initiative in a developing country using Linked Open Data from the scratch?

All the issues with LOD I mentioned above apply and are even more strongly found in the developing world. I think we should take a step by step approach and start going from no data to some-star data in the very near term, lower the barriers one by one and start to building capacities in government and civil society but always with Web architecture principles in mind.
We will have to address the specificities of the developing world. For example, given that the LOD community is relying more and more on cloud-based options, on centralized data stores that require stable high-speed internet, how would one deploy a LOD solution in a country where clients (computers/mobile phones) have limited resources (disk, cpu) and where connectivity is unstable and with low-bandwidth? We’re participating in a worskhop to explore these issues.

This does not mean that LOD is completely ruled out from the beginning. As I pointed out before, there are cases on which it can be extremely useful and powerful and in those, we intend to accelerate adoption, likely piloting and building capacities as a first step.

Could you please tell us a few words about the Web Foundation?

The Web Foundation was launched by the inventor of the Web, Sir Tim Berners-Lee, in 2009 to address global challenges by connecting humanity and empowering individuals through an increasingly inclusive and powerful Web. More on the vision of the Web Foundation at:

Jose, many thanks for this interview. It seems that there is a quick progress in open data in developing countries as well as there are different requirements there to be taken into account in comparison to open data projects in Australia, the US or in Europe! Also the potential of Linked Open Data seems an interesting point for these countries!
We are looking forward to staying in touch with you on this in the future and wish you all the best for your future work in this area!

Tassilo Pellegrini

Topic Maps and the Semantic Web

tmraFrom November 11 – 13, 2009 this will be one of the big issues at the 5th International Conference on Topic Maps taking place in Leipzig/Germany. When asked about the relationship between TM and SemWeb conference organizer Lutz Maicher says:

With the vision of the web of data Topic Maps and the Semantic Web move closer over time. Anywhere URIs represent subjects, structured statements are gathered around them. In this context I see subj3ct.com as an interesting ventures. This recently launched service provides URIs for 15 million subjects to be used in structured data. Naturally, linked data hubs like dbpedia or geonames.org are part of it. The crowd is invited to contribute to this collection, also the Topic Maps Lab provides several feeds to register new URIs. Subj3ct.com turns out to be an infrastructure technology for Web 3.0 applications, regardless whether they are based on Topic Maps or other Semantic Web technologies.

Through this convergence the uniqueness of each technology sharpens. Reasoning is the strong point of the Semantic Web. But the strength of Topic Maps are semantic portals and the global federation of facts around subjects. Bringing together all and even contradictory information about each subject – and not building reasoning-ready consistent models of the world – is built into the genes of Topic Maps.

Read the full interview here.

