Martin Kaltenböck

José Manuel Alonso: “If you want to scale up, you should consider LOD”

José Manuel Alonso has been working for W3C and CTIC in many open data projects. At the Web Foundation he promotes and supports (linked) open data in developing countries. Martin Kaltenböck from SWC talked with José about ongoing activities in the area of Open Government Data.

Open Data is a powerful worldwide movement these days. Regarding open data projects in developing countries and in high industrialised countries (Europe, US, Australia et al) where do you see the main differences – regarding organisational – cultural – technical issues?

We conducted feasibility studies in Ghana and Chile several months ago, are supporting the Ghana government on the development of its national initiative and have visited and have engaged in Open Data discussions with many other countries in Africa, Latin America and Asia.
The situations are quite diverse and can vary significantly from country to country. It is always difficult to generalize, but I think there are a few important differences that can be highlighted (in no particular order):

  • The amount of information available in digital form is generally much lower
  • The IT infrastructure is yet to be fully developed or under development
  • The capacities on the government and civil society side have to be improved
  • The mobile phone is the main device to access information but data connectivity is still scarce, only available in the big cities and not at all in the rural areas
  • Digital literacy related issues have to be seriously considered and addressed
  • Multilingualism is an important factor, as there are dozens of dialects being spoken in many countries

Said all of the above, I would say that there are also quite a number of commonalities such as privacy and security concerns, the resistance to change but also the existence of champions within government, and the interest and willingness in civil society, that is already producing a number of interesting applications.

You are also very familiar with the concept of Linked Open Data (LOD) – where do you see the main benefit in using LOD – where do you think are the main challenges – where the main obstacles?

Having managed a few projects achieving 5-star open data, I’ve learned a thing or two about the pros and cons. I’ve been saying consistently that there are a few important issues:

  • There is still little knowledge about LOD out there and it is perceived as too complex
  • The demand for LOD is, hence, very low
  • The tooling is not powerful enough yet, specially when compared to XML tooling and others
  • The modeling part is very tough

People are used to work with XML and Web Services and believe that anything along this line such as REST+JSON fulfils most expectations and needs. But this is not fully true. In my opinion, the power of LOD resides on the linking part more than anything else. Combination of data from disparate sources using RESTful techniques is much more difficult while it’s a natural fit for LOD.
My experience tells me that for dealing with few and simple datasets, investing in LOD is not really needed, but if you want to scale up and, specially, if you want to link and integrate, then you should consider LOD. It is generally a bigger investment but it pays back for interlinking big volumes of information, facilitates re-use in multiple formats, and can get very powerful when using SPARQL appropriately as it allows access to the whole underlying knowledge base.

Where do you see the main differences regarding effort of publishing and benefit in re-use (or the re-use itself) between Open Data and Linked Open Data?

I would say that the main difference here is between using the Web as an archive for files and using the full potential of the Web. If one publishes hundreds of spreadsheets on the Web using an open format and license, he is already doing Open Data, but more than using the Web, he is going back to the FTP days. And that is not too different from giving away a USB stick with the files. We can do much better nowadays.

The often cited Tim Berners-Lee’s 5-star scale is a good reference here. The higher you can achieve on that scale, the more power of the Web you are using, the more you are facilitating reuse.

Are there differences regarding the use of LOD principles and technologies between developing countries and industrialised countries in your opinion? For example: does it make sense to start an Open Data Initiative in a developing country using Linked Open Data from the scratch?

All the issues with LOD I mentioned above apply and are even more strongly found in the developing world. I think we should take a step by step approach and start going from no data to some-star data in the very near term, lower the barriers one by one and start to building capacities in government and civil society but always with Web architecture principles in mind.
We will have to address the specificities of the developing world. For example, given that the LOD community is relying more and more on cloud-based options, on centralized data stores that require stable high-speed internet, how would one deploy a LOD solution in a country where clients (computers/mobile phones) have limited resources (disk, cpu) and where connectivity is unstable and with low-bandwidth? We’re participating in a worskhop to explore these issues.

This does not mean that LOD is completely ruled out from the beginning. As I pointed out before, there are cases on which it can be extremely useful and powerful and in those, we intend to accelerate adoption, likely piloting and building capacities as a first step.

Could you please tell us a few words about the Web Foundation?

The Web Foundation was launched by the inventor of the Web, Sir Tim Berners-Lee, in 2009 to address global challenges by connecting humanity and empowering individuals through an increasingly inclusive and powerful Web. More on the vision of the Web Foundation at:
http://www.webfoundation.org/vision/

Jose, many thanks for this interview. It seems that there is a quick progress in open data in developing countries as well as there are different requirements there to be taken into account in comparison to open data projects in Australia, the US or in Europe! Also the potential of Linked Open Data seems an interesting point for these countries!
We are looking forward to staying in touch with you on this in the future and wish you all the best for your future work in this area!

Enhanced by Zemanta
Andreas Blumauer

Open Intranet

The following blog post was used by Andreas Blumauer as a basis for a talk at TEDxVienna on Monday, November 29, 2010:



Open Data, Open Government, Open Source, Open Innovation – “Open” everywhere. Today I want to talk about another “Open something”: The “Open Intranet”. This might sound a bit radical but it will also help to reflect a little bit on the term “open” in general.

“Open Intranet” – isn´t this a contradiction by definition? What is understood by “Intranet”? It means a network of computers and users “within” some organisational boundaries. But boundaries don´t necessarily have to be closed as nature teaches us: Organisms aren´t closed systems. A watch would be an example for a closed system but living organisms tend to be open – to survive. Of course they aren´t totally open, in systems theory we are talking about systems which are structurally coupled with their medium when we refer to this special kind of openness. As an example, an immune system, having learned to recognise a class of virus it will remain sensitive to that and similar viruses in future. In contrast to this, imagine a fly walking over a painting of Rembrandt: Since the fly isn´t structurally coupled to the cultural space of human aesthetics it is not “open” to the beauty of Rembrandt´s work.

When we think of today´s intranets, we can see that they tend to be isolated from the world wide web, they don´t seem to perceive the internet as their medium. From a user perspective, those two systems aren´t connected to each other. Typically, when working on the intranet we jump from time to time to be in the “internet mode” and start to Google something, we copy it, jump back and paste it into the intranet. It´s the user who is the only part of the whole system connecting the internet with the intranet. Isn´t this exhausting for us?

And now I start with the good news: Intranets all over the world start to open up, slowly – but they do. It seems like the “pressure” from “the outside” just became too huge. In the first instance it seems that it´s not the data and the information which will “break” in, it´s rather the “cool functions” which web apps offer and which we (as digital natives) would like to have in our intranets too. We want:

  • better search,
  • more possibilities to interact with information,
  • integrated views instead of jumping around,
  • and we want more possibilities to self-serve our extensive hunger for more and well structured information.

On the information level intranets are still rather conservative: Typical pieces of information already “injected” from the web into an average intranet would be:

  • weather forecasts,
  • stock exchange rates,
  • time zones and
  • jokes.

How could companies use the web to inspire their employees (without opening up totally), how could the web “inject” the right amount of information into an intranet to make an enterprise portal as vivid as the web is being perceived by today´s typical end-user. How could this tremendous amount of data and knowledge on the web be “structurally” coupled with intranet repositories and workflows? What are the advantages a company could gain from publishing (at least some) data on the web?

Let me give you a few examples for intranet apps which have started to consume other information than jokes from the web:

  • Enterprise Mashups: Combine CRM systems with social networks like LinkedIn
  • Open innovation: Let´s bring the knowledge of consumers and producers together and improve certain products and services. As an example, just recently after BP´s oil spill more than 40.000 people came up with ideas on how to clean up  the oil, more than two dozen were deployed to help clean up the oil
  • Content Augmentation: Enrich content which is being edited, let´s say in an enterprise wiki, automatically with some background knowledge from Wikipedia or with news from a news company

Finally I will also give you two examples for use cases where companies expose and publish internal data on the web (without violating privacy) and benefit from it.

  • Wisdom of the crowd: The Canadian gold mining group Goldcorp made 400 megabytes of geological survey data available to the public over the Internet. They offered over $500,000 to anyone who could analyze the data and suggest places where gold could be found. The company claims that the contest produced 110 targets, 8 million ounces of gold, worth more than $3 billion.
  • Prize economics: Netflix, a movie rental service in the US has published data for a contest to improve their recommender engine. One team out of 50.000 contestants after nearly 3 years has improved the existing recommender engine by more than 10% and won 1 Million dollar

To end with a conclusion: What Tim Berners-Lee has demanded in one of his famous TED talks was “raw data now!”. It has started to become reality. Just think of all the “Open Government Data Initiatives” around the globe which were initiated since then. Now companies with a “Web DNA” have started to understand the value of open data and to contribute their “5 cents” to the global “open data cloud”. I think this will not only be of value for many companies but also will increase tremendously the chances to resolve some global problems in the near future.

Thomas Thurner

data.reegle.info – Linked Open Data on Clean Energy

Following the worldwide trend of Open Government Data as well as Linked (Open) Data the reegle.info team has decided to launch a reegle data portal in November 2010: data.reegle.info.

The idea of providing raw data (first mentioned by Sir Tim Berners Lee in the course of the W3C Linked (Open) Data movement) for free and unrestricted re-use follows the idea and objectives of the reegle.info information system as the single point of access for worldwide clean energy data (renewable energy as well as energy efficiency).

On data.reegle.info you can find data on stakeholders in the clean energy area as well as country (energy) profiles from the 1st day of the launch in November 2010 – later on the reegle.info team will open up its renewable energy and energy efficiency thesaurus (SKOS format) for public re-use and continuously will open up and provide more and more clean energy data on data.reegle.info. As license for data.reegle.info the Open Government Data License for public sector information is used. data.reegle.info follows W3C standards and recommendations for Linked Open Data as well as Open Government Data.

For developers data.reegle.info have created a comprehensive developer guide as well as a SPARQL endpoint as the central API to the reegle.info data. So the the reegle.info consortium  hopes that data.reegle.info initiates a lot of new (data) mash ups as well as innovative apps using data.reegle.info.

Thomas Thurner

The Open Government Data Meetup in Vienna

Show what is possible! As Martin Kaltenböck – one of the organizers oft the recently held Semantic Web Meetup on an Austrian Open Government Data Initiative – said, there is a lot of enthusiasm and energy to inform the public and engage politics about the impact a initative similar to those in US and UK may have for Austria. And the KickOff was promissing. Inspiring talks by Rufus Pollock (UK) and Stefano Bertolo (EU) where giving an insight whats possible in the specific field of Open Government Data, as well as how a start of an initiative can look like.

As ePSI-Platform wrote in their blog
The Austrian Open Data initiative is online and at work.

The event was very well attended, and brought together stakeholders from science, industry, government and citizen activists, A promising melange of people which may carry the project forward to very concrete UseCases and Trials in the very near future. As the initiative is ment to be carried by a broad group of proponents, the follow-up of the meeting will be a round table talk, of those who are willing to contribute in upcoming light-tower projects and opening concrete sets of government data for that.

The next meeting of the Austrian Open Data Initiative
takes place on the 12th May at 9.30 a.m. in
Room D, quartier 21 of the Vienna Museum Quarter.

Find Documentation of the Meetup on Zukunftsweb, browse the Picture’s Album or read the conclusions at ePSI-Platform.

More resources