Semantic Web Company

The Semantic Puzzle

Open World Assumptions

subscribe RSS

Keep the Semantic Web trusty

March 13, 2009 By: Thomas Thurner Category: Corporate Semantic Web, Mashups & Web services, Politics, Privacy & Information Ethics, Text Mining 1 Comment →

Tim Berners-Lee at a Podcast Interview
Image via Wikipedia

In recent days – here at Semantic Web Company – we have had a lot of discussions on how the future of the Semantic Web (name it Web3.0 if you like) will develop. Several stakeholders on the future of the Semantic Web see already, that also a potential danger will come along with the technical realisation of the web3.0: This is the present possibility to create applications and mashups with semantic technologies that are a real drain on privacy and information ethics. Without an underpinning discussion about the ethical framework within technolgies like linked data, text-mining, biometric-systems and geo-systems in combination with the web of data, the whole domain is in danger to be doomed like genetic engineering some years ago.

It’s crucial for the public opinion on the Semantic Web, to adress the immanent risks regarding privacy and ethics. In this context I’ll see also Tim Berners-Lee’s statement yesterday: “W3C wants to help make sure data use is appropriate,” he said. Berners-Lee, who is director of W3C, said in an interview on Wednesday that the teams working on the Semantic Web project are making sure that privacy principles are included in its architecture: “The Semantic Web project is developing systems which will answer where data came from and where it’s going to — the system will be architectured for a set of appropriate uses.”

Maybe it’s an important step in keeping the further development of Semantic Web trusty in the eyes of public opinion, that the W3C has privacy and information ethics on their agenda and persons like Berners-Lee stand with their reputation for it. But it is also crucial to build this awareness on the corporate side. Only if everyone within the domain follows a common ethic understanding we have a public opinion, which is on the future potential of the Semantic Web, and not in fear of the same.

Reblog this post [with Zemanta]
Sphere: Related Content

First Make.tv cast about the Social Semantic Web

November 19, 2008 By: Jana Herwig Category: Videos & Tutorials No Comments →

Time for a bit of over-the-top web 2.0 adulation… at yesterday’s Digitalks event (organized once again wonderfully by Meral Akin-Hecke), Luca Hammer was there and filmed throughout the presentations and discussions – using two cameras at a time AND live-editing and live-streaming it on Make.tv. What is Make.tv? The most incredible web 2.0 application I’ve seen so far – it’s a TV-Studion in your browser! And it’s free! (Although I doubt I will stay free forever)

You can live-edit the input from several cameras – this can also be achieved by logging in on different computers at a time, thus using the input from several built-in webcams at a time. You can drag and drop the video input channels into your scene, make the embedded videos smaller to achieve a screen-in-screen effect, create your own TV design and virtual studio from graphics…. wow, wow, wow.

I played with it today, not being quite as adventurous as Luca, in that I used only one camera (see what he achieved yesterday with multiple screens), nor did I interrupt and restart the recording (which I could have), but even though, I find the visual result, i.e. the ’studio’ I built from the book cover, impressive enough.

So here is it: My introduction of the Social Semantic Web publication (which is in German, which is why the audio is in German, too, but you don’t need to understand what I am saying to be impressed by Make.tv). Jump to seconds 3:30 to 4:30 to see how you can switch between different screens while doing the web cast.

P.S. That’s an image below – you can embed the video, but you cannot (yet) deactivate that it starts automatically if you embed it, so I’ve decided to use an image on the blog instead. Click here, or the image, to launch the webcast on the Make.tv website.

Social Semantic Web - Webcast

Btw, I am not sure whether I said XML or XHTML in the webcast, but of course I meant XHTML when talking about the benefits of RDFa.

Reblog this post [with Zemanta]
Sphere: Related Content

The Future, Quantum Encryption, Privacy on the Social Semantic Web

October 28, 2008 By: Jana Herwig Category: Semantics & Philosophy, Social Software No Comments →

Just two memos: There is a talk tonight with Thomas Länger from the Viennese quantum encryption project (BBC article about the project), co-organized by quintessenz (an organisation devoted to civil rights in the information age) and Transforming Freedom (who are dedicated to documenting the discourse of the battle zones of digital culture; I volunteer for them). ORF wrote a German article about it, with information about the venue and start time. The key issue quintessenz want to raise with this talk is: Who is going to benefit? Will “unbrekable” quantum encryption become available to citizens, too? Quantum encryption cartridges for your PC, anyone?

Secondly: I published an “inaugural interview” Marion Fugléwicz-Bren did with two of my colleagues, Matthias Samwald and Thomas Schandl (not so inaugural for the former, as he already joined SWC in January). I’d like to extract this quote by W3C member Samwald regarding privacy on the (corporation owned) social web and the future (user-managed) social semantic web:

I also think that Semantic Web technologies will receive a lot of media attention when the first big, public breach in security / privacy happens in one of the websites that currently dominate the whole world wide web. At the moment, we all are uploading most of our private and business lives to web sites such as Google, Facebook, Flickr and others. It is just a matter of time until a big scandal happens, be it the companies themselves that misuse the vast amounts of data they have, or be it a government agency in an overzealous effort of crime prevention.

When this will happen, people will re-evaluate the trend towards massive centralisation on the web, and will search for opportunities to make the same feeling of being ‘in the network’ happen in a distributed environment, without selling ones soul to a multinational corporation. Then we will find that such an opportunity already exists — the Semantic Web.

Read the whole interview here.

Sphere: Related Content

Semantic Desktop, Lifting and Human Language Technology [WOD-PD]

October 22, 2008 By: Jana Herwig Category: Conferences & Events, Search Engines, Social Software 2 Comments →

The next session at WOD-PD was given by Leo Sauermann (German Research Center for Artificial Intelligence DFKI, Germany), and Brian Davis (DERI Galway, Ireland). Leo introduced the idea of the Semantic Desktop, and more specifically, the Nepomuk Social Semantic Desktop. There’s good article about Nepomuk on Linux.com, written by Bruce Byfield on August 26, 2008, from which I quote the following, enlightening passages:

Ansgar Bernardi, deputy head of the Knowledge Management Department at Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI, or the German Research Center for Artificial Intelligence) and Nepomuk’s coordinator, explains, “The basic problem that we all face nowadays is how to handle vast amounts of information at a sensible rate.” [...] “The point is, you have a vast amount of information on your desktop, hidden in files, hidden in emails, hidden in the names and structures of your folders. Nepomuk gives a standard way to handle such information.”

At a high level of generalization, Nepomuk has three main aspects, according to Bernardi. First, there is a standard framework for annotating pieces of information so that connections can be made between them. Second, there are ontologies, the sets of “documented shared understanding” or common concepts that can be defined for particular types of information, such as bio-science or computer desktop use. Finally, there are the tools for making or using the annotations and ontologies, what Bernardi calls the “workspaces that connect to other workspaces and help you in your day to day activities of collecting information, structuring it, making sense of it, and creating new information and communicating it.”

Leo has provided the relevant download links for those who “want to get their hands dirty” with Nepomuk (as he put it) on his blog. Leo Sauermann and Ansgar Bernardi also contributed an article about the Semantic Desktop to the recently published Social Semantic Web volume – a preview of the article is available here (in German – I’m sorry!).

Brian Davis‘ part of the talk focused on Lifting and Human Language Technology (HLT) for the Semantic Desktop – Semantic Lifting means to capture semantics and translate them into ontologies. Human language technology (HLT), in its broadest sense, can be described as computational methods for processing and manipulating language (for instance text analysis).

One of the goals of the Semantic Desktop is speech act detection for email – speech act here as defined by John Searle. At its most basic definition, a speech act is simply an utterance, but is also often understood more specifically as an illocutionary act (which is a term introduced by John L. Austin in How to do things with words), or a ‘performative utterance’, meaning that by saying something, one actually does something. For instance, the sentence “Please have the document ready for Workshop 1.” contains an instruction: It informs the reader about the requirements for a particular event, and asks him or her to meet these requirements.

Brian also introduced Roundtrip Ontology Authoring (ROA), which is a process that allows non-expert users to author or amend an ontology by using simple, easy to learn, controlled natural language. The process is a combination of Controlled Language for Information Extraction (CLIE) and Text Generation which is developed on top of GATE. ROA is documented on the the Nepomuk website; for further information about CLIE, read this article by Valentin Tablan, Tamara Polajnar, Hamish Cunningham and Kalina Bontcheva: User-friendly ontology authoring using a controlled language (PDF, 64 KB).

Reblog this post [with Zemanta]
Sphere: Related Content

Social Semantic Web – New Publication Out

October 16, 2008 By: Jana Herwig Category: Literature & Publications, Semantics & Philosophy, Social Software 5 Comments →

The “Social Semantic Web” is here – yay! The book of the same name, edited by Andreas Blumauer (right) and Tassilo Pellegrini, is now available in stores. Another contributor from SWC is Matthias Samwald (left), who, together with Holger Stenzhorn, discussed the relevance of the Semantic Web for biomedial research in their article for the book.

The publication (in German, with the exception of one article by Narayanan Kulathuramaiyer and Hermann Maurer addressing issues of Data Mining) has four sections:

  • a low-threshold introduction to Web 2.0 and social software, covering technological, cultural and social aspects,
  • an overview of core technologies and methods, covering e.g. knowledge discovery, expert finders, tag recommendation, etc,
  • an overview and discussion of existing applications and their perspectives within the Social Semantic Web, e.g. the Semantic Desktop, Bibsonomy or the perspectives for biomedical research,
  • a discussion of phenomena of the Social Semantic Web from the perspective of communication studies and social sciences, e.g. privacy on the social semantic web, or the role of user-generated content for individual empowerment.

We have also created a wiki for the book (using Semantic Media Wiki) which is available at social.semantic-web.at. You can, for instance, browse it by article, by author, or by organisation. Tom Schandl made a few changes to available templates, which he is soon going to blog about.

Social Semantic Web Happy AuthorsImage by leobard via FlickrAuthor copies were shipped last week – some of the contributors have already blogged about the book, for instance Leo Sauermann, who, together with Malte Kiesel, Kinga Schumacher and Ansgar Bernardi, contributed an article about the Semantic Desktop and personal knowledge management (image also provided by Leo Sauermann). Jan Schmidt a.k.a “Schmidt with Dee Tee”, in an article he wrote together with Tassilo Pellegrini, approached the Semantic Web from the perspective of Communication Studies; Jan has posted the abstract (in German) and offered a bit of commentary on his blog. Michael Nagenborg, who authored the article about privacy on the Social Semantic Web, announced the book on his website.

Please let us know if you’ve also written a blog post about the book or have resources on Flickr, Slideshare, elsewhere; and/or tag it with “socsemweb08″ so that we can find it. Of course you can also immediately add them to the wiki yourself (page Resonanz).

Complete list of contributors (in order of appearance in the book): (more…)

Sphere: Related Content

BarCamp Proposals: Factolex, Social Enhanced Search

October 06, 2008 By: Jana Herwig Category: Conferences & Events, Social Software No Comments →

Hello Monday! I am a bit tired today as I did not really have a weekend but spent it in a rather intellectually stimulating fashion, attending BarCamp Vienna held on the premises of HP in the 12th district. My head is still buzzing from all the input!

Originally, the plan had been to have a marketing-themed BarCamp, but thanks to the bottom-up approach towards scheduling typical for BarCamps, that didn’t quite come to pass (greatly appreciated also that this wasn’t enforced by the organizers, thank you!). There were two sessions in the ones that I attended that have relevance for the Social Semantic Web:

One was held by Alexander Kirk about the latest improvements in Factolex, a collaborative, micro-content encyclopedia based on facts; I hear that Factolex will receive further semantic enhancements in the near future, so I’ll write a longer blog post about it then. One feature Alex showed and which impressed me considerably was the distributed way in which one can add further facts to Factolex now: On any webpage, highlight a word or phrase (e.g. “President of the European commission”) and then click on the bookmarklet. Factolex is automatically going to check whether it knows the term already and either creates a new one or adds a fact to an existing term. The source will be added automatically – pretty nifty!

Another project that does not yet have a name and that is currently in stealth mode was presented by Christian Zeidler: Social Enhanced Search on del.icio.us. The project addresses a well known del.icio.us problem: You can search your bookmarks, i.e. search the tags and possibly definitions you might have added – yet all too often this only leads to the problem that your search query does not match the tags you once assigned. Being able to search the full text of the saved page would improve the scenario considerably – and this is exactly the approach Christian’s project takes.

To begin with, he built his own search index using Lucene, an open source, full-featured text search engine library written in Java. Of course it doesn’t crawl the whole web – just the pages you have added to your del.icio.us account. Instead of building one index for every user, Christian decided to have one large search index which also takes away the troubles of double indexation – the current index, based on 800 pages, doesn’t exceed a size of 3MB, which seems rather reasonable.

Apart from your own bookmarks, the plan is to also allow searching the bookmarks of your friends on del.icio.us, giving your search perspective. How many friends do you have on Facebook, how many on del.icio.us? It’s about half a dozen on del.icio.us for me, so I guess that “friendship” here really stands for particular topics and interests – this social perspective thing might actually work for enhanced searches, I think.

What other means are there to weight and rank search results? Somebody raised the issue of customization, i.e. let the user define which weight he’d like to give the results of which friend. I completely agree with Christian when he said he doesn’t believe people want customization, as conscious, user-initiated customization efforts are often (considered) too high. Instead, the system must learn from the data, e.g. prefer the results of friends whose results you use the most often.

Another useful feature that is already in place is that you can add any RSS feed to your search index as well – this is indeed very neat. And finally, in addition and as a point of reference, the prototype displayed the Lucene-based results in one column, and Yahoo! Search BOSS results in another column. Not surprisingly, the Search BOSS results were rather general, and the Lucene-based results rather specific – and that specificity is what you’d expect from searching your own bookmarks.

Reblog this post [with Zemanta]
Sphere: Related Content