Andreas Blumauer

Interview with David Huynh: “The user interface design must inform the back-end design”

Linked Data is evolving fast. A huge amount of RDF data is available and ready for exciting new applications. Unfortunately, the bottleneck is still the availability of Semantic Web user front-ends which demonstrate the power of linked data. To a certain degree BBC Music beta is the first commercial platform which makes heavy use of linked data. With Parallax David Huynh has shown that one of the most interesting semantic web applications can be built around browse and search applications which offer tools for doing complex search queries.

Andreas Blumauer from Semantic Web Company (SWC) talked with David Huynh, “Interaction Scientist” at Metaweb, the company which developed Freebase, an “open, shared database of the world’s knowledge”.

SWC: David, you have been working for MIT´s Simile Project and now for Metaweb Technologies – two “building blocks” of the Semantic Web. Could you tell us a bit about your ongoing work at Metaweb?

David: My official title at Metaweb is “Interaction Scientist,” and so my main focus is coming up with novel interaction designs for Metaweb’s platform and products, and prototyping them to some extent to evaluate their effectiveness. Parallax was one such prototype that has gathered much excitement within Metaweb and the Semantic Web community at large. And the Freebase query editor 2.0 shows my interaction designs at the other end of the spectrum – targeting developers rather than just end-users.
I’ve also learned that data-centric user interfaces and interaction designs can only be as good as the data allows them to. So I am also dedicating some of my time toward analyzing the data we have and improving its quality so that I can design even better interactions.

Freebase Query Editor 2.0 from David Huynh on Vimeo.

SWC: With Parallax you have introduced a new way to search and explore data: Could you explain the “set-based browsing paradigm”?

David: In the browsing paradigm of the original Web, while looking at a web page, you can only click on one hyperlink to get to one other web page. But in a lot of cases, the hyperlinks on that web page can be grouped into different groups based on what they mean to the human reader: these are the links that lead to reviews, these are the links that lead to authors, these are the links that lead to vendors, etc.
Now if the computer actually knows what these links mean, then you can tell it to follow several of those links that mean the same thing: follow all the links that lead to authors. Think of it as powered browsing: the computer does the work of following several similar browsing paths at the same time – going from a set of things (web pages or data entries) to a similarly related set of things – and making all of that information available for your perusal in one shot. It is a paradigm shift compared to how we browse the Web today. And it’s only possible when the computer is capable of telling which link is similar to which other link. And that capability, in turn, will be made possible by the Data Web.
(See this unpublished paper which goes into depth about this concept)

SWC: Linked Data is evolving fast. A huge amount of RDF data is available and ready for exciting new applications. Unfortunately, one bottleneck is still the availability of Semantic Web user front-ends which demonstrate the power of linked data. Do you think, that the Semantic Web is rather a server-technology than an end-user experience?

David: I have never thought of the Semantic Web as either a server technology or an end-user experience. I only care about usefulness, and then a matching amount of usability to make that usefulness accessible to people, especially those without Computer Science expertise.
I find that it’s so much easier to explain to people and get them excited about “immediate, personal, local benefits” of a particular technology than about “long-term, communal, global benefits” of a vision. For most people, the former must be experienced and felt often before the latter can appear vaguely appealing enough to call for actions. I’m lazy – I don’t like to spend efforts convincing people of visions; I only want entice people into using the tools that I have created.
So if Parallax is considered a success, it is so not just because of its technologies and research contributions, but also because the accompanying screencast explained it in a way that people who cared nothing about the Semantic Web could understand why Parallax would be useful to them. This was achieved by pointing out limitations of existing web technologies as already experienced and understood by a lot of web users, and then illustrating concretely a possible solution enabled by data web technologies.
Perhaps I could venture further and say that the dichotomy of server technologies and end-user experience is what’s holding back Semantic Web user interface efforts. For those who don’t have expertise in design, it is a comfort to think that once the back-end technologies are solid, then it’s just a matter of putting on some polishes, a.k.a. user interfaces from their point of view, to make the whole package appealing. This approach is wrong. The user interface design must inform the back-end design. Otherwise, the user interface will almost always reflect the internal system model, and that’s usually very dissonant with how users think and behave. Recall all the Semantic Web interfaces you have seen that force users to think in terms of triples or of raw URIs. Those were made by starting from the data model, not from user needs.

SWC: Quite often I hear people saying: Where is the Semantic Web? – I still can´t “see” it! How could the linking open data community make use of such user interfaces like Exhibit, Piggy Bank or Parallax? Is the set-based browsing paradigm a universal way to browse linked data or just one possible way?

David: My research prototypes embody a number of UI ideas that are quite transferable to other platforms. Most of my code is open source, too. This, by the way, is rarer than it should be: research prototypes often fall apart as soon as, or even sooner than, the relevant research papers get presented at conferences, and research code rots rather than gets offered free for reuse. This is sad, because reusable data needs reusable code to proliferate even more widely, but there is no reward system for making research code reusable, or for keeping research prototypes running. So perhaps people can’t “see” the Semantic Web because research prototypes are not presented in appealing and comprehensible ways, and they break down and disappear too quickly.
Regarding the set-based browsing paradigm, it is most certainly not the only way to browse linked data. It is just the first good one that came to my mind, around 2005. But it’s not until 2008 that I actually got around to implement it for real. One of the factors so important in its feasibility is the quality of data in Freebase, compared to other data sources that I had access to. Even the simple fact that a lot of Freebase topics have images makes Parallax look a lot more interesting and useful. People like to see pictures rather than raw URIs. And the diversity of types of data helps illustrate the browsing paradigm of Parallax – that ability to shift focus from one set of things to another set of things, even across very seemingly unrelated domains of information, such as from politicians to their celebrity friends in the movie industry.
So, perhaps one of the main challenges in adopting Parallax ideas on any arbitrary RDF data set is curating the data sufficiently for the purpose of presenting it. In fact, if you don’t know how some data is to be presented and used, there’s no way for you to determine if that data is of sufficient quality. User needs and interface designs drive back-end implementation and data curation, not the other way around. It’s a simple idea, really, but it can be hard to adopt if one is fixated on data alone.

SWC: Do you plan new versions of Parallax? When will it become part of Freebase or of even more Linked Data Sources?

David: I’ve done a few further experiments with the ideas in Parallax, but they are not ready for public use, yet. Freebase data makes my job much easier by allowing me to focus mostly on interaction designs rather than mostly on data quality, or rather, fighting the lack of data quality, for the purpose of presenting it. So I’ll start with Freebase data and we’ll see where it takes me.

SWC: What else are you working on at the moment?

David: As mentioned briefly earlier, reusable data needs reusable code to proliferate widely. That gives you a hint at an effort that I’m involved with.

SWC: Many thanks, David!

About David François Huynh

Reblog this post [with Zemanta]
Thomas Thurner

1000-and-one pulldowns

Personalisation interface
Image by wocrig via Flickr

Luckily, times have come, where semantic search techniques have found their way to enhance knowledge providing theme portals. Nearly once a week a new knowledge portal with built-in semantic search pops up. They deal with environmental issues, health care, economy etc. These sites are good examples how the vision of a knowledge web is fostered by semantic technologies. Such focused approaches are great showcases for “a” semantic web (even if they are not based on “the” RDF semantic web) in the next few months besides general knowledge portals like Wolfram Alpha.

But the potential of these semantic theme portals is often reduced essentially by their bad usability. You get lost in categories and flags – you get puzzled by pulldowns, mouseovers and embedded hierachies – it’s sometimes a mess out off 1001 functions. You need to understand the underpinning semantic concept to get oriented within these applications – and this is not the goal of the exercise. Search has to be easy.

To show the potential of semantic technologies, we need good examples, which offer good usability. This is a call to everyone to provide such examples.

See my favorites:

  • NextBio, a platform that enables life science researchers to search, discover, and share knowledge locked within public and proprietary data
  • reegle, the Search Engine for Renewable Energy and Energy Efficiency
  • CultureSampo, a Finnish cultural heritage platform for institutional organizations as well as private citizens
Reblog this post [with Zemanta]
Andreas Blumauer

BBC Music relaunch: Linked Data goes Business?

Since SWC is involved in a couple of semantic web projects in the media industry, I was watching for the BBC Music relaunch. Now the new platform is online – and from an enduser’s perspective the new system offers comfortable ways to navigate through the world of music: Bands, their members, biographies and outgoing links like to Wikipedia or MySpace are retrieved from MusicBrainz and mashed up with BBC blogs, playlists or reviews.

bbc_music

Matthew Shorter, interactive editor for music at the BBC, told silicon.com:

We’re kind of on a journey of moving from what’s effectively a magazine/print publication-based metaphor around web publishing…to a world where we recognise that that’s not the way that people use the web.

No doubt: Linked Data is a great deal for the end-users but what´s in for the providers, in this case for BBC?

From a media company’s perspective Shorter has mentioned a handful of interesting arguments why linked data could be useful:

  1. reusing data from MusicBrainz and Wikipedia also provides better value for the licence payer as the BBC isn’t wasting resources reproducing data already in the public domain
  2. from an SEO point of view, once we start generating a lot of meaningful links among our pages, then we’re going to improve the find-ability of our content via web search
  3. by having as open a platform as we can, then our hope at least is that people will pick up that content and do things with it and we’ll benefit from incoming links as a result

This could be summarised as follows (by adding a fourth item):

  1. re-use existing data
  2. increase find-ability
  3. extend your eco-system
  4. understand users’ interests

By saying that linked data can help providers to understand their users in a more profound way which is based on the more granular way how information is offered in the linked data world (paradigm shift: page versus linked data) I´d like to ask a short, value-free question: Which side of the internet will drive the business in the future – the visible web or the deep web? Was linked data designed only for the visible web?

Reblog this post [with Zemanta]
Jana Herwig

Java’s Inner Sanctum: A Visit to Sun Microsystems’ Usability Lab in Prague

The walls in room 3328, the observation room at Sun Microsystem’s usability lab in Prague, are painted a subdued blue. It swallows all the light, ensuring the testing scenario is not interrupted by curious guests like us, the Kiwi-project team members who were granted the privilege of a tour of the inner sanctum of Sun’s developer den. Through the one-way mirror, we can see a rosy-cheeked developer, talking to himself in Czech, interrupted by little sips from a coke bottle. He does not see us. The fact that very few of us understand Czech gives the situation an even more experimental appeal.

Sun Usabilty Lab, Prague
The new usability lab at Sun Microsystems, Prague

Jakub Franc, the cognitive psychologists in charge of the design of the study, explains to us that Sun rely on the Think Aloud method and observation in most of their test cases, rather than analyzing data from biofeedback sensors or eye-tracking devices. “Eye-tracking is good for testing the usability of web sites,” he says, “but for our purposes, the think aloud method, where the test person describes what he does and thinks, has greater benefits to offer.” The authenticity of the tasks to be performed in the study is a key: The developer behind the sound-proof glass wall is currently busy importing his own PHP application into NetBeans, Sun’s open source development environment, while the interaction designers and developers who created the tested module observe. A typical testing scenario lasts about 90 minutes, with the final 20 minutes consisting of an interview. “I always tell the testers that it’s not their fault if they fail to perform a task,” says Jakub. “If they fail, it’s the product’s fault. After all, that’s why we’re testing it.”

Before a software product is tested in its design or redesign phase, the ideal candidates are identified based on the results of questionnaires that are sent out to people in the tester database. The database includes both users of open source software as well as of competitive products, with the ideal test sample consisting of people who represent the whole spectrum of the target group, ranging from expert to newbie – and they must not necessarily be open source enthusiasts: “We offer a relatively high reward of 1000 CZK* as we want testers from all levels and backgrounds and not just the volunteering enthusiasts.”

Until Sun Microsystems moved into their new building in 2006, they collaborated with the Department of Computer Science at Czech Technical University (CTU), where they set up the very first usability lab in the Czech Republic in 2004. The deal was that Sun would supply the equipment and know-how, and CTU would supply the space and construction. Both institutions shared the facility until, after three years, all usage rights and equipment were transferred to CTU. One of the features of the new lab is the one-way mirror – the previous one relied on video observation: “From our experience, despite the fact that some participants feel less comfortable in this set-up, it makes a difference to observers”, writes Jiri Mzourek on his, i.e. one of the many Sun blogs, “they feel more connected to the participants”.

Jakub Franc
Jakub Franc, cognitive psychologist and usability researcher

Even though there is now an in-house usability lab at Sun, the collaboration between Sun and CTU continues, in particular in research and design projects. Students participate in projects led by Sun that focus on Sun products, learning about research methodology as well as gaining experience in project management in a real business environment. Jakub Franc also gives seminars in cognitive psychology and research methodology to CTU students, and is himself pursuing a PhD in environmental psychology – a relatively new discipline which, according to Jakub, deals with questions such as: “How should buildings be designed so that people are not getting lost in them? What recreational areas help people to recover from daily stress? What kinds of front gardens discourage burglars from invading the place?” In other words: Jakub studies the cognitive parameters of the usability of real objects.

Once the KiWi/Sun usecase enters the evaluation stage, the KiWi team will again be given access to the lab – but this time not as visitors, but as observers, witnessing how usable the KiWi-Wiki system really is to the inclined user. We are looking forward to the experience – and thank the designers of the lab for implementing a sound-proof wall, just in case the KiWis get emotional!

*) worth about 2 monthly passes for the metro in Prague, or 40 beers in a good pub

Zemanta Pixie