Thomas Thurner

American Physical Society Taxonomy – Case Study

image_jb

Joseph A Busch

Taxonomy Strategies has been working with the American Physical Society (APS) to develop a new faceted classification scheme.

The proposed scheme includes several discrete sets of categories called facets whose values can be combined to express concepts such as existing Physics and Astronomy Classification Scheme (PACS) codes, as well as new concepts that have not yet emerged, or have been difficult to express with the existing PACS.

PACS codes formed a single-hierarchy classification scheme, designed to assign the “one best” category that an item will be classified under. Classification schemes come from the need to physically locate objects in one dimension, for example in a library where a book will be shelved in one and only one location, among an ordered set of other books. Traditional journal tables of contents similarly place each article in a given issue in a specific location among an ordered set of other articles, certainly a necessary constraint with paper journals and still useful online as a comfortable and familiar context for readers.

However, the real world of concepts is multi-dimensional. In collapsing to one dimension, a classification scheme makes essentially arbitrary choices that have the effect of placing some related items close together while leaving other related items in very distant bins. It also has the effect of repeating the terms associated with the last dimension in many different contexts, leading to an appearance of significant redundancy and complexity in locating terms.

A faceted taxonomy attempts to identify each stand-alone concept through the term or terms commonly associated with it, and have it mean the same thing whenever used. Hierarchy in a taxonomy is useful to group related terms together; however the intention is not to attempt to identify an item such as an article or book by a single concept, but rather to assign multiple concepts to represent the meaning. In that way, related items can be closely associated along multiple dimensions corresponding to each assigned concept. Where previously a single PACS code was used to indicate the research area, now two, three, or more of the new concepts may be needed (although often a single new concept will be sufficient). This requires a different mindset and approach in applying the new taxonomy to the way APS has been accustomed to working with PACS; however it also enables significant new capabilities for publishing and working with all types of content including articles, papers and websites.

To build and maintain the faceted taxonomy, APS has acquired the PoolParty taxonomy management tool. PoolParty will enable APS editorial staff to create, retrieve, update and delete taxonomy term records. The tool will support the various thesaurus, knowledge organization system and ontology standards for concepts, relationships, alternate terms etc. It will also provide methods for:

  • Associating taxonomy terms with content items, and storing that association in a content index record.
  • Automated indexing to suggest taxonomy terms that should be associated with content items, and text mining to suggest terms to potentially be added to the taxonomy.
  • Integrating taxonomy term look-up, browse and navigation in a selection user interface that, for example, authors and the general public could use.
  • Implementing a feedback user interface allowing authors and the general public to suggest terms, record the source of the suggestion, and inform the user on the disposition of their suggestion.

Arthur Smith, project manager for the new APS taxonomy notes “PoolParty allows our subject matter experts to immediately visualize the layout of the taxonomy, to add new concepts, suggest alternatives, and to map out the relationships and mappings to other concept schemes that we need. While our project is still in an early stage, the software tool is already proving very useful.”

About

Taxonomy Strategies (www.taxonomystrategies.com) is an information management consultancy that specializes in applying taxonomies, metadata, automatic classification, and other information retrieval technologies to the needs of business and other organizations.

The American Physical Society (www.aps.org) is a non-profit membership organization working to advance and diffuse the knowledge of physics through its outstanding research journals, scientific meetings, and education, outreach, advocacy and international activities. APS represents over 50,000 members, including physicists in academia, national laboratories and industry in the United States and throughout the world. Society offices are located in College Park, MD (Headquarters), Ridge, NY, and Washington, DC.

Enhanced by Zemanta
Christian Mader

Online checker for SKOS vocabularies now available

Create better SKOS vocabularies
quality_check_finishedPoolParty team likes to announce the availability of the new online vocabulary quality checker for SKOS vocabularies. It finds over 20 kinds of potential quality problems in controlled vocabularies that are expressed using SKOS. The service is based on the qSKOS open-source tool.
The main features of the service are:
  • No need for registration – you can log in with your existing accounts at Google, Xing, LinkedIn or Twitter
  • Upload and check as many vocabularies you like (100MB maximum size for each vocabulary)
  • Access reports of the quality checks for all uploaded vocabularies
  • Quality reports can also be sent to you by email, if you wish
PoolParty team asks for feedback and suggestions on the service!

Either contact support@poolparty.biz or fill in our feedback form!

Thomas Thurner

I-SEMANTICS 2013: Kai Holzweissig about Daimler’s Linked Data Projects

“Product development is an information intensive process, which relies heavily on the division of labor. Thus, product development is not only of high cognitive, but also of high social complexity. Employees in product development possess different “thought worlds” of the product, its components and its development process, that is, they see the product differently. These different “thought worlds” cause communication – and consequently collaboration, which is the topmost success factor in product development – to break down.”

At this years I-Semantics Kai Holzweissig describes the use of controlled vocabularies and product development process reference models at Daimler. The use of controlled vocabulary and the corresponding process reference model as a “discursive anchor” gives employees at Daimler a toolset to harmonize their “thought worlds”. This can result in higher efficiency and effectiveness in product development by fostering inter-departmental collaboration.

About Dr. Kai Holzweissig

He has an educational background in Informatics (PhD) and Cognitive Science (MSc). During his time at University he was awarded scholarships from SIEMENS AG and the Friedrich Naumann Foundation for Freedom. From 2007 to 2011 Kai worked at the Project Management Office at Daimler Trucks. Since 2011 he works for the Chief Technology Officer (CTO) at Daimler’s central IT department. Kai’s technical interests include: Contextual Informatics, Interaction Design, Social Software Theory and Linked Data. Kai is a lecturer for Interactive Systems at Reutlingen University.

 

Enhanced by Zemanta
Martin Kaltenböck

Linked (Open) Data has reached the European Publishing Industry – but is it the ‘Real Linked Data’ – a short review on the Publishers’ Forum 2013

Invited by Helmut von Berg, Director at Klopotek & Partner (Klopotek is THE European vendor for publishing production software) I had the chance to participate and speak at this years Publishers’ Forum 2013 at the Concorde Hotel in Berlin on 22nd to 23rd of April 2013.

Coming from the semantic web / linked (open) data community to this publishing industry event with about 320 participants (mainly decision makers) from small to huge publishers all across Europe made me really curious in the forefront of the Forum – what would be the most important issues for innovative publishing processes, what would be the hypes and hopes of a sector that is in the middle of a big change: coming from paper publishing straight into the world of our todays’ data economy?

And  then in Berlin, Monday morning – the big surprise: already the opening keynotes by David Worlock, Outsell, UK (Title of Talk: The Atomization of Everything) and Dan Pollock, Nature Publishing Group, UK (Title of Talk: Networked Publishing is Open for Business) mentioned topics as the Semantic Web, Linked (Open) Data and even RDF and Triple Stores – last but not least pointing out that the content of publishers needs to be atomized down to the ‘data level’ and then can to be used successfully for new and innovative business models to serve existing and future customers…

D-Worlock_PublishersForum2013_Keynote
David Worlock ‘singing my song’ at the Publishers’ Forum 2013

As I participated in the European Data Forum 2013 (EDF2013) just a few days before the Publishers’ Forum my first thought was: WOW – publishers today have arrived in modern data economy (following already the data value chain)! And I enjoyed talking to David Worlock in the coffee break telling him my thoughts and that I will manage a workshop about ‘Enterprise Terminology as a basis for powerful semantic services for publishers’ in the afternoon that day (see slides on slideshare) and his answer was ‘Yes Martin, it seems that I was singing your song’.

The following 1.5 days of the Publishers’ Forum 2013 were full of presentations, workshops and discussions about innovative publishing processes, new business models for publishers and innovative approaches and services – full of terms that are well known by myself like: meta data management, semantics, contextualisation and very very often: Big Data and Linked (Open) Data…..and I listened very carefully to all of this – and at some point it was clear: this discussion needs to be evaluated more carefully – because many of talks and presentations were using the above mentioned terms, principles and technologies only as marketing buzz words – but taking a deeper look showed: there is no semantic web technology in place?!

Hey, Linked Data does NOT mean to establish something like a relation / a link between ‘an Author and a publication’ inside of a repository / a database – Linked (Open) Data is a well established and specified methodology using W3C semantic web standards:

Tim Berners-Lee outlined four principles of linked data in his Design Issues: Linked Data as follows:

  • Use URIs to denote things.
  • Use HTTP URIs so that these things can be referred to and looked up (“dereferenced”) by people and user agents.
  • Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF*, SPARQL.
  • Include links to other related things (using their URIs) when publishing data on the Web.

Please read in more detail here:

As being a bit like an evangelist for Linked (Open) Data I think such a hype can be very dangerous for the publishing industry – because I see a very strong need for these companies to go for innovative content- and data management approaches very quickly to ensure competitiveness today as well as competitive advantage tomorrow – but not using the respective standards (means: only having the packaging and marketing brochures branded with it) cannot fulfill the hopes in the mid- and the long term!

Thereby I would like to point out here that ‘Linked Data’ seems not always to be ‘Linked Data’ – and I would like to strongly recommend to take a look at the well proven standards – and when selecting IT consultants and IT vendors (means: your IT partners – also a very interesting message taken home from the Forum: that publishers and IT vendors should co-operate more closely in the future in the form of sustainable partnerships) to ensure that these partners really have worked already and are working continuously with these standards and mechanisms!

C-Dirschl_PublishersForum2013_Terminology-Workshop

Christian Dirschl (Wolters Kluwer) presenting the
WKD Use Case on Enterprise Terminologies

Btw. I had a great workshop on Monday afternoon together with Christian Dirschl from Wolters Kluwer Germany (WKD) discussing applications on top of enterprise terminologies (controlled vocabularies using real linked (open) data principles). And: The Semantic Web Company (SWC) is already a partner of the publisher WKD – and this partnership seems to become a more and more fruitful and sustainable one every day – using real linked (open) data…