The Semantic Puzzle

Thomas Thurner

American Physical Society Taxonomy – Case Study

image_jb

Joseph A Busch

Taxonomy Strategies has been working with the American Physical Society (APS) to develop a new faceted classification schemea classification scheme is the descriptive information for an arrangement or division of objects into groups based on characteristics which the objects have in common. The ISO/IEC 11179 metadata registry standard uses classification schemes as a way to classify administered items, such as data ....

The proposed scheme includes several discrete sets of categories called facets whose values can be combined to express concepts such as existing Physics and Astronomy Classification Scheme (PACS) codes, as well as new concepts that have not yet emerged, or have been difficult to express with the existing PACS.

PACS codes formed a single-hierarchy classification scheme, designed to assign the “one best” category that an item will be classified under. Classification schemes come from the need to physically locate objects in one dimension, for example in a library where a book will be shelved in one and only one location, among an ordered set of other books. Traditional journal tables of contents similarly place each article in a given issue in a specific location among an ordered set of other articles, certainly a necessary constraint with paper journals and still useful online as a comfortable and familiar context for readers.

However, the real world of concepts is multi-dimensional. In collapsing to one dimension, a classification scheme makes essentially arbitrary choices that have the effect of placing some related items close together while leaving other related items in very distant bins. It also has the effect of repeating the terms associated with the last dimension in many different contexts, leading to an appearance of significant redundancy and complexity in locating terms.

A faceted taxonomyTaxonomy is the practice and science of classification. The word finds its roots in the Greek τάξις, taxis (meaning 'order' or 'arrangement') and νόμος, nomos (meaning 'law' or 'science'). Taxonomy uses taxonomic units, known as taxa. In addition, the word is also used as a count noun: ... attempts to identify each stand-alone concept through the term or terms commonly associated with it, and have it mean the same thing whenever used. Hierarchy in a taxonomy is useful to group related terms together; however the intention is not to attempt to identify an item such as an article or book by a single concept, but rather to assign multiple concepts to represent the meaning. In that way, related items can be closely associated along multiple dimensions corresponding to each assigned concept. Where previously a single PACS code was used to indicate the research area, now two, three, or more of the new concepts may be needed (although often a single new concept will be sufficient). This requires a different mindset and approach in applying the new taxonomy to the way APS has been accustomed to working with PACS; however it also enables significant new capabilities for publishing and working with all types of content including articles, papers and websites.

To build and maintain the faceted taxonomy, APS has acquired the PoolParty taxonomy management tool. PoolPartyWeb based ontology manager which can serve as a central hub for your knowledge organization. With PoolParty you can organize and maintain knowledge models based on widely accepted specifications like RDF, SPARQL and SKOS. will enable APS editorial staff to create, retrieve, update and delete taxonomy term records. The tool will support the various thesaurusA thesaurus is a book that lists words grouped together according to similarity of meaning, in contrast to a dictionary, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary, which contains more than ..., knowledge organization system and ontology, an ontology is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain. In theory, an ontology is a "formal, explicit ... standards for concepts, relationships, alternate terms etc. It will also provide methods for:

  • Associating taxonomy terms with content items, and storing that association in a content index record.
  • Automated indexing to suggest taxonomy terms that should be associated with content items, and text miningText mining, sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the divining of patterns and trends through means such as ... to suggest terms to potentially be added to the taxonomy.
  • Integrating taxonomy term look-up, browse and navigation in a selection user interface that, for example, authors and the general public could use.
  • Implementing a feedback user interface allowing authors and the general public to suggest terms, record the source of the suggestion, and inform the user on the disposition of their suggestion.

Arthur Smith, project manager for the new APS taxonomy notes “PoolParty allows our subject matter experts to immediately visualize the layout of the taxonomy, to add new concepts, suggest alternatives, and to map out the relationships and mappings to other concept schemes that we need. While our project is still in an early stage, the software tool is already proving very useful.”

About

Taxonomy Strategies (www.taxonomystrategies.com) is an information management consultancy that specializes in applying taxonomies, metadata, automatic classification, and other information retrieval technologies to the needs of business and other organizations.

The American Physical Society (www.aps.org) is a non-profit membership organization working to advance and diffuse the knowledge of physics through its outstanding research journals, scientific meetings, and education, outreach, advocacy and international activities. APS represents over 50,000 members, including physicists in academia, national laboratories and industry in the United StatesThe United States of America (commonly referred to as the United States, the U.S. , the USA, or America) is a federal constitutional republic comprising fifty states and a federal district. The country is situated mostly in central North America, where its forty-eight contiguous states and ... and throughout the world. Society offices are located in College Park, MD (Headquarters), Ridge, NY, and Washington, DC.

Enhanced by Zemanta