Thomas Thurner

Data to Value & Semantic Web Company agree partnership to bring cutting edge Semantic Management to Financial Services clients

The partnership aims to change the way organisations, particularly within Financial Services, manage the semantics embedded in their data landscapes. This will offer several core benefits to existing and prospective clients including locating, contextualising and understanding the meaning and content of Information faster and at a considerably lower cost. The partnership will achieve this through combining the latest Information Management and Semantic techniques including:

  • Text Mining, Tagging, Entity Definition & Extraction.
  • Business Glossary, Data Dictionary & Data Governance techniques.
  • Taxonomy, Data Model and Ontology development.
  • Linked Data & Semantic Web analyses.
  • Data Profiling, Mining & Discovery.

This includes improving regulatory compliance in areas such as BCBS, enabling new investment research and client reporting techniques as well as general efficiency drivers such as faster integration of mergers and acquisitions. As part of the partnership, Data to Value Ltd. will offer solution services and training in PoolParty product offerings, including ontology development and data modeling services.

Nigel Higgs, Managing Director of Data to Value notes; “this is an exciting collaboration between two firms which are pushing the boundaries in the way Data, Information and Semantics are managed by business stakeholders. We spend a great deal of time helping organisations at a grass roots level pragmatically adopt the latest Information Management techniques. We see this partnership as an excellent way for us to help organisations take realistic steps to adopting the latest semantic techniques.”

Andreas Blumauer, CEO of Semantic Web Company adds, “The consortium of our two companies offers a unique bundle, which consists of a world-class semantic platform and a team of experts who know exactly how Semantics can help to increase the efficiency and reliability of knowledge intensive business processes in the financial industry.”

Thomas Thurner

Automatic Semantic Tagging for Drupal CMS launched

REEEP [1] and CTCN [2] have recently launched Climate Tagger, a new tool to automatically scan, label, sort and catalogue datasets and document collections. Climate Tagger now incorporates a Drupal Module for automatic annotation of Drupal content nodes. Climate Tagger addresses knowledge-driven organizations in the climate and development arenas, providing automated functionality to streamline, catalogue and link their Climate Compatible Development data and information resources.

Climate Tagger

Climate Tagger for Drupal is a simple, FREE and easy-to-use way to integrate the well-known Reegle Tagging API [3], originally developed in 2011 with the support of CDKN [4], (now part of the Climate Tagger suite as Climate Tagger API) into any web site based on the Drupal Content Management System [5]. Climate Tagger is backed by the expansive Climate Compatible Development Thesaurus, developed by experts in multiple fields and continuously updated to remain current (explore the thesaurus at http://www.reegle.info/glossary). The thesaurus is available in English, French, Spanish, German and Portuguese. And can connect content on different portals published in these different languages.

Climate Tagger for Drupal can be fine-tuned to individual (and existing) configuration of any Drupal 7 installation by:

  • determining which content types and fields will be automatically tagged
  • scheduling “batch jobs” for automatic updating (also for already existing contents; where the option is available to re-tag all content or only tag with new concepts found via a thesaurus expansion / update)
  • automatically limit and manage volumes of tag results based on individually chosen scoring thresholds
  • blending with manual tagging
click to enlarge

click to enlarge

“Climate Tagger [6] brings together the semantic power of Semantic Web Company’s PoolParty Semantic Suite [7] with the domain expertise of REEEP and CTCN, resulting in an automatic annotation module for Drupal 7 with an accuracy never seen before” states Martin Kaltenböck, Managing Partner of Semantic Web Company [8], which acts as the technology provider behind the module.

Climate Tagger is the result of a shared commitment to breaking down the ‘information silos’ that exist in the climate compatible development community, and to provide concrete solutions that can be implemented right now, anywhere” said REEEP Director General Martin Hiller. “Together with CTCN and SWC laid the foundations for a system that can be continuously improved and expanded to bring new sectors, systems and organizations into the climate knowledge community.”

For the Open Data and Linked Open Data communities, a Climate Tagger plugin for CKAN [9] has also been published, which was developed by developed by NREL [10] in cooperation with CTCN’s support, harnessing the same taxonomy and expert vetted thesaurus behind the Climate Tagger, helping connect open data to climate compatible content through the simultaneous use of these tools.

REEEP Director General Martin Hiller and CTCN Director Jukka Uosukainen will be talking about Climate Tagger at the COP20 side event hosted by the Climate Knowledge Brokers Group in Lima [11], Peru, on Monday, December 1st at 4:45pm.

Further reading and downloads

About REEEP:

REEEP invests in clean energy markets in developing countries to lower CO2 emissions and build prosperity. Based on strategic portfolio of high impact projects, REEEP works to generate energy access, improve lives and economic opportunities, build sustainable markets, and combat climate change.

REEEP understands market change from a practice, policy and financial perspective. We monitor, evaluate and learn from our portfolio to understand opportunities and barriers to success within markets. These insights then influence policy, increase public and private investment, and inform our portfolio strategy to build scale within and replication across markets. REEEP is committed to open access to knowledge to support entrepreneurship, innovation and policy improvements to empower market shifts across the developing world.

About the CTCN

The Climate Technology Centre & Network facilitates the transfer of climate technologies by providing technical assistance, improving access to technology knowledge, and fostering collaboration among climate technology stakeholders. The CTCN is the operational arm of the UNFCCC Technology Mechanism and is hosted by the United Nations Environment Programme (UNEP) in collaboration with the United Nations Industrial Development Organization (UNIDO) and 11 independent, regional organizations with expertise in climate technologies.

About Semantic Web Company

Semantic Web Company (SWC, http://www.semantic-web.at) is a technology provider headquartered in Vienna (Austria). SWC supports organizations from all industrial sectors worldwide to improve their information and data management. Their products have outstanding capabilities to extract meaning from structured and unstructured data by making use of linked data technologies.

Martin Kaltenböck

GBPN Knowledge Platform using Semantic Technologies and Linked Open Data launched

The brand new web based GBPN Knowledge Platform has been launched on 21 February 2013. It helps the building sector effectively reduce its impact on climate change!

It has been designed as a participative knowledge hub and data hub harvesting, sharing and curating best practice policies in building energy performance globally. Available in English and soon in Mandarin, this new web-based tool of the Global Buildings Performance Network (GBPN) aims to stimulate collective research and analysis from experts worldwide to promote better decision-making and help the building sector effectively reduce its impact on climate change. To sustain and accelerate change in the building sector, the GBPN encourages open and transparent access to good quality and verifiable data. The data can be used and re-used in HTML, PDF and machine readable raw data (CSV) formats – provided by a Creative Commons Attribution (CC-BY 3.0 FR) license.

The GBPN Knowledge Platform is built on Drupal CMS and seamless connected with the PoolParty Semantic Information Management Platform of Semantic Web Company. Thereby this knowledge platform makes use of semantic technologies and Linked Open Data (LOD) principles and techniques under the hood. A lot of the available data of the various GBPN tools is provided as (linked) open data under a Creative Commons Attribution license. The Semantic Web Company is responsible for conceptual design and technical implementation of the GBPN Knowledge Platform.

As follows an overview and description of the most important features, tools and services of the information management system.

Continue reading

Andreas Blumauer

State-of-the-art Text Mining: PoolParty Extractor 2.1.1 released

PoolParty Extractor (PPX) is part of the PoolParty product family and builds the basis for state-of-the art text mining applications.

The idea behind PPX is to underpin automatic text mining algorithms with domain-specific knowledge from thesauri and linked data sources. This is the precondition to extract meaning from unstructured information more precisely and with higher performance. PoolParty Extractor supports the following application scenarios:

  • automatic document categorisation
  • named entity extraction based on concepts from thesauri or other knowledge models
  • text analysis to improve semantic indexing
  • automatic transformation of unstructured text to an RDF based linked data source
  • linking and enrichment of text with structured data from databases or XML-documents
  • extended indexing by using inflected forms of words and by splitting of compound words
  • generation and continuous improvement of thesauri by text corpus analysis

PoolParty Extractor can be integrated smoothly with third-party systems like CMS, DMS, communication platforms, wikis etc. PPX is fully based on Java and provides an HTTP API. Integrations with Sharepoint, Confluence, WordPress and others exist, please provide us your use case!

The latest release 2.1.1 of PPX further extends the capabilities to extract meaning from text with high precision and high performance:

  • use of tf-idf (term frequency inverse document frequency)
    • Creation of a textcorpus for tf-idf
    • Use tf-idf calculation during extraction
    • Corpus / thesaurus alignment
      • show missing concepts
      • show not used concepts
  • Use regular expressions to match specific patterns in texts
  • Use parts of the thesaurus as dynamic components for regular expressions
  • Calculate inflected forms (at the moment for German)
    • Word forms are added to the extraction model and used during extraction
    • List of inflected forms can be imported to thesaurus
  • Split compound words (at the moment for German)

PPX can be tested online as a web service, please send us a short note describing your interest and we will provide further details.