Thomas Thurner

Energy Buildings Performance Scenarios as Linked Open Data

The reduction of green house gas emissions is one of the big global challenges for the next decades. (Linked) Open Data on this multi-domain challenge is key for addressing the issues in policy, construction, energy efficiency, production a like. Today – on the World Environment Day 2014 – a new (linked open) data initiative contributes to this effort: GBPN’s Data Endpoint for Building Energy Performance Scenarios.

gbpn-scenariosGBPN (The Global Buildings Performance Network) provides the full data set on a recently made global scenario analysis for saving energy in the building sector worldwide, projected from 2005 to 2050. The multidimensional dataset includes parameters like housing types, building vintages and energy uses  – for various climate zones and regions and is freely available for full use and re-use as open data under CC-BY 3.0 France license.

To explore this easily, the Semantic Web Company has developed an interactive query / filtering tool which allows to create graphs and tables in slicing this multidimensional data cube. Chosen results can be exported as open data in the open formats: RDF and CSV and also queried via a provided SPARQL endpoint (a semantic web based data API). A built-in query-builder makes the use as well as the learning and understanding of SPARQL easy – for advanced users as well as also for non-experts or beginners.

gbn-filter

The LOD based information- & data system is part of Semantic Web Companies’ recent Poolparty Semantic Drupal developments and is based on OpenLinks Virtuoso 7 QuadStore holding and calculating ~235 million triples as well as it makes use of the RDF ETL Tool: UnifiedViews as well as D2R Server for RDF conversion. The underlying GBPN ontology runs on PoolParty 4.2 and serves also a powerful domain-specific news aggregator realized with SWC’s sOnr webminer.

reegle.info-trusted-linksTogether with other Energy Efficiency related Linked Open Data Initiatives like REEEP, NREL, BPIE and others, GBPNs recent initative is a contribution towards a broader availability of data supporting action agains global warming – as also Dr. Peter Graham, Executive Director of GBPN emphasized “…data and modelling of building energy use has long been difficult or expensive to access – yet it is critical to policy development and investment in low-energy buildings. With the release of the BEPS open data model, GBPN are providing free access to the world’s best aggregated data analyses on building energy performance.”

The Linked Open Data (LOD) is modelled using the RDF Data Cube Vocabulary (that is a W3C recommendation) including 17 dimensions in the cube. In total there are 235 million triples available in RDF including links to DBpedia and Geonames – linking the indicators: years – climate zones – regions and building types as well as user scenarios….

Enhanced by Zemanta
Tassilo Pellegrini

Linked Data in the Content Value Chain or Why Dynamic Semantic Publishing makes sense …

In 2012 Jem Rayfield released an insightful post about the BBC’s Linked Data strategy during the Olympic Games 2012. In this post he coined the term “Dynamic Semantic Publishing”, referring to

“the technology strategy the BBC Future Media department is using to evolve from a relational content model and static publishing framework towards a fully dynamic semantic publishing (DSP) architecture.”

According to Rayfield this approach is characterized by

“a technical architecture that combines a document/content store with a triple-store proves an excellent data and metadata persistence layer for the BBC Sport site and indeed future builds including BBC News mobile.”

The technological characteristics are further described as …

  • A triple-store that provides a concise, accurate and clean implementation methodology for describing domain knowledge models.
  • An RDF graph approach that provides ultimate modelling expressivity, with the added advantage of deductive reasoning.
  • SPARQL to simplify domain queries, with the associated underlying RDF schema being more flexible than a corresponding SQL/RDBMS approach.
  • A document/content store that provides schema flexibility; schema independent storage; versioning, and search and query facilities across atomic content objects.
  • Combining a model expressed as RDF to reference content objects in a scalable document/content-store provides a persistence layer that uses the best of both technical approaches.

So what are actually the benefits of Linked Data from a non-technical perspective?

Benefits of Linked (Meta)Data

Semantic interoperability is crucial in building cost efficient IT systems that integrate numerous data sources. Since 2009 the Linked Data paradigm has emerged as a light weight approach to improve data portability ferderated IT systems. By building on Semantic Web standards the Linked Data approach offers significant benefits compared to conventional data integration approaches. These are according to Auer [1]:

  • De-referencability. IRIs are not just used for identifying entities, but since they can be used in the same way as URLs they also enable locating and retrieving resources describing and representing these entities on the Web.
  • Coherence. When an RDF triple contains IRIs from different namespaces in subject and object position, this triple basically establishes a link between the entity identified by the subject (and described in the source dataset using namespace A) with the entity identified by the object (described in the target dataset using namespace B). Through these typed RDF links, data items are effectively interlinked.
  • Integrability. Since all Linked Data sources share the RDF data model, which is based on a single mechanism for representing information, it is very easy to attain a syntactic and simple semantic integration of different Linked Data sets. A higher-level semantic integration can be achieved by employing schema and instance matching techniques and expressing found matches again as alignments of RDF vocabularies and ontologies in terms of additional triple facts.
  • Timeliness. Publishing and updating Linked Data is relatively simple thus facilitating a timely availability. In addition, once a Linked Data source is updated it is straightforward to access and use the updated data source, since time consuming and error prune extraction, transformation and loading is not required.

On top of these technological principles Linked Data promises to improve the reusability and richness (in terms of depth and broadness) of content thus adding significant value to the content value chain.

Linked Data in the Content Value Chain

According to Cisco communication within electronic networks has become increasingly content-centric. I.e. Cisco reports for the time period from 2011 to 2016 an increase of 90% of video content, 76% of gaming content, 36% VoIP, 36% file sharing being transmitted electronically.  Hence it is legitimate to ask what role Linked Data takes in the content production process. Herein we can distinguish five sequential steps: 1) content acquisition, 2) content editing, 3) content bundling, 4) content distribution and 5) content consumption. As illustrated in the figure below Linked Data can contribute to each step by supporting the associated intrinsic production function [2].

Linked Data in the Content Value Chain

Linked Data in the Content Value Chain

  • Content acquisition is mainly concerned with the collection, storage and integration of relevant information necessary to produce a content item. In the course of this process information is being pooled from internal or external sources for further processing.
  • The editing process entails all necessary steps that deal with the semantic adaptation, interlinking and enrichment of data. Adaptation can be understood as a process in which acquired data is provided in a way that it can be re-used within editorial processes. Interlinking and enrichment are often performed via processes like annotation and/or referencing to enrich documents either by disambiguating of existing concepts or by providing background knowledge for deeper insights.
  • The bundling process is mainly concerned with the contextualisation and personalisation of information products. It can be used to provide customized access to information and services i.e. by using metadata for the device-sensitive delivery of content, or to compile thematically relevant material into Landing Pages or Dossiers thus improving the navigability, findability and reuse of information.
  • In a Linked Data environment the process of content distribution mainly deals with the provision of machine-readable and semantically interoperable (meta-)data via Application Programming Interfaces (APIs) or SPARQL Endpoints. These can be designed either to serve internal purposes so that data can be reused within controlled environments (i.e. within or between organizational units) or for external purposes so that data can be shared between anonymous users (i.e. as open SPARQL Endpoints on the Web).
  • The last step in the content value chain is dealing with content consumption. This entails any means that enable a human user to search for and interact with content items in a pleasant und purposeful way. So according to this view this step mainly deals with end user applications that make use of Linked Data to provide access to content items (i.e. via search or recommendation engines) and generate deeper insights (i.e. by providing reasonable visualizations).

Conclusion

There is definitely a place for Linked Data in the Content Value Chain, hence we can expect that Dynamic Semantic Publishing is here to stay. Linked Data can add significant value to the content production process and carry the potential to incrementally expand the business portfolio of publishers and other content-centric businesses. But the concrete added value is highly context-dependent and open to discussion. Technological feasibility is easily contradicted by strategic business considerations, a lack of cultural adaptability to legacy issues like dual licensing, technological path dependencies or simply a lack of resources. Nevertheless Linked Data should be considered as a fundamental principle in next generation content management as it provides a radically new environment for value creation.

More about the topic – live

Linked Data in the content value chain is also one of the topics set onto the agenda of this year’s SEMANTiCS 2014. Listen to keynote speaker Sofia Angeletou an others, to learn more about next generation content management.

References

[1]     Auer, Sören (2011). Creating Knowledge Out of Interlinked Data. In: Proceedings of WIMS’11, May 25-27, 2011, p. 1-8

[2] Pellegrini, Tassilo (2012). Integrating Linked Data into the Content Value Chain: A Review of News-related Standards, Methodologies and Licensing Requirements. In: Presutti, Valentina; Pinto, Sofia S.; Sack, Harald; Pellegrini, Tassilo (2012). Proceedings of I-Semantics 2012. 8th International Conference on Semantic Systems. ACM International Conference Proceeding Series, p. 94-102

Enhanced by Zemanta
Andreas Blumauer

Why SKOS should be a focal point of your linked data strategy

skos_hand-small

The Simple Knowledge Organization System (SKOS) has become one of the ‘sweet spots’ in the linked data ecosystem in recent years. Especially when semantic web technologies are being adapted for the requirements of enterprises or public administration, SKOS has played a most central role to create knowledge graphs.

In this webinar, key people from the Semantic Web Company will describe why controlled vocabularies based on SKOS play a central role in a linked data strategy, and how SKOS can be enriched by ontologies and linked data to further improve semantic information management.

SKOS unfolds its potential at the intersection of three disciplines and their methods:

  • library sciences: taxonomy and thesaurus management
  • information sciences: knowledge engineering and ontology management
  • computational linguistics: text mining and entity extraction

Linked Data based IT-architectures cover all three aspects and provide means for agile data, information, and knowledge management.

In this webinar, you will learn about the following questions and topics:

  • How SKOS builds the foundation of enterprise knowledge graphs to be enriched by additional vocabularies and ontologies?
  • How can knowledge graphs be used build the backbone of metadata services in organisations?
  • How text mining can be used to create high-quality taxonomies and thesauri?
  • How can knowledge graphs be used for enterprise information integration?

Based on PoolParty Semantic Suite, you will see several live demos of end-user applications based on linked data and of PoolParty’s latest release which provides outstanding facilities for professional linked data management, including taxonomy, thesaurus and ontology management.

Register here: https://www4.gotomeeting.com/register/404918583

 

Christian Mader

Online checker for SKOS vocabularies now available

Create better SKOS vocabularies
quality_check_finishedPoolParty team likes to announce the availability of the new online vocabulary quality checker for SKOS vocabularies. It finds over 20 kinds of potential quality problems in controlled vocabularies that are expressed using SKOS. The service is based on the qSKOS open-source tool.
The main features of the service are:
  • No need for registration – you can log in with your existing accounts at Google, Xing, LinkedIn or Twitter
  • Upload and check as many vocabularies you like (100MB maximum size for each vocabulary)
  • Access reports of the quality checks for all uploaded vocabularies
  • Quality reports can also be sent to you by email, if you wish
PoolParty team asks for feedback and suggestions on the service!

Either contact support@poolparty.biz or fill in our feedback form!