OWL research at the University of Manchester

Joint research by members of the Information Management Group and the Bio-Health Informatics Group.

Ontology Diffing

Project Description
Tool Support
Papers

Project Description

Over the past few years there have been considerable advances in the area of ontology diffing, translated into a variety of services revolving around syntactic or semantic change detection. These typically perform the basic distinction between additions and removals (whether that be syntactic or semantic changes), and subsequently align axiom changes with those class names found on the left hand side of the change. However, no further characterisation of changes is typically carried out, e.g., whether changes produce any logical effect (thus effectual) or not (ineffectual). Indeed semantic diffs consider all ineffectual changes as neglectable, though, generally, knowing the proportion of effectual versus ineffectual changes gives us a more accurate idea of how much change (effort) an ontology has undergone. Furthermore these diffs lack a standard (and essential) feature of a diff: the alignment between the source and target of a change. This kind of data can be collected at development time via edit-based diffs, as implemented in Swoop, although if there are no such change records then an exact post facto change analysis is near-impossible.

We have developed a diff notion, and associated tool ecco, that incorporates structural and semantic techniques to, firstly, distinguish which additions and removals (obtained via structural difference, based on OWL’s notion of structural equivalence) are effectual or ineffectual, and, secondly, find the source of each change (where attainable), which in turn allows us to categorise and align (source with target of) changes between two ontologies. The categories follow from the impact a change can have, e.g., by further constraining an axiom we can make it “stronger”, and the relation between this stronger axiom and its preceding version is made explicit by our categorisation, and suitably presented by our tool.

Tool Support

The diff ecco is available as a standalone tool, and as a Web-based application. It is implemented in Java 7, relying on the OWL API, and the HermiT and FaCT++ description logic reasoners. Several computationally-challenging steps are carried out in parallel (e.g., justification finding, change categorisation), taking advantage of new concurrency features that Java 7 introduces and/or augments. Therefore in order to use the standalone version Java 7 is required.

In both standalone and Web-based diff, the output of the tool is the same: an XML diff report, and (optionally for the standalone tool) a transformation of that report into HTML. The Web-based frontend contains a variety of pre-computed examples, including several diffs between versions of the NCI Thesaurus.

Papers

  1. Analysing the evolution of the NCI Thesaurus. In Proceedings of the 24th IEEE International Symposium on Computer-Based Medical Systems (CBMS), 2011. [pdf]
  2. Categorising logical differences between OWL ontologies. In Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM), 2011. [pdf]
  3. Ecco: A Hybrid Diff Tool for OWL 2 Ontologies. In Proceedings of the of the 9th International Workshop on OWL: Experiences and Directions (OWLED), 2012. [pdf]