Converting an existing taxonomic data resource to employ an ontology and LSIDs.

  Data sharing is fundamental to biodiversity and taxonomic data applications, however previous attempts at developing mechanisms to facilitate sharing within the community have had limited effect. Reasons for this include the lack of take up of data exchange standards (which is now slowly happening due to the TDWG standards initiative), the absence of a common terminology or vocabulary for use within taxonomic data and the lack of reference database systems for serving and referring to authoritative data. In an attempt to improve this situation, a Core Ontology for taxonomic data has been developed to model the entities widely used in taxonomy in an independent manner and allow their reuse for different taxonomic purposes. In addition Life Science Identifiers (LSIDs) have been proposed by the TDWG GUID working group as the means for uniquely identifying taxonomic data objects, such as specimens, taxonomic names, taxonomic concepts, etc. The LSIDs can make use of a Core Ontology or a Domain Ontology derived from the Core in order to define the data to be returned from resolving an LSID. These data are expressed in RDF, a language central to the semantic web. For this approach to be effective it is essential that a mechanism exits for migrating existing data to the new technologies, e.g. LSIDs and RDF using a Core Ontology. However, using LSIDs per se will not address the issue of data sharing unless repositories reuse LSIDs to cross reference data internally and externally. It is important that taxonomists use the same LSID to refer to the same taxonomic entity rather than have multiple LSIDs identifying the same entity. If this were to happen we would need to decide if two LSIDs were really the same thing. We would be in a similar situation as we are today where we are trying to decide if two taxonomic names are really the same. Generating LSIDs for any self contained data set is trivial. It is a challenge however to allocate LSIDs to data when the LSID may be new because the data are owned by a specific repository, or to determine when an LSID should be acquired from an external database that serves as an authority for the data. This presentation will report on the migration of the Hexacorallians of the World to a domain ontology derived from the proposed TDWG core ontology. The ontologies are represented in RDF and the data were cross-referenced using LSIDs. The focus is on the development of a tool to aid the process of converting internal database keys to LSIDs. These LSIDs may be generated automatically for data owned by the repository or appropriated from some external LSID authority. The provision of such a tool will facilitate domain scientists in publishing their data in a manner that will enable better discovery, reuse and cross referencing using LSIDs.

  • Date:

    31 December 2006

  • Publication Status:



Kennedy, J., Gales, R., & Kukla, R. (2006). Converting an existing taxonomic data resource to employ an ontology and LSIDs. In L. Belbin, A. Rissoné, & A. Weitzman (Eds.), Proceedings of TDWG (2006), St Louis, MI



Biodiversity; Taxonomy; Data sharing; Data exchange standards; Core Ontology; Life science identifiers; RDF;

