New Publication from the Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems
The Ontologies Community of Practice (CoP) published a paper on recent developments and how the CoP is contributing in identifying innovative solutions that support quality data labeling. The article also includes key quality criteria to consider before selecting an ontology in agriculture.
Quality labeling of data using popular and quality vocabulary secures its online findability, reusability, interoperability, and reliable interpretation, as well as its processing by Machine Learning techniques. This paper presents the added value of the Ontologies Community of Practice (CoP) for harnessing relevant expertise in ontology development and identifying innovative solutions that support quality data labeling.
Description of the CoP’s products (Crop Ontology, Agronomy Ontology, Socio Economic Ontology, etc.), that contribute to the global semantic framework and recommendations for improving the use of ontological terms, demonstrates the added value of the CoP for producing interoperable multidisciplinary agrifood data. A practical example is included about the user experience of Dr. Berta Miro’s, International Rice Research Institute (IRRI), in selecting ontological terms to annotate public data sets about the water submergence tolerance of rice varieties. This example illustrates the use of ontology in the data labeling workflow described in the figure.
This paper notably includes the key quality criteria to be looked at for selecting an ontology in agriculture. The list was provided by CoP’s experts in a webinar:
Criteria Classified by the Expert Panel | |
1 | Adhere to the Open Biological and Biomedical Ontology (OBO) Foundry guidelines |
2 | Represent a unique non-overlapping knowledge domain (a.k.a. orthogonality) |
3 | Willingness to express and integrate multiple, evidence-based classification systems in the chosen domain |
4 | Logically-structured with a well-defined scope |
5 | May contain relationships and dependencies to other reference ontologies |
6 | Represent accurate science supported by evidence |
7 | Open source and Creative Commons CC-BY or CC-0 license |
8 | Must be widely used in annotation and data capture |
9 | Support both inter- and intra-specific needs with species agnostic (core) and specific (extensions) resources that work together |
10 | Sustainable funding sources |
11 | Human resources to manage (i.e. curators, editors, and developers) |
12 | Established ontology management system including roles and responsibility |
13 | Must be designed to answer both the computing and community needs |
14 | Must explicitly identify the communities of reference |
15 | Centralized maintenance of the validated content, and distributed contribution and access |
16 | Ontology quality assurance by experts in the field of knowledge |
17 | Reducing reliance on internal processes and data stewardship networks |
The five top-ranking criteria selected by data managers were: (1) the domain-specific coverage of the ontology; (2) the ontology wide used in annotation and data capture; (3) availability of indicators used for quality assurance; (4) ontology central maintenance and distributed users’ contributions; (5) the existence of sustainable funding to support the ontology.
We stress the role of the CoP to regularly convey the needs of ontology curators and data managers to developers of ontology lookup services like the EMBL-Ontology LookUp Service, Planteome, Agroportal and the annotation tools like COPO. The CoP plans to contribute further to priority ontologies for livestock, fisheries and aquaculture, water management, food systems and value chains, and will continue integrating concepts on agriculture and food systems into the SDG Interface Ontology (SDGiO). The CoP will stimulate collaboration on the development of knowledge graphs in agriculture that support graph databases, a domain in which the agrifood industry has made rapid progress.
Citation:
Arnaud E. et al, The Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems. 2020, Patterns J., Vol. 1, Issue 7, DOI:https://doi.org/10.1016/j.patter.2020.100105
November 20, 2020
Elizabeth Arnaud
Ontologies Community of Practice Lead
Alliance of Bioversity International and CIAT
Latest news