Update from the livestock ontologies working group
By Vanessa Meadu
How do we present livestock data in a consistent manner? This is the central question for the Livestock Ontologies working group, convened under Livestock Data for Decisions community of practice. The group, which brings together people working on different aspects of livestock ontologies, recently met to catch up, share challenges and innovations, and set the groundwork for 2021 and beyond.
Watch the meeting video
Updates from group members
Alexander Robertson from the University of Edinburgh’s Bayes Centre presented a Pilot project to automatically tag content on livestockdata.org. Watch the presentation.
- It is a precursor to put together something more vocabulary/ontology-based.
- The goal is to offer users a way to explore the data and show how datasets are connected.
- Alex has extracted the full text of the site is producing the top five descriptive key terms for each document. These can be turned into tags.
- Once tags are set you can look for documents which have similar tags
- The next step is to bring in a domain expert and ask them to validate the tags.
- A further step is to work on making the site search more meaningful – for example returning results with synonyms rather than exact text matches.
Marie-Christine Salaün, Sophie Aubin, Jérome Bugeon, Catherine Hurtaud, and Matthieu Reichstadt from INRAE presented an update of their work on established Livestock Ontologies: ATOL (Animal Trait Ontology for Livestock), EOL (Environment Ontology for Livestock) and AHOL (Animal Health Ontology for Livestock). Watch their presentation.
- In 2020 the team reviewed each of the three ontology bundles and homogenised the terms and definitions, revised hierarchies, improved synonyms and added new relevant traits and environmental parameters.
- From 2021 the team will work on EU infrastructure projects for the European Cattle and Pig sectors, as well as a product quality project (INTAQT)
- AHOL now has 177 diseases and the hierarchy has been enriched with new branches. Next they will work on programmes on health management of experimental herds and veterinary diagnosis decision support
- The three ontologies use a web-based application. Only curators can access backetnd to add, edit and delete traits, and also versioning the ontologies. The front end is open access, allows users to browse, filter and export detailed ontologies and traits. These use MySQL relational databases.
Elizabeth Arnaud (Alliance of Bioversity and CIAT) and CGIAR Big Data Ontologies Community of Practice presented work on the development of the Small Fisheries & Aquaculture ontology, on behalf of Jacqueline Muliro (WorldFish). Watch the presentation.
- The aim of this work is to make data interoperable among the various projects, databases and repositories, and address the current and consistent use of terms across datasets.
- They are starting with concept extraction, led by Tusnuva Jahan, a Masters student. She and Jacqueline are manually extracting terms and trying to find an exact or close match to AGROVOC. They are also identifying missing terms that could be added to AGROVOC, including terms from ASFA (Aquatic Sciences and Fisheries Thesaurus).
- They are also doing some ontology mapping at the dataset level, extracting the keywords that scientists are using to describe their data. Tusnuva is checking if these concepts already exist in AGROVOC and if not, using the ontology lookup service to identify if is is present in existing ontologies.
- The work covers domains such as fish traits, fish species, management processes, environmental factors, socioeconomic parameters and survey data concepts.
- This effort is manually based as it brings better perspective on the domain but survey data eventually will be submitted to the concepts extraction algorithm developed by the Univ of Sheffield.
David Brodbelt, Royal Veterinary College (UK) presented developments with VeNom (Veterinary Nomenclature), a set of diagnoses for farm animal conditions. Watch the presentation.
- VeNom is principally companion animal focused but they have developed a farm practice terminology subset, relevant to a veterinary practice context.
- It is more of a terminology than an ontology, offering standardisation within electronic patient record systems.
- Terms are currently available in excel via their website but they are working on an API powered by machine learning, allowing a web-based interface to deliver the code
Georgina Cherry, vHive / University of Surrey shared progress in developing ontologies for the Data Innovation Hub for Animal Health (DIHAH), which is designed to promote data sharing discovery in an animal health cluster. Watch the presentation.
- The goal is for members to be able to upload their data and combine it with existing datasets to create data visualisations and dashboards.
- They are creating a taxonomy to catalogue the data in the platform, and starting with a taxonomy of content tags that users of the platform can use to tag and search datasets.
- There are some challenges around developing hierarchies and also working with datasets that have mixed species.
- They are also looking at bringing in terms from AGROVOC and mapping the simple taxonomy against this. They are discussing this with the GBADS (Global Burden of Animal Diseases) team about moving this forward, and this could be something for the LD4D working group to engage with.
Itlala Gizo (OIE) presented work in developing a codification system for animal health data and its integration into the WAHIS (World Animal Health Information System) platform. Watch the presentation.
- The goal is to ensure that animal health information from OIE member states is fully interoperable, and allows better data extraction and analysis.
- They are starting with three main concepts: animal disease names, pathogens and species.
- A big challenge relates to ensuring codes can be flexible to deal with the alteration of disease names in OIE records over the years.
Common challenges, potential actions
Groups discussed key ideas that emerged during the presentations and tried to identify common challenges, possible synergies and some follow-up actions for the working group to address. These include:
- Sharing experiences with machine learning.
- Developing a common business-case or use-case to bring us all into alignment and demonstrate the practical use of standardisation and links between ontologies.
- Sharing strategies for convincing scientists to use existing ontologies.
- Exploring how to draw linkages between the different specialist ontologies and make them interoperable. Can we provide synonyms for technical terms? There is a need for multiple ontologies feeding into the same concept. How do you manage multiple ontologies with terms across all of those ontologies that you would like to use, with some terms shared across ontologies.
- Exploring the value of web interfaces and API to encourage more people to utilise ontologies.
- How to map ontologies at the dataset level and deal with the challenges linked to having clear definitions to the concepts.
- How to best use AGROVOC terms, which are quite general (useful for descriptions) but making these work together with existing specialist ontologies e.g. phenotypic.
- Can we map out what domains we are working within and what we have in common, to form groups that are keen to work together on specific topics (disease etc).
- How the different ontologies we develop are made accessible and reusable (licenses) to facilitate convergence.
Reflection and next steps
To date the Working Group has been primarily focused on knowledge sharing. People are keen to continue to share knowledge at regular intervals via the group.
Moving forward, the group should define a clear workplan with deliverables. Members identified a number of key issues during the breakout group discussions, which could serve as a starting point for concrete problems to work on.
In early 2021, the LD4D secretariat will circulate a survey to working group members, in order to map out the different knowledge domains where people are working, and try to identify priority problems to work on together.
This will be followed by a Working Group planning workshop to initiate a programme of work for 2021 and beyond under LD4D phase 2.
At the end of the story add links to further reading
- To join the Livestock Ontologies working group please contact Vanessa Meadu
Vanessa Meadu is Communications and Knowledge Exchange Specialist with SEBI-Livestock, which convenes the Livestock Data for Decisions Community of Practice.
December 18, 2020