2019 Winner

Rapid genomic detection of aquaculture pathogens



Malaysia, Bangladesh

The project will pilot a readily deployable “lab-in-a-backpack” for pond-side identification and quantitation of pathogens affecting tilapia. Equipped with a portable DNA-extraction system, a hand-held DNA sequencer (MinION), a battery-operated minicomputer (MinIT), and an intuitive purpose-built software package, users without experience in molecular biology or bioinformatics will be able to identify fish pathogens from both water samples and infected tissues remotely and in real-time, with limited electricity and internet connectivity.

More about the project

Aquaculture is the world’s fastest growing food sector in the world and has been recognized for its potential to alleviate poverty and hunger. However, fish diseases and a lack of how to identify, track, and contain them can prohibit aquaculture development. This project will pilot a transportable, low-cost diagnostic “lab-in-a-backpack” that will enable users without molecular biology experience to identify fish pathogens in real-time with limited electricity and internet connectivity.

Aquaculture, the farming of aquatic organisms in both coastal and inland areas, accounts for 50 percent of the world’s fish that is used for food today. It is practiced by both some of the poorest farmers in developing countries and by multinational companies.

However, the development of aquaculture systems is often limited by fish diseases and a lack of knowledge and tools to identify fish pathogens, track their origin, and manage their spread.

Whole-genome sequencing informs how pathogens change and move through environments, permitting the implementation of evidence-based biosecurity to minimize disease impact.

Offsite sequencing services are expensive and cause prohibitive delays. Therefore, the project proposes leveraging offline supervised machine learning associated with the MinION portable sequencing device for low-cost diagnostics of fish pathogens in remote locations, allowing real-time disease investigation and data-driven management.

These tools will enable tilapia breeding, quarantine, and biosecurity centers, as well as academics and vets, to identify causal agents of disease outbreaks in a fraction of the time and cost required for external laboratory analysis; the project’s tests give results in hours rather than weeks or months and cost roughly 40 USD as opposed to more than 100 USD.

This Inspire Challenge proposal was selected as a 2019 pilot project winner, receiving a total of US$ 100,000 to put their ideas into practice. Learn more about the Inspire Challenge Grant here.

Team

WorldFish
Jérôme Delamare-Deboutteville | Email

Wilderlab
Shaun Wilkinson | Email

The University of Queensland Australia
Andrew Barnes | Email

Centex Shrimp (BIOTEC/Mahidol University)
Saengchan Senapin | Email
Ha Thanh Dong | Email

GeneSEQ
Han Ming Gan | Email

Step by step

2021

FEB 2021

Gates Foundation letter of intent

The team sent a letter of intent to the Bill & Melinda Gates foundation for the Global Grand Challenges on Smart Farming Innovations for Small-Scale Producers. The proposed project aims to create smart systems for antibiotic-free farm fish in Bangladesh in response to the challenge of production due to fish diseases directly impacting farmers’ livelihoods. If granted, the initiative will develop and implement transformational fish disease detection technology and prevention strategies for carp, tilapia and catfish, the most important crops for small-scale fish farmers.



FEB - JUNE 2021

Virtual training workshop

Using the samples collected from universities and private sector actors in Malaysia in October 2020, the team led a virtual training workshop on how to process, sequence, and upload genomic information to the cloud-based database.

MARCH 2021

Journal publication on fish diseases

The team published a paper detailing how Oxford Nanopore Technologies (ONT)-based amplicon sequencing is a promising platform to deploy in regional aquatic animal health diagnostic laboratories in low and medium-income countries, for fast and accurate confirmation and phylogenetic/genotyping of emerging infectious pathogens from field samples within a single day.

Overall workflow from sample collection of diseased fish on farm to sequence results. The entire process takes less than 12 hours. DNA repair, end-preparation, multiplex native barcode and adapter ligation.

APRIL 2021

Data analysis

The team is analyzing participants’ samples from past workshops and preparing all the content material for the virtual workshop that will be held in June 2021.



JUNE 2021

Virtual workshops

The team will hold workshops with academics, small-scale farmers from Bangkok, Bangladesh, and Malaysia on rapid diagnostics.

JUNE 2021

Developing a manual and training modules for field samplers

The team will build virtual training modules for field-based samplers composed of factsheets, short video tutorials, and easy-to-follow protocols for end-user software interfaces for the data outputs.

The manual will cover the entire process from biological sample collection to performing a sequencing run for analysis.

2020

JAN 2020

Bacterial genome sequencing and expansion of the team

The team sequenced 30 bacterial genomes and welcomed a new Ph.D. student, Suvra Das, from Bangladesh, to the team. Under Associate Professor Andrew Barnes at the University of Queensland, she will research processing methods for DNA extraction and library preparation to optimize the cost and performance of field sequencing tests.

Suvra Das, a PhD student in Andrew Barnes’ aquatic animal health laboratory, performs DNA extraction from fish pure bacterial isolates.

MARCH 2020

COVID-19 adaptations

Although the project was affected by the COVID-19 pandemic, the first year’s activities were primarily focused on laboratory and computer-based activities to generate the fish pathogen sequences, and, therefore, the team experienced fewer disruptions than field-based work.

However, various travel and workshop were adapted or postponed. A workshop that was originally planned to take place in person in Bangladesh has been converted to a virtual format and occurred in early 2021.

Watch the video below to hear from WorldFish Scientist Dr. Jerome Delamare-Deboutteville about the impacts of COVID-19 and progress throughout 2020:

JUNE 2020

Generation of aquatic pathogen genomic typing data

The team completed 50 bacterial genome sequences, generating two types of data:

Highly accurate sequence data for all target aquatic pathogens derived from long and short-read sequencing was used to build the reference training database for machine learning algorithms.
Raw nanopore read data for model development. This data was generated at the University of Queensland, Mahidol University/BIOTEC’s CENTEX Shrimp, and WorldFish.

Nurulhuda Ahmad Fatan from WorldFish demonstrates how to load a library onto the flow cell before starting a sequencing run on the Minion connected to the MinIT.

JULY 2020

Optimisation of field data acquisition and upload methodology

The team will compare sample collection and processing methods to optimise the cost and performance of the field sequencing workflows.

Sample extraction and library preparation and indexing methods will be compared to ensure that they can be completed in semi-remote locations.

Essential equipment needed to take single colonies from blood agar plates, extract DNA, and prepare a library for sequencing on the Minion, MinIT, and computer (or tablet or mobile phone) to visualise dashboard and launch a sequencing run.

JULY 2020

Building a software environment for typing pathogens from fuzzy data

To address the base-call error rate (<5 percent) of the MinION sequencing technology, the team developed a new bioinformatics software package that leverages machine learning to identify fish pathogens.

Two approaches were compared. In the first approach, hidden Markov models (HMMs) were used to compare experimental data to a reference database of hierarchical regions of differentiation. The second approach considered that all genomic regions provide information on strain type. Therefore, a rapid alignment method can be used to bin query samples probabilistically with the correct strain or type.

These models provided a position-specific scoring system that can account for base-calling inaccuracies and were trained on sequences from isoclinal pathogens obtained using the MinION.

Example of an accurate genome sequence.

JULY 2020 - FEB 2021

Development of machine learning tools and cloud-based database

The team is creating a cloud-based database that features a large collection of fish pathogen genomes. The point-and-click user interface will be designed for public use, and the site is expected to be accessible in 2021.

OCT 2020

Sample collection

In place of in-person training workshops, the team collected samples from researchers at five leading universities and various private sector actors in Malaysia. These samples will be processed and used in an adapted virtual version of the workshop in early 2021.

A field team prepares to collect samples from fish for disease diagnostic investigation.

DEC 2020 - JUNE 2021

Engagement in joint initiative to increase aquaculture sustainability in Sub-Saharan Africa

Working through the WorldFish office in Egypt, the team is engaging 12 Master’s students (six from the College of Basic and Applied Sciences of the University of Ghana and six students from the College of Agriculture & Veterinary Sciences of the University of Nairobi) in a six-month intensive training on general aquaculture.

This effort is a part of a joint project led by WorldFish and the Norwegian Veterinary Institute to support aquatic animal health research, education, and management in Sub-Saharan Africa.

2019

OCT 2019

Project awarded US$100K Inspire Challenge grant

The project was one of four winners of the Inspire Challenge 2019 and was awarded US$100K at the Convention of the CGIAR Platform for Big Data in Agriculture, during 16-18 October, 2019.



Gender & Youth Inclusion



WorldFish in partnership with GeneSEQ, UQ, and Wilderlab is working on the preparation of virtual training including sex-disaggregated data collected from the Malaysian participants involved in the project.

Partners

Project News and Resources

Lab-in-a-backpack: Rapid Genomic Detection to revolutionize control of disease outbreaks in fish farming

A winning 2019 Inspire Challenge project led by WorldFish, the University of Queensland, and Wilderlab is revolutionizing aquaculture disease control ...

VIDEO: Q&A with Inspire Challenge winner: Rapid genomic detection of aquaculture pathogens

Live Q&A with Jérôme Delamare-Deboutteville (WorldFish), Andrew Barnes (University of Queensland), and Shaun Wilkinson (Wilderlab) about their 2019 Inspire Challenge ...

Meet all the Winners

Inspire Winner 2019

Gamifying weather forecasting: “Let it rain” campaign

Inspire Winner 2019

Hungry cities: Inclusive food markets in Africa

Inspire Winner 2019

Rapid genomic detection of aquaculture pathogens

Inspire Winner 2019

Real-time East Africa live groundwater use database

Revealing informal food flows through free WiFi

Inspire Winner 2018

Machine learning for smarter seed selection

Seeing is believing – Using smartphone camera data

Inspire Winner 2018

CubicA: The new farmer advisory app

An integrated data pipeline for smallscale fisheries

MARPLE: Real time diagnostics for devastating wheat rust

Farm.ink: Analysing livestock social media data for farmer chatbot

Using commercial microwave links (CMLs) to estimate rainfalls

PlantVillage Nuru: Pest and disease monitoring using artificial intelligence

Inspire Winner 2017

Using IVR to connect farmers to market

Search the website

Discover agricultural data and publications

Powered by GARDIAN

Become a youth in data partner

Submit an initiative!

AgroFIMS: Your new companion for easy standardization of data collection and description

The Agronomy Field Information Management System (AgroFIMS) allows users to create fieldbooks to collect agronomic data that is already tied to a metadata standard (the CG Core Metadata Schema, aligned with the standard Dublin Core), and semantic standards like the Agronomy Ontology (AgrO), generating data that is Findable, Accessible, Interoperable, and Reusable (FAIR) at collection. AgroFIMS therefore standardizes data collection and description for easy aggregation and inter-linking across disparate datasets. The fieldbooks you create can be exported to the Android-based KDSmart data collection application, and collected data imported back to AgroFIMS for statistical analysis and reports. In 2021 AgroFIMS will allow you to set up agronomic survey questionnaires, for data collection via ODK. It will also allow easy upload of your “born FAIR” data to Dataverse repository platforms with Dublin Core-compliant metadata schemas. Funding for AgroFIMS was provided by the Bill and Melinda Gates Foundation’s Open Access, Open Data Initiative, and the CGIAR Platform for Big Data in Agriculture. AgroFIMS is under GPL license. Go to AGROFIMS →

Responsible Data Management Guidelines to protect privacy

CGIAR Platform for Big Data in Agriculture advocates open data for agricultural research for development. It considers that opening up research data for scrutiny and reuse confers significant benefits to society.

However, the Platform appreciates that not all research data can be open and that a broad range of legitimate circumstances may require data to be restricted.

As an integral component of its advocacy for open data, the Platform promotes responsible data management through the entire research data lifecycle from planning, collecting, storing, disclosing or publishing, transferring, discovery and archiving.

These guidelines were created from information collected from: review on best and emerging practices across various sectors in the fast changing landscape of privacy and ethics (130 external resources); privacy and ethic materials sourced from seven CGIAR centers; first draft was circulated for input and feedback across CGIAR and incorporated into this edition. It’s important to note that this is an evolving document, the next stage is to consult externally for further input.

These Guidelines are intended to assist agricultural researchers handle privacy and personally identifiable information (PII) in the research project data lifecycle.

Check the guidelines →

REUSE / TRANSFER

Ensure consistency with the DMP-PII and the purpose for which prior informed consent has been obtained
Revaluate likelihood of (re-)identification and risk of harm, particularly if it involves a public data-set containing PII (as above)
Ensure PII is stored securely to protect privacy (as above)
Minimize use of PII and risk of disclosure through pro-privacy access controls and analytical tools (as above)

Don’t transfer data containing PII unless have explicit consent
Don’t transfer data containing PII in the absence of a data sharing agreement identifying aspects such as purpose and scope of use, privacy protections measures, confidentiality and any limitations)
Don’t reuse or transfer PII until any inconsistencies with the DMP-PII and/or purpose compatibility have been resolved (e.g. through updated ethics review or consent from participant)

ARCHIVING / DISCARDING

Plan for archiving or data destruction early in the process. Destroying data can be more secure, however, archiving can be beneficial if the data has ongoing evidentiary, scientific or cultural value. If archiving, identify where and how, the budget require
Ensure DMP-PII and purpose compatibility (as above)
Ensure adequate security measures to protect privacy (as above)

Don’t wait until the end of the project to assess archiving needs when time and resources may be limited
Don’t assume the longevity of a particular format, future-proof your archives data
Don’t forget to budget for archiving data, this should be done as part of your Data Management Plan

PUBLISHING AND DISCOVERY

Ensure DMP-PII and purpose compatibility (as above)
Revaluate likelihood of (re-)identification and risk of harm, particularly if it involves a public data-set containing PII
Indicate in metadata the availability of raw data or minimized data containing PII, if available bilaterally
Minimize use of PII and risk of disclosure through pro-privacy access controls and analytical tools

Don’t include PII in public datasets unless absolutely necessary to preserve the data’s analytic potential, scientific utility or benefit to the participant (and subject to participants informed consent and a rigorous risk assessment)

STORAGE AND ANALYSIS

Ensure compatibility with the DMP-PII (as above) and also the purpose for which prior informed consent has been obtained

Ensure PII is stored securely to protect privacy, through organizational or project specific safeguards to prevent unauthorized access, accidental disclosure or breach of data (physical & technical)

encryption for the storage and transmission of PII
access control measures to limited access to PII
two-factor or multifactor authentication
cloud services & back-end security

Don’t store data in unsecured locations or on unsecured devices or servers

Don’t store encrypted data and encryption keys in locations where they can be easily accessed simultaneously

Don’t underestimate the importance and value of administrative safeguards to standardize practices (i.e. organizational policies, procedures and maintenance of security measures that are designed to protect private information, data and access)

COLLECTION

Ensure compatibility with the DMP-PII
De-identify data to anonymize by default unless it will impair the data’s analytic potential, scientific utility or benefit to the participant,
If you cannot anonymize, minimize the PII and pseudonymize to reduce the disclosure risk
Provide research participants sufficient information to use reasoned judgment to decide whether or not they wish to participate in the project
Ensure informed consent is designed to address the following elements:
- competence, comprehension, full disclosure, voluntariness
- legitimate scientific purpose for which the PII is collected and scope of use (e.g. stored, transferred, published and whether as anonymized, minimized or raw data)
- foreseeable risk of privacy loss and consequences
- meaningful alternatives including opt-in protection/anonymization
- safeguards to protect privacy, conditions on which PII may be shared and any limitations on reuse or third- party access and use of PII
- permission to follow-up or contact the participant and for what purpose (including by third- parties)
- participant’s right to withdraw and rights regarding their data (e.g. to be informed; to access; to rectify; to object; to erase)
- inclusion of physical, phone and/or electronic contact (at least two forms of contact) that participant can reach to exert her/rights
- explicit consent and participant’s acknowledgement of understanding
- if written, provide the participant a copy of processed informed consent
Use plain language and adapt informed consent to meet the needs of vulnerable populations (e.g. obtain orally or in local language)

Don’t collect PII unless you have a Data Management Plan and any necessary approvals in place, including the recorded approval of the potential participant
Don’t collect PII unless you absolutely need it
Don’t assume that removal of direct identifiers is sufficient to anonymize data or that all de-identification techniques will result in anonymized data. Consider the risk of re-identification of a research participant, particularly if datasets are combined. If there is a reasonable risk of re-identification the information should be handled as PII (i.e. undertake risk analysis, evaluate stronger anonymization techniques, seek informed consent for the disclosure of data and explain its possible consequences)
Don’t include vulnerable participants or communities if their ability or capacity to provide voluntary informed consent is genuinely in question
Don’t underestimate the potential of quasi or indirect identifiers to identify an individual, particularly the inherent ability of location-based data to identify participants and their communities, and the increased risk of harm this may pose to potentially vulnerable individuals/communities
Avoid seeking overly broad consent that may call into question transparency or a research participant’s understanding regarding the use of their PII, be specific regarding the activities, purpose and limitations associated with PII so that the participant can make a genuinely informed decision and downstream users can evaluate purpose compatibility and seek fresh consent if needed

PLANNING AND APPROVAL

Develop a Data Management Plan which governs the handling of PII in the research project and beyond (DMP-PII). It should address:
- the type and nature of PII
- compliance requirements (including necessary forms for obtaining consent, and ethics clearance, if applicable)
- legitimate research objectives that will be advanced by the PII
- foreseeable risks and consequences if participants are identified from the data
- privacy protection measures (or lack thereof) for collection, storage, transfer and publishing
- process for obtaining informed consent
- timeframe or trigger for archiving or deletion of PII
Employ stricter standards for research involving vulnerable populations such as children or illiterate participants or sensitive data such as ethnicity or religious beliefs
Undertake due-diligence of datasets previously collected by you or third parties to ensure you are entitled/permitted to use for your research project
Consult the legal, IRB or ethics clearance committee or any other relevant institutional group for specific institutional, local, regional or national policies and regulatory frameworks that may apply to PII in the context of your work

Don’t leave the handling of PII and privacy protection as an after-thought, plan ahead!
Don’t forget to check local laws and donor or third-party requirements in addition to institutional policies governing research ethics and privacy protection (seek expert support if unsure!)
Don’t ignore ethical practices/standards, if your institution does not have an ethics framework or clearance process in place self-assess!
In assessing whether information is capable of identifying someone (i.e. PII) don’t limit your focus to direct identifiers, also consider indirect/quasi identifiers. Appreciate this will depend on the context of the research project, the data in question and external data which is or may become otherwise available (i.e. there is no exhaustive list).
In assessing risk of harm don’t forget to consider potential harm to the participant’s community or groups of individuals that can otherwise be identified or associated with the participant

2019 Winner

Rapid genomic detection of aquaculture pathogens

Malaysia, Bangladesh

More about the project

Team

Step by step

2021

FEB 2021

Gates Foundation letter of intent

FEB - JUNE 2021

Virtual training workshop

MARCH 2021

Journal publication on fish diseases

APRIL 2021

Data analysis

JUNE 2021

Virtual workshops

JUNE 2021

Developing a manual and training modules for field samplers

2020

JAN 2020

Bacterial genome sequencing and expansion of the team

MARCH 2020

COVID-19 adaptations

JUNE 2020

Generation of aquatic pathogen genomic typing data

JULY 2020

Optimisation of field data acquisition and upload methodology

JULY 2020

Building a software environment for typing pathogens from fuzzy data

JULY 2020 - FEB 2021

Development of machine learning tools and cloud-based database

OCT 2020

Sample collection

DEC 2020 - JUNE 2021

Engagement in joint initiative to increase aquaculture sustainability in Sub-Saharan Africa

2019

OCT 2019

Project awarded US$100K Inspire Challenge grant

Gender & Youth Inclusion

Partners

Project News and Resources

Meet all the Winners

Search the website

Discover agricultural data and publications

Powered by GARDIAN

Become a youth in data partner

Submit an initiative!

AgroFIMS: Your new companion for easy standardization of data collection and description

Responsible Data Management Guidelines to protect privacy

<img class="wp-image-93311 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/REUSE_arrow.png" alt="" width="100" height="100" />

REUSE / TRANSFER

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

ARCHIVING / DISCARDING

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class="wp-image-93312 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/rss-transparent-300x300px.png" alt="" width="100" height="100" />

PUBLISHING AND DISCOVERY

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class="wp-image-93295 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/data-analysis-icon.png" alt="" width="100" height="100" />

STORAGE AND ANALYSIS

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class=" wp-image-93249 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/data-collection-icon.png" alt="" width="100" height="75" />

COLLECTION

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class=" wp-image-93217 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/planning-icon.png" alt="" width="100" height="114" />

PLANNING AND APPROVAL

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />