Annual Report 2019

C

At a Glance

As the largest network of agricultural research organizations in the world, CGIAR is uniquely positioned to be a thought leader and global convener on the use of big data and information technology in agriculture.

The CGIAR Platform for Big Data in Agriculture is driving the effort to positively disrupt agricultural research, helping to generate impactful big data innovations that can revolutionize farming in developing countries.

In 2019, under its three modules: INSPIRE, CONVENE and ORGANIZE, the Platform made significant strides to build fundamental technologies and data standards to support CGIAR’s digital strategy, develop strategic CGIAR’s digital partner networks, and foster new innovative pathways that leverage public-good data to solve to solve intractable challenges at scale.

Inspiring innovation

USD1.025 M

awarded to 8 Inspire Challenge winners

31

capacity-building activities

27

R&D innovations

In 2019, under its three modules: INSPIRE, CONVENE and ORGANIZE, the Platform made significant strides to build fundamental technologies and data standards to support CGIAR’s digital strategy, develop strategic CGIAR’s digital partner networks, and foster new innovative pathways that leverage public-good data to solve intractable challenges at scale.

A Global Partnership

15 CGIAR Centers,
81 external partners &
3,5K community of practice members

   

Data management

155,000 publications & 23,000 datasets

discoverable via GARDIAN

10

tools & approaches for data management

Outreach

139M

reach in media mentions including The Economist, Reuters, and Telangana Today

75K

website visits

10K

followers on social media

Foreword

Brian King
Coordinator, CGIAR Platform for Big Data in Agriculture

Digital disruption is changing all industries, and agriculture is no exception. This is due in part to a common business model: the digital platform. A digital platform is a multi-sided, technology-enabled network that facilitates the interaction of stakeholders.  Digital platforms have become a key  way that organizations of all types engage with the external environment, seize opportunities to act, and develop more agility and dynamism.

The Platform for Big Data in Agriculture strives to embody such a digital platform strategy that informs how CGIAR manages its data, builds alliances, and targets digital innovation to build new, agile, adaptive collective action spanning food, farming, and ecological systems.

2020 has put our theory of change to the test in unanticipated ways.

The need for resilient food systems comes into stark relief during a crisis. Responses must be nimble, facilitating the quickest possible recovery while equipping food systems to adaptively manage or avert crises in the future. The strength of our networks matters for rapid, collective sense-making. CGIAR’s global partnerships represent a critical network for diagnosing, predicting, and informing responses to food security shocks. The BIG DATA Platform benefits from these partnerships and builds on them through open, collaborative Communities of Practice in big data research; driving open data standards and sharing; alignment with digital-first food system actors such as the Strike Two Summit, and on-the-ground digital innovation projects through the CGIAR Inspire Challenge.

Trust: Humans, Machines and Ecosystems

During our 2019 BIG DATA Convention—TRUST: Humans, Machines & Ecosystems—we examined how the economy, society, and biosphere are already linked by food systems, and how, as a result, ecological systems will be increasingly intertwined with algorithmic systems.
CGIAR and our global network of partners must navigate this global crisis of trust if we wish to claim the promise of algorithmic systems for global food security.

Brian King
Coordinator, CGIAR Platform for Big Data in Agriculture

Global challenges and One (Digital) CGIAR

There will be increasingly intense challenges in the next decade across demographic, natural resource, ecological, and climatic dimensions, and the window of opportunity for mitigating or reversing their most harmful effects is quickly closing. There will also likely be unprecedented rates of technology innovation at the intersection of digital and the life sciences in the coming years that—if harnessed and applied properly—could provide critical, cross-domain tools that can be used to respond to these challenges and facilitate a shift towards more favorable potential futures.

A unified and digital CGIAR could capitalize on several important comparative advantages in the evolving agricultural research for development landscape:

  • Leveraging global partnership networks and data-driven engagement with smallholder farmers worldwide
  • Being in a position to help measure environmental progress and move global agriculture towards agroecological intensification and carbon neutrality
  • Sustaining its long history as a trusted intermediary, managing the dissemination and adoption of technology innovations in developing economies.

CGIAR and the Platform for Big Data in Agriculture will continue to build the responsible digital innovation needed to guide global food, land, and water systems towards more favorable potential futures in 2030.

 

Inspire

Demonstrating the power of big data analytics through inspiring and innovative projects

N
The Platform made four start-up grants to data-driven partnerships under our Inspire Challenge innovation process, and four scale-up grants to winning projects from 2018 and 2017, awarding over USD1 million.
N

Its growing portfolio, totaling 14 projects in 2019, demonstrates impressive early-stage results.

For example, one advisory service, using crowd-sourced smartphone images, reached over 33,000 Indian wheat farmers, increasing crop insurance efficacy and knowledge on agricultural practices by 78%. A project tracing informal food flows leveraging free Wi-Fi has collected data from over four million smartphones, and been adopted at a national scale in Vietnam as a tool to assess and predict food security shocks from COVID-19. A near real-time small-scale fisheries monitoring system  is being scaled to seven countries in Africa and Asia.

In 2019, the Platform awarded four start-up grants of USD100,000 each to four new projects and a total of USD625,000 in scale-up funds to four winning projects from 2018 and 2017 that demonstrated exceptional results and proven viability and potential for impact.

The eight winning teams comprised the collaboration of nine CGIAR Centers and Research Programs and 18 diverse external partners: start-ups, governmental bodies, universities, and private sector businesses.

The Inspire Challenge attracted USD200,000 in external funding commitments in 2019, signaling it could well become a signature digital innovation process for CGIAR.

Since its inception in 2017, the Inspire Challenge has awarded a combined total of USD2.5 million in 21 grants to 14 projects, which have contributed to build the evidence base for digital agriculture.

Applications

winners

USD awarded

The Platform is improving its ability to source and foster innovation

  • proposed to use machine learning 65% 65%
  • categorized as incremental and disruptive innovation 69% 69%
  • included a gender component 80% 80%
N
Proposals targeting small producers have nearly doubled since 2017.
N
In 2018 and 2019, more than 65% of applicants proposed to use machine learning on large unstructured data sources, up from 10% in 2017.
N
In 2019, 69% of proposals were categorized as incremental and disruptive innovation, compared to 21% in 2017, promising more immediate and widespread impact in comparison to basic research.
N
In 2019, 80% of proponents included a gender component in their proposals, an increase of 10% from 2018.

PlantVillage Nuru: Pest and disease monitoring using AI

40 countries

The project has developed an AI-based app called Nuru, which helps users identify diseases affecting cassava plants. Now, Nuru has learnt to identify diseases in other crops.

Revealing informal food flows through free WiFi

Vietnam

The project has collected data from 5M phones at markets in Hanoi. The government has now adopted the initiative to assess food security shocks from the COVID-19 crisis.

An integrated data pipeline for smallscale fisheries

Timor-Leste and 7 new countries

By equipping fishing vessels with tracking devices, the project has put critical data in the hands of fisheries officers in Timor Leste. Now, it’s scaling to seven additional countries.

Using commercial microwave links to estimate rainfalls

Kenya

The project validated the use of CMLs data as an effective method to estimate rainfalls. Now it will be working with crop insurers to design better rainfall-based index insurance schemes.

“Let it Rain” Campaign: Gamifying weather forecasting

Kenya

The campaign turns weather prediction into a game to incentivize Kenyan farmers to uptake localized agro-advisories and maintain their yields despite a changing climate.

Real-time groundwater use database

East Africa

The project will turn solar pumps across East Africa into IoT devices to provide real-time data on groundwater withdrawals and inform water management efforts.

Rapid genomic detection of aquaculture pathogens

Malaysia, Bangladesh

This project will pilot a transportable, low-cost “lab-in-a-backpack” for pond-side identification and quantitation of tilapia pathogens.

Hungry cities: Inclusive food markets in Africa

Kenya

The project will analyze five years of data on 17 fruits and vegetables trade in Nairobi to inform policy and help reduce nutritional deficiencies in low-income populations.

Convene

Developing CGIAR´s digital partner networks to fast track big data innovation in agriculture

In 2019, the Convene module made significant strides in developing CGIAR´s digital partner networks.
N
Convene culminated in the annual CGIAR Big Data in Agriculture Convention, hosted by ICRISAT, where 700 attendees from 236 public, private, and non-profit organizations assembled to examine digital models relevant for collective action in global food systems.
N
The Platform showcased a reference Internet of Things architecture for agronomy and breeding, ground-breaking gender research methods leveraging telecom data, and technology standards for managing sensitive data.
N
Our six Communities of Practice (CoPs) grew to over 3,500 CGIAR and non-CGIAR members. The CoPs released important community-driven outputs including: an early warning system for wheat blast, draft ground data standards to facilitate machine learning analysis of satellite data, a survey of CGIAR digital extension efforts, a harmonized household survey methodology to bring more standardization to socioeconomic research across CGIAR, and much more. 
N
The Platform is working to enable women and youth to contribute to, and benefit from the digital transformation of agriculture through dedicated Youth in Data Workshop and Youth in Data Connect initiatives and through mainstreaming gender equality in Inspire innovation.

The Convention 2019

16-18 October – ICRISAT, Hyderabad, India

There is a global crisis of trust.

Governments, firms, and the media are less trusted today than they were ten years ago, and in the last few years digital tools and technologies appear to be accelerating this global erosion of trust. We have seen large-scale state surveillance programs unveiled, confirmation that human biases are encoded in algorithmic systems, and massive commercialization of user data leading to breaches in consumer trust.

What does this have to do with global food security? During our 2019 BIG DATA Convention—TRUST: Humans, Machines & Ecosystems—we examined how the economy, society, and biosphere are already linked by food systems, and how, as a result, ecological systems will be increasingly intertwined with algorithmic systems.

CGIAR and our global network of partners must navigate this global crisis of trust if we wish to claim the promise of algorithmic systems for global food security. A growing body of organizational research indicates that trust is built on three pillars: competence, integrity, and benevolence.

At the Convention, approximately 700 delegates presented an array of approaches and technological solutions to help foster these three qualities in global food systems. We have summarized some of these key takeaways here.

At a Glance

attendees

65% external to CGIAR

companies and organizations

20 Million

audience reached

1 Million

USD awarded

Key Takeaways

Improving data quality and use

  • We saw how text mining of policy documents can help us monitor the three pillars of agroecology: socially equitable food systems, regenerative use of ecosystems, free choice over production and consumption.
  • Radiant Earth, AtlasAI, and the BIG DATA Platform ́s Geospatial Community of Practice led sessions to validate a ‘straw man’ of minimum quality standards for ground reference data suitable for machine learning-driven analysis of remote sensing imagery for agriculture.
  • BIG DATA ́s Socio-Economic Data Community of Practice launched 100Qs, a standardized list of 100 commonly-asked questions from socio-economics research in food security, enabling large scale comparablility and interoperability of socio-economic survey data.
  • Results of applying the SDG Interface Ontology to the CGIAR Strategic Results Framework were presented, demonstrating the power of the agility, data interoperability, and precision of language ontologies bring when applied to organizational strategy and opening the way to develop powerful new capabilities in support of the new CGIAR Results Dashboard.
  • Partners presented case studies of the power of using non-traditional data such as social media, mobile network metadata, mobile money transaction data for food security research.
  • GARDIAN came into its own as a trusted intermediary in the data ecosystem, facilitating data discovery from national agricultural research organizations and development agency partners and new pipelines and data products built with pan-CGIAR data.

Partnerships and collaboration

Discovering and examining new tech

Privacy, ethics, and consent

  • A legal consultant with the Platform for BIG DATA outlined a new, dynamic approach to informed consent based on ongoing partnership and collaboration with data holders.
  • The Minnesota Supercomputing Institute presented work undertaken with the BIG DATA Platform to converge regulatory and ethical frameworks and map them to detailed international standards for data systems for managing sensitive data. We are moving beyond guidelines–useful but by definition very general–to architectures and standards for responsible data in our sector.
  • Experts in intellectual property, international treaties on biodiversity, and genetic resources staged a “commons debate” to help elucidate and unpack the complexity of effective stewardship of genetic resources in the age of digital sequences.
  • Experts in digital extension examined the depth of the challenge and some key learnings for building ethical, trustworthy digital advisory services.

Working towards the SDGs

  • DATA & GENDER: Gender experts and data scientists working with mobile network metadata, converged on methods for trying to predict and observe changes in women’s economic empowerment at a whole-of-system scale.
  • FINANCIAL INCLUSION: Experts examined the opportunities for leveraging the convergence of the world’s largest biometric identification system (Aadhaar), mobile phones, and the world’s largest financial inclusion program (Jan Dhan Yojana) for transforming rural livelihoods.
  • DATA & GLOBAL DIETS: A cross-disciplinary panel and a member of the EAT-Lancet Commission and co-author of the 2019 report “Food in the Anthropocene” explored the notion of eating within planetary boundaries, and looked at how the data could be mobilized to examine real diets in global context.

Innovation and the Inspire Challenge

  • Ten stellar projects competed for start-up grants from the CGIAR Inspire Challenge, and four were awarded 100,000 USD each to continue their project.
  • Seven teams who had completed startup awards competed for scale-up funds and four were awarded a combined total of 625,000 USD
    Last year’s Inspire Challenge scale-up winner “Seeing is Believing” demonstrated how they anonymized cellphone camera images and developed an image processing pipeline to build a living, high-quality dataset from the flow of wheat field images from some 33,000 farmers with mobile phones.
  • An innovation bazaar buzzed with thirty potential solutions in robotics, remote sensing, applications, and more.
  • Legal consultants, innovation managers, and researchers into innovation examined gaps in the agtech startup ecosystem and good practices in setting up digital accelerators for agriculture.

1Improving data quality and use

2Privacy, ethics, and consent

3Partnerships and collaboration

4Working towards the SDGs

  • DATA & GENDER: Gender experts and data scientists working with mobile network metadata, converged on methods for trying to predict and observe changes in women’s economic empowerment.
  • FINANCIAL INCLUSION: Experts examined opportunities for leveraging the convergence of the world’s largest biometric identification system (Aadhaar), mobile phones, and the world’s largest financial inclusion program (Jan Dhan Yojana) for transforming rural livelihoods.
  • DATA & GLOBAL DIETS: A cross-disciplinary panel and a member of the EAT-Lancet Commission and co-author of “Food in the Anthropocene” explored the notions of eating within planetary boundaries and examining real global diets with data.

5

Discovering and examining new tech

6

 

Innovation and the Inspire Challenge

  • Inspire Challenge finalists competed for start-up grants, and four were awarded USD100,000 each.
  • Four teams were awarded scale-up grants, a combined total of USD625,000.
  • Last year’s Inspire Challenge scale-up winner “Seeing is Believing” demonstrated how they developed an image processing pipeline to build a living, high-quality dataset from the flow of wheat field images from some 33,000 farmers with mobile phones.
  • An innovation bazaar buzzed with thirty potential solutions in robotics, remote sensing, applications, and more.
  • Legal consultants, innovation managers, and researchers into innovation examined gaps in the agtech startup ecosystem and good practices in setting up digital accelerators for agriculture.

 

Scroll right to see the 6 take away messages

Communities of Practice

Communities of Practice

Data Driven Agronomy
Crop Modeling
Geospatial Data
Livestock Data
Ontologies
Socio-economic Data

members

%

external to CGIAR

Community-Driven Outputs

Geospatial Data

880 members

CGIAR Consortium for Spatial Information (CGIAR-CSI) is the Platform’s geospatial science Community of Practice (CoP) that facilitates CGIAR’s research using geospatial data and analysis. CGIAR-CSI coordinates community-wide activities to bring CGIAR’s spatial scientists together through collaborative research, capacity building, communications, developing geospatial datasets and publications and convening of various events to share learning and represent CGIAR in the geospatial domain of expertise.

Building on capacity building activities initiated in the earlier years, this CoP continued to focus on the community-wide training and facilitation of using R for analyzing geospatial datasets. Two on-site training workshops were organized, and the materials were made public. Three technical sessions were organized during the 2019 Big Data Convention, attracting more than 250 attendees.

Through their mini-grant programs, the Geospatial CoP contributed — in collaboration with the University of Twente — on “A suite of global accessibility indicators,” which was published in the Nature journal Scientific Data.

The Geospatial CoP continued to play a key coordinating role in the identification, provision, and management of key geospatial shared services. A position paper about the community’s view on the role of geospatial big data for supporting agro-ecosystems was published as a book chapter. A new recipient for the 2019 mini-grant was decided to support the development of key climate projection data layers, and this will be completed in 2020.

Activities and products from the CoP were communicated through the community website, which was visited 140K times by 54K visitors in 2019.

Data-Driven Agronomy

1126 members

The Data-Driven Agronomy CoP works to collectively strengthen the innovation of technology and big data to tackle an array of agricultural challenges – including the closing of yield gaps – to reduce hunger and poverty and transform global agriculture.

In 2019, this CoP developed and shared a variety of content with the wider global community, building capacity for knowledge transfer and to build novel connections and partnerships. The content included webinars, newsletters, papers, blogs and videos on and around topics such as agri-businesses in the developing world, artificial intelligence, and digital extension with particular focus on increasing understanding of the digital landscape of developing nations.

In early 2019, they shared a report and a video on key constraints agri-businesses have in the developing world and published a peer-reviewed paper, in collaboration with the Colombian government, on “A scalable scheme to implement data-driven agriculture for small-scale farmers”. It demonstrated how machine learning can help make farming more efficient and productive even amid climate uncertainty.

In collaboration with the World Bank, FAO, and the African Development Bank, the Community began work to create several Digital Agriculture Country Profiles (DAPs). The profiles aim to enable a better understanding of the digital agricultural landscape at a global level, painting a data-driven picture of what technologies are in use, the benefits they are providing and where the greatest digital needs exist. In 2019, profiles were completed on Turkey, Argentina, Grenada, Vietnam, Kenya, Rwanda, Ivory Coast and South-Africa, and are expected to be launched in 2020.

Crop modeling

931 members

During 2019, the Crop Modeling CoP continued promoting capacity building activities related to identified needs within the community. In their webinar series, they covered useful platforms for crop modeling purposes such as GARDIAN and GEMS, strategies to bridge the gap between modelers and researchers regarding the data required for reliable simulations, and have helped the community answer pressing questions about the world’s most widely used crop modeling software (DSSAT). These webinars reached over 500 live participants, and the recordings have been viewed 1,500 times, collectively.

Two technical sessions were organized during the 2019 Big Data Convention, attracting more than 50 participants. A full-day writing workshop on digital extension within CGIAR was organized, in collaboration with the Data Driven Agronomy CoP, and attracted more than 20 industry experts to attend.

The Crop Modeling CoP continued to play an important coordinating role in identifying the gaps and needs in crop modeling activities within CGIAR. As a result of their mini-grant program, the Community awarded one project, done in collaboration with CIMMYT and the University of Florida, that works to identify the minimum data requirements researchers should collect in order to perform different crop modeling and other analytical activities.

The CoP participated in two peer-reviewed papers, one on “Different uncertainty distribution between high and low latitudes in modeling warming impacts on wheat,” which was published in Nature Food, and a second one on “Adapting irrigated and rainfed wheat to climate change in semi-arid environments: Management, breeding options and land use change”. Activities and products from the CoP were communicated through regular newsletters, and the CoP’s website.

Socio-economic data

885 members

The Socio-Economic Data CoP is dedicated to improving the landscape of livestock data by encouraging better use of existing data and analysis, and supporting collaboration on new and innovative data solutions.

In 2019, CoP grew to over 650 CGIAR and non-CGIAR members, attracting interest from disciplines in academia, business and government, all committed to improving data interoperability. The members convened during several virtual meetings throughout the year.

The CoP addressed the issue of data standardization and harmonization through the publication of the 100Q report to boost household survey data usability with 100 core questions. In a bid to improve gender awareness within agricultural research, the community collaborated with the CGIAR Platform for Gender Research to publish a report on the findability of gendered datasets. The CoP presented preliminary findings regarding an ontology-agnostic, flexible, extensible, machine-readable and human-intelligible metadata schema at several events.

During the 2019 Big Data in Agriculture Convention, the SED-CoP organized a workshop to build capacity on blockchain technologies to improve transparency in agriculture. It also published a report on the use of blockchain technology in agri-food systems focusing on the use case of biofortified maize.

In 2019, key representatives involved in research ethics across CGIAR formed an informal Community of Practice, aptly named: ´Informal group of CGIAR IRB folks’. The with close ties to the Community of practice on socio-economic data as the majority of ethics concerns in CGIAR research are related to human subjects research and the related data.

Livestock data

The Livestock Data for Decisions (LD4D) CoP aims to drive informed livestock decision making through the better use of existing data and analyses.

In 2019, LD4D authored the Livestock Fact Check journal paper, an exploration of the data behind popular livestock figures. The Livestock Fact Check project aims to help inform discussions about livestock production through a balanced examination of some commonly referenced livestock ‘facts.’ The project’s key findings are useful to anyone engaged in discussions about livestock and society.

The CoP formed a new working group,Livestock Ontologies. The group is open to anyone with an interest in exploring and building livestock vocabularies and ontologies with the aim to answer the question: “How do we present livestock and fish data in a consistent manner that enhances interoperability and re-use?” This is a collective effort aimed at cataloguing initiatives to develop standards and ontologies, map expertise and interests of community members, and share tools and best practices for ontologies, vocabularies, terminologies, thesauri, and similar tools and methods.

Additionally, the CoP engaged in monitoring, learning, and evaluation of livestock initiatives; a number of livestock development projects are developing key performance indicators for their activities with the aim of providing more clarity to the sector.

Ontologies

563 members

The Ontologies CoP made significant advancements in the development and adoption of quality ontologies for agrifood research data across CGIAR Research Programs, allied research organizations and private partners. Through dedicated Working Groups (WG), significant progress was made in the development of the agrifood ontology framework:

  • As of 2019, Crop Ontology (CO) comprised 4,125 traits and 5,556 variables for 29 plant species to describe plant phenotypes in databases, and, with support of Planteome, a US National Science Foundation project, enables comparative genotypic and phenotypic studies as well as gene-discovery experiments. CO counts the University of British Columbia, Cornell University, PEPSICO Inc and the NIAB, a UK research organization,  as new contributors.
  • Agronomy Ontology (AgrO), which compiles field management practices, was extended to address the feedback of users like Rothamsted Research and the University of Florida.
  • The draft version of the Socio Economy Ontology (SEOnt) was developed with scientists from the Socio Economic CoP using the ‘100Q’ and through Machine Learning techniques with the University of Sheffield, UK.
  • The  Fish Ontology WG, in collaboration with Worldfish, and the Livestock Ontology WG led by the Livestock Data for Decisions CoP were created.

The CoP stimulated knowledge sharing and capacity building through webinars, blog articles and members contributed to a peer-reviewed paper. Webinars explored ontology selection criteria for data annotation, the use of ontologies in machine learning techniques, as well as  knowledge graphs, and reached approximately 1400 viewers. Blog articles were published on the Crop Ontology, the GEMS platform and the Breeding API.

Geospatial Data

880 members

CGIAR Consortium for Spatial Information (CGIAR-CSI) is the Platform’s geospatial science Community of Practice (CoP) that facilitates CGIAR’s research using geospatial data and analysis. CGIAR-CSI coordinates community-wide activities to bring CGIAR’s spatial scientists together through collaborative research, capacity building, communications, developing geospatial datasets and publications and convening of various events to share learning and represent CGIAR in the geospatial domain of expertise.

Building on capacity building activities initiated in the earlier years, this CoP continued to focus on the community-wide training and facilitation of using R for analyzing geospatial datasets. Two on-site training workshops were organized, and the materials were made public. Three technical sessions were organized during the 2019 Big Data Convention, attracting more than 250 attendees.

Through their mini-grant programs, the Geospatial CoP contributed — in collaboration with the University of Twente — on “A suite of global accessibility indicators,” which was published in the Nature journal Scientific Data.

The Geospatial CoP continued to play a key coordinating role in the identification, provision, and management of key geospatial shared services. A position paper about the community’s view on the role of geospatial big data for supporting agro-ecosystems was published as a book chapter. A new recipient for the 2019 mini-grant was decided to support the development of key climate projection data layers, and this will be completed in 2020.

Activities and products from the CoP were communicated through the community website, which was visited 140K times by 54K visitors in 2019.

Data-Driven Agronomy

1126 members

The Data-Driven Agronomy CoP works to collectively strengthen the innovation of technology and big data to tackle an array of agricultural challenges – including the closing of yield gaps – to reduce hunger and poverty and transform global agriculture.

In 2019, this CoP developed and shared a variety of content with the wider global community, building capacity for knowledge transfer and to build novel connections and partnerships. The content included webinars, newsletters, papers, blogs and videos on and around topics such as agri-businesses in the developing world, artificial intelligence, and digital extension with particular focus on increasing understanding of the digital landscape of developing nations.

In early 2019, they shared a report and a video on key constraints agri-businesses have in the developing world and published a peer-reviewed paper, in collaboration with the Colombian government, on “A scalable scheme to implement data-driven agriculture for small-scale farmers”. It demonstrated how machine learning can help make farming more efficient and productive even amid climate uncertainty.

In collaboration with the World Bank, FAO, and the African Development Bank, the Community began work to create several Digital Agriculture Country Profiles (DAPs). The profiles aim to enable a better understanding of the digital agricultural landscape at a global level, painting a data-driven picture of what technologies are in use, the benefits they are providing and where the greatest digital needs exist. In 2019, profiles were completed on Turkey, Argentina, Grenada, Vietnam, Kenya, Rwanda, Ivory Coast and South-Africa, and are expected to be launched in 2020.

Crop Modeling

931 members

During 2019, the Crop Modeling CoP continued promoting capacity building activities related to identified needs within the community. In their webinar series, they covered useful platforms for crop modeling purposes such as GARDIAN and GEMS, strategies to bridge the gap between modelers and researchers regarding the data required for reliable simulations, and have helped the community answer pressing questions about the world’s most widely used crop modeling software (DSSAT). These webinars reached over 500 live participants, and the recordings have been viewed 1,500 times, collectively.

Two technical sessions were organized during the 2019 Big Data Convention, attracting more than 50 participants. A full-day writing workshop on digital extension within CGIAR was organized, in collaboration with the Data Driven Agronomy CoP, and attracted more than 20 experts to attend.

The Crop Modeling CoP continued to play an important coordinating role in identifying the gaps and needs in crop modeling activities within CGIAR. As a result of their mini-grant program, the Community awarded one project, done in collaboration with CIMMYT and the University of Florida, that works to identify the minimum data requirements researchers should collect in order to perform different crop modeling and other analytical activities.

The CoP participated in two peer-reviewed papers, one on “Different uncertainty distribution between high and low latitudes in modeling warming impacts on wheat,” which was published in Nature Food, and a second one on “Adapting irrigated and rainfed wheat to climate change in semi-arid environments: Management, breeding options and land use change”. Activities and products from the CoP were communicated through regular newsletters, and the CoP’s website.

Socio-Economic Data

885 members

The Socio-Economic Data CoP is dedicated to improving the landscape of livestock data by encouraging better use of existing data and analysis, and supporting collaboration on new and innovative data solutions.

In 2019, CoP grew to over 650 CGIAR and non-CGIAR members, attracting interest from disciplines in academia, business and government, all committed to improving data interoperability. The members convened during several virtual meetings throughout the year.

The CoP addressed the issue of data standardization and harmonization through the publication of the 100Q report to boost household survey data usability with 100 core questions. In a bid to improve gender awareness within agricultural research, the community collaborated with the CGIAR Platform for Gender Research to publish a report on the findability of gendered datasets. The CoP presented preliminary findings regarding an ontology-agnostic, flexible, extensible, machine-readable and human-intelligible metadata schema at several events.

During the 2019 Big Data in Agriculture Convention, the SED-CoP organized a workshop to build capacity on blockchain technologies to improve transparency in agriculture. It also published a report on the use of blockchain technology in agri-food systems focusing on the use case of biofortified maize.

In 2019, key representatives involved in research ethics across CGIAR formed an informal Community of Practice, aptly named: ´Informal group of CGIAR IRB folks’. The with close ties to the Community of practice on socio-economic data as the majority of ethics concerns in CGIAR research are related to human subjects research and the related data.

Livestock Data

The Livestock Data for Decisions (LD4D) CoP aims to drive informed livestock decision making through the better use of existing data and analyses.

In 2019, LD4D authored the Livestock Fact Check journal paper, an exploration of the data behind popular livestock figures. The Livestock Fact Check project aims to help inform discussions about livestock production through a balanced examination of some commonly referenced livestock ‘facts.’ The project’s key findings are useful to anyone engaged in discussions about livestock and society.

The CoP formed a new working group, Livestock Ontologies. The group is open to anyone with an interest in exploring and building livestock vocabularies and ontologies with the aim to answer the question: “How do we present livestock and fish data in a consistent manner that enhances interoperability and re-use?” This is a collective effort aimed at cataloguing initiatives to develop standards and ontologies, map expertise and interests of community members, and share tools and best practices for ontologies, vocabularies, terminologies, thesauri, and similar tools and methods.

Additionally, the CoP engaged in monitoring, learning, and evaluation of livestock initiatives; a number of livestock development projects are developing key performance indicators for their activities with the aim of providing more clarity to the sector.

Ontologies

563 members

The Ontologies CoP made significant advancements in the development and adoption of quality ontologies for agrifood research data across CGIAR Research Programs, allied research organizations and private partners. Through dedicated Working Groups (WG), significant progress was made in the development of the agrifood ontology framework:

  • As of 2019, Crop Ontology (CO) comprised 4,125 traits and 5,556 variables for 29 plant species to describe plant phenotypes in databases, and, with support of Planteome, a US National Science Foundation project, enables comparative genotypic and phenotypic studies as well as gene-discovery experiments. CO counts the University of British Columbia, Cornell University, PEPSICO Inc and the NIAB, a UK research organization,  as new contributors.
  • Agronomy Ontology (AgrO), which compiles field management practices, was extended to address the feedback of users like Rothamsted Research and the University of Florida.
  • The draft version of the Socio Economy Ontology (SEOnt) was developed with scientists from the Socio Economic CoP using the ‘100Q’ and through Machine Learning techniques with the University of Sheffield, UK.
  • The  Fish Ontology WG, in collaboration with Worldfish, and the Livestock Ontology WG led by the Livestock Data for Decisions CoP were created.

The CoP stimulated knowledge sharing and capacity building through webinars, blog articles and members contributed to a peer-reviewed paper. Webinars explored ontology selection criteria for data annotation, the use of ontologies in machine learning techniques, as well as  knowledge graphs, and reached approximately 1400 viewers. Blog articles were published on the Crop Ontology, the GEMS platform and the Breeding API.

Scroll right to see updates from our 6 COPs

Strategic Partnerships

Novel Approach to gender research

In 2019, the CGIAR Platform for Big Data in Agriculture and the Generating Evidence and New Directions for Equitable Results (GENDER) Platform spearheaded a novel approach to studying women’s economic empowerment. The partners conducted a phone-based survey of 10,000 respondents and used it to analyze billions of data points generated by the operation of mobile phone networks to predict sex and decision power among female farmers at a national scale in Uganda. The approach has demonstrated the potential of a novel method for observing changes in female farmers’ economic empowerment with greater speed and greater scale compared with solely survey-based methods.

 

Computer science and agricultural research

The Platform launched two key new strategic partnerships designed to surface emerging risks at the intersection of digital technologies and agro-ecologies, and to build CGIAR capacity to claim the benefits of these technologies in service of our mission.

  • The Platform launched collaborative research with the Partnership on AI (PAI) and the University of Cambridge Center for the Study of Existential Risk (CSER) centered on identifying emerging priority topics for machine learning in agriculture – both the systemic risks and opportunities.
  • The French Digital Sciences Institute (INRIA) and the Platform began to design a program for integrating graduate students in computer science into CGIAR agriculture research for development, on diverse themes including machine-learning enhanced crop modeling, semantic data, digital architectures for agronomic research, and farmer decision making under uncertainty. Jointly the Platform and INRIA will build new and much needed linkages between computer science and agricultural research domains.

Digital Inclusion

The Platform is working towards creating pathways and opportunities for women and youth to contribute to and benefit from the digital transformation of agriculture. Leaving no one behind is vital if we are to achieve a sustainable food future.

Youth in Data Workshop

For the second year of our annual Youth in Data Workshop, we received over 80 applications to participate and enrolled 30 delegates to participate in the 2019 Convention. This event builds on the launch of our new BIG DATA youth initiatives, as we work to engage youth in digital agriculture, and to engage with youth already in the sector, in a meaningful way.

The 2019 Youth in Data group was composed of students from engineering and journalism universities local to Hyderabad, as well as PhD scholars from the ICRISAT campus.

The workshop introduces participants to important themes around digital agriculture and how using big data approaches to agricultural development can accelerate food security goals. Delegates were also trained in how to use digital media as a powerful tool to communicate about development, learning the basics of social media, interviewing, blog writing, data reporting, and other media skills.

Delegates were able to apply their newly-acquired skills during the convention, they engaged on social media, interviewed leading industry experts, speakers and 2019 Inspire Challenge finalists, and wrote blogs on topics of their choice.

Youth in Data Connect

Built upon its Youth In Data initiative, created in 2018 to engage with young innovators in digital agriculture, the Platform launched in 2019 the new Youth in Data Connect platform.

Youth in Data Connect is a database that compiles global youth-focus or youth-driven digital agriculture initiatives. It is the Platfor’s first step towards mapping out the global landscape of youth engaged in this innovation space. The objective is to connect these young innovators with industry leaders and experts, and to build informed infrastructures that will enable the support and engagement of youths in digital agriculture.

Mainstreaming Gender Equality

Photo: C. De Bode / CGIAR. Sita Kumari (right), farmer, uses mobile phone apps to enhance her yields and get access to market and labor.

Mainstreaming gender equality in digital innovation

Since 2018, the Platform has made effective progress in including a gender dimension in our Inspire Challenge using a rubric and scoring matrix to explicitly assess whether and how proposals dealt with gender issues. These efforts manifested in 2019 with 80% of participants including a gender component, up from 70% in 2018.

Working with the CGIAR Gender Platform we have identified further opportunities to build upon and use the Inspire Challenge process as a source of insight and positive action for mainstreaming gender equality. In 2019, the team decided to modify the digital innovation challenge process to include key points related to gender mainstreaming. Specifically, these included requesting a gender balance of proposal teams and detailing a gender equality mainstreaming hypothesis for proposed projects. We anticipate that this will result in positive action for mainstreaming gender equality and will highlight the role of digital innovation in achieving gender parity in the agricultural research space.

%

of Inspire Challenge 2019 proposals included a gender component

Better metadata annotation standards for gender research data

The Platform has provided a set of recommendations to the CGIAR Metadata Working Group to update metadata annotation standards for gender research data. These will enhance discovery as well as provide users with the ability to reuse gender-disaggregated data.

This will be essential for advancing all forms of gender research across the CGIAR System and for unlocking new big data-enabled methods that can be used for researching and advancing gender equality.

As a result, the annotation, discovery, and re-use of gender disaggregated data has become a pillar of the Platform’s data strategy and will inform future collaboration with the CGIAR GENDER Platform.

Organize

Building fundamental technology and data standards to support CGIAR’s digital strategy

In 2019, the Organize module made significant progress towards building community and tools around data discovery and standards, realizing a key component of a digital platform strategy for CGIAR.
N
GARDIAN, our flagship data discovery and analysis tool, now points to over 155,000 publications and 23,000 datasets from several partners alongside all CGIAR Centers.
N
Collaborative GARDIAN Labs (CGLabs) offer a secure analytic environment for researchers to find data and collaborate on analyses within the GARDIAN ecosystem, integrating single sign-on and Globus, a service enabling secure data sharing.
N
The new Expert Finder showcases CGIAR research, enables new collaborations, and visualization of institutional partnerships and expertise by location.
N
Organize contributed strongly towards leveraging semantic standards for describing agronomic, socioeconomic, and survey data, updating the CGIAR metadata standards, and enabling digital collection of standards-compliant agronomy data through the AgroFIMS tool.
N
The Digital Food Systems Evidence Clearing House showcases credible, measured evidence of “value add” digital interventions somewhere in a food system. The Platform CoPs are leveraged as expert networks for sourcing and validating this evidence.

GARDIAN

CGIAR flagship data discovery and analytic tool opens new horizons for agricultural research

now points to over
155,000 publications and 23,000 datasets
from several partners alongside all CGIAR Centers

In 2019, new institutional partners linked their repositories to GARDIAN, building an important network of institutions joining CGIAR in their commitment to digitize agricultural development through FAIR data.
N

USAID

N

DFID

N

The World Bank

N

The US Department of Agriculture’s Ag Data Commons

N

The Indian Council for Agricultural Research

N

The Open Government Portal of India

In 2019, GARDIAN added new functionalities, including:

The ability to map and spatially query production estimates for 30+ crops

Visualize a 7 Terabyte climate dataset

An analytic workbench enabling CGIAR researchers to apply machine learning analytics

A service to help flag personally-identifiable information before they are made open

The expanding GARDIAN ecosystem

CG Labs and the CGIAR Expert Finder

                
The Platform launched the Collaborative GARDIAN Labs (CGLabs), the latest offering in the GARDIAN data ecosystem. CGLabs has a built-in collaboration platform that allows users to create either private or public virtual spaces, invite members, receive notifications and collaborate remotely and asynchronously including by finding colleagues via the Find a CGIAR Expert to spark new collaborations.

Access is handled by a Single Sign-On via Globus, a shared service offered by the Platform to enable secure data sharing.

CGLabs offers three key modules with specific, interlinked functionalities:

U

Find data

Search, download, and save datasets from GARDIAN in CGLabs
+

Securely share data

Published, unpublished or sensitive data via Globus

Analyze data

Collaboratively write scripts and run analyses in Jupyter, which has been extended to support smooth data file exchange on the CG Labs Globus Server

Digital Food Systems
Evidence Clearing house

Leaving no evidence behind

With the launch of the Digital Food Systems Evidence Clearing House, the Platform increased its efforts to leave no evidence behind on the relevance of digital technologies to agricultural development. The Clearing House enables active pursuit of credible, measured evidence of a value add of digital interventions somewhere in a food system. The Platform Communities of Practice are leveraged as expert networks for sourcing and validating this evidence. Both formal studies conducted by Platform awardees and wider evidence from the sector inform how we learn about our interventions and inform our view on how best to target digital innovation strategy.

Working together towards better data standards

In 2019, Organize investments linked the array of CGIAR data assets in new ways to facilitate new partnerships and innovation.

A push towards semantic data at CGIAR

  • A Bioversity-based team engaged broadly to enhance agrisemantics standards and ontologies, including with the Environment Ontology, Food Ontology, SDG Interface Ontology, and Planteome Project teams.
  • WorldFish began development of a Fish Ontology.
  • Almost all CGIAR Centers implemented the CG Core Metadata Schema v.2.0 and applied the AGROVOC controlled vocabulary in annotation. Several centers also used ontologies, among them the Bioversity-CIAT Alliance, IITA, IRRI, IWMI, CIMMYT, and CIP.
  • IRRI enhanced the Rice Ontology, adapted its Farm Household Survey Database for machine-readability, developed a pilot database for UAV images and metadata, and shared UAV controlled vocabulary terms.
  • IWMI improved metadata workflows, incorporating existing ontologies.

Generating FAIR data at collection: AgroFIMS

V.1.0 of the Agronomy Field Information Management System (AgroFIMS) was released. AgroFIMS employs semantic standards to generate FAIR data at collection. Life sciences research is moving inexorably towards data annotation leveraging standard semantics and logic, and in 2019 the Platform made important contributions to these community standards by describing agronomic, socioeconomic, and survey data, and updating the CGIAR metadata standards.

To know more:

Rice Functional Genomics and Breeding database v2.0

IRRI has updated the Rice Functional Genomics and Breeding (RFGB) database of 3000 rice genomes (3K-RG) to include new features, annotations, and data.

In the new RFGB v2.0, new phenotypes and haplotypes allow associations to be inferred, enabling breeders and geneticists to narrow down on candidate gene targets for validation.

The new version of the database complements others that use 3K-RG data as the foundation, and leverages contributions and builds stronger partnership linkages with an array of partners including the Shenzhen Institute of Breeding for Innovation CAAS, Institute of Genetics and Developmental Biology CAS, Nanjing AU, China AU, BGI Shenzhen, and Shanghai Jiao Tong University.

Applying Machine learning to mine GARDIAN data

The Organize Module tested data mining techniques to apply machine learning on GARDIAN’s large data pool, working with the University of Florida to auto-generate harmonized datasets for input to crop models, and with University of California Davis to identify an approach to harmonize key data variables and apply machine learning and spatial approaches to GARDIAN data.

Capacity Building

The Platform invests in learning and capacity building initiatives to accelerate data sharing and analytic capabilities across CGIAR. In 2019, the Platform backstopped the adoption of data-related outputs, standards, and tools via guidance and webinars.

Data management

  • A multi-module online course on best practices in open, FAIR, and ethical data assets is available for CGIAR researchers.
  • Several Centers including the Bioversity-CIAT Alliance, CIP, ILRI, and IITA developed domain-specific materials on data management best practices, often involving researcher data champions.
  • Through UC Davis regional workshops were offered at CIMMYT, CIP, ICRISAT, and IFPRI on data science-related topics, including advanced R, spatial predictive modeling, and machine learning, and a data wrangling/processing guide was developed.
CGIAR Centers engaged researchers on data management through data sprints, data clinics, hackathons, “curathons”, and other modalities.

  • The Alliance Bioversity-CIAT held three trainings on data management planning across CIAT regions, while Bioversity-based staff participated in data sprints.
  • ILRI organized help-desks and data clinics, a data sprint and four trainings on basic research management.
  • IITA launched a campaign on best practices for data quality, CIFOR embedded data teams into projects
  • CIMMYT worked with stakeholders to promote good data management
  • ICRISAT organized two hackathons and four data management trainings.

Webinars

Transitioning to an online, inclusive Convention in 2020

The need for resilient food systems comes into stark relief during a crisis like the COVID-19 pandemic. Responses must be agile and adaptive, facilitating the quickest possible recovery while equipping food systems to adaptively manage or avert crises in the future.

2020 presents a unique opportunity for us to “walk the talk” on agile, adaptive, digitally-enabled collective action. We have transitioned our annual convention to be an inclusive, accessible and fully online event.

The event theme, Digital Dynamism for Adaptive Food Systems, will examine food system resilience and highlight how digital tools and technologies can help us sense, respond and (re)build better systems in times of global food security crises.

Supporting digital innovation in response to COVID-19: Rapid Response Grants

In response to the unfolding food security issues brought about by the COVID-19 pandemic, the BIG DATA Platform has made funding available for agile, big-data enabled projects working to tackle food system challenges.

The Inspire Challenge Rapid Response Grants, totaling up to USD100,000, are available to current or previous Inspire Challenge winners—groundbreaking innovations that use big data approaches to solve agricultural development challenges across the globe.

BIG DATA and the digital transformation of One CGIAR

International agricultural science is facing a new era. Several analyses indicate that there will be increasingly intense challenges in the next decade across demographic, natural resource, ecological, and climatic dimensions, and that the window of opportunity for mitigating or reversing the most harmful effects of these challenges is quickly closing. There will also likely be unprecedented rates of technology innovation adoption in the coming years  that—if harnessed and applied properly—could provide critical, cross-domain tools that can be used to adapt to these challenges and facilitate a shift towards more favorable potential futures.

A unified, digital CGIAR could capitalize on several important comparative advantages in this evolving agricultural research for development landscape:

  • Leveraging global partnership networks and data-driven engagement with smallholder farmers worldwide.
  • Being in a position to help measure environmental progress and move global agriculture towards agroecological intensification and carbon neutrality.
  • Its long history of managing the dissemination and adoption of technology innovations in developing economies.

To claim these advantages most effectively in service of its purpose, CGIAR will need a “strategic posture”  geared toward guiding global food, land, and water systems towards more favorable potential futures in 2030.

The research network CGIAR decided on a fundamental reform in mid-June of 2019: The 15 research centers, spread over three continents, will be consolidated. This was announced by Marco Ferroni, Chairman of the System Management Board:

“The core of the reform is a systemic approach to all areas. This will include the creation of a superordinate management structure to organize all phases of development more effectively.”

In 2019, the CGIAR System Management Board tasked the BIG DATA Platform to develop a digital strategy in preparation for the consortium’s One CGIAR transformation.

We will be publishing more about this process in the coming months.

The Platform is carried out with support from the CGIAR Trust Fund, UKAID and through bilateral funding agreements.

Credits

Project leader: Marianne McDade
Writing & Editing: Marianne McDade, Stefanie Neno & Hannah Craig
Concept & Web Implementation: Stefanie Neno