Shared services

Enabling access to shared services in support of CGIAR research and its researchers

The Platform empowers CGIAR and its community to deliver on the potential of big data to bring results for smallholder agriculture.

A key part of the Platform’s Convene Module strategy is to develop ambitious partnerships to facilitate collaboration and ideation and build CGIAR capacity on big data approaches.

The Platform has enabled access to shared services in support of CGIAR research. The services described below are available to all CGIAR researchers. If you would like to benefit from any of these services, please contact Jawoo Koo at j.koo@cgiar.org

Additional capacity buidling initiatives include our courses and regular webinars.

Help us optimize your use of these subscriptions by sharing your feedback in this short five-minute survey.

If you have found these services useful to your work, please acknowledge us following these guidelines.

Big Data Services

GARDIAN: The Global Agricultural Research Data Innovation & Acceleration Network

The primary focus in the early years of the Platform is on getting CGIAR data resources organized and FAIR (Findable, Accessible, Interoperable, Reusable). The Platform is working on establishing the infrastructures, tools, and approaches to make CGIAR data visible and usable, and has developed FAIR Metrics and downloadable Guidelines for making data assets FAIR. It is also working with all 15 Centers to build capacity across the CGIAR system to effectively manage its valuable data resources.

GARDIAN, the Global Agricultural Research Data Innovation & Acceleration Network, is the CGIAR flagship data harvester. GARDIAN enables the discovery of publications and datasets from the thirty-odd institutional publications and data repositories across all CGIAR Centers to enable value addition and innovation via data reuse.

COLLABORATIVE DATA SCIENCE ENVIRONMENT

Collaborative GARDIAN Labs

As a part of the GARDIAN Ecosystem, the Platform will soon launch the Collaborative GARDIAN Labs (CGLabs), an open collaborative data science platform that allows researchers to work together on the same data science project using datasets securely transferred from GARDIAN and other trusted sources.

CGLabs will help discoverability, visualization, and analyses of datasets and collaborative analytics using R and Python computer programming languages. CGLabs establishes a secure transfer and storage of computer program codes and data files through Globus, another core Shared Service that the Platform provides.

Shared programming codes will help accelerate the customization of analytics and avoid possible duplications. This will ultimately contribute to the advances in science by enabling the reproducibility of published work and increased efficiencies.

With CGLabs, we anticipate lowering the barrier to the practical use of big data in agricultural research. Stay tuned for the exciting launch of CGLabs in mid-2020!

Subscription Services

Secure sharing and transfer of large datasets

The CGIAR Platform for Big Data in Agriculture activated a subscription to Globus, a grid computing alliance providing tools for secure management of data. The Platform is testing Globus to determine if it can become a pan-CGIAR data infrastructure.

The subscription offers comprehensive data management capabilities for CGIAR researchers, including file sharing, easy and secure transfer of large datasets, access to cloud storage, protected data management with the setting of appropriate access permissions for sensitive data, advanced endpoint administration, and much more.

Commercial satellite imagery

CGIAR has partnered with Maxar’s DigitalGlobe to accelerate machine learning solutions for agriculture. Under the partnership, CGIAR’s geospatial scientists will mine DigitalGlobe’s 100 petabyte imagery library using machine learning and the computational power of the company’s Geospatial Big Data platform (GBDX) to create more sophisticated baseline datasets in agriculture, plan new projects and monitor crop health, crop yield and the environmental impacts of farming.

The CGIAR Platform for Big Data in Agriculture conducted several trainings and provided technical assistance to CGIAR researchers to make the most out of satellite imagery and the processing platform GBDX. This is allowing scientists to leverage GBDX to examine land tenure (focusing initially on India, Ethiopia and West Africa), crop yield and crop production estimation, water resource conservation in South Asia, and pest and disease monitoring to develop early interventions (focusing initially on the fall armyworm in sub-Saharan Africa).

Gridded global weather data

Localized, accurate weather data helps CGIAR researchers to analyze the risk of farming and inform farmers on when to plant, provide warnings and recommendations for pest and disease control and even optimal times to harvest to maximize profits.

The Platform has secured access for CGIAR researchers to validated high-resolution gridded weather data from multiple sources, including The Weather Company, aWhere, and European Centre for Medium-Range Weather Forecasts (ECMWF). The Platform provides the weather data through Application Programming Interfaces (APIs) for advanced users and also facilitates the reanalysis of weather data to serve a broad range of users (e.g., weather data in GIS and crop model-compatible formats to serve the communities of geospatial scientists and crop modelers, respectively).

High-resolution population data

The Platform facilitated unfettered access for CGIAR to global gridded population LandScan™ dataset through securing a subscription with Oak Ridge National Laboratory.

LandScan is a community standard for global population distribution data and is widely regarded as one of the best available population datasets. At approximately 1 km (30″ X 30″) spatial resolution, it represents an ambient population (average over 24 hours) distribution.