Information and Data Management

Community of Practice

 

Welcome!

The CGIAR Open Access and Data Management Policy was ratified by all CGIAR Centers in November 2013, a commitment by CGIAR to the widespread diffusion and reuse of its research and development activities as international public goods. The task of managing CGIAR research outputs in conformance with the Policy has been a key responsibility of Center information specialists and data managers. As part of the Open Access, Open Data Initiative (2014 – 2016) and the Big Data Platform’s Organize Module (2017 to present), these information and data managers organized themselves into the Data Management Task Force (DMTF) and the Open Access Working Group (OAWG). In November 2018 members of these groups voted to consolidate into the Information and Data Management Community of Practice (also known as the IDM CoP).

The CoP works through its Working Groups to develop and implement common standards, tools, and approaches, and already engages with external entities on a variety of fronts (including but not limited to Harvard University for Dataverse repository concerns; Earlham Institute for the COPO data annotation tool; FAO, GODAN, RDA, and other organizations on agrisementic standards). The CoP works to enhance capacity and culture change around data management at Centers, and helps the CGIAR community stay current in a rapidly changing data and publications landscape.

Over the past five years, members of the IDM CoP have worked together to tackle cultural and technical challenges to enable CGIAR’s data assets to be increasingly open as well as FAIR (Findable, Accessible, Interoperable, and Reusable). We welcome those beyond CGIAR who would like to join us to enhance cross-learning on all sides, to work collaboratively in implementing common standards and approaches towards open and FAIR data, and to leverage opportunities for broader linkages and exchange.

Get in touch

CoP Lead

Medha Devare | Email

Administration

Michelle Fotsy | Email

Engage with the Community

i

BE INFORMED

Tune in to the latest CoP updates! Please see previous newsletters for more information and to subscribe.

w

ENGAGE

Interact with community members via our LinkedIn group.

CONTRIBUTE

Collaborate and co-learn with our working groups by subscribing to one or more. 

Working Groups

Metadata Working Group

The Metadata Working Group is composed of data managers and information specialists and meets about once every two months. The working group aims at enhancing the interoperability of CGIAR open access repositories (both publications and data) by developing a common metadata schema and guidelines. The CG Core and its associated guidelines (https://github.com/AgriculturalSemantics/cg-core) has been openly published online and is available for all to use.  

Repository Working Group

The Repository Working Group addresses issues regarding data and information repository management. Members identify solutions through internal discussion or coordination with external partners or service providers and share best practices and experiences. The group meets monthly.

Ontology Working Group

The Ontology Working Group works to ensure collaboration across ontology efforts and use within CGIAR, and links to efforts outside the System. This group acts as an advisory group for the Ontologies CoP by monitoring the progress of thematic groups like those overseeing the development of the Socioeconomic, Fish and Plant Phenotypes ontologies. Members discuss key topics and issues around ontology gap filling and ontology-supported data annotation. Relevant topics are then brought to the attention of the Ontologies CoP and other working groups within the IDM CoP.

Open Access Working Group

The Open Access Working Group serves as a forum for issues related to Open Access (OA) and scholarly publishing. Members work together to develop educational and promotional material on OA, advocate for stable OA funding, and help with reporting on the impact of OA. 

Globus Working Group

This working group seeks to bring together a community of people working on the research life cycle to use the APIs, tools and services provided by Globus for secure data sharing and more effective and efficient data management. Globus is a non-profit service managed by the University of Chicago to provide unified access to research data across all systems (high performance computing cluster, laptop, in-cloud or on-premise storage) using any existing identity. Globus allows researchers to efficiently, securely, and reliably transfer data directly between systems, be they separated by an office wall or an ocean.

 

Meeting notes and key outputs are made available on group sites maintained through the Organize Module and via GitHub. A well-attended CoP webinar series is focused on capacity enhancement; there were 8 webinars in 2019, attended by 35 people on average. There is one annual face to face meeting; in 2017, 2018 and 2019 this was held in conjunction with the Big Data Convention. These meetings are attended by an average of around 40 members representing all Centers. The 2019 agenda and meeting report are available, along with those of past meetings

News

Resources

AgroFIMS

The Agronomy Field Information Management System (AgroFIMS) consists of modules that represent the typical cycle of operations in agronomic trial management, and enables the creation of data collection sheets using the same ontology-based set of variables, terminology, units and protocols, hence generating FAIR data at collection.

CG Core Agricultural metadata scheme

A list of the metadata elements that are used to describe all types of information products that are published by the different CGIAR Centres.

The Agronomy Ontology (AgrO)

AgrO provides terms from the agronomy domain that are semantically organized and can facilitate the collection, storage and use of agronomic data, enabling easy interpretation and reuse of the data by humans and machines alike.

The Crop Ontology (CO)

The CO’s current objective is to compile validated concepts and their inter-relationships on anatomy, structure and phenotype of crops, on trait measurement and methods as well as on germplasm with the multi-crop passport terms.

Responsible Data Guidelines

Guidelines intended to assist agricultural researchers handle privacy and Personally Identifiable Information (PII) in the research project data lifecycle.

COPO: A portal to describe, store, and retrieve plant data

COPO is a web-based data brokering system that enables scientists to describe their research objects (raw or processed data, publications, samples, images, etc.) using community-sanctioned metadata sets and vocabularies, and then use public or institutional repositories to share them with the wider scientific community.

CG Labs

The Platform launched the Collaborative GARDIAN Labs (CGLabs), the latest offering in the GARDIAN data ecosystem. CGLabs has a built-in collaboration platform that allows users to create either private or public virtual spaces, invite members, receive notifications and collaborate remotely and asynchronously including by finding colleagues via the Find a CGIAR Expert to spark new collaborations.

Globus

The CGIAR Platform for Big Data in Agriculture has activated a subscription to Globus. The subscription offers comprehensive data management capabilities for CGIAR researchers, including file sharing, easy and secure transfer of large datasets, access to cloud storage, protected data management with the setting of appropriate access permissions for sensitive data, advanced endpoint administration, and much more.

CGIAR Expert Finder

The Expert Finder showcases the breadth of research expertise at CGIAR, with nearly 10,000 profiles distributed by geographies, Centers, funding agencies, and research publications. The Expert Finder builds on the VIVO semantic web application developed at Cornell University, with content collated from public sources, including bibliographic databases, GARDIAN, and CGIAR authoritative sources (such as Active Directory).

LandScan high-resolution population data

The Platform facilitated unfettered access for CGIAR to global gridded population LandScan™ dataset through securing a subscription with Oak Ridge National Laboratory. LandScan is a community standard for global population distribution data and is widely regarded as one of the best available population datasets. At approximately 1 km (30″ X 30″) spatial resolution, it represents an ambient population (average over 24 hours) distribution.

Global Climate Data – WorldClim

WorldClim is a set of global climate layers (gridded climate data in GeoTiff format) that can be used for mapping and spatial modeling. WordlClim version 2 contains average monthly climatic gridded data for the period 1970-2000 with different spatial resolutions. The older version of WorldClim (version 1.4) contains gridded data for the same variables for the period 1960-1990, as well as temperature and precipitation projections by the IPCC Fifth Assessment Report.

FAIR Data Principles

The FAIR Data Principles is a set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable. However, those principles are not orthogonal and have not been designed for automated machine-based evaluation. To this end, we have adopted the Netherlands Institute for Permanent Access to Digital Research Resources (DANS) metrics for FAIR compliance.

Gridded Global Weather Data

The BIG DATA Platform has secured access for CGIAR researchers to validated high-resolution gridded weather data from multiple sources, including The Weather CompanyaWhere, and European Centre for Medium-Range Weather Forecasts (ECMWF). The Platform provides the weather data through Application Programming Interfaces (APIs) for advanced users and also facilitates the reanalysis of weather data to serve a broad range of users (e.g., weather data in GIS and crop model-compatible formats to serve the communities of geospatial scientists and crop modelers, respectively). Contact us.

TechChange FAIR course

This course has been developed to strengthen capacity in the CGIAR System and beyond to create, manage and share research and development data assets that are not only open (i.e., discoverable and downloadable), but also easily interpretable, interoperable, and reusable.

Events

[WEBINAR] FAIR Guidelines and Assessment in GARDIAN

June 25, 2019 – June 25, 2019 | Online

[WEBINAR] Semantic Annotation of Images in the FAIR Data Era

September 3, 2019 – September 3, 2019 | Online

Interested in joining our community of practice?

Sign up to our mailing list for community news and updates.