Information and Data Management

Community of Practice

Welcome!

The CGIAR Open Access and Data Management Policy was ratified by all CGIAR Centers in November 2013, a commitment by CGIAR to the widespread diffusion and reuse of its research and development activities as international public goods. The task of managing CGIAR research outputs in conformance with the Policy has been a key responsibility of Center information specialists and data managers. As part of the Open Access, Open Data Initiative (2014 – 2016) and the Big Data Platform’s Organize Module (2017 to present), these information and data managers organized themselves into the Data Management Task Force (DMTF) and the Open Access Working Group (OAWG). In November 2018 members of these groups voted to consolidate into the Information and Data Management Community of Practice (also known as the IDM CoP).

The CoP works through its Working Groups to develop and implement common standards, tools, and approaches, and already engages with external entities on a variety of fronts (including but not limited to Harvard University for Dataverse repository concerns; Earlham Institute for the COPO data annotation tool; FAO, GODAN, RDA, and other organizations on agrisementic standards). The CoP works to enhance capacity and culture change around data management at Centers, and helps the CGIAR community stay current in a rapidly changing data and publications landscape.

Over the past five years, members of the IDM CoP have worked together to tackle cultural and technical challenges to enable CGIAR’s data assets to be increasingly open as well as FAIR (Findable, Accessible, Interoperable, and Reusable). We welcome those beyond CGIAR who would like to join us to enhance cross-learning on all sides, to work collaboratively in implementing common standards and approaches towards open and FAIR data, and to leverage opportunities for broader linkages and exchange.

Get in touch

CoP Lead

Medha Devare | Email

Administration

Celine Aubert | Email

Engage with the Community

BE INFORMED

Tune in to the latest CoP updates! Please see previous newsletters for more information and to subscribe.

ENGAGE

Interact with community members via our LinkedIn group.



CONTRIBUTE

Collaborate and co-learn with our working groups by subscribing to one or more.

See our Working Groups

Working Groups

Metadata Working Group

The Metadata Working Group is composed of data managers and information specialists and meets about once every two months. The working group aims at enhancing the interoperability of CGIAR open access repositories (both publications and data) by developing a common metadata schema and guidelines. The CG Core and its associated guidelines (https://github.com/AgriculturalSemantics/cg-core) has been openly published online and is available for all to use.

Repository Working Group

The Repository Working Group addresses issues regarding data and information repository management. Members identify solutions through internal discussion or coordination with external partners or service providers and share best practices and experiences. The group meets monthly.

Ontology Working Group

The Ontology Working Group works to ensure collaboration across ontology efforts and use within CGIAR, and links to efforts outside the System. This group acts as an advisory group for the Ontologies CoP by monitoring the progress of thematic groups like those overseeing the development of the Socioeconomic, Fish and Plant Phenotypes ontologies. Members discuss key topics and issues around ontology gap filling and ontology-supported data annotation. Relevant topics are then brought to the attention of the Ontologies CoP and other working groups within the IDM CoP.

Open Access Working Group

The Open Access Working Group serves as a forum for issues related to Open Access (OA) and scholarly publishing. Members work together to develop educational and promotional material on OA, advocate for stable OA funding, and help with reporting on the impact of OA.

Globus Working Group

This working group seeks to bring together a community of people working on the research life cycle to use the APIs, tools and services provided by Globus for secure data sharing and more effective and efficient data management. Globus is a non-profit service managed by the University of Chicago to provide unified access to research data across all systems (high performance computing cluster, laptop, in-cloud or on-premise storage) using any existing identity. Globus allows researchers to efficiently, securely, and reliably transfer data directly between systems, be they separated by an office wall or an ocean.

Meeting notes and key outputs are made available on group sites maintained through the Organize Module and via GitHub. A well-attended CoP webinar series is focused on capacity enhancement; there were 8 webinars in 2019, attended by 35 people on average. There is one annual face to face meeting; in 2017, 2018 and 2019 this was held in conjunction with the Big Data Convention. These meetings are attended by an average of around 40 members representing all Centers. The 2019 agenda and meeting report are available, along with those of past meetings.

News

Webinar – All about GARDIAN

December 16, 2021

This webinar by the Info and Data Management Community of Practice presents GARDIAN, the Global Agricultural Research Data Innovation & ...

Webinar – OpenSAFELY for sensitive agricultural data

December 9, 2021

This webinar by the Info and Data Management Community of Practice presents OpenSAFELY, a secure analytics platform for electronic health ...

Webinar – Collaboration with the FAO on AGROVOC and AGRIS – The CGIAR exemplar

June 18, 2021

The Information and Data Management Community of Practice presents the collaboration between CGIAR and AGROVOC– FAO ...

Webinar – Challenges of managing qualitative data: Stories from the trenches

April 23, 2021

The Information and Data Management CoP hosts a webinar to discuss the challenges faced when managing qualitative data, including the ...

Webinar – Enabling Data-Driven Transformation of Agriculture

December 3, 2020

The Information and Data Management Community of Practice presented a webinar on data-driven transformation that could enable hyper-local solutions at ...

2020 Convention session – CGIAR’s management of data assets

November 20, 2020

This session on CGIAR's management of data assets aired live at the 2020 virtual CGIAR Convention on Big Data in ...

Resources

FAIR data, Ontologies

AgroFIMS

The Agronomy Field Information Management System (AgroFIMS) consists of modules that represent the typical cycle of operations in agronomic trial management, and enables the creation of data collection sheets using the same ontology-based set of variables, terminology, units and protocols, hence generating FAIR data at collection.

Metadata

CG Core Agricultural metadata scheme

A list of the metadata elements that are used to describe all types of information products that are published by the different CGIAR Centres.

Ontologies

The Agronomy Ontology (AgrO)

AgrO provides terms from the agronomy domain that are semantically organized and can facilitate the collection, storage and use of agronomic data, enabling easy interpretation and reuse of the data by humans and machines alike.

Ontologies

The Crop Ontology (CO)

The CO’s current objective is to compile validated concepts and their inter-relationships on anatomy, structure and phenotype of crops, on trait measurement and methods as well as on germplasm with the multi-crop passport terms.

Privacy & PII

Responsible Data Guidelines

Guidelines intended to assist agricultural researchers handle privacy and Personally Identifiable Information (PII) in the research project data lifecycle.

Privacy & PII

PII engine: Check your data for PII

This tool identifies personally identifiable information in dataset in order to preserve the privacy of research participants

Webinars

2021

2020

2019

2018

Documents

2021 Work Plan – Detailed description of expected outcomes and deliverables for 2021

Newsletters

Communities

AGRONOMY

CROP MODELING

GEOSPATIAL

INFO & DATA MANAGEMENT

LIVESTOCK

ONTOLOGIES

SOCIO-ECONOMIC



Interested in joining our community of practice?

Search the website

Discover agricultural data and publications

Powered by GARDIAN

Become a youth in data partner

Submit an initiative!

AgroFIMS: Your new companion for easy standardization of data collection and description

The Agronomy Field Information Management System (AgroFIMS) allows users to create fieldbooks to collect agronomic data that is already tied to a metadata standard (the CG Core Metadata Schema, aligned with the standard Dublin Core), and semantic standards like the Agronomy Ontology (AgrO), generating data that is Findable, Accessible, Interoperable, and Reusable (FAIR) at collection. AgroFIMS therefore standardizes data collection and description for easy aggregation and inter-linking across disparate datasets. The fieldbooks you create can be exported to the Android-based KDSmart data collection application, and collected data imported back to AgroFIMS for statistical analysis and reports. In 2021 AgroFIMS will allow you to set up agronomic survey questionnaires, for data collection via ODK. It will also allow easy upload of your “born FAIR” data to Dataverse repository platforms with Dublin Core-compliant metadata schemas. Funding for AgroFIMS was provided by the Bill and Melinda Gates Foundation’s Open Access, Open Data Initiative, and the CGIAR Platform for Big Data in Agriculture. AgroFIMS is under GPL license. Go to AGROFIMS →

Responsible Data Management Guidelines to protect privacy

CGIAR Platform for Big Data in Agriculture advocates open data for agricultural research for development. It considers that opening up research data for scrutiny and reuse confers significant benefits to society.

However, the Platform appreciates that not all research data can be open and that a broad range of legitimate circumstances may require data to be restricted.

As an integral component of its advocacy for open data, the Platform promotes responsible data management through the entire research data lifecycle from planning, collecting, storing, disclosing or publishing, transferring, discovery and archiving.

These guidelines were created from information collected from: review on best and emerging practices across various sectors in the fast changing landscape of privacy and ethics (130 external resources); privacy and ethic materials sourced from seven CGIAR centers; first draft was circulated for input and feedback across CGIAR and incorporated into this edition. It’s important to note that this is an evolving document, the next stage is to consult externally for further input.

These Guidelines are intended to assist agricultural researchers handle privacy and personally identifiable information (PII) in the research project data lifecycle.

Check the guidelines →

REUSE / TRANSFER

Ensure consistency with the DMP-PII and the purpose for which prior informed consent has been obtained
Revaluate likelihood of (re-)identification and risk of harm, particularly if it involves a public data-set containing PII (as above)
Ensure PII is stored securely to protect privacy (as above)
Minimize use of PII and risk of disclosure through pro-privacy access controls and analytical tools (as above)

Don’t transfer data containing PII unless have explicit consent
Don’t transfer data containing PII in the absence of a data sharing agreement identifying aspects such as purpose and scope of use, privacy protections measures, confidentiality and any limitations)
Don’t reuse or transfer PII until any inconsistencies with the DMP-PII and/or purpose compatibility have been resolved (e.g. through updated ethics review or consent from participant)

ARCHIVING / DISCARDING

Plan for archiving or data destruction early in the process. Destroying data can be more secure, however, archiving can be beneficial if the data has ongoing evidentiary, scientific or cultural value. If archiving, identify where and how, the budget require
Ensure DMP-PII and purpose compatibility (as above)
Ensure adequate security measures to protect privacy (as above)

Don’t wait until the end of the project to assess archiving needs when time and resources may be limited
Don’t assume the longevity of a particular format, future-proof your archives data
Don’t forget to budget for archiving data, this should be done as part of your Data Management Plan

PUBLISHING AND DISCOVERY

Ensure DMP-PII and purpose compatibility (as above)
Revaluate likelihood of (re-)identification and risk of harm, particularly if it involves a public data-set containing PII
Indicate in metadata the availability of raw data or minimized data containing PII, if available bilaterally
Minimize use of PII and risk of disclosure through pro-privacy access controls and analytical tools

Don’t include PII in public datasets unless absolutely necessary to preserve the data’s analytic potential, scientific utility or benefit to the participant (and subject to participants informed consent and a rigorous risk assessment)

STORAGE AND ANALYSIS

Ensure compatibility with the DMP-PII (as above) and also the purpose for which prior informed consent has been obtained

Ensure PII is stored securely to protect privacy, through organizational or project specific safeguards to prevent unauthorized access, accidental disclosure or breach of data (physical & technical)

encryption for the storage and transmission of PII
access control measures to limited access to PII
two-factor or multifactor authentication
cloud services & back-end security

Don’t store data in unsecured locations or on unsecured devices or servers

Don’t store encrypted data and encryption keys in locations where they can be easily accessed simultaneously

Don’t underestimate the importance and value of administrative safeguards to standardize practices (i.e. organizational policies, procedures and maintenance of security measures that are designed to protect private information, data and access)

COLLECTION

Ensure compatibility with the DMP-PII
De-identify data to anonymize by default unless it will impair the data’s analytic potential, scientific utility or benefit to the participant,
If you cannot anonymize, minimize the PII and pseudonymize to reduce the disclosure risk
Provide research participants sufficient information to use reasoned judgment to decide whether or not they wish to participate in the project
Ensure informed consent is designed to address the following elements:
- competence, comprehension, full disclosure, voluntariness
- legitimate scientific purpose for which the PII is collected and scope of use (e.g. stored, transferred, published and whether as anonymized, minimized or raw data)
- foreseeable risk of privacy loss and consequences
- meaningful alternatives including opt-in protection/anonymization
- safeguards to protect privacy, conditions on which PII may be shared and any limitations on reuse or third- party access and use of PII
- permission to follow-up or contact the participant and for what purpose (including by third- parties)
- participant’s right to withdraw and rights regarding their data (e.g. to be informed; to access; to rectify; to object; to erase)
- inclusion of physical, phone and/or electronic contact (at least two forms of contact) that participant can reach to exert her/rights
- explicit consent and participant’s acknowledgement of understanding
- if written, provide the participant a copy of processed informed consent
Use plain language and adapt informed consent to meet the needs of vulnerable populations (e.g. obtain orally or in local language)

Don’t collect PII unless you have a Data Management Plan and any necessary approvals in place, including the recorded approval of the potential participant
Don’t collect PII unless you absolutely need it
Don’t assume that removal of direct identifiers is sufficient to anonymize data or that all de-identification techniques will result in anonymized data. Consider the risk of re-identification of a research participant, particularly if datasets are combined. If there is a reasonable risk of re-identification the information should be handled as PII (i.e. undertake risk analysis, evaluate stronger anonymization techniques, seek informed consent for the disclosure of data and explain its possible consequences)
Don’t include vulnerable participants or communities if their ability or capacity to provide voluntary informed consent is genuinely in question
Don’t underestimate the potential of quasi or indirect identifiers to identify an individual, particularly the inherent ability of location-based data to identify participants and their communities, and the increased risk of harm this may pose to potentially vulnerable individuals/communities
Avoid seeking overly broad consent that may call into question transparency or a research participant’s understanding regarding the use of their PII, be specific regarding the activities, purpose and limitations associated with PII so that the participant can make a genuinely informed decision and downstream users can evaluate purpose compatibility and seek fresh consent if needed

PLANNING AND APPROVAL

Develop a Data Management Plan which governs the handling of PII in the research project and beyond (DMP-PII). It should address:
- the type and nature of PII
- compliance requirements (including necessary forms for obtaining consent, and ethics clearance, if applicable)
- legitimate research objectives that will be advanced by the PII
- foreseeable risks and consequences if participants are identified from the data
- privacy protection measures (or lack thereof) for collection, storage, transfer and publishing
- process for obtaining informed consent
- timeframe or trigger for archiving or deletion of PII
Employ stricter standards for research involving vulnerable populations such as children or illiterate participants or sensitive data such as ethnicity or religious beliefs
Undertake due-diligence of datasets previously collected by you or third parties to ensure you are entitled/permitted to use for your research project
Consult the legal, IRB or ethics clearance committee or any other relevant institutional group for specific institutional, local, regional or national policies and regulatory frameworks that may apply to PII in the context of your work

Don’t leave the handling of PII and privacy protection as an after-thought, plan ahead!
Don’t forget to check local laws and donor or third-party requirements in addition to institutional policies governing research ethics and privacy protection (seek expert support if unsure!)
Don’t ignore ethical practices/standards, if your institution does not have an ethics framework or clearance process in place self-assess!
In assessing whether information is capable of identifying someone (i.e. PII) don’t limit your focus to direct identifiers, also consider indirect/quasi identifiers. Appreciate this will depend on the context of the research project, the data in question and external data which is or may become otherwise available (i.e. there is no exhaustive list).
In assessing risk of harm don’t forget to consider potential harm to the participant’s community or groups of individuals that can otherwise be identified or associated with the participant

Information and Data Management

Welcome!

Get in touch

CoP Lead

Administration

Engage with the Community

BE INFORMED

ENGAGE

CONTRIBUTE

Working Groups

News

Resources

Webinars

2021

2020

2019

2018

Documents

Newsletters

Communities

Interested in joining our community of practice?

Search the website

Discover agricultural data and publications

Powered by GARDIAN

Become a youth in data partner

Submit an initiative!

AgroFIMS: Your new companion for easy standardization of data collection and description

Responsible Data Management Guidelines to protect privacy

<img class="wp-image-93311 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/REUSE_arrow.png" alt="" width="100" height="100" />

REUSE / TRANSFER

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

ARCHIVING / DISCARDING

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class="wp-image-93312 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/rss-transparent-300x300px.png" alt="" width="100" height="100" />

PUBLISHING AND DISCOVERY

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class="wp-image-93295 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/data-analysis-icon.png" alt="" width="100" height="100" />

STORAGE AND ANALYSIS

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class=" wp-image-93249 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/data-collection-icon.png" alt="" width="100" height="75" />

COLLECTION

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />

<img class=" wp-image-93217 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/planning-icon.png" alt="" width="100" height="114" />

PLANNING AND APPROVAL

<img class="alignnone size-full wp-image-92805 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/tips-icon-orange-100px.png" alt="" width="100" height="100" />

<img class=" wp-image-93476 aligncenter" src="https://bigdata.cgiar.org/wp-content/uploads/2019/01/DONT-DO-ICON.png" alt="" width="100" height="100" />