OrganizeSupport and improve data generation, access, and management in CGIAR
The Platform embraces the power of big data analytics, supporting CGIAR as it becomes a leader in generating actionable data-driven insights for stakeholders.
It builds capacity throughout CGIAR to generate and manage big data, assisting CGIAR and its partners’ efforts to comply with open access / open data principles to unlock important research and datasets. It empowers researchers to strengthen data analytical capacity, developing practical big data tools and services in a coordinated way. It also addresses critical gaps, both organizational and technical, expanding the horizon of CGIAR research.
Data quality at the source
The Data Management Strategy is based on three pillars: establishing a process, supporting compliance, and enabling a data culture in alignment with the CGIAR Open Access and Data Management (OADM) Policy. In addition, the Big Data Platform has invested in an online tool (GARDIAN) that enables users to easily search and discover open datasets and publications across databases at all CGIAR Centers, with the intention of making this a key mechanism for monitoring and measuring compliance with the CGIAR open access policy. Some key guiding principles of the plan include:
- In accordance with the CGIAR OADM Policy, the Big Data Platform is mandated to produce international public goods and ensure that these are open via FAIR principles – that is, the data are Findable, Accessible, Interoperable and Reusable. This enables the data to be used to enhance innovation, impact, and uptake.
- The Big Data Platform also provides data managers at Centers a Data Management Support Pack. This tool was designed to help the research community produce high quality, reusable, and open data from research activities. It consists of documents, templates, and videos covering a range of aspects related to data management and interoperability, ranging from overarching concepts and strategies through to day-to-day activities.
- The Big Data Platform coordinates and supports a monthly webinar series and a number of cross-Center groups and Communities of Practice. These activities are designed to support the management and “FAIRification” of information resources and has one related Community of Practice on Ontologies that helps to classify agronomic and breeding concepts and knowledge.
To monitor and accelerate CGIAR Research Centers’ progress towards making their data Findable, Accessible, Interoperable and Reusable (FAIR) the Platform developed and launched a robust prototype of the first pan-CGIAR data search tool, enabling any user to do keyword searches and discover CGIAR publications and datasets.
This data harvesting tool has been named the Global Agriculture Research Data Innovation and Acceleration Network (GARDIAN).
As of August 2018 GARDIAN showcased more than 93,000 publications and 2,100 datasets; it continues to grow, with several new features planned for 2018.
How GARDIAN changes the way we search and find data
Starting from machine-readable structured data, available on the different CGIAR center repositories with an open license, and achieving FAIRness at the metadata level is only the first step towards the grander picture of deriving bigger, actionable knowledge and value from shared research data!
- By using data mining for discovering the meaning (semantics) of data,
- and expressing them as Semantic Web resources,
- and reusing established specifications (e.g. W3C RDF).
Privacy & Ethics Guidelines
While the enthusiasm for data sharing grows, we have been working to ensure that data sharing and use comply with ethical standards that protect those who could be vulnerable to exploitation. An ongoing predicament is how to protect private farm and farmer data while being able to provide them with valuable personalised solutions. To mitigate this, the Platform has been working towards developing high-level risk assessments and guidelines for the Platform and the CGIAR System as a whole.
In 2017 the Platform engaged a lawyer to survey the privacy and ethics frameworks of all Centers as well as external partners. In 2018 we completed surveys of each of the 15 centers’ privacy and ethics standards and are in the final stages of developing a set of guidelines to help researchers navigate the evolving implications of technology, confidentiality, intellectual property, consent, access and sharing of benefits.
This work will be built on over the coming year, to produce guidelines and actionable support for Centers and others in the sector.
Learning and Capacity Building
The Platform is investing in learning and capacity building initiatives to accelerate data sharing as well as key analytic capabilities across the CGIAR. These are being delivered over a combination of in-person and online channels to raise the awareness and capability of centers to share data and to use emerging data analysis techniques.
As a result of allocating a series of grants to each of our 15 partner centers, we have seen a series of inspiring trends emerge as each center implements strategies to mobilise data.
- Increased investment in developing data repositories and software infrastructure to build open data sharing and storage capabilities
- Investment in new staff roles for data curation, collection, and analysis, while providing additional training for current staff
- Reallocation of staff resources to collect, store, and unlock data.
The Platform hosts monthly webinars for CGIAR members, discussing how to engage and improve upon capacity building initiatives. Contact firstname.lastname@example.org for more details.