Women in Data Science

Featuring Elizabeth Arnaud, Data Scientist (Alliance of Bioversity and CIAT)

Elizabeth Arnaud

Elizabeth Arnaud

Data scientist (Bioversity International)

Currently, I am leading a project on the development of semantic standards for the harmonization of crop breeding and agronomic data for agriculture. The objective is to support data interoperability along the data exchange pipeline, from data capture to data publishing and analysis. This implies to model the variables measured in the field or laboratory with electronic field books. Breeding and agronomic data must be linked to the germplasm information, be associated with environmental observations, with a well-described experiment in order to enable multi-scale data integration. I also coordinate the Bioversity International activities linked to the Platform for Big Data in Agriculture. I chair the CGIAR Ontology working group and the Ontology Community of Practice of the Big Data.

How did you come to pursue a career in data science?
I am a biologist who obtained a master degree in scientific data management after a student job at the library of my University. During this job, I developed an interest in informatics tools that could facilitate the data management and access. While studying my master degree, I had to design a taxon management system to support the development and maintenance of taxonomies. I joined agricultural research for development and, with this background, I led several projects around the digitization of data on plant resources conserved in collections. I was involved in diversity studies of Banana through the mapping of geospatial occurrence data.

Early in my CGIAR career, I was involved in the design of solutions for the digital capture of quality field data and their storage into information systems that I coordinated. These included the Musa Germplasm Information System, the CGIAR system-wide information network on genetic resources, and the collected crops sample database. I also became a mediator or intermediate between the agronomists, breeders, and the database developers.

What are the things you love about your role as a data scientist?
Solving issues about data quality that hampers their publishing and re-use for analysis. I love facilitating the interaction between scientists and the software developers so they can brainstorm together to find a solution to their data problems. I enjoy organizing training workshops, enabling scientists, data managers, and developers to meet. It stimulates discussion and exchange of expertise, thus facilitating the adoption of the data management tools by scientists.

What has been the most exciting project you have worked on?
I consider the Crop Ontology project as a successful project because it produced series of outputs contributing to a change in practice for agricultural data management. Crop Ontology is the result of a close collaboration over years with the CGIAR centers and their partners, with projects from the Integrated Breeding Platform, Planteome, and now with the Platform for Big Data in Agriculture and the adoption of the ontology by key agricultural or bioinformatics institutions. The project led to the creation of a Community of Practice, mixing public and private partners, that meets every two years, taking many opportunities for discussions in between.

Do you feel this is an exciting time to be a woman working in the data sector?
Yes, because of the challenge of extracting information and knowledge from multidisciplinary big data to support agriculture for development and provide quality services to farmers.

Why do you think there is a lack of women in Tech/Data sectors?
The percentage of women in data and informatics sector has increased over the last decade, and many women holding a Ph.D. degree perform data curation and data science. However, women are still under-represented, particularly in data and informatics for agriculture. This is probably due to the fact that the wide gender gap, created over decades, in Informatics research and development takes time to compensate. Although, with the development of the genetic/genomic research, high throughput data generation with sensors, and the integration of the farmer gender perspective in the data collect and analysis, agricultural data science has become a more attractive option women scientists. My perception is that data science and the courses for becoming a data scientist are not promoted enough to women interested in a scientific career.

As a woman, did you face any obstacles/challenges in your career pathway to become a data scientist?
It was a challenge at times to gain the trust of some my colleagues and for them to have confidence in my skills to lead an informatics and data management team which were mainly composed of men. I must admit that however, I received some support from both male and female colleagues to access to such a leadership role.

Why should more women get into data science?
Because there have been recent developments to address big data challenges, with the integration of new technologies for generating data for agriculture to provide mobile services to farmers. Informatics technologies and mobile services are successful when they match a farmer’s needs and comply with their daily habits, or when they can solve their daily problems and give support. Women can bring a different perspective to the data collection and analysis, as well as an innovative spirit that complements the approach adopted by men on the team.

How could more women working in data, specifically for the development/agriculture sector benefit the sector?
Women farmers express specific needs that, according to the cultural context, can perhaps be better captured and analyzed by a woman.

What advice would you give to women wanting to get into data science?
Do not hesitate! Embrace a data scientist career and get trained early into the cutting edge technologies where you can bring a different perspective, and rapidly envisage your career in a leadership position.