Can big data increase food production?

If data can be used to predict the presidential outcome between Trump and Clinton, what potential could this hold for agricultural production? The answer is in our hands.

Photo by the European Space Agency

As people were tweeting about Clinton and Trump presidential debate, I was in my study room busy mining tweets.

Though Clinton seemed to have performed better in the debate, the tweets could tell that Trump will win the election.

I made a post on my Facebook account but my Facebook friends thought I was wrong. But it was not me talking, it was data!

If such a simple task can predict the future, what if we can apply the same knowledge to agricultural production?  I bet we could be the most food-secure generation that has ever existed.

Appropriate use of available data can help us visualize and make critical simple decisions that can accurately predict and determine the future food security and even provide warning signals.

But how do we make data widely and easily available in the first place?

Every second, thousands of bytes of valuable data are generated from several sources such as: posts from twitter, text messages, satellite images among others.

The collection of massive volumes of data has been made possible by the growth of social media, mobile technology, android applications, Google and the many other modern tools.

If these data sets are utilized well, there is enormous potential to stir innovation and creativity in agricultural production.

Most research organizations have already been collecting data for a long period of time. What if we encourage massive sharing of this data?

I can’t even picture the kind of predictive analytics and visualization this could bring about.

Development and sharing of R and Python libraries makes it easy to become a data scientist as more people are able to access and analyze varied data so as to determine various situations and make more informed decisions. 

Photo by Georgina Smith / CIAT

I wish that we aspired to share data in the same way we aspire to publish our research results in high impact journals.  

Unfortunately, our data is still in our C-drives and the few that we are willing to share are placed with restrictions or charges for the users before they can gain access.

I encourage organizations conducting research in agriculture to share their data freely and massively.  

As we share the data, we can have a clarification on what the recipients can do with the data.

This could be useful for those who want to make publications or for those who are looking for data to practice or kick off their training in Artificial Intelligence (AI) or Machine Learning (ML).

Data is nothing without users.

Sometimes we are too cautious in sharing our data. This may be because the data is not sufficiently accurate to be shared, it may cost too much to clean and transform the data, or, their may be a fear of harming the image of your organization by publishing low-quality data.

However, by sharing your data, comments from users will help in improving the quality of your future data.  

One idea is to share the code that was used to transform your raw data into cleaned data and then share the two versions of the data. We should, perhaps, be more tolerant of our data appearing in various platforms.

There are two possible ways of ensuring shared data gets to the right user at the right time.

First, we can make our data technically open.

By this, we mean making data available in machine-readable standard formats where data can be retrieved and meaningfully processed by a computer application.

Examples of data file formats that can be used include Json, XML, RDF, spread sheets, comma separated values, text document, plain text, scanned images, HTML among others.

Technically, open data means that the data is easily accessible to its intended audience. For example, if the data is intended for programmers, it should be available within an application programming interface (API).

The second way is to make the data we share legally open.

It should be licensed in a way that it can permit commercial and non-commercial use of the data without restrictions. It is important to attach proactive license such that users don’t have to request using your data as long as you are acknowledged.

As a young data scientist I strongly believe that if we can share the data we hold in our laptops, and take advantage of the daily generated data, we can have more nutritious food in our plates.

The answer is all in our hands.

Apr 11, 2018

Chris Mwungu

Research Associate for the Climate Change Adaptation and Farming Systems, International Center for Tropical Agriculture (CIAT)

Nairobi, Kenya

Chris Mwungu is a Research Associate at CIAT. He has a passion for using big data to enhance effective decision making in agricultural production.

Blog Competition Entry

This article is published as a part of our publicly open big data blog competition. If you have enjoyed this reading this entry, you can vote by liking, commenting or sharing.