Predictive Models of Crop Phenotyping Maximize Future Harvests

By Lina Sorg

The global population has quadrupled over the last century and continues to rise, which places a heightened strain on food production and the overall agricultural process. Demands for more feed, fiber, and fuel conflict with rapidly-diminishing amounts of arable land and limited availability of water. To increase crop production, farmers across the world will have to either clear more land for agriculture or maximize productivity of existing arable soil via changes to fertilizer use, irrigation, and overall farming methods. Unfortunately, continuing to clear land for agricultural purposes is not without consequence, and the social and ecological tradeoffs are frequently high.

Achieving yield stability is challenging due to the weather variabilities associated with global climate change, such as droughts, floods, extreme temperatures, pests, and diseases. While the use of statistics has linearized grain yields, the research and development (R&D) costs of plant breeding are growing exponentially. “There has been an increase in the money spent on R&D and improvements, but the yield remains linear,” Patrick Schnable of Iowa State University said. “This is a case of diminishing returns.” In short, current yearly crop yields are not growing fast enough to satisfy the forecasted demand for food in coming years.

During a minisymposium presentation at the 2019 SIAM Conference on Computational Science and Engineering, which took place in Spokane, Wash., this week, Schnable presented novel predictive statistical models to forecast crop performance in diverse agronomic environments and maximize overall yield. “The yield of a crop is a function of the genetics, the environment, and the interaction of these things,” Schnable said. “If we have sufficient data, we should be able to build predictive models.”

Predictive models benefit the agricultural sector in the following ways:

Improve selection accuracy in plant breeding programs, thus increasing the yearly rate of genetic gain
Enhance researchers’ abilities to effectively breed crops that endure the increased weather variabilities resulting from climate change
Allow scientists to provide farmers with evidence-based recommendations regarding appropriate crop varieties to plant in certain fields under particular management practices, ultimately leading to increased yields and enhanced yield stability
Enable daily national and global yield predictions, thus enforcing early responses to food emergencies and avoiding market failures.

Patrick Schable employs HTP time-lapse photography—in the form of a Raspberry Pi microcomputer and a tripod—to observe plants' development and response to environmental stimuli in the field.

Creation of these predictive models requires a plethora of phenotype, genotype, and environmental practice data. “We need to do a better job of getting large amounts of phenotypic data from multiple environments,” Schnable said. His team is specifically interested in observing plants as they develop and respond to environmental stimuli. To do so, they employed HTP time-lapse photography—in the form of a Raspberry Pi microcomputer and a tripod—to develop and deploy a prototype. They placed their tool in fields with growing crops, took pictures at 10 to 15-minute increments, and stitched the resulting images together. This allowed them to watch plants move through their growth and developmental processes, including recovery from a wind storm.

Before drawing any biological insights, Schnable had to convert these images to numbers. The large size of the datasets demanded a quick, non-painful way to extract image features as training sets for machine learning techniques. He turned to Amazon Mechanical Turk, a crowdsourcing marketplace that allows individuals and businesses to outsource their projects. Schnable created a freely-accessible Qualtrics-based survey—which he linked to the crowdsourcing platform—with detailed instructions for the “turkers,” who varied significantly in geographical location. He asked these turkers to manually define the height of the plants in thousands of photographs.

Schnable grew 100 genotypes in various four-year locations, with camera plots at each location to measure the specifics of their growth and assess their ability to predict yield under irrigated and non-irrigated conditions, for example. “Ultimately, what we’ve done is developed a consortium of maize breeders across the country, each of which is growing different varieties,” he said. He and his colleagues collect this data to determine if a functional approach will create yield across multiple environments. Schnable presented a functional mixed-effects model (that accounts for the Amazon turkers) to generate functions for each genotype. Yield trials are expensive, so his eventual goal is to instead use the measured growth rate to determine yield.

In the long run, Schnable wants the process of feature identification—currently conducted by the turkers—to occur automatically. This involves creating a labelled dataset, using that dataset to train the model’s machine learning algorithm, and deploying the trained model and algorithm on the data to gauge its performance. Selecting a diverse set of images for the training model ensures the model’s wide applicability.

The underlying purpose of Schnable’s work is to build connections between plant scientists, engineers, and computational scientists. The predictive models necessary to do pose new means of data analysis and reveal the ways in which genotypes, phenotypes, and environmental practices interact to affect the agricultural sector. Researchers can use this information to help farmers manage their environments and maximize crop yield without having to field-test everything. “Predictive phenomics has the potential to address some of major challenges facing agriculture,” Schnable said.

Lina Sorg is the associate editor of SIAM News.