Making Data-Driven Discoveries Possible

Carolyn Lawrence-Dill connects researchers and increases the shelf life of data. Her efforts make research information generated by scientists available to other scientists, increasing the impact of their work
over time.

“My group’s work supports scientists who are doing research in the field or lab. It’s very rewarding—you can feel like you’ve saved hundreds of thousands of hours for people who are trying to solve a hard problem,” says Lawrence-Dill. “But the aspect of my job I enjoy the most is creating information management systems that allow large groups of scientists to work together.”

Lawrence-Dill is an associate professor 
in the Department of Genetics, Development and Cell Biology and in the Department
of Agronomy. She’s a Plant Sciences Institute Scholar, charged with advancing the field of predictive phenomics—an area of biology that measures physical and biochemical traits of plants as they change in response
to genetic and environmental influences. Her work involves creating systems to manage complex datasets to support the development of better predictive models for plant breeding and genetics research.

After reading a recent journal article that began by describing people who make a living off the data that other people generate as “research parasites,” she and her colleagues began jokingly calling each other “parasites.” But, Lawrence-Dill and the article authors concede that, in truth, they have a symbiotic relationship.

“You can’t imagine all of the possible research questions that could be answered with access to large scale datasets,” says Lawrence-Dill.

Deemed a strategic priority by Iowa State University, Lawrence-Dill, Asheesh Singh, assistant professor of agronomy, and Baskar Ganapathysubramanian, associate professor of mechanical engineering, were awarded $750,000 as part of the university’s Presidential Initiative for Interdisciplinary Research earlier this year. Their research proposal, “Data Driven Discoveries for Agricultural Innovation,” led by Lawrence- Dill brings together more than 20 faculty members from across campus to make significant strides in the collection, management, interpretation and use
of data related to agriculture.

Joe Colletti, senior associate dean
and associate director of the experiment station in the College of Agriculture and Life Sciences, says Lawrence-Dill has expertise in computational biology and bioinformatics, is very collaborative and knows how to build productive teams.

“She is able to knit together the right people, at the right time, in the right way here at Iowa State, nationally and internationally,” he says. “More importantly, she understands the connection between statistical methodologies and biological sciences—she can see the big picture.
 She really understands how to create tools for researchers and make data accessible through the right databases so that more value can be obtained from those data.”

Lawrence-Dill and her team members have three active projects working with scientists in various disciplines across campus. She offers a quick summary in her own words:

Genomes to Fields (G2F)

“For one of the major G2F projects we have more than 20 locations—20 different universities and federal entities growing the same inbred and hybrid corn lines.
 If we can measure how the same plants grow in lots of different environments, and we get the genome sequences for those plants, we can start looking at how different components of the genome affect growth and development and other traits based on the environment. Once we get
a dataset where we can see how those things relate, we hope to be able to predict how it’s going to grow in current environments and future environments as the climate changes. My work for G2F is to put together an information management platform that enables collaborators on
the project to access genetics, weather
and phenotype data, which includes things like plant height, ear height and yield.” (Read more about G2F on page 26.)

Enviratron

“The Enviratron involves eight different plant environments where a robot collects images and other data from plants. What I’m working on is to make sure the infor- mation collected is tagged with the right metadata (information that enables re- searchers to understand, discover and re-use the data). We are documenting, for other scientists, what the environments were and what the experimental design was, which includes things like how
often you took measurements, what the temperature treatment was, the light treatment, and so on. What we’re putting together is a system that will enable other scientists to find and use the data collected using the Enviratron system.” (Read more about the Enviratron on page 25.)

Genome Editing Design Tool

“The tool we developed enables scientists to predict all the locations in a genome that can be edited. It is specific to the genome sequences recognized by the CRISPR-Cas9 system, which is the coolest thing going on in molecular biology right now. With our tool you could take any genome and identify the sites in a gene of interest that could be manipulated using CRISPR-Cas9. Then in the lab
the CRISPR-Cas9 system can be used
to change the DNA sequence identified (such changes could improve agronomic traits). Our design tool works on soybean, peanut, corn, rice—we can load it with any genome, but so far we’ve been specifically working with plants.”