CMSE Special Presentation, John P. Lloyd - Defining genomic functionality using machine learning approaches

  • Mar 30, 2017

Name: John P. Lloyd
Affiliation: Department of Plant Biology, MSU

Date: April 4, 2017 (Tuesday)
Time: 9:30am - 10:30am
Venue: 1502 Engineering Building

Title: Defining genomic functionality using machine learning approaches
Since the beginning of the genomics era in biology, advances in technology have reduced the cost of sequencing a genome by five orders of magnitude. Active areas of research now focus on utilizing the vast quantities of genome sequence and functional genomics data to better understand the functional content within genomes. Toward this end, my research has focused on using machine learning approaches to address two critical questions: where are the functional regions within a genome and what functions are genes performing? The presence of biochemical activity occurring outside of known genes (known as intergenic transcription) indicates that there are likely functional regions in genomes that have escaped detection thus far. To identify likely-functional genome components among intergenic transcribed regions, I generated statistical learning models capable of distinguishing between known functional and non-functional genomic regions by integrating a diverse set of evolutionary, biochemical, and structural features. Once a putative functional region has been identified, the next question is what role is it performing in a cell? I also discuss a machine learning framework that was successful in integrating heterogeneous expression, conservation, duplication, and gene network features to predict the consequences of gene disruption, an important piece of evidence for uncovering gene function. Taken together, this work reinforces the effectiveness of machine learning techniques in addressing biological questions. Specifically, I highlight data integration-based approaches for defining functionality of genomic regions and inferring gene function.

Host: Arjun Krishnan (