ABSTRACT

As more and more scientific domains are collecting vast troves of data, we rely on machine learning techniques to analyze the data and help make data-driven scientific discoveries. In this talk, I will discuss how machine learning has been used to advance science. But, we pause to ask, are these data-driven discoveries reproducible? And, how can we use machine learning to draw reliable scientific conclusions? I will discuss these questions by giving examples from my own research, including an extended example on clustering. Additionally, I will outline both new research directions and offer practical advice for improving the reliability and reproducibility of data-driven discoveries.

About the Speaker:

Genevera Allen is an Associate Professor of Electrical and Computer Engineering, Statistics and Computer Science at Rice University and an investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital and Baylor College of Medicine. She is also the Founder and Faculty Director of the Rice Center for Transforming Data to Knowledge, informally called the Rice D2K Lab.

Dr. Allen's research focuses on developing statistical machine learning tools to help scientists make reproducible data-driven discoveries. Her work lies in the areas of interpretable machine learning, optimization, data integration, modern multivariate analysis, and graphical models with applications in neuroscience and bioinformatics. Dr. Allen is the recipient of several honors including a National Science Foundation Career award, the George R. Brown School of Engineering's Research and Teaching Excellence Award at Rice University, and in 2014, she was named to the "Forbes '30 under 30': Science and Healthcare" list. Dr. Allen received her PhD in statistics from Stanford University (2010), under the mentorship of Prof. Robert Tibshirani, and her bachelors, also in statistics, from Rice University (2006).

Return to Schedule