Statistics and Geometry for biological systems

Susan Holmes (Stanford)

Abstract:

Distances are an essential component of modern multivariate statistics and bioinformatics. One can do statistics on complex heterogeneous objects such as trees, networks, tensors, shapes and images. However geometry is not enough as the real data are never uniformly distributed on latent manifolds but occur with varying densities which are hard to capture when the data are sparse. Using prior information one can incorporate data and construct posterior distributions along nonlinear dimensions and provide meaningful approximations to complex data even in non-Euclidean settings.I will provide examples of using both mathematical and computational tools to understand trajectories followed by the human microbiome and  even an understanding of how  food ingredients are shared across the world.This contains joint work  with my past lab members: Lan Huong Nguyen, Elisabeth Purdom, Christof Seiler, Nina Miolane, Claire Donnat, Kris Sankaran and Laura Symul.