Logo Utrecht University

Methodology & Statistics Data Science Lab

Agenda

20 September 2018
15:00 - 16:00
Sjoerd Groenmangebouw B1.09

Jeanine Houwing-Duistermaat: Partial Least Square methods for omics data sets.

Thursday 20/09/2018 at 15:00 in room B1.09

 

You can now find Jeanine’s ppt here. The new academic year has started and so will the MSDSlab meetings. Kicking of is professor Jeanine Houwing-Duistermaat, Chair in Data Analytics and Statistics of the School of Mathematics in the University of Leeds who will present on Partial Least Square methods for omics data sets on the  20th  of September. You can find her full abstract below and watch an amusing video on the topic here.

 

Abstract
The availability of large omics datasets in epidemiological and clinical studies provides many opportunities for research in statistical bioinformatics. The hope is that the abundance of information will provide better understanding of underlying disease mechanisms and accurate prediction models enabling patient targeted screening and treatment. Statistical challenges are to deal with data cleaning, heterogeneity across omic datasets, high dimensionality, data integration and the presence of high correlation within and between datasets (Morris et al, 2017; Houwing-Duistermaat et al, 2017). In this talk I will present Partial Least Squares (PLS) methods for multivariate regression and for data integration and dimension reduction when analysing several omics datasets simultaneously.

Three PLS type of methods for omics analysis will be considered namely the standard PLS algorithm (Wold, 1972), Envelope (Cook et al, 2015) and our recently developed Probabilistic PLS (PPLS) (Bouhaddani et al, 2018). Envelope and PPLS are maximum likelihood methods. PLS and PPLS can deal with high dimensions while Envelope requires n larger than p. PPLS maximizes a constrained log likelihood to ensure that the solution is unique.  The methods will be illustrated with several data examples. The results of simulation studies to compare their performances will be shown.