Dr. Andrew Roth
Senior Postdoctoral Research Scientist, Big Data Institute, Department of Statistics and Ludwig Institute of Cancer Research, University of Oxford
Bayesian Methods for Inferring Cancer Phylogenies
Thursday, March 22nd
11:15am – 12:15pm
E2-599
Abstract
Cancer is an evolutionary process. Accumulation of genomic mutations coupled with the effects of genetic drift and selection lead to divergent clonal populations of cancer cells in a tumour. High throughput sequencing (HTS) of both bulk tissue and single cells offers a powerful tool to study this diversity, and opens the possibility of reconstructing the evolutionary history of tumours. In particular, it is now possible to reconstruct the phylogeny (evolutionary tree) of extant clones in a tumour. Understanding the phylogeny of clonal populations can provide insight into the ontogeny of a tumour, mechanisms of metastasis, and modes of therapeutic resistance. However, inferring phylogenies using HTS is challenging due to issues such as admixed populations in bulk sequencing and noisy measurements in single cell experiments.
I will present three Bayesian methods which leverage data from different HTS assays to provide complementary information about the population structure and phylogeny of clones in a tumour.
First, I will discuss the PyClone non-parametric Bayesian model which uses bulk sequencing data to infer what proportion of cells in a biopsy sample harbour a mutation, and which mutations originate at the same point in the evolutionary history of tumour
[1]. I will present current work on scaling PyClone to whole genome scale data using recently developed statistical inference methods based on conditional Sequential Monte Carlo sampling
[2]. I will also discuss the PhyClone model, a successor of PyClone, which attempts to explicitly model the clonal phylogeny using a novel non-parametric Bayesian process. Second, I will present the single cell genotyper (SCG) model which can be used to analyse single cell sequencing data of known point mutations
[3]. The model accounts for several sources of noise, including doublet cells and allele drop-out. This model allows for robust inference of clonal genotypes, which in turn can be used as inputs to classical phylogenetic algorithms. Finally, I will consider the problem of mutation loss and present a novel phylogenetic model based on the Stochastic Dollo process for inference of lost mutations. I will show how using this approach, coupled with the PyClone and SCG models, the migration of clones in the peritoneal cavity of patients with High Grade Serous Ovarian Cancer can be tracked [4].
Time permitting, I will finish with a discussion of some preliminary ideas on connecting clonal phylogeny reconstructions from bulk sequencing to single cell RNA-Seq data. This approach could pave the way to connecting clonal genotypes to phenotypes. I will also introduce a new Bayesian non-parametric method for performing pan-cancer stratification of cancer types using bulk RNA-Seq.
[1] Roth et al., PyClone: statistical inference of clonal population structure in cancer , Nature Methods, 2014
[2] Bouchard-Côté, Doucet and Roth, Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models , Journal of Machine Learning Research, 2017
[3] Roth et al., Clonal genotype and population structure inference from single-cell tumor sequencing, Nature Methods, 2016
[4] McPherson & Roth et al., Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer, Nature Genetics, 2016
Hosted by David Haussler
*******************************************
To accommodate a disability, please contact Ben Coffey at the UC Santa Cruz Genomics Institute (becoffey@ucsc.edu, 831-459-1477).