2022 Recipient Sohrab Shah, PhD

Dr. Sohrab Shah

Sohrab Shah, PhD

Inferring Mutational Processes and Patient Stratification From Standard-of-Care Clinical Imaging

Project Summary

Women with high grade serous ovarian cancer (HGSOC) represent 70% of all ovarian cancers and suffer high morbidities and poor response to standard of care treatment. The five year survival rate after diagnosis is <50%. Recently our team and others have discovered that patients’ can be categorized into different groups based on changes in their tumor genomes and that these subgroups differ in response to therapy and overall survival. Tumors defective in their ability to repair DNA are more susceptible to treatment and patients tend to fare better. In contrast, tumors with an intact DNA repair system, but more complex rearrangements of their genomes are better able to withstand chemotherapy and as a result these patients have worse outcome. This stratification is based on genome sequencing, a method that is not yet standard of care at every health care institution and still relatively costly. We therefore plan to use data created in the course of routine clinical care, such as histopathological (H&E) slides of tumor tissue, computed tomography (CT) scans and clinical information to improve the prediction of patient outcome. Given the size and complexity of these images, we will employ deep-learning Artificial Intelligence (AI) approaches for our analysis. Our first aim will be to collect and annotate the different datasets to ensure they are AI-ready. We have identified a cohort of 1,400 HGSOC patients with H&E slides, CT scans, genome sequence data and clinical information from their treatment at MSK, and we will create a data infrastructure that allows us to manage, annotate and integrate these data securely and anonymously. We will then develop AI-based models to extract features from the images that can infer mutational signatures and predict outcomes. Our preliminary data show that each data type contains features that make such predictions possible, however, since these data measure the properties of a tumor at very different scales: from the tissue-level CT scan, to the cell-level H&E slide and the molecular-level genome sequences, they each contain unique information. To optimize this complementary content, we will design an AI-approach that integrates the different data modalities into a unified model that will enable more accurate stratification of patients into accurate risk groups that are informative for therapy. We anticipate that improved patient stratification will lay the groundwork for identifying which patients are best suited to chemo and immune therapeutic strategies, and which patients may even benefit from more investigative, new treatments. Upon completion of the work we will make all de-identified data and AI models publicly available. This will increase the AI-ready data in the public domain by an order of magnitude and likely prompt further research by other groups which ultimately will advance our knowledge of ovarian cancer and improve patient outcomes.


Sohrab Shah was appointed to MSK in Apr 2018 as the inaugural Chief of the Computational Oncology Service and is the incumbent of the Nicholls-Biondi Chair. He received his BSc degree in biology from the Queens University in Ontario Canada and his BSc and MSc in computer science from the University of British Columbia. He obtained a PhD in computer science from the University of British Columbia in 2008 and was appointed as a Principal Investigator to The British Columbia Cancer Agency and the University of British Columbia in 2010 where he developed the roots of his research program. He is a University of British Columbia Killam laureate and a Susan G. Komen Foundation Scholar. His research focuses on understanding how tumors evolve over time through integrative approaches involving genomics and computational modeling. He has made seminal contributions to understanding the clonal evolution of ovarian cancers and has discovered that specific mutational patterns in the genomes of ovarian cancers are prognostic. Dr. Shah has also pioneered computational methods for identifying mutations in cancer genomes as well as deciphering patterns of cancer evolution. He has led the development of a novel experimental platform for single cell genome analysis as well as novel statistical models, algorithms, and computational approaches to analyze large, high dimensional genomics and transcriptomic data sets, from both patient tumors and model systems. These resources have led to recent progress in molecular profiling of cancer cells at single cell resolution. Dr. Shah has been at the forefront of studying tumor evolution in breast, ovary and lymphoid malignancies. His work has been published in many of the major scholarly journals. In 2018 Dr. Shah was a Highly Cited Researchers with Clarivate Analytics. He is widely regarded as a leader and his appointment has greatly enhanced the reputation of MSK as a leading force in the field of computational oncology.