Yichen Sun
Program: Computational and Systems Biology
Current advisor: Yin Cao, ScD, MPH
Undergraduate university: University of California-Irvine, 2018
Enrollment year: 2023
Research summary
The molecular landscape and tumor evolutionary trajectories of early-onset colorectal cancer (EOCRC)
STATEMENT OF HYPOTHESES AND SPECIFIC AIMS
Colorectal cancer (CRC) is the second leading cause of cancer-related death in the US, and its incidence is rising among younger adults. Early-onset colorectal cancer (EOCRC), defined as diagnosis before age 50, represents a growing global health challenge. Although recent genomic studies have provided emerging evidence that EOCRC is molecularly distinct from late-onset CRC, the mechanisms underlying EOCRC initiation and progression remain unclear. There is an urgent need to investigate the etiology of EOCRC and to determine whether its evolutionary trajectory differs from that of late-onset disease. Previous studies in CRC have identified both “big-bang” and selection-driven evolutionary models with varying distribution pattern across tumors; however, with limited direct evidence, the evolutionary dynamics of EOCRC remain largely unexamined. Leveraging high-throughput sequencing technologies, examining molecular alterations across adjacent normal tissue, EOCRC precursors (adenomas), and EOCRC carcinomas is critical for characterizing evolutionary dynamics in tumor development. Integrating evolutionary models with key macroevolutionary events may provide new insights to define tumor “age”, clarify mechanisms underlying its fast tumor development and establishes a conceptual timeline for EOCRC tumor progression.
The overarching goal of my thesis is to examine the molecular landscape and tumor evolutionary trajectories of early-onset colorectal cancer (EOCRC). Our central hypothesis is that EOCRC is molecularly distinct from late-onset CRC and develops through both “big-bang” and selection-driven evolutionary models. To test this hypothesis, we will first leverage the genomic data from Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO), a global consortium of largely population-based studies and detailed clinical annotations, to characterize the mutational landscapes of CRC across age spectrum. We will then extend molecular findings through deep sequencing of EOCRC tumors and precursor tissues with matched normal to capture subclonal architectures, mutational timing, and evolutionary dynamics that cannot be resolved at the resolution of targeted sequencing. We will apply state-of-the-art computational approaches and novel analytical frameworks to characterize the EOCRC mutational landscape and model tumor evolutionary trajectories. By integrating sequencing and clinical data, we aim to identify key macroevolutionary events and use them to infer the timing of EOCRC progression. Our specific aims are:
Aim 1. Determine mutational and pathway differences associated with age in EOCRC
1A. Characterize mutational patterns across the age spectrum of EOCRC.
1B. Assess pathway alterations with involved genes stratified by tumor site and stage.
We will analyze a total of 298 genes from targeted panels in 6,006 CRC patients from the GECCO consortium, including 651 with EOCRC. Mutation frequencies will be evaluated across tumor characteristics including birth cohort, sex, stage, anatomic site, and geographic location. Associations between gene and pathway mutations and age were estimated using logistic regression models stratified by hypermutation status, anatomic site, and stage. These analyses will identify candidate genes and pathways that elucidate age-associated molecular patterns in EOCRC tumors.
Aim 2. Define tumor evolutionary models and macroevolutionary events in EOCRC
2A. Test whether EOCRC tumors evolve through a “big-bang” model or a selection-driven model
2B. Identify whole-genome doubling (WGD) as a macroevolutionary anchor event and determine the timing of key somatic alterations relative to WGD
We will collect biospecimens from EOCRC patients through the Gastrointestinal (GI) department at Washington University School of Medicine, including blood, normal adjacent tissue, and adenoma/carcinoma samples, for whole-genome sequencing (WGS). We will integrate deep sequencing data with computational modeling to construct tumor evolutionary trajectories, test competing evolutionary models, and use WGD as a potential anchor to time tumor progression in EOCRC.
Impact: This work will advance our understanding of the molecular and evolutionary mechanisms underlying EOCRC. By integrating population-based genomic data with deep sequencing and evolutionary modeling, this project will clarify whether EOCRC follows distinct progression mechanisms compared with late-onset CRC. The findings will help identify risk factors and biomarkers linked to relative distribution of selection-driven and “big-bang” models and macroevolutionary events in EOCRC tumors, thereby improving risk stratification and enabling earlier detection strategies.
Graduate publications