Accepted Articles of Congress

  • Mapping the Dark Genome: How Non Coding Drivers Fuel Cancer

  • Zahra Ghanibeygi,1,*
    1. Department of Biology, Faculty of Biology, Falavarjan Branch, Islamic Azad University, Isfahan, Iran


  • Introduction: The advent of large scale pan-cancer genomics initiatives has shown definitively that a significant number of oncogenic driver events occur in the non coding genome. These cis regulatory elements, including enhancers, promoters, insulators, and non coding RNA genes, allow for the precise spatiotemporal control of gene expression, and disruption of the complex circuitry of cis regulation by genetic and epigenetic means is a hallmark of cancer. In this review, we show how bioinformatics allows for the systematic identification and characterization of the functional roles of these noncoding regulatory drivers of oncogenesis across a wide range of cancers by broadening the scope of our study from the exome, and in turn allowing for a more comprehensive understanding of what oncogenesis encompasses.
  • Methods: This systematic review outlines the conclusions from significant publications which conducted pan cancer studies using data through large consortia such as TCGA and ICGC. We do highlight some of the analytical methods central to this area of study: 1) Variant Effect Prediction: Tools such as FunSeq2 and LARVA reasonably incorporate evolutionary conservation, chromatin state (e.g. ENCODE epigenomic marks), and transcriptional regulatory annotations together to rank functionally-disruptive non coding variants. 2) Epigenomic Integration: There are algorithms to assess somatic epimutations (e.g. aberrant methylation) and connect them to somatic alterations to chromatin accessibility (through ATAC-seq) or histone modifications (through ChIP-seq) to find aberrantly regulated regulatory elements. 3) Network Analysis: There are algorithms that allow the establishment of gene regulatory networks to predict target genes of non-coding alterations; even predicted targets that would be mediated by long non coding RNAs (lncRNAs) or microRNAs, and map the affected genes back into the relevant context of the core oncogenic pathways.
  • Results: Pan cancer bioinformatic analyses have identified non coding driver events that are recurrently seen across malignancies on a systematic basis. The large scale studies have reported several key findings. The first major recurring non coding driver event is recurrently mutant TERT promoter mutation occurring in several malignancies, TERT is the catalytic subunit of the telomerase protein, which adds telomeric repeat sequence DNA to the ends of chromosomes, promoting replicative immortality. This leads to a situations where abnormal cell proliferation occurs due to the lack of cell senescence. Second, somatic structural variants driving "enhancer hijacking" or enhanced oncogenes through the disruption of topologically associating domains (TADs) occurred across several malignancies (ex. GFI1 in medulloblastomas). Third, hypermethylation of tumor suppressor gene promoters and subsequent epigenetic silencing occurred for genes that had been previously identified via expression microarray studies. In addition to identifying recurrent noncoding driver events across malignancies, the studies concluded that non-coding drivers frequently demonstrate tissue specificity due to the cell type specific epigenetic scaffold surrounding the driver; and converge on common evasion of growth suppression or sustain proliferative signaling hallmarks of cancer (ex. TERT and the CDKN2A methylation events and their effects on lifespan of cancer exclusively colonizing multiple tissues). The reviews have made progress in establishing the role of bioinformatics in building a catalog of oncogenic non coding RNAs and their associated expression, suppression, and ceRNA networks (ex. lncRNA PVT1).
  • Conclusion: The bioinformatics aspect was essential in advancing from a protein view of cancer genetics to also considering the regulatory genome. We were able to track down new non coding drivers by adding in data sources via high throughput methods and conducting analysis and interpretive analysis for integrative multi omic data sources that enhanced our understanding of tumorigenesis and found more heterogeneity in cancer types. Our issue now becomes designing broad model to clinician workflows and mapping causation from correlation in an attempt to facilitate clinical practice with new biomarkers and targets to clear up the clinical complexity. The next steps will certainly consist of more sophisticated algorithms that will provide insight into the non coding genome that also concisely integrates a single cell and spatial approach into the tumor microenvironment.
  • Keywords: Non Coding Genome, Pan Cancer Analysis, Regulatory Drivers

Join the big family of Cancer Genetics and Genomics!