Accepted Articles of Congress

  • Integrated Bioinformatics Analysis Identifies CDCA7 and DLGAP5 as Key Genes with NEAT1-Mediated lncRNA Regulation in Chronic-Phase CML

  • Ghazaleh Safaie,1 Samira Ameri Golestan,2 Najmeh Bagheri,3 Shiva Vaheb Hosseinabadi,4 Mansoureh Azadeh,5,*
    1. Zist Fanavari Novin Biotechnology Institute
    2. Zist Fanavari Novin Biotechnology Institute
    3. Zist Fanavari Novin Biotechnology Institute
    4. Zist Fanavari Novin Biotechnology Institute
    5. Zist Fanavari Novin Biotechnology Institute


  • Introduction: Chronic myeloid leukemia (CML) is a hematopoietic stem cell malignancy in which the number of granulocytes increases in the blood. In about 95% of CML patients, the BCR-ABL1 fusion gene is detected. This gene is caused by a reciprocal translocation between chromosomes 9 and 22, known as the Philadelphia chromosome, and translates into a tyrosine kinase that is continually active, which results in abnormal cell growth and survival. However, with the introduction of tyrosine kinase inhibitors (TKIs), the life expectancy of CML patients has increased considerably. This disease has three phases: chronic, accelerated, and blast, and while it can occur at any age, it is mostly diagnosed in older adults and is rare in children. Previous studies have proven the role of both coding genes and their mutations, as well as non-coding genes, such as microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), in the pathogenesis of CML. In this study, we used bioinformatics tools to identify genes that are involved in the chronic phase of CML, along with two classes of their regulatory elements (miRNAs and lncRNAs), and their related SNPs with clinical significance.
  • Methods: Microarray data analysis was conducted using GEO2R on a GEO dataset (GSE268456), which contained samples from chronic-phase CML patients and healthy donors. The gene list was first narrowed by defining a threshold for adjusted p-value and log₂FC figures, which were adj. p < 0.01 and |log₂FC| > 2. Next, protein–protein interactions (PPIs) of the filtered gene list were constructed, and genes with multiple PPIs and established roles in cancer were selected. Correlation analysis of the selected genes was then carried out using GEPIA2, and two significantly correlated genes were selected for further investigations. The Enrichr database was also queried to confirm the relevance of the selected genes. For each of the two selected genes, the following steps were carried out. First, gene–miRNA interactions were retrieved from miRTarBase, miRDB, and TargetScan. This was followed by identifying shared miRNAs across the three databases using Cytoscape. Subsequently, lncRNAs targeting the shared miRNAs were identified using LncBase (DIANA), and finally, common lncRNAs were determined through Cytoscape. The final step was investigating stop-gained mutations in the exon regions of the longest transcript of each gene in Ensembl. The clinical significance of these mutations was evaluated using the dbSNP and VarSome.
  • Results: The selected genes were CDCA7 (log₂FC = 2.03, adj. p = 0.00000104) and DLGAP5 (log₂FC = 2.03, adj. p = 0.0000115). A positive correlation of 0.83 (p = 0) was reported between these genes in GEPIA2. For DLGAP5, hsa-miR-409-5p and hsa-miR-33a-3p were the common miRNAs found in miRTarBase, miRDB, and TargetScan. On the other hand, nine common miRNAs, including hsa-miR-299-5p, hsa-miR-30b-5p, hsa-miR-3606-3p, hsa-miR-513a-3p, hsa-miR-513c-3p, hsa-miR-550a-3-5p, hsa-miR-550a-5p, hsa-miR-550b-2-5p, and hsa-miR-1271-3p were identified for CDCA7. lncRNA analysis revealed that NEAT1 targeted both identified miRNAs of DLGAP5. For CDCA7, three out of nine miRNAs (hsa-miR-3606-3p, hsa-miR-513a-3p, hsa-miR-513c-3p) were not found in LncBase (DIANA). Among the six remaining miRNAs, NEAT1 was the common lncRNA, with the exception of hsa-miR-299-5p and hsa-miR-550a-5p. Variant analysis of the longest transcript of CDCA7 (ENST00000911359.1) resulted in three clinically significant single-nucleotide variations (SNVs), namely rs1312796815, rs1686687425, and rs2106391563. While rs1686687425 and rs2106391563 were classified as variants of uncertain significance (VUS) in dbSNP and VarSome, rs1312796815 was considered a VUS only in dbSNP. For the DLGAP5 transcript (ENST00000247191.7), none of the detected stop-gained mutations were clinically significant.
  • Conclusion: In this study, two highly expressed genes in chronic-phase CML, CDCA7 and DLGAP5, were introduced, and NEAT1 was identified as a common lncRNA. Additionally, three stop-gained mutations classified as VUS were introduced for the CDCA7 transcript. It is worth noting that the role of DLGAP5 in CML has yet to be studied. Moreover, the regulatory role of NEAT1 for either gene has not been assessed in CML. These findings highlight novel gene–lncRNA regulatory interactions in CML and introduce variants that might contribute to the disease pathogenesis. Further in vivo and in vitro studies can validate these interactions and assess their potential roles in diagnosis (biomarkers) and treatment.
  • Keywords: chronic myeloid leukemia, miRNA, lncRNA, NEAT1, bioinformatics

Join the big family of Cancer Genetics and Genomics!