Introduction
Prostate cancer ranks as the predominant malignancy affecting males in Western developed nations, with the second highest mortality rate after lung cancer1. Androgen deprivation therapy (ADT) represents the cornerstone treatment for intermediate and advanced prostate cancer2. However, nearly all patients develop resistance to ADT within 18–36months, resulting in castration-resistant prostate cancer (CRPC)3, characterized by a median survival of approximately 13 months4.
Timely identification of biomarkers associated with the onset and progression of prostate cancer, combined with early intervention, has the potential to improve patient quality of life and extend survival. There is an urgent need for reliable biomarkers to predict the prognosis of prostate cancer patients and identify potential therapeutic targets. Due to the heterogeneous nature of prostate cancer, traditional single prognostic markers often lack sufficient predictive accuracy. Thus, it is essential to identify novel biomarkers and develop effective prognostic models to enhance the accuracy of prognosis prediction for prostate cancer patients.
In this study, we identified common differentially expressed genes in prostate cancer tissues, castration-resistant prostate cancer (CRPC) tissues, and benign tissues using datasets from the NCBI Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) repositories. We analyzed their association with clinical characteristics and prognostic outcomes and constructed a new prognostic model based on these findings. This research aims to establish a theoretical foundation for predicting disease progression and guiding precision treatment strategies in prostate cancer.
Information and methodology
Data acquisition
mRNA microarray data from the GEO database (https://www.ncbi.nlm.nih.gov) were obtained for the training set, which comprised samples of normal prostate tissue, prostate cancer tissue, and castration-resistant prostate cancer tissue from the GSE35988 dataset5. This dataset included 12 samples of normal prostate tissue, 49 samples of prostate cancer tissue, and 27 samples of castration-resistant prostate cancer tissue. Additionally, the GSE661876 dataset was utilized as the validation set, consisting of 24 LuCaP-PCa xenografts and 71 CRPC metastatic tumors.
Identification of shared differentially expressed genes
We conducted differential expression analysis separately for normal tissues versus prostate cancer tissues and prostate cancer tissues versus Castration-Resistant Prostate Cancer tissues using the limma package7 on the GSE35988 dataset. Significant DEGs were obtained after setting the screening criteria (p.adj < 0.01, |log2FC|> 1), where log2FC > 1 was set as “up” for up-regulated genes among differentially expressed genes, and log2FC < -1 was set as “down” for down-regulated genes among differentially expressed genes. After identifying significantly differentially expressed genes, ggplot2 was loaded to plot the volcano map of the dataset, and the heatmap package was loaded to obtain the corresponding heatmap of gene expression. The computational principle of the Venn diagram was employed to identify commonly dysregulated genes across the two stages of progression.
Functional analysis of differential genes: GO and KEGG signaling pathway analysis
The enrichplot R packages and clusterProfiler R packages were utilized for enrichment analysis and visualization of functional analysis results. Figures were generated using the barplot R packages and dotplot R packages8.
Establishment of risk prognostic model
In the TCGA-PRAD dataset, the survival package was used to conduct univariate Cox regression analysis on the aforementioned common differentially expressed genes. Genes with a significant association with progression-free interval (PFI) were selected using a threshold of P < 0.05. Only genes meeting this criterion were considered for subsequent LASSO regression analysis. The LASSO method was used to perform dimensionality reduction and variable selection. The penalty parameter (λ) was determined using ten-fold cross-validation to minimize the mean squared error (MSE) of the model. Specifically, the λ value corresponding to the minimum criteria was selected, which balances model complexity and prediction accuracy. As shown in Fig.3a, the optimal λ was chosen to include seven genes with non-zero coefficients, which were subsequently used to construct the prognostic model9. After eliminating gene covariates and reducing the number of genes, multivariate Cox regression analysis was performed. Based on the regression coefficients and the optimized gene expression levels, patients’ PFI risk scores were calculated using the formula RS = EXPgene1*β1 + EXPgene2 *β2 + EXPgene3*β3 + … + EXP gene n*βn (where EXP represents gene expression and βn is the regression coefficient in multivariate Cox regression)10. Prostate cancer patients were stratified into high and low-risk groups based on their risk scores, with evaluation conducted using Kaplan–Meier and ROC analyses11. Calibration curves and decision curve analysis were utilized to assess the prognostic model’s accuracy and clinical value. Univariate and multivariate Cox regression analyses were performed to ascertain whether the risk score served as an independent prognostic factor for PFI in PRAD patients, considering covariates such as age at diagnosis, Gleason score, prostate-specific antigen (PSA) level, clinical stage, and pathological stage.
Validation of the predictive model’s accuracy
GSE11691812 data from the GEO database was obtained for additional validation of the established model. After calculating each patient’s risk score using the training set’s formula, we grouped patients into low-risk and high-risk categories according to the median score. To analyze survival disparities, we utilized Kaplan–Meier (KM) curves between these groups, while assessing feature prediction accuracy via receiver operating characteristic (ROC) curves.
Construction of column line plots and calibration curves
We integrated clinical data such as age, clinical T-stage, pathological T and N-stage, PSA level, Gleason score, and risk scores. The RMS package in R software was utilized to create a column chart to forecast individual survival probability. Additionally, calibration curves were generated to assess the predicted survival rates for PRAD patients at 1, 3, and 5years. The clinical relevance of these graphical representations was evaluated using decision curve analysis (DCA), providing insights into their practical utility.
Validation of differential expression of hub genes
TCGA_PRAD RNAseq data in TPM format were retrieved from the Tumor and Cancer Genome Atlas (TCGA) database. Statistical calculations and visualization of TCGA_PRAD were performed using the R package 3.6.3. Hub gene expression differences between cancerous and normal tissues were analyzed. Similarly, GSE66187 was analyzed to compare the expression differences of hub genes between castration-resistant prostate cancer and primary prostate cancer tissues.
Clinical characteristics and prognostic analysis of hub genes in prostate cancer patients
Selected hub genes may have clinical significance in the prognosis of prostate cancer. The expression levels of target genes were individually analyzed for their correlation with clinical variables [pathological stage, clinical stage, age at diagnosis, prostate-specific antigen (PSA) level, Gleason score], and their association with progression-free interval (PFI) was evaluated.
Statistical processing
In this study, SPSS 25, R language (R 4.3.2), and R studio (2023.12.0 Build 372) were employed for data processing. Measurement data were expressed as mean ± standard deviation (x ± s) if they followed a normal distribution, and compared using t-tests; if not normally distributed, non-parametric tests were employed. Count data were expressed as rates (%), and compared using chi-square tests.
Results
Exploration of differentially expressed genes (DEGs)
In the comparison between normal and prostate cancer groups, 494 genes showed differential expression, including 192 up-regulated genes and 302 down-regulated genes (Fig.1a). Similarly, 4867 genes showed differential expression between the hormone-sensitive and castration-resistant groups, comprising 1900 up-regulated genes and 2967 down-regulated genes (Fig.1b). Venn plot analysis unveiled 182 common DEGs shared between the two datasets, including 30 co-regulated up-regulated genes and 152 co-regulated down-regulated genes (Fig.1e,f).
Identification of differentially expressed genes in GSE35988: (a) Volcano plot showing 494 genes differentially expressed between the normal and primary prostate cancer groups; (b) Volcano plot showing 4867 genes differentially expressed between the hormone-sensitive and castration-resistant groups; (c) Heatmap of differentially expressed genes between the normal and primary prostate cancer groups; (d) Heatmap of the top 500 differentially expressed genes ranked by log2FC between the hormone-sensitive and castration-resistant groups; (e) Venn diagram illustrating commonly upregulated genes between the Venn diagram illustrating commonly upregulated genes between the Tumor-Normal (TN) and White Adipose Tissue (WAT) groups. ; (f) Venn diagram illustrating commonly downregulated genes between the TN and WAT groups.
GO and KEGG enrichment analysis results
The GO analysis highlighted the involvement of differentially expressed genes in various biological processes, including the mitotic cell cycle, cell division, supramolecular complexes, microtubule cytoskeleton, cytoskeleton protein binding, and microtubule binding (Fig.2a–c). Additionally, examination through the KEGG database unveiled enrichment in metabolic pathways such as focal adhesion, the Hippo signaling pathway, vascular smooth muscle contraction, and the TGF-β signaling pathway (Fig.2d).
GO and KEGG analysis of common differentially expressed genes: (a) Biological processes. (b) Cellular components. (c) Molecular functions. (d) KEGG pathways: Kyoto Encyclopedia of Genes and Genomes.
Risk prognostic model for prostate cancer prognosis
For the construction of a prognostic model in prostate cancer, common differentially expressed genes from both datasets underwent univariate Cox regression analysis, leading to the discovery of 67 genes linked to prognosis. Subsequently, Lasso regression analysis was performed, which revealed that the optimal model comprised seven genes: KIF4A, UBE2C, FAM72D, LIX1, CCDC78, HOXD9, and SLC5A8 (Fig.3a,b). Further refinement of the model was achieved through multivariate Cox regression analysis, providing regression coefficients for each gene and a constant term. Utilizing these coefficients, a risk prognostic model was developed, integrating gene expression values (Fig.3c). The corresponding regression coefficients β1–β7 are 0.486295137, − 0.084724889, 0.20856452, − 0.480372962, 0.177600963, 0.608003224, and – 0.250531262, with a constant term of − 2.421738301. According to the risk assessment model scoring formula, the prostate cancer prognostic model Risk Score is calculated as − 2.421738301 + 0.486295137EXP(KIF4A) − 0.084724889EXP(UBE2C) + 0.20856452EXP(FAM72D) − 0.480372962EXP(LIX1) + 0.177600963EXP(CCDC78) + 0.608003224EXP(HOXD9) − 0.250531262*EXP(SLC5A8), where EXP(gene) represents the gene expression value.
Identification of co-expressed genes associated with the progression-free interval (PFI) in prostate cancer. (a) The partial likelihood deviance of different variable numbers identified by the LASSO regression model is depicted. Blue dots indicate the partial likelihood deviance values, while grey lines represent the partial likelihood deviance ± standard error (SE). The λ value was selected using ten-fold cross-validation based on the minimum criteria, as indicated by the vertical dashed line. This corresponds to the selection of seven genes with non-zero coefficients. The two vertical lines on the left and right denote optimal values determined by minimum criteria and 1-SE criteria, respectively. The appropriate log (Lambda) value was selected through tenfold cross-validation using minimum criteria. LASSO stands for the least absolute shrinkage and selection operator method. (b) The LASSO coefficient profiles of the 67 genes related to PFI are presented. (c) Multivariate Cox regression analysis was conducted on the genes identified through LASSO regression.
Using this risk prognostic model, we assessed its effectiveness through Kaplan–Meier survival analysis, ROC curve analysis, and decision curve analysis (DCA). Patients were stratified into high-risk and low-risk categories based on their calculated risk scores. The high-risk group exhibited significantly elevated rates of disease recurrence or mortality and shorter disease-free intervals compared to the low-risk group. Notably, higher risk scores were associated with poorer prognoses in patients diagnosed with prostate adenocarcinoma (PRAD). Analysis of gene expression patterns revealed elevated levels of KIF4A, UBE2C, FAM72D, CCDC78, and HOXD9 in the high-risk group, while LIX1 and SLC5A8 were downregulated (Fig.4a). Kaplan–Meier survival analysis confirmed the poorer outcomes of the high-risk group (P < 0.001, HR = 4.72, Fig.4b).
Construction of the prognostic model for predicting Prostate Cancer PFI in the TCGA training cohort. (a) Distribution of prostate cancer patients in the TCGA training cohort based on prognostic risk model scores, recurrence status, and expression patterns of seven genes. (b) Kaplan–Meier curve illustrating PFI based on prognostic risk model scores in the TCGA training cohort. (c) Time-dependent ROC curve assessing the predictability of 1, 3, and 5-year PFI in the TCGA training cohort. (d) DCA curve depicting risk scores in the TCGA training cohort. (e) Calibration curves predicting 1, 3, and 5-year PFI for patients. (f) Univariate and multivariate Cox regression analyses of risk scores, patient age at diagnosis, Gleason score, PSA level, pathological tumor stage, and clinical tumor stage in the TCGA cohort. PSA ≤ 4 was defined as 0, while PSA > 4 was defined as 1.
As demonstrated in Fig.4c, the risk model exhibited robust predictive performance at 1year (AUC = 0.754), 3years (AUC = 0.776), and 5years (AUC = 0.706). Decision curve analysis (DCA) highlighted the model’s significant net benefit (Fig.4d), and the calibration curve validated its consistent predictive accuracy (Fig.4e).
To further evaluate the independent prognostic significance of the risk score and clinical characteristics, both univariate and multivariate Cox regression analyses were conducted. Univariate analysis identified risk score, pathological T and N stages, clinical T stage, Gleason score, and PSA level as prognostic indicators in TCGA-PRAD. In multivariate analysis, risk score and PSA level emerged as independent prognostic factors. These findings underscore the robustness and clinical utility of our prognostic model as a biomarker for prostate cancer prognosis (Fig.4f).
Validation of the risk model using the GEO database
In the GSE116918 validation dataset, Kaplan–Meier analysis (Fig.5a) confirmed that patients categorized as low-risk exhibited better prognoses compared to those classified as high-risk (P = 0.020; HR = 1.92, 95% CI = 1.11–3.31), consistent with findings from the training dataset. Furthermore, the area under the curve (AUC) of the survival ROC curve demonstrated the model’s strong sensitivity and specificity in predicting patient prognosis (Fig.5b), with AUC values at 1year, 3years, and 5years of 0.728, 0.526, and 0.599, respectively.
Validation of Risk Model in GSE116918 Dataset (a) Kaplan–Meier curves of PFI based on risk assessment model scores (b) Time-dependent ROC curves predicting patient PFI at 1, 3, and 5years.
Creation and evaluation of column charts
We devised a column-line diagram that provides clinicians with a quantitative approach to predict the prognosis of PRAD patients, incorporating Gleason score, PSA level, tumor clinical T stage, pathological T and N stages, along with risk scores. This diagram highlighted the significance of risk scores among various clinical parameters (Fig.6a). Additionally, calibration curves illustrated the alignment between the column-line plots and actual survival outcomes of PRAD patients (Fig.6c). Compared to conventional prognostic scoring systems, our model exhibited a higher AUC value (AUC = 0.775, Fig.6b).
Construction and Evaluation of Nomogram for Predicting 1-year, 3-year, and 5-year PFI in PCa (a) Nomogram for predicting 1-year, 3-year, and 5-year PFI in PCa patients (b) Calibration curves predicting 1-year, 3-year, and 5-year PFI incidence rates (c) ROC curves comparing risk scores and other variables for predicting PFI.
Validation of hub gene expression levels
These findings were corroborated in the TCGA-PRAD dataset, where KIF4A, UBE2C, FAM72D, and CCDC78 were highly expressed in prostate cancer, while LIX1, SLC5A8, and HOXD9 were expressed at lower levels (Fig.7a). Using the raw microarray data from GSE66187, scatter plots were generated to illustrate the expression differences of Hub genes (Fig.7b). Results indicated that KIF4A, UBE2C, FAM72D, and CCDC78 exhibited high expression levels in Castration-Resistant Prostate Cancer, whereas LIX1, SLC5A8, and HOXD9 showed low expression levels, consistent with the risk prognostic model results.
Expression Levels of Hub Genes Validated in TCGA-PRAD and GSE66187 (a) Expression of hub genes in the TCGA-PRAD cohort (b) Expression of hub genes in GSE66187.
Clinical significance and survival analysis of hub genes
Through integration with clinical prognostic information from the TCGA database, the screened Hub genes underwent clinical prognostic analysis (Fig.8). High expression of KIF4A, UBE2C, FAM72D, or low expression of LIX1 was associated with higher pathological T and N stages, clinical T stage, age, PSA level, Gleason score, and poorer PFI in prostate cancer patients. Similarly, low expression of SLC5A8 was linked to higher pathological T and N staging, clinical T and M staging, age, Gleason score, and poorer PFI. Additionally, high CCDC78 expression correlated with higher pathological T and N staging, age, Gleason score, and poorer PFI. Furthermore, increased expression of HOXD9 correlated with higher Gleason scores and worse progression-free interval (PFI).
Relationship between Hub Gene Expression and Clinical Characteristics (a) KIF4A (b) SLC5A8 (c) LIX1 (d) UBE2C (e) FAM72D (f) CCDC78 (g) HOXD9.
Discussion
The rising occurrence of prostate cancer presents an increasingly formidable obstacle due to factors like the aging population, improved standards of living, and heightened healthcare awareness13. Amidst the array of treatment options for advanced prostate cancer, androgen deprivation therapy (ADT) emerges as a cornerstone approach. Despite achieving short-term relief from symptoms, the disease often persists and evolves, culminating in recurrence and the development of Castration-Resistant Prostate Cancer (CRPC)14. Consequently, there is a pressing need to identify biomarkers indicative of prostate cancer initiation and progression to CRPC, enabling early intervention to inform clinical diagnosis and treatment strategies. The objective of this study was to pinpoint shared fundamental genes linked to both prostate cancer (PCa) and CRPC, enabling early detection of high-risk patients and development of a prognostic model rooted in these core genes, shedding light on their potential impact on tumor prognosis.
Initially, GSE35988 was selected as the training dataset to conduct differential expression analysis between the normal versus prostate cancer group and the prostate cancer versus Castration-Resistant Prostate Cancer (CRPC) group, with the aim of pinpointing commonly differentially expressed genes. Following this, a prognostic risk model was established through Cox proportional hazard modeling and Lasso Cox regression analysis, integrating seven genes (KIF4A, UBE2C, FAM72D, CCDC78, HOXD9, LIX1, and SLC5A8).
Notably, Kinesin family member 4A (KIF4A), belonging to the kinesin 4 subfamily, plays a pivotal role in regulating chromosome cohesion and segregation during mitosis15. Its overexpression has been linked to adverse outcomes in lung, breast, and colon cancers, highlighting its significance in cancer biology. Understanding KIF4A’s role in mitosis and its correlation with cancer prognosis could inform potential therapeutic approaches for these malignancies. Our findings revealed a connection between elevated KIF4A expression and advanced pathological stage and higher Gleason score in prostate cancer patients. This association suggests potential implications for disease progression and clinical outcomes in these individuals. Consistent with prior studies, heightened KIF4A expression was linked to enhanced proliferation and migration abilities in prostate cancer cells. Moreover, downregulation of KIF4A was demonstrated to counteract the progression of endocrine therapy-resistant CRPC through modulation of the androgen receptor (AR)16.
Ubiquitin-binding enzyme E2 C (UBE2C) serves as a critical regulator in eukaryotic protein degradation pathways, contributing to the disruption of mitotic cycle proteins and affecting cell cycle progression. Elevated UBE2C expression has been linked to the development and advancement of several cancers, such as lung, esophageal adenocarcinoma, hepatocellular carcinoma, nasopharyngeal carcinoma, and breast cancer. Moreover, its overexpression is frequently correlated with poor prognoses in breast, thyroid, cervical, bile duct, and gastric cancers17. In our study, we observed significant up-regulation of UBE2C during prostate cancer formation, with increased expression observed in castration-resistant Prostate Cancer (CRPC). Elevated UBE2C expression was found to correlate with higher pathological T and N stages, clinical T stage, patient age, PSA level, Gleason score, and poorer Progression-Free Interval (PFI) in CRPC patients.
On the other hand, FAM72D, located on Chromosome 1q21.1, has been reported to be up-regulated in high-risk multiple myeloma and is associated with enhanced MM cell proliferation, indicative of a poor prognosis18. However, little is known about its function in prostate cancer and further exploration is needed. Limb Expression 1 (LIX1) is localized in mitochondria, where it regulates mitochondrial shape and redox signaling. It is predominantly expressed in gastrointestinal mesenchymal tumors, often indicating an unfavorable prognosis. Knockdown of LIX1 has been shown to inhibit the MAPK pathway in GIST cells and enhance the anti-tumor effect of imatinib19. Our findings reveal a significant downregulation of LIX1 expression in both prostate cancer and Castration-Resistant Prostate Cancer (CRPC), indicating a potential correlation with poorer prognoses. Conversely, Homeobox D9 (HOXD9), known for its pivotal role in governing cellular processes, exhibits a typical pattern of overexpression in various cancers including gastric, cervical, pancreatic, colorectal, and hepatocellular carcinomas, often associated with unfavorable clinical outcomes20. Surprisingly, our investigation unveils a decreased expression of HOXD9 in prostate cancer, which further diminishes in CRPC compared to normal prostate tissues. Furthermore, this downregulation aligns with higher Gleason scores and diminished disease-free survival in patients. SLC5A8, a protein-coding gene, is known to exert tumor-suppressive effects, particularly in colon and thyroid cancer cells. This gene promotes apoptosis, facilitating the programmed cell death of tumor cells, while simultaneously inhibiting their proliferation, thereby impeding tumor progression21. Our study findings align with this role of SLC5A8 in prostate cancer.
This study also has some limitations. The mechanism of action of these genes on prostate cancer cell proliferation, invasion, and apoptosis remains unclear and requires further exploration. Additionally, the sample size in this study is small, highlighting the need for subsequent multicenter, large-sample, prospective studies to validate the findings.
Conclusion
This study established a prognostic model utilizing KIF4A, UBE2C, FAM72D, CCDC78, HOXD9, LIX1, and SLC5A8, accurately forecasting the outcomes of prostate adenocarcinoma (PRAD) patients. We systematically investigated how these genes correlate with clinical characteristics in prostate cancer patients, indicating their promise as diagnostic and prognostic markers. Furthermore, these genes may offer valuable therapeutic targets for individualized treatment strategies.
Data availability
TCGA gene expression profile (FPKM value) and corresponding clinical information during the current study are publicly available in the TCGA-PRAD repository (https://portal.gdc.cancer.gov/repository/). The data on microarray expression profiling and corresponding survival information from the GEO database are publicly available in GSE35988, GSE66187 and GSE116918 repository (https://www.ncbi.nlm.nih.gov/geo/). Further inquiries can be directed to the corresponding author.
References
Global burden of disease cancer collaboration, Fitzmaurice C, Akinyemiju TF, et al. 2018 Global, Regional, and National Cancer Incidence, Mortality, Years of Life Lost, Years Lived With Disability, and Disability-Adjusted Life-Years for 29 Cancer Groups, 1990 to 2016: A Systematic Analysis for the Global Burden of Disease Study. JAMA oncology, 4(11):1553–1568. https://doi.org/10.1001/jamaoncol.2018.2706 (2018).
Hoffman, K. E. et al. Patient-reported outcomes through 5 years for active surveillance, surgery, brachytherapy, or external beam radiation with or without androgen deprivation therapy for localized prostate cancer. JAMA 323(2), 149–163. https://doi.org/10.1001/jama.2019.20675 (2020).
Watson, P. A., Arora, V. K. & Sawyers, C. L. Emerging mechanisms of resistance to androgen receptor inhibitors in prostate cancer. Nat. Rev. Cancer 15(12), 701–711. https://doi.org/10.1038/nrc4016 (2015).
El Fakiri, M. et al. PSMA-Targeting radiopharmaceuticals for prostate cancer therapy: Recent developments and future perspectives. Cancers 13(16), 3967. https://doi.org/10.3390/cancers13163967 (2021).
Grasso, C. S. et al. The mutational landscape of lethal castration-resistant prostate cancer. Nature 487(7406), 239–243. https://doi.org/10.1038/nature11125 (2012).
Zhang, X. et al. SRRM4 Expression and the loss of rest activity may promote the emergence of the neuroendocrine phenotype in castration-resistant prostate cancer. Clin. Canc. Res.: An Off. J. Am. Assoc. Canc. Res. 21(20), 4698–4708. https://doi.org/10.1158/1078-0432.CCR-15-0157 (2015).
Limma powers differential expression analyses for RNA-sequencing and microarray studies - PubMed[EB/OL]. [2024–02–25]. https://pubmed.ncbi.nlm.nih.gov/25605792/.
Yu, G. et al. clusterProfiler: An R package for comparing biological themes among gene clusters. Omics: J. Integr. Biol., 16(5), 284–287. https://doi.org/10.1089/omi.2011.0118 (2012).
Goeman, J. J. L1 penalized estimation in the Cox proportional hazards model. Biometr. J. Biometr. Zeitschrift 52(1), 70–84. https://doi.org/10.1002/bimj.200900028 (2010).
Comprehensive investigation of a novel differentially expressed lncRNA expression profile signature to assess the survival of patients with colorectal adenocarcinoma - PubMed[EB/OL]. [2024-02-25]. https://pubmed.ncbi.nlm.nih.gov/28187432/.
A nonparametric test for the association between longitudinal covariates and censored survival data - PubMed[EB/OL]. [2024-02-25]. https://pubmed.ncbi.nlm.nih.gov/30796830/.
Jain, S. et al. Validation of a Metastatic Assay using biopsies to improve risk stratification in patients with prostate cancer treated with radical radiation therapy. Ann. Oncol.: Off. J. Eur. Soci. Med. Oncol. 29(1), 215–222. https://doi.org/10.1093/annonc/mdx637 (2018).
Showalter, T. N. Commentary: In search of answers regarding the benefits and harms of short term ADT for intermediate-risk prostate cancer. The Canad. J. Urol. 24(1), 8663 (2017).
Chowdhury, S. et al. Real-world outcomes in first-line treatment of metastatic castration-resistant prostate cancer: The prostate cancer registry. Targeted Oncol. 15(3), 301–315. https://doi.org/10.1007/s11523-020-00720-2 (2020).
Kahm, Y. J. et al. Impact of KIF4A on cancer stem cells and EMT in lung cancer and Glioma. Cancers 15(23), 5523. https://doi.org/10.3390/cancers15235523 (2023).
Chen, J. et al. KIF4A: A potential biomarker for prediction and prognostic of prostate cancer. Clin. Invest. Med. 43(3), 49–59. https://doi.org/10.25011/cim.v43i3.34393 (2020).
Ong, K. H. et al. Ubiquitin-conjugating enzyme E2C (UBE2C) is a prognostic indicator for cholangiocarcinoma. Eur. J. Med. Res. 28(1), 593. https://doi.org/10.1186/s40001-023-01575-9 (2023).
Chatonnet, F. et al. The hydroxymethylome of multiple myeloma identifies FAM72D as a 1q21 marker linked to proliferation. Haematologica 105(3), 774–783. https://doi.org/10.3324/haematol.2019.222133 (2020).
S R D, E T, S D, et al. LIX1 Controls MAPK Signaling Reactivation and Contributes to GIST-T1 Cell Resistance to Imatinib. Int. journal of molecular sciences, 2023, 24(8)[2024–02–25]. https://pubmed.ncbi.nlm.nih.gov/37108337/.
Li, J. et al. BST2 promotes gastric cancer metastasis under the regulation of HOXD9 and PABPC1. Mol. Carcinogen. https://doi.org/10.1002/mc.23679 (2024).
Yang, Y. et al. Role of hypermethylated SLC5A8 in follicular thyroid cancer diagnosis and prognosis prediction. World J. Surg. Oncol. 21(1), 367. https://doi.org/10.1186/s12957-023-03240-1 (2023).
Author information
Authors and Affiliations
Second Affiliated Hospital of Zhengzhou University, Zhengzhou, China
Changhui Fan,Zhiheng Huang,Tianhe Zhang,Haiyang Wei,Junfeng Gao,Changbao Xu&Changhui Fan
School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, China
Han Xu
Authors
- Changhui Fan
View author publications
You can also search for this author in PubMedGoogle Scholar
- Zhiheng Huang
View author publications
You can also search for this author in PubMedGoogle Scholar
- Han Xu
View author publications
You can also search for this author in PubMedGoogle Scholar
- Tianhe Zhang
View author publications
You can also search for this author in PubMedGoogle Scholar
- Haiyang Wei
View author publications
You can also search for this author in PubMedGoogle Scholar
- Junfeng Gao
View author publications
You can also search for this author in PubMedGoogle Scholar
- Changbao Xu
View author publications
You can also search for this author in PubMedGoogle Scholar
- Changhui Fan
View author publications
You can also search for this author in PubMedGoogle Scholar
Contributions
C.H.F contributed to the study design and article revision. Z.H.H. contributed to the study design, data analysis, result interpretation and manuscript drafting. H.X., T.H.Z. and H.Y.W. were involved in the reviewing the manuscript. Z.H.H., H.X., J.F.G., C.B.X., and C.H.F. were involved in the manuscript revision. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Zhiheng Huang.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Fan, C., Huang, Z., Xu, H. et al. Machine learning-based identification of co-expressed genes in prostate cancer and CRPC and construction of prognostic models. Sci Rep 15, 5679 (2025). https://doi.org/10.1038/s41598-025-90444-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-90444-y
Keywords
- Prostate cancer
- Castration resistance
- TCGA
- Differentially expressed genes
- Prognostic modeling