Skip to main content

Proteome-wide Mendelian randomization identifies causal plasma proteins in prostate cancer development

Abstract

Background

The etiology of prostate cancer remained elusive, whether plasma protein levels are associated with prostate cancer is still unknown.

Methods

We have performed Mendelian randomization analyses to calculate the causal effects of plasma proteins on the risk of prostate cancer in the PRACTICAL consortium dataset using cis-protein quantitative trait loci (cis-pQTL) variants as instrumental variables for plasma proteins, and cis-expression quantitative trait locus (cis-eQTL) for the circulating gene expression. We also replicated the findings in the FinnGen consortium.

Results

Genetically proxied levels of 4 plasma proteins (CREB3L4, HDGF, SERPINA3, GNPNAT1) were identified as positively correlated with an increased risk of prostate cancer, while an increase in genetically proxied levels of 5 plasma proteins (TNFRSF6B, GSK3A, EIF4B, CLIC1, SMAD2) were significantly associated with a decreased risk of prostate cancer in the PRACTICAL consortium. Among the identified proteins, the causal effects of six proteins including CREB3L4, HDGF, SERPINA3, TNFRSF6B, EIF4B, and SMAD2 remained significant in the replication analyses in the FinnGen consortium and when combined with meta-analyses (SMAD2: OR 0.710, 95% CI 0.578–0.873, p-value = 0.001; CREB3L4: OR 1.260, 95% CI 1.164–1.364, p-value < 0.0001; HDGF: OR 1.072, 95% CI 1.021–1.125, p-value = 0.005; SERPINA3: OR 1.138, 95% CI 1.091–1.187, p-value < 0.0001; TNFRSF6B: OR 0.656, 95% CI 0.496–0.869, p-value = 0.003; EIF4B: OR 0.701, 95% CI 0.618–0.796, p-value < 0.0001). SMAD2 and CREB3L4 gene expressions proxied with cis-expression quantitative trait loci are also significantly associated with the risk of prostate cancer in both consortiums and when combined with meta-analyses (SMAD2: OR 0.787, 95% CI 0.719–0.861, p-value = 1.00 × 10–4; CREB3L4: OR 1.219, 95% CI 1.033–1.438, p-value = 0.019).

Conclusions

Our consistent results highlighted the important roles of plasma SMAD2 and CREB3L4 in the risk of prostate cancer. Further investigations on these proteins may reveal their potential in the prevention and treatment of prostate cancer.

Background

Prostate cancer is one of the most common types of cancer in males, with the second highest incidence rate among all malignant tumors [1, 2]. Although the prognosis of prostate cancer at early stages is relatively good, with a 5-year survival rate over 99%, the survival rates drop sharply for patients with advanced or metastatic prostate cancer [1]. The overall death rate of prostate cancer ranks fifth among all cancer deaths in males, which imposes a significant burden on society [3]. Therefore, investigations on the molecular mechanisms underlying prostate cancer progression may help to identify potential pharmaceutical targets and improve the overall prognosis of prostate cancer. Currently, the etiology of prostate cancer remained elusive, although several risk factors have been identified as associated with prostate cancer risk including obesity, diet, inflammation, age, family history [4,5,6]. Previous studies have proposed that levels of circulating proteins can be associated with the risk of prostate cancer [7, 8]. However, the number of studies on this topic is limited and most of the related studies are based on an observational design, which may be biased by reverse causality and confounding factors. Additionally, it is not feasible to explore the associations of thousands of proteins with prostate cancer using randomized control trials. To address these problems, we have performed Mendelian randomization (MR) study by integrating genetic datasets from large genome-wide association studies.

MR analyses employed genetic variants as instrumental variables to proxy certain exposures to assess the causal effects of the exposures on the outcomes of interest [9]. As the SNPs are presumed to be assigned randomly during gamete formation and is less likely to be affected by confounding factors, the MR design mimics the randomized controlled trials and reduces the risk of reverse causality and residual confounding. Specially, we have also used cis-protein quantitative trait loci (cis-pQTL) and cis-expression quantitative trait loci (cis-eQTL) variants as instrumental variables for the proteins of interest. A cis variant is an SNP within a certain range of the transcription start site of the protein-encoding gene [10]. Findings from MR studies may be biased by horizontal pleiotropy which occurs when the instrumental variable has an effect on the outcome independent of the exposure. Using cis-pQTL and cis-eQTLs as proxies for the levels of circulating proteome can help to decrease the risk of horizontal pleiotropy [11, 12]. Previous studies have used cis- pQTL to estimate the causal effects of circulating proteome on the risk of multiple diseases [10, 11, 13, 14]. Our study can highlight causal proteins and genes involved in the incidence of prostate cancer and may provide potential targets for the prevention and treatment of prostate cancer.

Methods

Study design

The schematic plot of the study design is presented in Fig. 1. Cis-pQTL variants and their associations with circulating proteins were obtained from a previous publication [15]. Associations of the cis-pQTL variants with prostate cancer were studied in summary-level statistics of genome-wide association studies (GWASs) from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium and FinnGen Release 8 [16, 17]. Two-sample MR analyses were performed to assess the causal associations between circulating proteome and the risk of prostate cancer in the two independent cohorts respectively and the causal estimates were combined with meta-analyses. In addition, we also used cis-eQTL to proxy the circulating expression of protein coding genes to validate the causal effects of the identified proteins on prostate cancer risk with MR analyses.

Fig. 1
figure 1

Schematic view of the study design

Exposures

7141 conditional independent cis-pQTL variants for 1925 plasma proteins were obtained from a previous publication based on individual-level data from the Atherosclerosis Risk in Communities (ARIC) study (Table 1) [15, 18]. Only subjects of European ancestry were included in our analyses, with in total of 7213 European Americans included. The plasma protein levels were measured by SomaLogic Inc. (Boulder, Colorado, US) using an aptamer (SOMAmer)-based approach [19]. Genotyping of the samples was performed with Affymetrix 6.0 DNA microarray and then imputed with TOPMed reference panel (Freeze 5b) [15]. The results were adjusted for covariates such as sex, age, study site, and 10 genetic principle components. The mapping window of the cis-pQTL variants was defined as within 500 kb of the transcription starting site of the target proteins. Cis-eQTL variants for SMAD2, CREB3L4, EIF4B and HDGF gene expression were obtained from the eQTLGen consortium phase II (https://www.eqtlgen.org/) [20]. The cis-eQTL datasets for circulating expression of SMAD2, CREB3L4, EIF4B and HDGF were based on a cohort enrolling over 31,000 subjects of European ancestry.

Table 1 Detailed information of data sources of the included GWAS studies. GWAS, genome-wide association studies

Prostate cancer

In the primary analyses, we obtained the summary-level GWAS data of prostate cancer from the PRACTICAL study, which is a meta-analysis of 52 GWAS studies enrolling 79,148 prostate cancer patients and 61,106 controls from European ancestry (Table 1) [16]. In the PRACTICAL study, categorization of European ancestry was determined by identifying individuals with an estimated European ancestry over 80%, by referring to the HapMap populations. Duplicated samples and first-degree relatives were excluded. The QC pipeline excluded SNPs with a call rate < 95%, not in Hardy–Weinberg equilibrium (P < 10−7 in controls or P < 10−12 in cases), and with a minor allele frequency MAF < 1%. 498,417 SNPs were retained after the QC steps. Definition of prostate cancer and case sources of the PRACTICAL study are described in detail in the source paper [16]. A likelihood-ratio test was used in the analyses to minimize bias from rare variants. A fix-effect inverse variance weighted meta-analysis method was used to combine the odd ratios (OR) and standard errors by using METAL [21]. 1528 plasma proteins had enough IVs to be included in the analyses in the PRACTICAL dataset.

In the secondary analyses, summary-level GWAS data of prostate cancer were derived from the FinnGen release 8 [17]. Individuals with genotype missingness (> 5%) and non-Finnish ancestry were excluded. A total of 121,779 male subjects (11,590 cases and 110,189 healthy controls) were enrolled in the cohort. In GWAS quality control steps, variants with high missingness (> 2%), low HWE P-value (< 1e−6) and low minor allele count (MAC < 3) were excluded. Prostate cancer was defined with ICD codes (ICD-10: C61; ICD-9: 185; ICD-8: 185). 1514 proteins had enough IVs to be included in the MR analyses in the FinnGen dataset.

Mendelian randomization

Associations between plasma protein levels or gene expressions and prostate cancer risk estimated from individual SNPs (cis-pQTLs and cis- eQTLs) were calculated with Wald ratios. A fixed-effect inverse variance weighted (IVW) method was used to combine the Wald ratios when less than 3 SNPs were used as instrumental variables, otherwise, a random-effect IVW method was used. MR-Egger and weighted median approaches were further used as sensitivity analyses. MR-Egger method can detect potential horizontal pleiotropy with a p-value of its intercept and provide causal estimates after correcting for the horizontal pleiotropy at the sacrifice of statistical power with instrument strength independent of direct effect assumption (InSIDE) [22]. The weighted median model can generate causal estimates when up to half of the instrumental variables were invalid [23]. Cochrane’s Q values were calculated to quantify the heterogeneity in the analyses. The meta-regression (MR) analysis results were reported as odds ratios (OR) and their corresponding 95% confidence intervals (CI), scaled to a one standard deviation (SD) increment in the genetically predicted plasma protein levels.

The results of MR analyses from the two different prostate datasets were further combined with meta-analyses to validate the robustness of the findings. When significant heterogeneity (I2 > 50% or p-value < 0.05) exists, a random-effect model would be used in the meta-analyses, otherwise, a fixed effect mode would be used.

Colocalization analyses

We have further studied the shared causal variants between the identified pQTL and prostate cancer with colocalization analyses. Colocalization analyses can test if the identified associations were biased by linkage disequilibrium. The coloc.abf function from R package coloc were used to perform the analyses [24]. Full GWAS summary data of prostate cancer from the PRACTICAL consortium and pQTL datasets from the ARIC study were included in the analyses. We set the priors to default, with p1 as 1 × 10−4, p2 as 1 × 10−4, and p12 as 1 × 10−5. The analyses were based on a Bayesian model that assess the five following hypothesis: (1) no association with either trait; (2) association with trait 1 only; (3) association with trait 2 only; (4) two traits are associated, but distinct causal variants for two traits; and (5) two traits are associated, and shares same causal variant. A posterior probability is provided for each hypothesis testing (PH0, PH1, PH2, PH3, PH4). PH4 over 0.8 were considered a strong support for colocalization, while 0.8 ≥ PH4 ≥ 0.5 were considered a medium support for colocalization.

Statistical analyses

All analyses were two-sided. In the primary analyses in the PRACTICAL dataset, a p-value less than 3.27 × 10–05 (0.05/1528 proteins, Bonferroni adjusted) was considered statistically significant, while a p-value between 0.05 and 3.27 × 10–05 was considered suggestively significant. In the replication analyses in the FinnGen dataset, a p-value less than 0.05was considered significant. All analyses were performed on the R platform, with TwoSampleMR and Mendelian Randomization packages [25, 26].

Results

Genetically-proxied Plasma proteins and Prostate cancer

We first assessed the causal effects of plasma protein levels proxied with cis-pQTLs on prostate cancer in the PRACTICAL dataset with Wald ratios or the IVW method. 4 genetically-proxied plasma proteins (CREB3L4, HDGF, SERPINA3, GNPNAT1) were identified as positively associated with an increased risk of prostate cancer, while 5 genetically-proxied plasma proteins (TNFRSF6B, GSK3A, EIF4B, CLIC1, SMAD2) were significantly associated with a decreased risk of prostate cancer (Supplementary Fig. 1A). We then replicated the analyses in a dataset from the FinnGen consortium. Among the 4 genetically-proxied proteins that significantly increased the risk of prostate cancer in PRACTICAL, three of them (CREB3L4, HDGF, SERPINA, p < 0.05) remained to have significant effects in the FinnGen dataset, except GNPNAT1 (p-value = 0.098) (Fig. 2). Among the 5 proteins (TNFRSF6B, GSK3A, EIF4B, CLIC1, SMAD2) whose genetically proxied levels significantly reduced the risk of prostate cancer in PRACTICAL, GSK3A (p-value = 0.602) and CLIC1 (p-value = 0.077) didn’t remain to be associated with prostate cancer in the FinnGen dataset. The consistency in the significance and direction of causal effects of the six plasma proteins (CREB3L4, HDGF, SERPINA3, TNFRSF6B, EIF4B, and SMAD2) highlighted their importance in prostate cancer.

Fig. 2
figure 2

Forest plots showing the plasma proteins that were identified to significantly increase the risk of prostate cancer in the PRACTICAL dataset. The causal estimates from the PRACTICAL and FinnGen datasets were combined with meta-analyses. SNP, single nucleotide polymorphism

Causal associations of all 1528 genetically-proxied plasma proteins with prostate cancer in the PRACTICAL consortium are shown in Supplementary Table 1. Results from sensitivity analyses are presented in Supplementary Table 2–5. Detailed information on all genetic variants used in the analyses in the PRACTICAL dataset can be found in Supplementary Table 6.

Causal associations of all 1514 genetically-proxied plasma proteins with prostate cancer in the FinnGen consortium are shown in Supplementary Table 7. Results from sensitivity analyses are presented in Supplementary Table 8–11. Detailed information on all genetic variants used in the analyses in the FinnGen consortium can be found in Supplementary Table 12.

Genetic variants used as instrumental variables were searched on the PhenoScanner (http://www.phenoscanner.medschl.cam.ac.uk/) and no horizontal pleiotropy (p < 5 × 10–8) were identified (Table 2).

Table 2 Detailed information of cis-pQTLs used as proxies of significant plasma protein levels and their causal effects on the risk of prostate cancer in PRACTICAL and FinnGen corhorts

Meta-analysis

We then performed meta-analyses to combine the causal estimates from PRACTICAL and FinnGen datasets. Among the proteins that increased the risk of prostate cancer (CREB3L4, HDGF, SERPINA), significant heterogeneity was identified in CREB3L4 (I2 = 50.5%) and HDGF (I2 = 72.1%), and the causal estimates were combined with a random-effect model, while the ones from SERPINA3 (I2 = 0.0%) were combined with a fixed-effect model (Fig. 2). All the genetically proxied protein levels retained to have a significant causal association with prostate cancer in the meta-analyses.

In the meta-analyses, one SD increase in genetically-proxied CREB3L4 level significantly increased the risk of prostate cancer (OR 1.260, 95% CI 1.164–1.364, p-value < 0.0001), and one SD increase in genetically-proxied SMAD2 level significantly reduced the risk of prostate cancer (OR 0.710, 95% CI 0.578–0.873, p-value = 0.001) (Figs. 2, 3). Significant causal effects on prostate cancer were also obversed with HDGF (OR 1.072, 95% CI 1.021–1.125, p-value = 0.005), SERPINA3 (OR 1.138, 95% CI 1.091–1.187, p-value < 0.0001), TNFRSF6B (OR 0.656, 95% CI 0.496–0.869, p-value = 0.003), and EIF4B (OR 0.701, 95% CI 0.618–0.796, p-value < 0.0001).

Fig. 3
figure 3

Forest plots showing the plasma proteins that were identified to significantly decrease the risk of prostate cancer in the PRACTICAL dataset. The causal estimates from the PRACTICAL and FinnGen datasets were combined with meta-analyses. SNP, single nucleotide polymorphism

Colocalization analyses

We further performed colocalization analyses with the 6 proteins identified in the meta-analyses. Among the 6 proteins (CREB3L4, HDGF, SERPINA3, TNFRSF6B, EIF4B, and SMAD2), SMAD2 (PH4 = 0.87) and TNFRSF6B (PH4 = 0.99) had a strong support of colocalization (PH4 > 0.8) (Supplementary Table 13). HDGF (PH4 = 0.73) and SERPINA3 (PH4 = 0.61) had a medium support for colocalization (0.8 ≥ PH4 ≥ 0.5). CREB3L4 (PH4 = 0) and EIF4B (PH4 = 0) were not supported by colocalization analyses.

Genetically proxied gene expression, and Prostate cancer

We further investigate the associations between genetically proxied circulating gene expression of CREB3L4, HDGF, SERPINA3, TNFRSF6B, EIF4B, and SMAD2 and prostate cancer risk by employing cis-eQTL for these genes as instrumental variables. One cis-eQTL, rs948602 was used as an instrumental variable for SMAD2 gene expression (Table 3). One SD increase in genetically proxied SMAD2 expression significantly decreased the risk of prostate cancer in the PRACTICAL (OR 0.799, 95% CI 0.721–0.884, p-value = 1.59 × 10–5) consortium. In the FinnGen consortium, rs948602 is not available thus another SNP in linkage disequilibrium with rs948602 (rs11082640, R2 = 0.9116) has been used as a proxy and the result showed a significant association (OR 0.745, 95% CI 0.612–0.906, p-value = 0.003). The association remained consistent when combined with random-effect meta-analysis (OR 0.787, 95% CI 0.719–0.861, p-value = 1.00 × 10–4) (Fig. 4).

Table 3 Detailed information of eQTLs used as proxies of CREB3L4, HDGF, SERPINA3, TNFRSF6B, EIF4B, and SMAD2 gene expressions and the effects of the eQTLs on the risk of prostate cancer in PRACTICAL and FinnGen corhorts
Fig. 4
figure 4

Forest plots showing the causal effects of SMAD2 and CREB3L4 gene expression proxied by cis-eQTL variants on the risk of prostate cancer. Cis-eQTL, cis expression quantitative trait loci

One cis-eQTL (rs11264736) was used to proxy the expression of CREB3L4 (Table 3). One SD increase in genetically proxied CREB3L4 expression significantly increased the risk of prostate cancer in both PRACTICAL (OR 1.130, 95% CI 1.058–1.206, p-value = 2.42 × 10–4) and FinnGen consortium (OR 1.338, 95% CI 1.182–1.515, p-value = 4.31 × 10–4). The fixed-effect model meta-analysis showed consistent results (OR 1.219, 95% CI 1.033–1.438, p-value = 0.019) (Fig. 4).

However, no eQTLs were available as instrumental variables for TNFRSF6B and SERPINA3. For genetically proxied HDGF gene expression, the ORs of causal effects on prostate cancer were 0.994 (95% CI 0.917–1.075, p-value = 0.857) and 1.019 (95% CI 0.875–1.187, p-value = 0.807) in PRACTICAL and FinnGen datasets respectively. For genetically proxied EIF4B gene expression, the ORs were 0.938 (95% CI 0.799–1.101, p-value = 0.436) and 0.927 (95% CI 0.773–1.113, p-value = 0.417) in PRACTICAL and FinnGen datasets respectively. No horizontal pleiotropy was identified in the analyses with the MR-Egger intercept test (p for intercept > 0.05) or manual search on the PhenoScanner (http://www.phenoscanner.medschl.cam.ac.uk/) for the included instrumental variables.

Discussion

In this study, we have employed 7141 conditional independent cis-pQTL variants for 1925 plasma proteins and tested their causal effects on the risk of prostate cancer. Our large-scale proteome Mendelian randomization analyses identified several plasma proteins as associated with an increased or decreased risk of prostate cancer in the PRACTICAL consortium. We validated our findings in another large dataset from the FinnGen consortium. Causal estimates of six proteins (CREB3L4, HDGF, SERPINA3, TNFRSF6B, EIF4B, and SMAD2) remained significant in both datasets. We further tested the robustness of the results by assessing the causal effects of SMAD2, CREB3L4, TNFRSF6B and SERPINA3 gene expression on prostate cancer using cis-eQTL variants as instrumental variables. We also employed colocalization analyses to test if these protein levels share causal loci with prostate cancer, and found evidence of colocalization for SMAD2, TNFRSF6B, HDGF and SERPINA3 with prostate cancer, but not for CREB3L4 and EIF4B. Consistency in the results highlighted the importance of these plasma proteins in the incidence of prostate cancer.

SMAD2 is a core transcriptional factor that mediates the downstream signaling of the TGF-β, thus is involved in a variety of cellular processes including cell proliferation, apoptosis, and differentiation [27, 28]. In response to TGF-β signaling, SMAD2 is phosphorylated by TGF-β receptors and then associated with SMAD4. Association with SMAD4 induced the translocation of SMAD2 to the cell nucleus where it acts as a transcription repressor via forming a complex with other cofactors. SMAD2 has been reported as playing an important role in several different cancer types including prostate cancer, colorectal cancer, and skin cancer [28,29,30,31]. Smad2 has been reported to mediate TGFβ induced apoptosis and gene expression in prostate epithelial cells [28]. Silencing Smad2 alone induced the malignant transformation of NRP-152 cells when assessed with subcutaneous tumor growth in athymic mice [28]. It was also reported that Smad2 expression is involved in zinc-induced apoptosis through forming a Smad2/4 complex in LNCaP cells, a prostate cancer cell line [32].

CREB3L4 is a CREB (cAMP responsive element binding) protein that contains a transmembrane domain that can bind to the endoplasmic reticulum membrane [33]. It has been reported that CREB3L4 is highly expressed in prostate cancer, and the expression is even higher in malignant prostate cells [34, 35]. Specifically, the expression of CREB3L4 is much elevated in high-grade prostatic intraepithelial neoplasia, and adenocarcinomas, in comparison to normal prostate cells [35]. CREB3L4 is known to be regulated by androgen, it has also been reported that that CREB3L4 is overexpressed in androgen-dependent prostate cancer cells [36]. It has been shown that CREB3L4 promotes the androgen-receptor (AR) recruitment to AR targets and increased the expression of the target genes such as prostate-specific antigen (PSA) [36].

HDGF is a nucleolar protein that was previously found to be overexpressed in several malignancies, such as hepatocellular carcinoma, non-small-cell lung cancer, pancreatic cancer [37,38,39,40]. HDGF was reported to play important roles in the apoptosis, angiogenesis, and metastasis of cancer cells [41]. It was found to be a survival related protein in prostate cancer, and HDGF knockdown supressed prostate cancer cell proliferation [42, 43]. SERPINA3 was previously found to be regulated by inflammatory cytokines, and the expression of SERPINA3 is elevated in inflammatory conditions [44, 45]. The overexpression of SERPINA3 is associated with decreased cell adhesion and inhibition of apoptosis, and increased the risk of malignant tumors [44]. These evidences are in line with our findings that genetically proxied HDGF and SERPINA3 levels positively correlated with prostate cancer risk.

TNFRSF6B is soluble secretary protein that lacks transmembrane structure [46]. The expression of TNFRSF6B protein was reported to be associated with apoptosis and immune monitoring [47]. It was found to play crucial roles in pancreatic cancer, gastric cancer and hepatocellular carcinoma [12, 48, 49]. However, the exact role of TNFRSF6B in prostate cancer is still unclear. EIF4B was previously known as involved in the regulation of protein synthesis and mitotic survival of cancer cells [50]. The phosphorylation of EIF4B was found to decrease the apoptosis of prostate cells [51]. While the relationship between EIF4B levels and prostate cancer incidence has not been studies yet. Further studies are needed to better clarify the exact role of TNFRSF6B and EIF4B in prostate cancer.

Previous publications by Ren et al. and Desai et al. have performed proteome-wide MR analyses and have identified several potential diagnosis and treatment targets for prostate cancer [13, 14]. Interestingly, Desai et al. have also identified SERPINA3, CREB3L4, and TNFRSF6B as associated with the risk of prostate cancer using instrumental variable from different sets of GWAS summary results. In our study, we have used a different set of cis-pQTLs to further investigation the causal effects of proteome on prostate cancer risk. Our study has identified several novel targets with potentials in early diagnosis and treatment of prostate cancer.

Our study has several advantages, firstly, most of the previous publications were based on cell studies or animal experiments, which are more prone to confounding factors. Our study employed an MR design that can minimize the risk of confounding bias and reverse causality. Our analyses using large-scale genetic data from human populations confirmed the role of plasma CREB3L4 and SMAD2 in the development of prostate cancer. Besides, we have validated our findings in two different prostate cancer datasets from non-overlapping populations, and the consistency of the results showed the robustness of our findings. Furthermore, we have restricted the instrumental variables to within a certain distance window of the protein-coding genes, which helped to decrease the risk from horizontal pleiotropy. Lastly, the study population was limited to subjects of European ancestry to minimize the bias from population stratification.

However, our study also has several disadvantages. Firstly, restricting the study population to European ancestry also limited the generalization of our findings to our populations. Secondly, we have used a dataset from the PRACTICAL consortium as primary analyses and replicated the analyses in the FinnGen dataset in the secondary analyses, we assumed that if one protein discovered in the primary analyses could be replicated in the secondary analyses, it suggests the robustness of the causal effects. However, the plasma proteins that didn’t remain significant in the replication could also have a causal effect on prostate cancer. Further work is needed to test the effects of these proteins in prostate cancer. Lastly, we focused on the effects of plasma proteins on prostate cancer, which could be different from the effects when expressed locally, further studies are necessary to test their effects of local expression.

Conclusions

In conclusion, our results suggested that the increased plasma level of SMAD2 generated a protective effect on prostate cancer, while a higher level of plasma CREB3L4 increased the risk of prostate cancer. Our study expanded the understanding of the mechanisms of prostate cancer and provided potential biomarkers determining the susceptibility to prostate cancer.

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article (and its additional files). The supplementary files have been uploaded to: https://figshare.com/articles/dataset/Supplementary_Tables/23012576, the summary-level statistics of prostate cancer can be found in the PRACTICAL consortium (http://practical.icr.ac.uk/) and FinnGen Release 8 (https://r8.finngen.fi/).

Abbreviations

ARIC study:

Atherosclerosis Risk in Communities study

cis-pQTL:

Cis-protein quantitative trait loci

CI:

Confidence interval

CREB :

CAMP responsive element binding

GWAS:

Genome-wide association studies

MR:

Mendelian randomization

OR:

Odds ratio

SNP:

Single-nucleotide polymorphism

PRACTICAL:

Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome

SD:

Standard deviation

AR:

Androgen-receptor

References

  1. Rawla P. Epidemiology of prostate cancer. World J Oncol. 2019;10(2):63–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424.

    Article  PubMed  Google Scholar 

  3. Steele CB, Li J, Huang B, Weir HK. Prostate cancer survival in the United States by race and stage (2001–2009): Findings from the CONCORD-2 study. Cancer. 2017;123 Suppl 24(Suppl 24):5160–77.

    Article  PubMed  Google Scholar 

  4. Mancuso N, Gayther S, Gusev A, Zheng W, Penney KL, Kote-Jarai Z, Eeles R, Freedman M, Haiman C, Pasaniuc B, et al. Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat Commun. 2018;9(1):4079.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Sun X, Ye D, Du L, Qian Y, Jiang X, Mao Y. Genetically predicted levels of circulating cytokines and prostate cancer risk: a Mendelian randomization study. Int J Cancer. 2020;147(9):2469–78.

    Article  CAS  PubMed  Google Scholar 

  6. Di Sebastiano KM, Pinthus JH, Duivenvoorden WCM, Mourtzakis M. Glucose impairments and insulin resistance in prostate cancer: the role of obesity, nutrition and exercise. Obes Rev. 2018;19(7):1008–16.

    Article  PubMed  Google Scholar 

  7. Mengus C, Le Magnen C, Trella E, Yousef K, Bubendorf L, Provenzano M, Bachmann A, Heberer M, Spagnoli GC, Wyler S. Elevated levels of circulating IL-7 and IL-15 in patients with early stage prostate cancer. J Transl Med. 2011;9:162.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Minas TZ, Candia J, Dorsey TH, Baker F, Tang W, Kiely M, Smith CJ, Zhang AL, Jordan SV, Obadi OM, et al. Serum proteomics links suppression of tumor immunity to ancestry and lethal prostate cancer. Nat Commun. 2022;13(1):1759.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Smith GD, Ebrahim S. “Mendelian randomization”: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003;32(1):1–22.

    Article  PubMed  Google Scholar 

  10. Yao C, Chen G, Song C, Keefe J, Mendelson M, Huan T, Sun BB, Laser A, Maranville JC, Wu H, et al. Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nat Commun. 2018;9(1):3268.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Zheng J, Haberland V, Baird D, Walker V, Haycock PC, Hurle MR, Gutteridge A, Erola P, Liu Y, Luo S, et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat Genet. 2020;52(10):1122–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Shah S, Henry A, Roselli C, Lin H, Sveinbjörnsson G, Fatemifar G, Hedman ÅK, Wilk JB, Morley MP, Chaffin MD, et al. Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure. Nat Commun. 2020;11(1):163.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Desai TA, Hedman Å K, Dimitriou M, Koprulu M, Figiel S, Yin W, Johansson M, Watts EL, Atkins JR, Sokolov AV et al: Identifying proteomic risk factors for overall, aggressive and early onset prostate cancer using Mendelian randomization and tumor spatial transcriptomics. medRxiv : the preprint server for health sciences 2023.

  14. Ren F, Jin Q, Liu T, Ren X, Zhan Y. Proteome-wide mendelian randomization study implicates therapeutic targets in common cancers. J Transl Med. 2023;21(1):646.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zhang J, Dutta D, Kottgen A, Tin A, Schlosser P, Grams ME, Harvey B, Consortium CK, Yu B, Boerwinkle E, et al. Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies. Nat Genet. 2022;54(5):593–602.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Schumacher FR, Al Olama AA, Berndt SI, Benlloch S, Ahmed M, Saunders EJ, Dadaev T, Leongamornlert D, Anokian E, Cieza-Borrella C, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet. 2018;50(7):928–36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Kurki MI, Karjalainen J, Palta P, Sipila TP, Kristiansson K, Donner KM, Reeve MP, Laivuori H, Aavikko M, Kaunisto MA, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023;613(7944):508–18.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators. Am J Epidemiol. 1989;129(4):687–702.

  19. Williams SA, Kivimaki M, Langenberg C, Hingorani AD, Casas JP, Bouchard C, Jonasson C, Sarzynski MA, Shipley MJ, Alexander L, et al. Plasma protein patterns as comprehensive indicators of health. Nat Med. 2019;25(12):1851–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Vosa U, Claringbould A, Westra HJ, Bonder MJ, Deelen P, Zeng B, Kirsten H, Saha A, Kreuzhuber R, Yazar S, et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat Genet. 2021;53(9):1300–10.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26(17):2190–1.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–25.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, Plagnol V. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10(5): e1004383.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, Laurin C, Burgess S, Bowden J, Langdon R, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Yavorska OO, Burgess S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. Int J Epidemiol. 2017;46(6):1734–9.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Abdel Mouti M, Pauklin S. TGFB1/INHBA homodimer/nodal-SMAD2/3 signaling network: a pivotal molecular target in PDAC treatment. Mol Ther. 2021;29(3):920–36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Yang J, Wahdan-Alaswad R, Danielpour D. Critical role of Smad2 in tumor suppression and transforming growth factor-beta-induced apoptosis of prostate epithelial cells. Cancer Res. 2009;69(6):2185–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Zhang L, Zhu Z, Yan H, Wang W, Wu Z, Zhang F, Zhang Q, Shi G, Du J, Cai H, et al. Creatine promotes cancer metastasis through activation of Smad2/3. Cell Metab. 2021;33(6):1111–23.

    Article  CAS  PubMed  Google Scholar 

  30. Hoot KE, Lighthall J, Han G, Lu SL, Li A, Ju W, Kulesz-Martin M, Bottinger E, Wang XJ. Keratinocyte-specific Smad2 ablation results in increased epithelial-mesenchymal transition during skin cancer formation and progression. J Clin Invest. 2008;118(8):2722–32.

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Fleming NI, Jorissen RN, Mouradov D, Christie M, Sakthianandeswaren A, Palmieri M, Day F, Li S, Tsui C, Lipton L, et al. SMAD2, SMAD3 and SMAD4 mutations in colorectal cancer. Can Res. 2013;73(2):725–35.

    Article  CAS  Google Scholar 

  32. Yang N, Zhao B, Rasul A, Qin H, Li J, Li X. PIAS1-modulated Smad2/4 complex activation is involved in zinc-induced cancer cell apoptosis. Cell Death Dis. 2013;4(9): e811.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Asada R, Kanemoto S, Kondo S, Saito A, Imaizumi K. The signalling from endoplasmic reticulum-resident bZIP transcription factors involved in diverse cellular physiology. J Biochem. 2011;149(5):507–18.

    Article  CAS  PubMed  Google Scholar 

  34. Qi H, Fillion C, Labrie Y, Grenier J, Fournier A, Berger L, El-Alfy M, Labrie C. AIbZIP, a novel bZIP gene located on chromosome 1q21.3 that is highly expressed in prostate tumors and of which the expression is up-regulated by androgens in LNCaP human prostate cancer cells. Cancer Res. 2002;62(3):721–33.

    CAS  PubMed  Google Scholar 

  35. Labrie C, Lessard J, Ben Aicha S, Savard MP, Pelletier M, Fournier A, Lavergne E, Calvo E. Androgen-regulated transcription factor AIbZIP in prostate cancer. J Steroid Biochem Mol Biol. 2008;108(3–5):237–44.

    Article  CAS  PubMed  Google Scholar 

  36. Kim TH, Park JM, Kim MY, Ahn YH. The role of CREB3L4 in the proliferation of prostate cancer cells. Sci Rep. 2017;7:45300.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kishima Y, Yamamoto H, Izumoto Y, Yoshida K, Enomoto H, Yamamoto M, Kuroda T, Ito H, Yoshizaki K, Nakamura H. Hepatoma-derived growth factor stimulates cell growth after translocation to the nucleus by nuclear localization signals. J Biol Chem. 2002;277(12):10315–22.

    Article  CAS  PubMed  Google Scholar 

  38. Hu TH, Huang CC, Liu LF, Lin PR, Liu SY, Chang HW, Changchien CS, Lee CM, Chuang JH, Tai MH. Expression of hepatoma-derived growth factor in hepatocellular carcinoma. Cancer. 2003;98(7):1444–56.

    Article  CAS  PubMed  Google Scholar 

  39. Ren H, Tang X, Lee JJ, Feng L, Everett AD, Hong WK, Khuri FR, Mao L. Expression of hepatoma-derived growth factor is a strong prognostic predictor for patients with early-stage non-small-cell lung cancer. J Clin Oncol Off J Am Soc Clin Oncol. 2004;22(16):3230–7.

    Article  CAS  Google Scholar 

  40. Tsai HE, Liu GS, Kung ML, Liu LF, Wu JC, Tang CH, Huang CH, Chen SC, Lam HC, Wu CS, et al. Downregulation of hepatoma-derived growth factor contributes to retarded lung metastasis via inhibition of epithelial-mesenchymal transition by systemic POMC gene delivery in melanoma. Mol Cancer Ther. 2013;12(6):1016–25.

    Article  CAS  PubMed  Google Scholar 

  41. Bao C, Wang J, Ma W, Wang X, Cheng Y. HDGF: a novel jack-of-all-trades in cancer. Fut Oncol. 2014;10(16):2675–85.

    Article  CAS  Google Scholar 

  42. Shetty A, Dasari S, Banerjee S, Gheewala T, Zheng G, Chen A, Kajdacsy-Balla A, Bosland MC, Munirathinam G. Hepatoma-derived growth factor: a survival-related protein in prostate oncogenesis and a potential target for vitamin K2. Urol Oncol. 2016;34(11):483.e481-483.e488.

    Article  Google Scholar 

  43. Guo Y, Xu H, Huang M, Ruan Y. BLM promotes malignancy in PCa by inducing KRAS expression and RhoA suppression via its interaction with HDGF and activation of MAPK/ERK pathway. J Cell Commun Signal. 2023;17(3):757–72.

    Article  CAS  PubMed  Google Scholar 

  44. Chelbi ST, Wilson ML, Veillard AC, Ingles SA, Zhang J, Mondon F, Gascoin-Lachambre G, Doridot L, Mignot TM, Rebourcet R, et al. Genetic and epigenetic mechanisms collaborate to control SERPINA3 expression and its association with placental diseases. Hum Mol Genet. 2012;21(9):1968–78.

    Article  CAS  PubMed  Google Scholar 

  45. Péré-Brissaud A, Blanchet X, Delourme D, Pélissier P, Forestier L, Delavaud A, Duprat N, Picard B, Maftah A, Brémaud L. Expression of SERPINA3s in cattle: focus on bovSERPINA3-7 reveals specific involvement in skeletal muscle. Open Biol. 2015;5(9): 150071.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Pitti RM, Marsters SA, Lawrence DA, Roy M, Kischkel FC, Dowd P, Huang A, Donahue CJ, Sherwood SW, Baldwin DT, et al. Genomic amplification of a decoy receptor for Fas ligand in lung and colon cancer. Nature. 1998;396(6712):699–703.

    Article  CAS  PubMed  Google Scholar 

  47. Zhao T, Xu Y, Ren S, Liang C, Zhou X, Wu J. The siRNA silencing of DcR3 expression induces Fas ligand-mediated apoptosis in HepG2 cells. Exp Ther Med. 2018;15(5):4370–8.

    PubMed  PubMed Central  Google Scholar 

  48. Zhang C, Li H, Huang Y, Tang Y, Wang J, Cheng Y, Wei Y, Zhu D, Cao Z, Zhou J. Integrative analysis of TNFRSF6B as a potential therapeutic target for pancreatic cancer. J Gastroint Oncol. 2021;12(4):1673–90.

    Article  CAS  Google Scholar 

  49. Chen G, Rong M, Luo D. TNFRSF6B neutralization antibody inhibits proliferation and induces apoptosis in hepatocellular carcinoma cell. Pathol Res Pract. 2010;206(9):631–41.

    Article  CAS  PubMed  Google Scholar 

  50. Wang Y, Begley M, Li Q, Huang HT, Lako A, Eck MJ, Gray NS, Mitchison TJ, Cantley LC, Zhao JJ. Mitotic MELK-eIF4B signaling controls protein synthesis and tumor cell survival. Proc Natl Acad Sci USA. 2016;113(35):9810–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Ren K, Gou X, Xiao M, Wang M, Liu C, Tang Z, He W. The over-expression of Pim-2 promote the tumorigenesis of prostatic carcinoma through phosphorylating eIF4B. Prostate. 2013;73(13):1462–9.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the participants of the included GWAS studies for their contributions.

Funding

This study was supported by grants from the National Natural Science Foundation of China (81972374), Natural Science Foundation of Zhejiang Province (LQ21H160030).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: XZ, CZ; Formal Analysis: JW, ZY, DJ, SH, HC, KJ; Writing—original draft: JW, ZY, SH; Writing – reviewing & editing: XZ, CZ, DJ. All authors have read and approved the final manuscript.

Corresponding authors

Correspondence to Cheng Zhang or Xiangyi Zheng.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J., Yang, Z., Ding, J. et al. Proteome-wide Mendelian randomization identifies causal plasma proteins in prostate cancer development. Hum Genomics 19, 17 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40246-025-00724-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40246-025-00724-x

Keywords