Skip to main content

Comprehensive bioinformatics analysis of selected germline variants of uncertain significance identified in a cohort of Sri Lankan hereditary breast cancer patients

Abstract

Background

Next-generation sequencing (NGS)-based testing is a cost-effective method for identifying pathogenic germline genetic variations in cancer-predisposing genes in hereditary breast cancer. However, many of the variants detected through NGS are classified as variants of uncertain significance (VUS), where the impact of the variants on protein function remains unclear. Bioinformatics analysis using multiple computational tools is postulated to aid in generating new knowledge regarding the functional relevance of these VUS. This study aimed to gain new insights into the potential pathogenicity of a selected set of VUS identified in a cohort of Sri Lankan hereditary breast cancer patients using advanced bioinformatics tools.

Methods

The cancer database at the Centre for Genetics and Genomics contains genomic and clinical data from patients who had undergone germline genetic testing between 2015 and 2023. Five germline VUS detected in breast cancer affected patients were identified from the existing database and selected for further bioinformatics analysis using a combination of in-silico pathogenicity prediction tools, 3D protein modeling with structural analysis, and protein structural stability assessment with molecular dynamic simulation (MDS). The VUS included: BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly, (rs1555587813); BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys, (rs45437094); CHEK2:(NM_007194.4):c.60G > T;p.Gln20His, (rs375507194); MET:(NM_000245.4):c.840G > T;p.Arg280Ser, (rs1207381066); and STK11:(NM_000455.5):c.355A > G;p.Asn119Asp, (rs545015076).

Results

Two variants MET:(NM_000245.4):c.840G > T;p.Arg280Ser and BRCA1:(NM_007294.4):c.3392A > G; p.Asp1131Gly are predicted to have high-risk potential for causing significant impacts on the protein structure and function. Align GVGD results and the MDS data for the BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys variant suggested some alterations that require further confirmation. The CHEK2:(NM_007194.4):c.60G > T;p.Gln20His variant suggested an intermediate impact, whereas STK11:(NM_000455.5):c.355A > G;p.Asn119Asp suggested no significant structural or functional impact on the protein.

Conclusions

This study contributes valuable insights into the potential structural and functional implications of five VUS in cancer predisposition genes. Our results suggest a high-risk potential for variants in MET, BRCA1 and BRIP1, warranting further investigation to delineate their exact biological effects and to better understand their role in breast cancer risk.

Background

Hereditary factors contribute to 5–10% of all breast cancers [1], with pathogenic germline variants in specific breast cancer predisposing genes such as BRCA1 and BRCA2, playing a well-defined role. The advent of next-generation sequencing (NGS)-based testing has led to the detection of a high frequency of genetic variants in several other breast cancer predisposing genes, in addition to the BRCA1 and BRCA2 genes. However, many of the variants detected are classified as variants of uncertain significance (VUS) owing to a limited understanding of their functional consequences and clinical actionability [2]. This creates uncertainty regarding a patient's cancer risk and poses a challenge in genetic counseling and clinical decision-making, especially in underrepresented populations. Thus, further investigations to determine the functional relevance of germline VUS in hereditary breast cancer risk are required.

Previous studies have demonstrated that bioinformatics analysis employing open-source web-based computational tools offers valuable insights into the potential pathogenicity of these VUS [3]. These tools analyze factors such as amino acid conservation, thermodynamic stability, amino acid polarity and physicochemical properties to predict the likelihood of a variant causing deleterious effects on protein structure and function. Most studies highlight the limitations of relying on a single computational tool. A growing trend emphasizes the need for combining predictions using multiple bioinformatics tools for a more comprehensive assessment of protein function and stability [3, 4]. This multifaceted approach considers various factors beyond protein stability, potentially leading to a more accurate delineation of the biological effects of these VUS.

Homology modeling predicts 3D protein structures on the basis of known templates. This technique allows researchers to investigate how genetic variants might affect protein structure and function by analyzing their impact on protein stability, DNA binding, catalytic sites, and protein–protein interactions [5, 6]. Structural data from sources such as UniProtKB and SWISS-MODEL are often used in this process. AlphaFold, a deep learning-based tool, shows promise for improved protein structure prediction and variant impact assessment [7]. Molecular dynamics (MDS) simulations provide further insights into the dynamic behavior of variants under simulated physiological conditions. This allows researchers to analyze how variants affect protein structure, stability, and interactions, ultimately aiding in the identification of potentially deleterious VUS [8]. Tools such as FoldX and DynaMut2 are valuable resources for protein dynamics analysis [9].

This study focused on a selected set of germline VUS identified in a cohort of Sri Lankan hereditary breast cancer patients. The strong familial predisposition to breast cancer in these patients suggests a heightened likelihood of pathogenicity of these VUS. To elucidate the structural and functional consequences of the VUS on the target proteins, we employed a multi-tiered bioinformatics approach. By leveraging advanced computational tools for in-silico pathogenicity prediction, 3D protein structure modeling, and molecular dynamics simulations, we aimed to provide a more comprehensive assessment of the functional impact of these variants. The insights derived from this analysis will inform the design of future functional genomic studies, including in vitro and in vivo assays for further experimental validation of the potential pathogenicity of these variants and contribute to our understanding of the molecular mechanisms underlying hereditary breast cancer risk.

Methods

The Centre for Genetics and Genomics at the Faculty of Medicine, University of Colombo maintains an anonymized cancer database that includes whole exome and clinical data of cancer-affected patients who underwent germline genetic testing using the Illumina next-generation sequencer platform between January 2015 and December 2023. An in-house bioinformatics pipeline was used for the genetic analysis. The resulting variants were interpreted according to the standard American College of Medical Genetics and Genomics (ACMG) guidelines.

This study focused on VUS identified in a cohort of Sri Lankan breast cancer patients. Even though a strong family history of cancer was reported in first-, second-, and third-degree relatives, a formal family segregation analysis could not be conducted for some of the patients in whom these VUS were detected due to the deceased status of some individuals and the unavailability of other family members. VUS identified in five such breast cancer patients were selected from the existing exome database for further bioinformatics analysis. The selected germline VUS included the following: BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly, (rs1555587813); BRIP1: (NM_032043.3):c.3103C > T;p.Arg1035Cys, (rs45437094); CHEK2: (NM_007194.4):c.60G > T;p.Gln20His, (rs375507194); MET:(NM_000245.4):c.840G > T;p.Arg280Ser, (rs1207381066); and STK11:(NM_000455.5):c.355A > G;p.Asn119Asp, (rs545015076). Ethical approval for the study was obtained from the Ethics Review Committee, Faculty of Medicine, University of Colombo [EC-22-141].

In-silico pathogenicity prediction tools

In-silico pathogenicity prediction was carried out using open-source web-based pathogenicity prediction bioinformatic tools, including ConSurf [10], PROVEAN [11], SIFT [12], PolyPhen-2 [13], SNPs&GO [14], and PhD-SNP [15].

Protein structure stability analysis

Protein structure stability was analyzed using PredictSNP [16], MuPro [17], MutPred2 [18], Michelanglo-VENUS [19] and I-Mutant2.0 [20]. I-Mutant2.0 predicts the effect of the amino acid substitution on protein stability, the free energy value (DDG value) and the reliability index (RI) value of the amino acids. The RI value reveals the reliability of the prediction, where 0 indicates the least reliable result, whereas 10 indicates the most reliable result. The DDG value measures the energy changes between a folded and unfolded structure. A DDG value greater than zero implies that the variant can increase the protein stability, whereas a negative DDG value will decrease the protein stability [21].

Biophysical characteristics analysis

To predict the transactivation activity of each of the five VUS, their biophysical characteristics were further analyzed using Align-GVGD [22]. Align-GVGD analysis assigns a classification code (C0, C15, C25, C35, C45, C55, C65) to each of the substitutions. C0 is least likely to disrupt function (neutral), whereas C65 is most likely to disrupt protein function (deleterious).

Prediction of structural alterations using 3D protein modeling

X-ray crystallographic structures, electron microscopic structures, or structure-predicted models of the wild-type proteins of BRCA1, BRIP1, CHEK2, MET, and STK11 were searched via databases such as the Protein Data Bank (PDB) [23], the Electron Microscopy Data Bank (EMDB) [24], the SWISS-MODEL Repository [25], and the ModBase [26]. When satisfactory structures were not found, homology modeling was performed using the Swiss Model [27] and the AlphaFold model of UCSF ChimeraX version 1.6.1.

Wild-type FASTA files for the five VUS (BRCA1, BRIP1, CHEK2, MET, STK11) were obtained using the UniProt database [28]. The relevant amino acid changes in the FASTA files of the variant protein structures were created using BioEdit version 7.2.5 (BRCA1:p.Asp1131Gly, BRIP1:p.Arg1035Cys, CHEK2:p.Gln20His, MET:p.Arg280Ser, STK11:p.Asn119Asp). With those variant FASTA files, homology modeling was performed using the Swiss Model and the AlphaFold model of ChimeraX. The quality of all the models was assessed using the UCLA-DOE LAB-SAVES v6.0 web server [29] and the model with the highest quality was chosen for subsequent analysis.

Structural comparisons between wild-type and variant protein structures including changes in the Ramachandran plots; root mean square deviation (RMSD) values of the alignment; bond angles [Psi (ψ), Phi (φ), and Omega (ω) angles]; bond length changes; changes in the protein surface and cavities; and bonds within 40-angstrom distances, were assessed using PROCHECK [30], PyMOL version 2.5.5, and Swiss Pdb Viewer version 3.7.

Protein–protein interactions and Molecular interaction pathways

Protein–protein interactions of the five genes of interest were analyzed using the STRING database [31], which provides a comprehensive collection of both observed and predicted protein–protein interactions. Molecular interaction networks and biological pathways were visualized using the open-source software platform Cytoscape [32], which enables the integration of annotations, gene expression profiles, and other state data into these networks. Gene functions were predicted, and the related genes were identified using Genemania [33]. The aim of the protein–protein interaction analysis in this study was firstly, to identify potential functional consequences of the variants. By examining how the variants might disrupt or alter protein interactions, we can gain insights into their potential impact on cellular processes. Secondly, it was to determine potential docking sites for future investigations. Understanding the regions of the protein involved in interactions can guide the design of experiments to study the effects of the variants on protein–protein binding and function.

Molecular dynamics simulations (MDS)

MDS was performed with Desmond 2020.1 from Schrödinger, LLC, with an OPLS-2005 force field for 50 ns and the root mean square deviation (RMSD), radius of gyration (Rg), and root mean square fluctuation (RMSF) were calculated to monitor the stability of the protein complexes. The RMSD quantifies the average displacement of atoms of the molecule compared with the starting structure over the course of the simulation and a lower RMSD value indicates that the atoms in the molecule have not moved much from their original positions, suggesting a more stable structure, whereas the RMSF focuses on the individual flexibility of each amino acid within the molecule and essentially calculates the average fluctuation of each atom relative to its average position throughout the simulation. Higher RMSF values suggest that the protein has more flexible domains [34].

Results

In-silico pathogenicity prediction, protein structure stability and functional analysis, biophysical characteristics analysis, prediction of structural alterations using 3D protein modeling, analysis of protein–protein interactions and molecular interaction pathways and MDS were conducted for five VUS: BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly,(rs1555587813);BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys,(rs45437094);CHEK2:(NM_007194.4):c.60G > T;p.Gln20His,(rs375507194);MET:(NM_000245.4):c.840G > T;p.Arg280Ser,(rs1207381066); and STK11:(NM_000455.5):c.355A > G;p.Asn119Asp, (rs545015076).

Clinicopathological features of patients

BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly variant was identified in a 51-year-old female with invasive ductal carcinoma. Her tumor was oestrogen receptor (ER)- and progesterone receptor (PR)-negative but human epidermal growth factor receptor 2 (HER2)-positive. BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys variant was detected in a 51-year-old female with low-grade ductal carcinoma in situ. Her tumor was triple-negative (ER-, PR-, and HER2-negative). CHEK2:(NM_007194.4):c.60G > T;p.Gln20His variant was detected in a 51-year-old female with invasive ductal carcinoma, her tumor was triple-negative. MET:(NM_000245.4):c.840G > T;p.Arg280Ser variant was detected in a 42-year-old female with high-grade ductal carcinoma in situ. Her tumor was triple-positive (ER-, PR-, and HER2-positive). STK11:(NM_000455.5):c.355A > G;p.Asn119Asp variant was identified in an 82-year-old female with invasive ductal carcinoma. Her tumor was ER-positive but PR- and HER2-negative. Family history analysis indicated that first-degree relatives of the patient with the BRCA1 variant had breast cancer and leukemia. Second-degree relatives reported oral and cervical cancers. The patient with the BRIP1 variant had second-degree relatives with colorectal cancer. First-degree relatives of the patient with the CHEK2 variant had breast cancer. The patient with the MET variant had first-, second-, and third-degree relatives with breast cancer. First-degree relatives of the patient with the STK11 variant were reported to have breast and colon cancers [35].

In-silico pathogenicity prediction tools

The results of the pathogenicity prediction tools are shown in Table 1. The MET:(NM_000245.4):c.840G > T;p.Arg280Ser variant was predicted to be pathogenic by almost all the pathogenicity prediction tools and the BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly variant was predicted to be pathogenic by four out of six prediction tools. Biophysical characteristics analysis was performed with Align-GVGD and it identified the BRCA1, BRIP1, and MET variants as Class C65, whereas the CHEK2, and STK11 variants were Class C15.

Table 1 Results of the in-silico pathogenicity prediction tools

Prediction of protein structure stability and functional analysis

As shown in Table 2, protein structure stability analysis revealed that the MutPred2 score for the MET:(NM_000245.4):c.840G > T;p.Arg280Ser variant was 0.915, indicating high pathogenic potential, whereas all other variants showed no pathogenic effect. The MET variant was predicted to cause an altered disordered interface with a 0.31 probability, altered metal-binding with a 0.28 probability, an altered transmembrane protein with a 0.13 probability, and a gain of the catalytic site at R277 with a 0.13 probability. Further analysis using I-Mutant2.0, and MuPro tools revealed that all five variants of interest could reduce protein stability. PredictSNP predicted all the variants except STK11 to be deleterious and the highest score of 72% was obtained for the MET variant. The Michelangelo-VENUS results suggested few changes in the motifs of the MET protein and predicted that all other variants were unlikely to significantly affect the stability of the protein.

Table 2 Results of protein structure stability and functional analysis

Prediction of structural alterations using 3D protein modeling

To obtain 3D structures of all five proteins of interest, experimentally determined X-ray crystallographic structures or electron microscopy structures were searched, and when no satisfactory structures were found, homology modeling was performed with either AlphaFold modeling with ChimeraX or Swiss modeling with templates that covered the whole genome with good coverage, identity, and comparatively good global model quality estimate (GMQE). The quality of the models was assessed with Verify 3D-UCLA-DOE LAB – SAVES v6.0 and as shown in Fig. 1, the best models with the highest quality and the best coverage were selected for further analysis. Ramachandran plots (Additional file 2: Fig. S2) of both the wild-type and the variant proteins were compared, and no significant differences were observed (Additional file 1: Fig. S1). Structural alignment, bond length, bond angle measurements, and the contacts within a 4-angstrom distance were checked with PyMOL version 4.40, and the RMSD values were obtained (Table 3). A lower RMSD value indicates greater similarity between the structures, whereas a higher RMSD value implies greater dissimilarity between two structures. The surface area and cavities of both the wild-type and variant proteins were observed with the Swiss PDB Viewer VERSION 4.1.0. Structural analysis revealed Phi (φ) angle deviations in BRCA1, altered bond angles in CHEK2 and STK11, modified bond angles and surface cavities in MET, no structural changes in BRIP1, and no interprotein bonds within 4 angstroms for any of the proteins (Table 3) (Additional file 3: Fig. S3).

Fig. 1
figure 1

Variant protein models of (a) BRCA1; b BRIP1; c CHEK2; d MET; and e STK11 proteins

Table 3 Results of the 3D protein modeling and structural comparison

Protein–protein interactions and molecular interaction pathways

Protein–protein interaction analysis

Protein interaction databases were utilized to identify potential functional partners for the BRCA1, BRIP1, CHEK2, MET and STK11 proteins. The STRING database was used to assess protein–protein interactions with high confidence scores. STRING analysis identified several proteins as functional partners with high-confidence interactions as shown below:

  • BRCA1: BRIP1, TOPBP1, BARD1, PALB2, TP53, ATM, FANCD2, MRE11, BABAM2, BABAM1.

  • BRIP1: MLH1, TOPBP1, BARD1, BRCA1, PALB2, NBN, MRE11, BRCA2, FANCD2, FANCI.

  • CHEK2: TP53, ATM, BRCA1, CDC25A, CDC25C, ATR, TP53BP1, MDC1, CDC7, BRCA2.

  • MET: HGF, CBL, GRB2, EGFR, PLXNB1, CD44, SRC, SHC1, GAB1, ERBB3.

  • STK11: CAB39, STRADA, CAB39L, STRADB, AXIN1, PRKAA1, PTEN, PRKAA2, PRKAB2, PRKAB1.

Genemania analysis

While STRING focuses on confidence scores, Genemania provides a broader view of gene co-expression and functional relationships. Here, we report the proteins identified by Genemania with the highest interaction scores for each protein: BRCA1 with BARD1; BRIP1 with BRCA1; CHEK2 with ASF1A; MET with HGF; and STK11 with STRADA.

Protein–protein interaction network of the breast cancer genes was obtained with the Cytoscape version 3.10.2 (Fig. 2).

Fig. 2
figure 2

Breast cancer protein–protein interaction network

Molecular dynamics simulations (MDS)

Comparative analysis of wild type and variant proteins (BRCA1, BRIP1, CHEK2, and STK11) using RMSD and RMSF calculations revealed varying degrees of structural stability. BRCA1 showed similar overall stability but increased flexibility at the variant position. Compared with the wild-type protein, BRIP1 exhibited increased stability in the variant protein. CHEK2 maintained overall stability, with minor fluctuations in the variant protein at the region of interest. Compared with the wild-type protein, STK11 demonstrated decreased stability in the variant protein. These findings suggest that the introduced amino acid variation impacts protein dynamics and stability differently across the studied proteins. MD simulation was not performed for the MET protein due to computational limitations. The results of the MDS are shown in Table 4 and Additional file 4: Fig. S4.

Table 4 Results of the molecular dynamic simulation

Discussion

The present study focused on five germline variants in cancer-predisposing genes that have been classified as VUS. Comprehensive bioinformatics analysis using multiple in-silico computational tools provided a better understanding of the potential pathogenicity, stability, functional impact, protein–protein interactions, and structural alterations caused by these variants. MDS provided further insights into the dynamic behavior of the wild-type and variant proteins, highlighting potential effects on protein stability and flexibility.

Heterogeneous results were obtained regarding the potential pathogenicity of the five VUS. This highlights the inherent limitations of these tools, as evidenced by conflicting predictions for certain variants. For example, the BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly variant displayed discrepancies between different pathogenicity prediction tools. The ClinVar database indicates five submissions for this variant, including four clinical testing submissions (uncertain significance) and one curation submission (likely benign) [36]. While one submission suggested a non-conservative amino acid change and three out of five in-silico tools predicted a damaging effect, the available data on variant occurrences in the general population are insufficient to draw definitive conclusions about its significance. The variant has been reported in at least one individual affected with Hereditary Breast and Ovarian Cancer Syndrome, but strong evidence for causality is lacking [37,38,39]. Dines et al. suggested that this variant might be located in a "coldspot" region, where missense variants are less likely to be pathogenic. However, other studies have indicated that the available evidence is insufficient to determine the role of this variant in disease. Additionally, algorithms developed to predict the effect of sequence changes on RNA splicing suggest that this variant may create or strengthen a splice site [40]. Advanced modeling of protein sequence and biophysical properties performed at Invitae suggests that this missense variant is not expected to disrupt BRCA1 protein function [36]. The amino acid change at codon 1131 replaces aspartic acid with glycine, which have similar properties, and the amino acid position is not well conserved in available vertebrate species. Furthermore, in-silico predictions for this alteration are inconclusive [35]. Based on the available evidence, including the conflicting in-silico predictions, limited population data, and lack of strong experimental evidence, the clinical significance of the BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly variant remains uncertain. Further research, including functional genomic studies, is warranted to definitively assess its role in breast and ovarian cancer susceptibility.

Similarly, the BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys, variant yielded conflicting results in the current study, with some tools classifying it as non-pathogenic and others suggesting deleterious potential (PredictSNP: 51% score, Align GVGD: C65 class with a high chance of deleterious effects). These findings underscore the crucial need for a multifaceted approach that combines in-silico analysis with additional methods for more definitive assessment of the functional relevance. The ClinVar database has 19 submissions for the BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys variant. Eighteen of them were clinical testing and one was a curation submission. Nine of those submissions reported the BRIP1 variant is of uncertain significance, while six reported it as likely benign. Four submissions had reported it as a benign variant based on a combination of the following: variant is present in unaffected individuals, population frequency, intact protein function, lack of segregation with disease, co-occurrence, RNA analysis, in-silico models, amino acid conservation, lack of disease association in case–control studies, and/or the mechanism of disease or impacted region is inconsistent with a known cause of pathogenicity [41].

The MET:(NM_000245.4):c.840G > T;p.Arg280Ser variant was predicted to have high-risk potential for altering protein structure and function. These include a possible disruption of disordered regions (altered disordered interface, 0.31 probability), changes in metal binding (altered metal binding, 0.28 probability), and a slight possibility of affecting its role as a transmembrane protein (altered transmembrane protein, 0.13 probability). Interestingly, the analysis also suggested a potential gain of a catalytic site at residue R277 (0.13 probability). Additionally, some alterations in protein motifs were observed. These findings warrant further investigation to determine the precise functional consequences of the variant on the MET protein. The ClinVar database has four clinical testing submissions for the MET:(NM_000245.4):c.840G > T;p.Arg280Ser variant and each of them reported the variant as a VUS [42]. It is noteworthy that while the MET gene has not previously been linked to hereditary breast cancer in the literature, this variant was identified in our exome database from a 42-year-old female diagnosed with high-grade ductal breast carcinoma in situ. This patient's tumor was triple-positive (ER + , PR + , and HER2 +). Additionally, the patient had a family history of breast cancer, with affected first-, second-, and third-degree relatives. It was based on this compelling clinical context that the variant was selected for further investigation in this study.

The CHEK2:(NM_007194.4):c.60G > T;p.Gln20His variant suggested an intermediate impact. All three ClinVar submissions for this variant have identified it as a VUS [43]. The STK11:(NM_000455.5):c.355A > G;p.Asn119Asp suggested no significant structural or functional impact on the protein. The ClinVar record has eight submissions for the STK11:(NM_000455.5):c.355A > G;p.Asn119Asp variant. Seven of them were clinical testing and one was a curation submission. Seven of those submissions reported the STK11 variant is of uncertain significance, while one reported it as likely benign. According to the Ambry Genetics entry, “The asparagine at codon 119 is replaced by aspartic acid, an amino acid with highly similar properties. This amino acid position is highly conserved in available vertebrate species. In addition, this alteration is predicted to be tolerated by in-silico analysis. Since supporting evidence is limited at this time, the clinical significance of this alteration remains unclear” [44].

3D protein modeling with structural analysis revealed a spectrum of changes created by the five variants. These changes ranged from subtle alterations in the backbone conformation, observed in the BRIP1 protein, to significant modifications in surface features, as observed with the MET protein. This finding suggests potentially varying functional consequences for the different proteins encoded by these variants.

Protein–protein interactions for the BRCA1 and MET proteins were explored using STRING to understand their roles in cellular processes. Studies have shown that BRCA1 interacts with proteins such as PALB2, RBBP8 and BRCA2, which are crucial for facilitating DNA repair through homologous recombination (HR) and inhibiting the error-prone non-homologous end joining (NHEJ) pathway [45]. Additionally, BRCA1 interacts with BARD1, forming the BRCA1/BARD1 complex which acts as an E3 ubiquitin ligase, attaching ubiquitin molecules to specific target proteins. Ubiquitination by BRCA1/BARD1 influences DNA repair, cell cycle control, and gene regulation. Mutations in BRCA1 and BARD1 disrupt the ubiquitin ligase activity of the complex [46].

Similarly, analysis using STRING and Genemania identified HGF as the strongest interactor with the MET protein. This finding is supported by the literature demonstrating that HGF binding activates c-MET, a receptor tyrosine kinase, promoting processes involved in cancer development, such as cell proliferation, migration, and metastasis [47,48,49,50,51]. Interestingly, a study by Papa et al. [52] suggested a potential negative interaction between the TGF-β and HGF pathways, highlighting the complex interplay between signaling pathways in cancer biology. The analysis of protein–protein interactions provides valuable context for interpreting the functional impact of the observed structural alterations in the BRCA1 and MET proteins.

While these in-silico methods provide a valuable starting point, their use is not devoid of limitations. For example, the short simulation timescales employed during MDS might not capture the full range of protein dynamics. Additionally, techniques such as analyzing hydrogen bond dynamics or solvent accessibility were not explored, potentially offering a more nuanced picture.

The structural models for the variant proteins of BRCA1, BRIP1, CHEK2, MET and STK11 as well as the wild-type models of BRCA1, BRIP1, CHEK2 proteins are novel as they were generated de novo through this study. Through a rigorous bioinformatics analysis, out of the five VUS that underwent pathogenicity assessment, we prioritize and recommend three VUS for further experimental validation: MET:(NM_000245.4):c.840G > T;p.Arg280Ser, BRCA1:(NM_007294.4):c.3392A > G;p.Asp1131Gly, and BRIP1:(NM_032043.3):c.3103C > T;p.Arg1035Cys. These variants present with a high-risk potential for affecting protein stability and function, warranting further investigation to delineate their exact biological effects and for a better understanding of their role in breast cancer development. Currently, insufficient evidence precludes the reclassification of these VUS on the basis solely of in-silico analysis. However, further experimental studies demonstrating the impact of the variant on protein function could lead to its reclassification as likely pathogenic or benign. This would provide a more definitive risk assessment for individuals harboring such variants.

It is important to note that in line with the methodology used in our study, several researchers have emphasized the importance of using multiple bioinformatic tools for the pathogenicity prediction of SNV with greater accuracy rather than relying solely on a single tool [3, 4]. Galehdari et al. [53], reporting on the diagnostic accuracy of SNV-based pathogenicity detection tools for UGT1A1 gene variants underscored this point in their meta-analysis. They compared the results of various SNV-based prediction tools with published clinical results for pathogenicity prediction of nonsynonymous SNV associated with Crigler-Najjar syndrome. While some tools like SIFT and PolyPhen-2 demonstrated promising results, the study highlighted the limitations of individual tools in accurately predicting the pathogenicity of variants [53]. Additionally, factors beyond structural stability, such as disruptions in post-translational modifications or ligand binding, can influence the disease phenotype.

Several limitations were identified during this study. The study relied heavily on computational bioinformatics tools, which are subject to limitations and potential biases. Conflicting predictions were observed for certain variants, such as BRCA1:(NM_007294.4):c.3392A > G and BRIP1:(NM_032043.3):c.3103C > T. Furthermore, the absence of experimental validation, both in vitro and in vivo, hinders the confirmation of variant pathogenicity and structural impact, limiting the clinical applicability of these results. Due to the deceased status or unavailability of family members, a formal segregation analysis could not be performed, which would have provided stronger evidence for variant pathogenicity.

Additional limitations include the relatively short 50-ns timeframe of MDS, which may not fully capture long-term protein dynamics or stability changes. The study did not explore hydrogen bond dynamics or solvent accessibility, which could provide a more nuanced understanding of protein-variant interactions. Computational resource constraints precluded the inclusion of MDS in the analysis of the MET variant, resulting in a partial assessment of its stability.

To establish definitive clinical relevance, experimental validation through functional genomics assays and extended family studies is essential. These efforts could significantly contribute to a deeper understanding and accurate classification of VUS in hereditary breast cancer predispositions.

Conclusions

This study employed a robust bioinformatics pipeline to investigate the potential structural and functional consequences of five VUS in cancer-predisposing genes. Our results suggest a high-risk potential for variants in MET, BRCA1, and BRIP1, warranting further investigation through functional genomics assays. However, the inherent limitations of in-silico predictions necessitate a cautious interpretation of these findings.

Availability of data and materials

No datasets were generated or analysed during the current study.

Abbreviations

VUS:

variants of uncertain significance

MDS:

Molecular dynamics simulations

DDG:

delta delta G (“G” Gibbs free energy)

ACMG:

American College of Medical Genetics and Genomics

RMSD:

root mean square deviation

Rg:

radius of gyration

RMSF:

root mean square fluctuation

RI:

reliability index

References

  1. Wilkinson L, Gathani T. Understanding breast cancer as a global health concern. Br J Radiol. 2021;95:10.

    Google Scholar 

  2. Richards S, Aziz N, Bale S, Bick D, Das S, Rehm HL, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17(5):405–23.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Almakhari M, Chen Y, Kong AS, Moradigaravand D, Lai KS, Lim SE, Loh JY, Maran S. In-silico identification of deleterious non-synonymous SNPs of TBX1 gene: Functional and structural impact towards 22q11.2DS. PLoS One. 2024;19(6):e0298092.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Ferla MP, Pagnamenta AT, Koukouflis L, Taylor JC, Marsden BD. Venus: Elucidating the impact of amino acid variants on protein function beyond structure destabilisation. J Mol Biol. 2022;434(11):167567.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Arnold K, Bordoli L, Kopp J, Schwede T. The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics. 2005;22(2):195–201.

    Article  PubMed  Google Scholar 

  6. Tramontano A, Morea V. Assessment of homology-based predictions in CASP5. Proteins Struct Funct Genet. 2003;53(S6):352–68.

    Article  CAS  PubMed  Google Scholar 

  7. Keskin Karakoyun H, Yüksel ŞK, Amanoglu I, Naserikhojasteh L, Yeşilyurt A, Yakıcıer C, et al. Evaluation of AlphaFold structure-based protein stability prediction on missense variations in cancer. Front Genet. 2023;14:96.

    Article  Google Scholar 

  8. Tam B, Sinha S, Qin Z, San MW. Comprehensive identification of deleterious TP53 missense VUS variants based on their impact on TP53 structural stability. Int J Mol Sci. 2021;22(21):11345–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Rodrigues CHM, Pires DEV, Ascher DB. DynaMut2, assessing changes in stability and flexibility upon single and multiple point missense mutations. Protein Sci. 2021;30(1):60–9.

    Article  CAS  PubMed  Google Scholar 

  10. ConSurf (https://consurfdb.tau.ac.il/; accessed in August 2023 )

  11. PROVEAN (http://provean.jcvi.org/; accessed in July 2024)

  12. SIFT (https://sift.bii.a-star.edu.sg/; accessed in August 2023)

  13. PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/; accessed in August 2023)

  14. SNPs&GO (https://snps.biofold.org/snps-and-go/snps-and-go.html; accessed in July 2024)

  15. PhD-SNP (https://snps.biofold.org/phd-snp/phd-snp.html; accessed in August 2023)

  16. PredictSNP (https://loschmidt.chemi.muni.cz/predictsnp1/; accessed in July 2024)

  17. MuPro (https://mupro.proteomics.ics.uci.edu/; accessed in July 2024)

  18. MutPred2 (http://mutpred.mutdb.org/; accessed in August 2023)

  19. Michelanglo-VENUS (https://venus.cmd.ox.ac.uk/venus; accessed in August 2023)

  20. I-Mutant2.0 (https://folding.biofold.org/i-mutant/i-mutant2.0.html; accessed in July 2024).

  21. Lim EC, Lim SW, Tan KJ, Sathiya M, Cheng WH, Lai KS, Loh JY, Yap WS. In-silico analysis of deleterious SNPs of FGF4 gene and their impacts on protein structure, function and bladder cancer prognosis. Life. 2022;12(7):1018.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Align-GVGD (http://agvgd.hci.utah.edu/about.php; accessed in July 2024).

  23. Protein Data Bank (PDB) (https://www.rcsb.org/; accessed in August 2023)

  24. Electron Microscopy Data Bank (EMDB) (https://www.ebi.ac.uk/emdb/; accessed in August 2023)

  25. SWISS-MODEL Repository (https://swissmodel.expasy.org/repository/; accessed in August 2023)

  26. ModBase (https://modbase.compbio.ucsf.edu/modbase-cgi/index.cgi; accessed in August 2023).

  27. Swiss Model (https://swissmodel.expasy.org/; accessed in August 2023)

  28. UniProt database (https://www.uniprot.org/; accessed in August 2023)

  29. UCLA-DOE LAB-SAVES v6.0 web server (https://saves.mbi.ucla.edu/; accessed in August 2023)

  30. PROCHECK (https://saves.mbi.ucla.edu/; accessed in August 2023)

  31. STRING database (https://string-db.org/; accessed in August 2023 and July 2024)

  32. Cytoscape (https://cytoscape.org/index.html; accessed in July 2024)

  33. Genemania (https://genemania.org/; accessed in August 2023 and July 2024).

  34. Ghahremanian S, Rashidi MM, Raeisi K, Toghraie D. Molecular dynamics simulation approach for discovering potential inhibitors against SARS-CoV-2: a structural review. J Mol Liq. 2022;354:118901.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Gunawardena K, Sirisena ND, Gayani A, Nilaksha N, Vajira HWD. Germline variants of uncertain significance, their frequency, and clinico-pathological features in a cohort of Sri Lankan patients with hereditary breast cancer. BMC Res Notes. 2023;16(1):96.

    Article  Google Scholar 

  36. National Center for Biotechnology Information. ClinVar; https://www.ncbi.nlm.nih.gov/clinvar/variation/462608/?oq=NM_007294.4:c.3392A%3EG&m=NM_007294.4(BRCA1):c.3392A%3EG%20(p.Asp1131Gly); accessed on 2024 Oct 5

  37. Dong H, Khyati C, Qin Y, Zhang J, Tian X, Rong C, et al. Prevalence ofBRCA1/BRCA2pathogenic variation in Chinese Han population. J Med Genet. 2020;58(8):565–9.

    Article  PubMed  Google Scholar 

  38. Sirisena N, Biswas K, Sullivan T, Stauffer S, Cleveland L, Southon E, et al. Functional evaluation of five BRCA2 unclassified variants identified in a Sri Lankan cohort with inherited cancer syndromes using a mouse embryonic stem cell-based assay. Breast Cancer Res. 2020;22(1):69.

    Article  Google Scholar 

  39. Peng Q, Zhang Y, Xian B, Wu L, Ding J, Ding W, et al. A synonymous variant contributes to a rare Wiedemann-Rautenstrauch syndrome complicated with mild anemia via affecting pre-mRNA splicing. Front Mol Neurosci. 2022;15:69.

    Article  Google Scholar 

  40. Dines JN, Shirts BH, Slavin TP, Walsh T, King MC, Fowler DM, et al. Systematic misclassification of missense variants in BRCA1 and BRCA2 “coldspots.” Genet Med. 2020;22(5):825–30.

    Article  PubMed  PubMed Central  Google Scholar 

  41. National Center for Biotechnology Information. ClinVar; https://www.ncbi.nlm.nih.gov/clinvar/variation/140819/; accessed on Nov. 27, 2024

  42. National Center for Biotechnology Information. ClinVar; https://www.ncbi.nlm.nih.gov/clinvar/variation/485748/?oq=MET:c.840G%3ET;%20p.Arg280Ser,%20(rs1207381066)&m=NM_000245.4(MET):c.840G%3ET%20(p.Arg280Ser); accessed Nov. 23, 2024

  43. National Center for Biotechnology Information. ClinVar; https://www.ncbi.nlm.nih.gov/clinvar/variation/530100/?oq=CHEK2:c.60G%3ET;%20p.Gln20His,%20(rs375507194&m=NM_007194.4(CHEK2):c.60G%3ET%20(p.Gln20His); accessed Nov. 23, 2024.

  44. National Center for Biotechnology Information. ClinVar; https://www.ncbi.nlm.nih.gov/clinvar/variation/219451/; accessed Nov. 23, 2024.

  45. Tavtigian SV, Chenevix-Trench G. Growing recognition of the role for rare missense substitutions in breast cancer susceptibility. Biomarkers Med. 2014;8(4):589–603.

    Article  CAS  Google Scholar 

  46. Witus SR, Stewart MD, Klevit RE. The BRCA1/BARD1 ubiquitin ligase and its substrates. Biochem J. 2021;478(18):3467–83. https://doiorg.publicaciones.saludcastillayleon.es/10.1042/BCJ20200864.

    Article  PubMed  Google Scholar 

  47. Comoglio PM, Giordano S, Trusolino L. Drug development of MET inhibitors: targeting oncogene addiction and expedience. Nat Rev Drug Dis. 2008;7:504–16.

    Article  CAS  Google Scholar 

  48. Cooper CS, Park M, Blair DG, Tainsky MA, Huebner K, Croce CM, Vande Woude GF. Molecular cloning of a new transforming gene from a chemically transformed human cell line. Nature. 1984;311(5981):29–33.

    Article  CAS  PubMed  Google Scholar 

  49. Kosaka T, Yamaki E, Mogi A, Kuwano H. Mechanisms of resistance to EGFR TKIs and development of a new generation of drugs in non-small-cell lung cancer. J Biomed Biotechnol. 2011;2011:165214.

    PubMed  PubMed Central  Google Scholar 

  50. Pothula SP, Xu Z, Goldstein D, Merrett N, Pirola RC, Wilson JS, Apte MV. Targeting the HGF/c-MET pathway: stromal remodelling in pancreatic cancer. Oncotarget. 2017;8(44):76722–39.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Trusolino L, Comoglio PM. Scatter-factor and emaphoring receptors: cell signalling for invasive growth. Nat Rev Cancer. 2002;2:289–300.

    Article  CAS  PubMed  Google Scholar 

  52. Papa E, Weller M, Weiss T, Ventura E, Burghardt I, Szabó E. Negative control of the HGF/c-MET pathway by TGF-β: a new look at the regulation of stemness in glioblastoma. Cell Death Disease. 2017;8(12):3210.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Galehdari H, Saki N, Mohammadi-Asl J, Rahim F. Meta-analysis diagnostic accuracy of SNP-based pathogenicity detection tools: a case of UTG1A1 gene mutations. Int J Mol Epidemiol Genet. 2024;4(2):52.

    Google Scholar 

Download references

Acknowledgements

None

Funding

None.

Author information

Authors and Affiliations

Authors

Contributions

NS was involved in the conceptual idea. NA, KS, and MF were involved in performing the bioinformatics analysis. NA, NDS and SDS drafted the manuscript. VHWD critically reviewed the manuscript. All the authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Nirmala D. Sirisena.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for the study was obtained from the Ethics Review Committee, Faculty of Medicine, University of Colombo [EC-22-141]. Prior written informed consent had been obtained from the participants for the use of these genomic data for future research studies.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

: Fig. S1. Best models for the protein structure (a) BRCA1 wild-type protein (Swiss Model); (b) BRCA1 variant protein (Swiss Model); (c) BRIP1 wild-type protein (Swiss model); (d) BRIP1 variant protein (Swiss model); (e) CHEK2 wild-type protein (Swiss model); (f) CHEK2 variant protein (Swiss model); (g) Predicted Alpha fold structure for the MET wild-type protein (UniProt); (h) Swiss model for MET variant protein (i) Alpha fold predicted model for the STK11 wild-type protein (UniProt); (j) Swiss model for the STK11 variant protein.

Additional file 2

: Fig. S2. Comparison of Ramachandran plots (PROCHEK) of (a) BRCA1 wild-type and variant proteins, (b) BRIP1 wild-type and variant proteins, (c) CHEK2 wild-type and variant proteins, (d) MET wild-type and variant proteins, and (e) STK11 wild-type and variant proteins.

Additional file 3

: Fig. S3. Surface area and cavities (Swiss PDV) of the (a) BRCA1 wild-type protein, (b) BRCA1 variant protein, (c) BRIP1 wild-type protein, (d) BRIP1 variant protein, (e) CHEK2 wild-type protein, (f) CHEK2 variant protein, (g) STK11 wild-type protein, (h) STK11 variant protein, (i) MET wild-type protein, and (j) MET variant protein.

Additional file 4

: Fig. S4. (a) RMSD plot of BRCA1- comparison between wild-type and variant proteins. (b) RMSF plot of BRCA1 wild-type protein. (c) RMSF plot of BRCA1 variant protein. (d) RMSD plot of BRIP1- comparison between wild-type and variant proteins. (e) RMSF plot of BRIP1- wild-type protein. (f) RMSF plot of BRIP1 - variant protein. (g) RMSD plot of CHEK2- comparison between wild-type and variant proteins. (h) RMSF plot of CHEK2 wild-type protein. (i) RMSF plot of CHEK2 variant protein. (j) RMSD plot of STK11- comparison between wild-type and variant proteins. (k) RMSF plot of STK11 wild-type protein. (l) RMSF plot of STK11 variant protein.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arachchige, N.D.S., Sirisena, N.D., De Silva, S. et al. Comprehensive bioinformatics analysis of selected germline variants of uncertain significance identified in a cohort of Sri Lankan hereditary breast cancer patients. Hum Genomics 19, 12 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40246-024-00703-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40246-024-00703-8

Keywords