- Research
- Open access
- Published:
Genetic diversity of the immunoglobulin heavy chain locus in cohorts of patients affected with SARS-CoV-2
Human Genomics volume 19, Article number: 7 (2025)
Abstract
Background
The Immunoglobulin Heavy Chain (IGH) genomic region is responsible for the production of circulating antibodies and warrants careful investigation for its association with COVID-19 characteristics. Multiple allelic variants within and across different IGH gene segments form a limited set of haplotypes. Previous studies have shown associations between some of these haplotypes and clinical outcomes of COVID-19. We typed 445 individuals of European ancestry, stratified for gender, age, and clinical status for 4 SNPs, two of which result in amino acid substitutions in IGHA2 and IGHG4, respectively. We analyzed associations at the single-locus level and for 4-loci haplotypes, inferred by phasing, after stratifying the overall cohort by gender, age, and disease severity.
Results
Only weak evidence of significant differences between subgroups was obtained at the level of a single SNP. However, when the haplotypic data were analyzed for the young and old subgroups separately, uneven partitioning was observed regarding the occurrence of severe cases and Resistors. We then examined the cross-tabulation of disease severity in males and females, based on the presence of each haplotype in the genotype. Two haplotypes were underrepresented in young severe cases compared to old severe ones. The same two haplotypes were overrepresented among young Resistors. These findings provide stronger support for, the weak associations observed at the single locus level.
Conclusions
Two haplotypes seem to act as protective factors specifically in young individuals, counteracting the general increase in vulnerability with age. This observation aligns with stronger genetic effects seen in young patients for other susceptibility genes. Our findings complement previous research identifying specific genetic variants that influence COVID-19 susceptibility and severity, emphasizing the complex interplay between host genetics and viral infection outcomes. Our results are consistent with a potential causative role of IGH regulatory regions (e.g. HS1.2), which are flanked by the SNP set here analyzed.
Introduction
The Coronavirus disease 2019 (COVID-19) is a respiratory illness caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which emerged in late 2019 and quickly spread. It was declared a global pandemic by the World Health Organization (WHO) in March 2020. In May 2023 COVID-19 was declared no longer a public health emergency of international concern. Nonetheless, the virus continues to circulate in communities and remains a potentially serious health risk.
Symptoms of COVID-19 can vary in severity. Many individuals exhibit mild symptoms that may resemble those of a common cold or flu. Patients with moderate symptoms may experience more pronounced manifestations, including persistent cough, shortness of breath or difficulty breathing, chest pain or pressure, increased fatigue, and exacerbation of pre-existing conditions (e.g., asthma, chronic obstructive pulmonary disease). A smaller proportion of patients develop severe symptoms that can lead to hospitalization and critical care. These symptoms include significant difficulty breathing, persistent chest pain or pressure, confusion or inability to stay awake, and bluish lips or face. Severe cases often require supplemental oxygen, mechanical ventilation, or other intensive medical interventions [1]. The variability in symptoms can be attributed to several factors, including age, viral load, and immune response. It is also important to note that some individuals infected with SARS-CoV-2 may remain asymptomatic, meaning they do not exhibit any symptoms despite being infected. Asymptomatic carriers can still spread the virus, contributing to the challenge of controlling the pandemic. Additionally, some individuals seem to have an innate resistance to SARS-CoV-2; several cases have been reported where all family members, except one spouse, became infected, suggesting that certain highly exposed individuals may be resistant to infection by this virus [2].
Several genome-wide studies have investigated the association between genetic variants and the risk of both SARS-CoV-2 infection and progression to severe COVID-19 [3,4,5]. Although many common genetic variants have been linked to an increased risk of infection, it is now well established that rare genetic mutations in the interferon (IFN) pathway are critically important, particularly in individuals who experienced severe manifestations of COVID-19 [6,7,8,9,10]. The presence of these mutations has been linked to a significantly increased risk of hospitalization and mortality due to COVID-19, underscoring the importance of the interferon response in controlling the disease [11]. The importance of this pathway in COVID-19 severity is further confirmed by the presence of autoantibodies against type I interferons (IFNs), which neutralize the antiviral effects, leading to impaired immune responses and increased susceptibility to severe outcomes [12, 13]. The identification of autoantibodies against type I IFNs has potential clinical implications for managing COVID-19 and for screening individuals at higher risk for severe disease [14]. The mechanism by which these autoantibodies are generated remains unknown. However, it is conceivable that it is the result of a multifactorial process, involving viral mimicry, dysregulation of immune tolerance, inflammatory responses, genetic predisposition, and possibly vaccine-induced effects [15]. Further research is needed to clarify the precise pathways leading to autoantibody production and their implications for disease severity and management. The presence of autoantibodies, dysregulated immune responses, and the impact of age and comorbidities on antibody levels suggest that altered antibody production in COVID-19 may significantly impact disease severity, recovery, and vaccine efficacy [16].
Antibody production during SARS-CoV-2 infection is highly variable, both in terms of immunoglobulin (Ig) class and timing of seroconversion [17,18,19]. Given that the main transmission way of SARS-CoV-2 is through inhalation of respiratory droplets, secretory immunoglobulins A (SIgA) produced in the mucosa represent the first immune response and plays a fundamental role in limiting the virus to the upper respiratory tract [20]. Therefore, the airway mucosal immune system mounts innate and adaptive responses that provide powerful protection against pathogen invasion. Association between levels of reactive saliva IgA antibodies and susceptibility in some pulmonary diseases was reported [21].
The immunoglobulin heavy chain gene cluster (IGH) is located on chromosome 14. Different Ig isotypes are the results of somatic recombination and mRNA splicing, leading to the assembly of continuous coding sequences for various Ig classes. The genomic region containing the coding segments for the immunoglobulin heavy chain constant domains is the result of a tandem duplication, in which the telomeric member harbours, among others, the active gene segments IGHG3, IGHG1 and IGHA1, whereas the other one contains IGHG2, IGHG4, IGHE and IGHA2 (Fig. 1). They are transcribed from telomere to centromere on the minus (−) DNA strand. A Regulatory Region (3`RR) is present in each of the duplicated blocks, with the telomeric and centromeric paralogues named 3'RR1 and 3'RR2, respectively. Each of these latter contains three enhancer elements indicated as HS3, HS1.2 and HS4 [Fig. 1 in refs. [22, 23]]. Both regions have been categorized as super-enhancers, i.e. elements that drive expression of genes that control and define cell identity [24]. Variation in these genomic regions has implications on inter-individual diversity in the ability to make Ig switches [23].
Epidemiological data show that women are less susceptible to COVID-19 than men [25]. Protection against death in women has been associated with a heightened immune response to viral infections compared with men [26,27,28]. These observations could be related to the level of interferon availability (as mentioned above), steroid hormones, or both. The hypothesis that different Ig classes’ production could be influenced by estrogens is supported by the identification of DNA binding elements within switch regions in IGH [29, 30].
In a previous study [31], we documented inter-individual variation in the HS1.2 region of 3’RR1, identifying allelic variants of different lengths associated with a set of Single Nucleotide Polymorphisms (SNPs) in strong linkage disequilibrium (LD). The non-independent arrangements identified in the European population could impact on the ability of the genomic region to bind transcription factors or regulatory effectors, including estrogens. Furthermore, coding variants may affect the functional and immunogenic properties of Ig heavy chains.
Here we hypothesized that, by virtue of LD, a limited number of variable sites could define haplotypes relevant to the COVID-19 response. We thus typed COVID-19 patients of European ancestry, stratified by gender, age, and clinical status, for 4 SNPs (Fig. 1) located on both sides of HS1.2, two of which result in amino acid substitutions in IGHA2 and IGHG4, respectively. We analyzed the genotyping results in terms of single- and multi-locus associations after stratifying the overall cohort by gender, age, and disease severity.
Materials and methods
The subjects
All subject considered in the present study were enrolled in the pre-vaccine era. A total of 305 COVID-19 patients were enrolled and hospitalized at the University Hospital of Rome “Tor Vergata” and the Bambino Gesù Pediatric Hospital in Rome, in the period March-May 2020. All were diagnosed with COVID-19 after positive results of naso-oropharyngeal swabs (Table 1).
The patients were partitioned into 4 disease severity groups based on their hospitalization outcomes: (i) asymptomatic, absence of clinical symptoms; (ii) mild, presence of few symptoms, but not requiring ventilation, except for cases of respiratory support via Venturi Mask (VMK); (iii) moderate, showing respiratory impairment, requiring non-invasive ventilation and CPAP (continuous positive airway pressure) or BiPAP (bilevel positive airway pressure) cycles; (iv) severe, defined by respiratory failure, requiring invasive ventilation and intensive care unit (ICU) admission. Specifically, patients (N = 305) were classified as severe (N = 114, 37%; mean age ± s.d. 69.4 ± 17.6 years); moderate (N = 184, 60%; 62.3 ± 20.3); mild (N = 7, 2%; 51.1 ± 14.2).
Additionally, the study included a group of subjects classified as Resistors (N = 140; 41.3 ± 12.6). Resistors are individuals exposed to SARS-CoV-2 without developing infection (occupational or household contacts) who tested negative on PCR and absent anti-SARS-CoV-2-specific antibodies. This group consisted of uninfected household contacts of proven infected and symptomatic individuals (with a score of 3 or higher on the WHO clinical progression scale), and individuals exposed to an index case without personal protection equipment for at least one hour per day, during the first three to five days of symptoms in the index case. These subjects were enrolled from March 20th to June 20th, 2020, during the first severe pandemic wave in Italy.
Age, gender, and disease severity, as indicated by hospitalization outcomes, were recorded for all subjects in each cohort (Tables 1 and S1). We analyzed subjects aged < 50 and > = 50 years separately to obtain groups of nearly equal size (hereafter referred to as “young” and “old”).
Following approval from the local ethics committee at Tor Vergata University Hospital (protocol no. 50/20), the study was conducted in accordance with the principles of the Declaration of Helsinki. Informed written consent was obtained from each patient.
Single nucleotide polymorphism (SNP) genotyping
We analyzed the 4 SNPs listed in Table 2 and Fig. 1. These were selected for their documented effects as missense substitutions, predictors of quantitative expression of IGH gene segments, or HS1.2 length allele by LD (Table 2, column 6).
Genotyping was performed by TaqMan allele discrimination assays (Applied Biosystem) according to the manufacturer’s instructions. The reactions were run on an Applied Biosystem StepOne RealTime-PCR under the following conditions: initial denaturation at 95 °C for 10 min., 40 cycles of denaturation at 92 °C for 15 s. and a single step of annealing-extension at 60 °C for 90 s. Genotype was assigned by registering the fluorescence emission from each sample at the corresponding VIC and FAM dye wavelengths.
The resulting dataset was integrated with genotypes of the 1000 Genomes project (N = 107, Tuscan controls, TSI) [32]. Individual genotypes at the same 4 SNPs typed in patients and Resistors were extracted with the data slicer available at http://www.ensembl.org/Homo_sapiens/Tools/DataSlicer, and their phasing retained to infer haplotypes. We limited the analysis to this geographically proximate European sub-population, in view of the strong divergence in this genomic region, also within the Continent [31].
Statistical analyses
Allele frequencies for all the SNPs and the testing of Hardy–Weinberg equilibrium (HWE) were obtained with Arlequin [33] for each group of subjects, separately (Table S2). Given the physical proximity and the presence of LD between the 4 analyzed SNPs in the IgH region, we used PHASEv2.1 [34] to reconstruct the 4-loci haplotypes and obtain their frequencies.
All further calculations were performed with R functions. Sample groups were compared for allele, genotype and haplotype frequencies using a contingency chi-square on raw counts with the "chi-square.test" function, which returns an exact probability value.
Multidimensional analysis was performed by sparse Principal Components as implemented in the R package sparsepc (https://github.com/erichson/spca) on the subset of 430 fully genotyped subjects. Sparse Principal Component Analysis (SPCA) attempts to find sparse weight vectors (loadings), i.e., a weight vector with only a few "active" (nonzero) values. This approach provides better interpretability for the principal components in high-dimensional data settings. This is because the principal components are formed as a linear combination of only a few of the original variables. This is a powerful method to analyze differentiation at multiallele systems and takes into account the absence/presence of allelic states, by encoding each of them in a distinct binary (0, 1) variable [35]. We used the 4-loci haplotypes as obtained upon phasing (see above) as a multi-allele system, encoding their presence/absence into 13 such variables (haplotypes listed in Tables 3 and S3).
Results
In the overall study, men (224) and women (221) were equally represented (Table 1). In the younger age group, women were overrepresented, whereas the opposite was true in the older age group (p = 4E-5). Similar differences were observed across various clinical categories (Table S1), with women overrepresented among Resistors, while men were overrepresented among moderate (n.s.) and severe patients (p = 0.01). This peculiarity among Resistors may be attributed to a higher representation of women in the population cohort where potential exposure was recorded and subjects were sampled. Even when excluding this group and considering only COVID-19 patients, men were overrepresented in the older age group (150 M vs 88 F) as opposed to the young group (30 M vs 37 F) (p = 0.01). Thus, the composition of the entire patient cohort replicates the previously reported milder presentation of COVID-19 in women in general, as well as the higher occurrence of severe cases in elderly individuals of both sexes [36,37,38].
Single locus analysis
All the individuals were genotyped for the 4 SNPs listed in Table 2, with only a few instances of missing genotype (range 6–13 depending on the SNP; compare the totals in Tables 1 vs S1). To compare our results with those obtained from another sample of similar geographic origin, we considered the 107 Tuscans of the 1000 Genomes Project for whom genotypes for the same SNPs were available (see Materials and Methods). We selected this sample, among other European samples, due to the long known strong structuring at this genomic region worldwide and within Europe [39].
Allele and genotype frequencies for all SNPs were generally in Hardy–Weinberg Equilibrium (HWE). In four instances (greyed in Table S2), a nominal p < 0.05 was obtained, which did not resist Bonferroni correction.
Pairwise comparisons of allele frequencies were performed between all subgroups (clinical status categories and control subjects) and between young and old subgroups, for men and women separately. The results are visually summarized in Fig. S1. Significant differences (0.01 < p < 0.05) were obtained for rs10137020 when comparing the moderate subgroup (lower G frequency) vs both the Resistors and the 1000 Genomes TSI controls (higher G frequency) for the aggregated sexes, and when comparing the moderate (lower G frequency) vs Resistor (higher G frequency) subgroups among women. Significant comparisons also emerged for rs12896746 for both the aggregated sexes and for women (Table S2 and Fig. S1). Notably, significant differences were found when comparing the Resistors (higher A frequency) vs both moderate and severe patients (lower A frequency), and severe patients (lower A frequency) vs controls (higher A frequency) for the aggregated sexes and when comparing the Resistors vs severe patients among women. No significant difference was found in any of the comparisons for rs61984162 and rs12433324. Overall, these results provided only weak evidence of significant departures between subgroups at the single SNP level.
Multi-locus analysis
While the power of individual SNPs in detecting association is limited by the presence of two alleles only, this power can notably increase by combining alleles at adjacent sites into haplotypes. This effect is further augmented if neighboring SNPs are in LD, producing a limited number of combinations in cis (haplotypes) with frequencies strongly departing from the random expectation, each of which conveys information on untested intervening variation, possibly of functional significance [40]. Based on this concept, we inferred the haplotypes contributing to each 4-loci genotype (Table S3). The 13 reconstructed haplotypes matched in structure and frequencies (p = 0.64) those represented in 107 Tuscans of the 1000 Genome Project upon complete resequencing, providing support to the subsequent analyses.
We employed a multidimensional method to plot each subject according to his or her haplotypic genotype, while retaining information on gender, age, and clinical status. Figure 2 displays each subject in the space of the 1st (38% of variance) and 2nd (20%) sparse Principal components. sPC1 was strongly contributed by haplotypes GAGA (order of sites as in Table 2) and GGAG for positive and negative values, respectively. sPC2 was strongly contributed by haplotypes GGAA, GGGA, and AGAG for positive values and GAGA and GGAG for negative values, respectively.
Plot of Sparse Principal Components 1 and 2 of 430 COVID-19 patients and Resistors (colour coded). Points were scattered (max ± 0.1) around the original position to avoid complete overlapping of symbols. Arrows highlight three clusters depleted in severe cases among patients aged < 50 and enriched in Resistors. A all subjects; B age < 50; C age >=50
Figure 2A shows five major genotype clusters. Except for the bottom central and rightmost clusters (heterozygotes GAGA/GGAG and homozygotes GAGA/GAGA, respectively), the other three clusters contained subclusters, each corresponding to slightly different genotypes. All clusters and subclusters were populated by subjects of both sexes from all status categories, with no prominent enrichment or depletion of severe, or moderate patients, or Resistors.
However, when the data were plotted for the young and old subgroups separately (Fig. 2B, C), uneven partitioning was observed regarding the occurrence of severe cases and Resistors. The general trend of enrichment of severe cases among the old was confirmed for all clusters (p ranging 0.0084 to 0.0001), representing the major trend in the data. Among young subjects, a clear separation was noted among three clusters (indicated by arrows 1–3 in Fig. 2B, C), with a single severe case overall and two remaining clusters in which severe cases persisted. Conversely, severe cases were present in all clusters among older subjects. Moreover, the same three clusters among young appeared to be enriched in Resistors. In conclusion, the prevalence of severe cases among the old increased across all clusters, although some clusters exhibited more significant increases than others.
The plot in Fig. 2 was instructive, as the genotypes corresponding to the three clusters that were depleted in young severe cases and enriched in Resistors were all contributed by haplotype GGAA, and in one case GGAG. Specifically, cluster 1 included the genotypes GGAA/GGAG and GGAG/GGAG, cluster 2 contained GGAA/GGAA, and cluster 3 comprised GGAA/GAGA. We reasoned that the presence of GGAA or GGAG alone could be associated with a low prevalence of severe of COVID-19 presentation in young patients. We then analysed the cross-tabulation of disease severity in men and women, according to the presence of each haplotype in the genotype (Table 3).
This partition confirmed the excess of young women in the moderate group, in contrast to an excess of old men (Totals in Table 3A, B). However, in the overall distribution, a complete absence of young severe cases carrying GGAA emerged, compared to 11 out of 30 among old cases (p = 0.0001). Additionally, GGAG was underrepresented in young severe cases (5/79) compared to old severe cases (42/123) (p = 1E-5). The same two haplotypes were overrepresented among young Resistors (27/38 vs 5/30 for GGAA; p = 2.5E-5 and 51/79 vs 19/123 for GGAG; p = 2.5E-12).
These results reinforce, with stronger reliability, the weak associations found at the single locus level. The haplotypes GGAA and GGAG differ from the most common GAGA at the two central SNPs (rs10137020 and rs12896746). Coherently, rs10137020(G) and rs12896746(A) were underrepresented in severe and moderate patients with less probative significance than the corresponding haplotypes.
Discussion
In this study, we analysed SNP variation in the IGH genomic region in three cohorts of COVID-19 patients and a group of subjects tentatively classified as Resistors. When analysing the genetic results, it was necessary to distinguish between genetic association and two major trends in the data, i.e., the lower propensity of women to develop severe forms of the disease and the increase of severe forms with age progression [36, 37]. Our series faithfully replicated these trends; thus, our search aimed at detecting genetic effects that strengthened these trends.
The IGH genomic region warrants careful investigation for association with COVID-19 features for several reasons. First, its duplicated nature, containing highly similar orthologous gene segments and multiallelic variation, complicates the unambiguous handling of NGS short reads and variant calling. However, this region is of utmost importance because it is responsible for the production of circulating antibodies. In this context, both structural and regulatory variation are important, up to the finest level, i.e., single nucleotide variation.
In a previous study [31], we showed that, at least in Europeans, multiple allelic variants within and between different IGH gene segments are arranged in cis non-randomly, forming a limited set of haplotypes. These haplotypes, in turn, harbour different allelic versions of regulatory elements (e.g. HS1.2), and the same is expected for any variant whose phenotypic effect could potentiate or weaken the antibody-mediated immune response, not only to SARS-CoV-2.
Two haplotypes were found to be differentially represented in different subgroups of patients, significantly underrepresented among young severe patients of both sexes and overrepresented among young Resistors. The three sPCA clusters in which this effect is detected include GGAA heterozygotes and homozygotes, as well as GGAG homozygotes (Fig. 2). Thus, the reduction of severe cases was associated with genotypes having a single GGAA copy, whereas two copies of GGAG were required.
In summary, these two haplotypes act as protective factors specifically for young subjects, in a manner that surpasses the general increase of vulnerability with age, in line with stronger genetic effects in young patients for other susceptibility genes [5]. In the overall population, the protective effect of GGAA and GGAG is predicted to impact a minority of people, as they account for 8% and 23% of all haplotypes, respectively, in our dataset (Table S3). Some degree of population heterogeneity among Europeans is expected, as their frequencies range 0.10–0.17 and 0.26–0.42, respectively, among the 5 European population samples in the 1000 Genomes data. In our LD analysis [31] these two haplotypes turned out to be in cis mainly (though not exclusively) to HS1.2 allele 1.
These results align with a previous study [41] reporting a protective effect against extremely severe outcomes in homozygotes for the allotype Gm3, which is predicted by our haplotype GGAA [31]. On the other hand, we point out that the protective effect here reported is of a different kind from that outlined in [29], examining the same genomic region with a partially overlapping set of SNPs. In that study, the authors considered the development of pneumonia in COVID-19, reporting a lower prevalence in female carriers of HS1.2 allele 2. A notable variation in prevalence was associated with different haplotypes. Conversely, with the limitation of our sample sizes, the lack of severe young cases associated with GGAA and GGAG in the present study impacted on both sexes. In conclusion, while in ref. [29] the hypothesis of estrogens as protective effectors in COVID-19 is viable and in line with a higher ability of binding them by HS1.2 allele 2 vs 1 [31], a similar hormonal role cannot be called to interpret our present results. It is well possible that the two studies detected different pathways to protection, which are captured by different SNP sets.
Recently, a relation between SARS-CoV-2 infection and genome alteration has been emphasized. This may take the form of altered methylation landscape patterns in lung tissues [42] and/or alteration of 3D genome/epigenome structures correlated with transcriptional suppression of interferon response genes [43], with a possible inter-generational effect [44]. The 3’RR regions are candidates as targets of the mechanisms underlying the above effects, for their richness in CpG dinucleotides [31] and conformational constraints [45].
Beyond the IGH genomic region, only a few protective alleles have been discovered in SARS-CoV-2 infection. These include the ancestral HLA-A*68 allele in some indigenous populations of Mexico [46], the IFNL4 gene polymorphisms in COVID-19-related pneumonia in women [47], common variants in TMPRSS2 [48], the elastase neutrophil-expressed (ELANE) [49] and the rs10774671(G) allele in OAS1 in Berbers [50].
As for the possible physiological mechanisms involved, inter-individual variation for susceptibility to COVID-19 severity and the associated IGH variation might well be only a facet of a more general ability to mount antibody-mediated responses to a variety of challenges. In other words, IGH variants may drive resistance to a range of diseases. In this context, increased HS1.2 allele 2 frequencies were observed in several autoimmune diseases, such as Celiac disease, Systemic Lupus Erythematosus and Rheumatoid Arthritis [51, 52]. On the contrary, in Crohn’s disease, a multifactorial disease with an abnormal immune response to enteric antigens, no significant correlations were observed with any HS1.2 allele [53]. The possible role of polymorphic variants in HS1.2 has been reported in AIDS [54], another disease caused by a virus (HIV). Interestingly, in HIV-infected patients, HS1.2 allele 1 was overrepresented in nonprogressors, i.e., individuals who did not develop AIDS. Our results also inferentially suggest a protective effect of the HS1.2 allele 1 in the cohort of COVID-19 young patients. The finding of the same HS1.2 allele involved in the protection from two viral diseases suggests a common mechanism of immunological defense carried out by the 3'RR and surrounding regions against the severity and/or progression of pathological conditions caused by viruses. This is not surprising, given the recently reported role of host genotypes in protection from infectious diseases [55]. Several studies have identified specific genetic variants that influence COVID-19 susceptibility and severity, highlighting the complex interplay between host genetics and outcomes of viral infection [5, 56, 57].
The human determinants of age-dependent patterns of death from infectious diseases are multifaceted, involving a complex interplay of biological, social, and environmental factors [58]. Research consistently shows that age significantly influences mortality rates from infectious diseases, with older adults exhibiting higher susceptibility and poorer outcomes compared to younger populations. This trend is particularly evident in the context of COVID-19, where age and sex have emerged as critical determinants of mortality. However, infectious diseases have historically been the cause of very high mortality in the first year of life (28%), decreasing sharply until early adulthood and progressively increasing with age, leading to a classic U-shaped curve [58]. Older adults, particularly those over 65, are at a markedly increased risk of severe outcomes from infectious diseases. For instance, studies have reported that the mortality rate from COVID-19 escalates significantly with age, with individuals aged 80 and above experiencing the highest fatality rates [25, 59]. This increased vulnerability is attributed to several factors, including age-related declines in immune function, which impairs the body’s ability to respond effectively to infections [60] as well as the presence of pre-existing anti-IFN1 autoantibodies, which significantly increase in prevalence after age 70. These autoantibodies are found in approximately 20% of both critical COVID-19 cases aged over 80 and total fatal COVID-19 cases (12, 13).
In conclusion, understanding the molecular mechanisms underlying the joint protective effects of immunoglobulins and other genetic variants against COVID-19 is essential for developing targeted public health interventions aimed at reducing mortality from infectious diseases, particularly in vulnerable age groups.
Data availability
All data are available as a spreadsheet in the Supplementary Materials.
References
Cao X. COVID-19: immunopathology and its implications for therapy. Nat Rev Immunol. 2020;20(5):269–70.
Andreakos E, Abel L, Vinh DC, Kaja E, Drolet BA, Zhang Q, et al. A global effort to dissect the human genetic basis of resistance to SARS-CoV-2 infection. Nat Immunol. 2022;23(2):159–64.
The Severe COVID-19 GWAS Group. Genomewide association study of severe Covid-19 with respiratory failure. N Engl J Med. 2020;383:1522–34.
Covid-Host Genetics Initiative. Mapping the human genetic architecture of COVID-19. Nature. 2021;600:472–7.
Cobat A, Zhang Q, Abel L, Casanova JL, Fellay J. Human genomics of COVID-19 pneumonia: contributions of rare and common variants. Annu Rev Biomed Data Sci. 2023;6:465–86.
Zhang Q, Bastard P, Liu Z, Le Pen J, Moncada-Velez M, Chen J, et al. Inborn errors of type I IFN immunity in patients with life-threatening COVID-19. Science. 2020;370(6515):eabd4570.
Asano T, Boisson B, Onodi F, Matuozzo D, Moncada-Velez M, Maglorius Renkilaraj MRL, et al. X-linked recessive TLR7 deficiency in ~1% of men under 60 years old with life-threatening COVID-19. Sci Immunol. 2021;6(62):eab14348.
Zhang Q, Bastard P, Cobat A, Casanova JL. Human genetic and immunological determinants of critical COVID-19 pneumonia. Nature. 2022;603(7902):587–98.
Biancolella M, Colona VL, Luzzatto L, Watt JL, Mattiuz G, Conticello SG, et al. COVID-19 annual update: a narrative review. Hum Genom. 2023;17(1):68.
Matuozzo D, Talouarn E, Marchal A, Zhang P, Manry J, Seeleuthner Y, et al. Rare predicted loss-of-function variants of type I IFN immunity genes are associated with life-threatening COVID-19. Genome Med. 2023;15(1):22.
Wang L, Zhu Y, Zhang N, Xian Y, Tang Y, Ye J, et al. The multiple roles of interferon regulatory factor family in health and disease. Signal Transduct Target Ther. 2024;9(1):282.
Bastard P, Rosen LB, Zhang Q, Michailidis E, Hoffmann HH, Zhang Y, et al. Autoantibodies against type I IFNs in patients with life-threatening COVID-19. Science. 2020;370(6515):eabd4585.
Bastard P, Gervais A, Le Voyer T, Rosain J, Philippot Q, Manry J, et al. Autoantibodies neutralizing type I IFNs are present in ~4% of uninfected individuals over 70 years old and account for ~20% of COVID-19 deaths. Sci Immunol. 2021;6(62):eab14340.
Credle JJ, Gunn J, Sangkhapreecha P, Monaco DR, Zheng XA, Tsai H-J, et al. Unbiased discovery of autoantibodies associated with severe COVID-19 via genome-scale self-assembled DNA-barcoded protein libraries. Nat Biomed Eng. 2022;6(8):992–1003.
Casanova JL, Peel J, Donadieu J, Neehus AL, Puel A, Bastard P. The ouroboros of autoimmunity. Nat Immunol. 2024;25(5):743–54.
Garcia-Beltran WF, Lam EC, Astudillo MG, Yang D, Miller TE, Feldman J, et al. COVID-19-neutralizing antibodies predict disease severity and survival. Cell. 2021;184(2):476–88.
Long QX, Liu BZ, Deng HJ, Wu GC, Deng K, Chen YK, et al. Antibody responses to SARS-CoV-2 in patients with COVID-19. Nat Med. 2020;26:845–8.
Sun B, Feng Y, Mo X, Zheng P, Wang Q, Li P, et al. Kinetics of SARS-CoV-2 specific IgM and IgG responses in COVID-19 patients. Emerg Microbes Infect. 2020;9(1):940–8.
Ma H, Zeng W, He H, Zhao D, Jiang D, Zhou P, et al. Serum IgA, IgM, and IgG responses in COVID-19. Cell Mol Immunol. 2020;17(7):773–5.
Esmat K, Jamil B, Kheder RK, Kombe Kombe AJ, Zeng W, Ma H, et al. Immunoglobulin A response to SARS-CoV-2 infection and immunity. Heliyon. 2024;10(1):e24031.
Wang X, Zhang J, Wu Y, Xu Y, Zheng J. SIgA in various pulmonary diseases. Eur J Med Res. 2023;28(1):299.
Sepulveda MA, Garrett FE, Price-Whelan A, Birshtein BK. Comparative analysis of human and mouse 3’ Igh regulatory regions identifies distinctive structural features. Mol Immunol. 2005;42:605–15.
Cianci R, Mancino G, Galli E, Serone E, Massoud R, D’Addabbo P, et al. New insight of human-IgH 3’regulatory regions in immunoglobulins switch. Gene. 2023;862:147254.
Hnisz D, Abraham BJ, Lee TI, Lau A, Saint-André V, Sigova AA, et al. Super-enhancers in the control of cell identity and disease. Cell. 2013;155(4):934–47.
Ahrenfeldt LJ, Otavova M, Christensen K, Lindahl-Jacobsen R. Sex and age differences in COVID-19 mortality in Europe. Wien Klin Wochenschr. 2020;133(7–8):393–8.
Mauvais-Jarvis F, Klein SL, Levin ER. Estradiol, progesterone, immunomodulation, and COVID-19 outcomes. Endocrinology. 2020;161(9):bqaa127.
Takahashi T, Ellingson MK, Wong P, Israelow B, Lucas C, Klein J, et al. Sex differences in immune responses that underlie COVID-19 disease outcomes. Nature. 2020;588(7837):315–20.
Klein SL, Flanagan KL. Sex differences in immune responses. Nat Rev Immunol. 2016;16(10):626–38.
Colucci M, Frezza D, Gambassi G, De Vito F, Iaquinta A, Massaro MG, et al. Functional associations between polymorphic regions of the human 3’IgH locus and COVID-19 disease. Gene. 2022;838:146698.
Hurwitz JL, Penkert RR, Xu B, Fan Y, Partridge JF, Maul RW, et al. Hotspots for vitamin-steroid-thyroid hormone response elements within switch regions of immunoglobulin Heavy Chain loci predict a direct influence of vitamins and hormones on B Cell class switch recombination. Viral Immunol. 2016;29(2):132–6.
Jodice C, Malaspina P, Ciminelli BM, Martinez-Labarga C, Biancolella M, Novelli G, et al. Variation of the 3’RR1 HS1.2 enhancer and its genomic context. Genes (Basel). 2024;15(7):856.
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526(7571):68–74.
Excoffier L, Lischer HE. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010;10(3):564–7.
Stephens M, Smith NJ, Donnelly P. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001;68(4):978–89.
Erichson NB, Zheng P, Manohar K, Brunton SL, Kutz JN, Aravkin AY. Sparse Principal Component analysis via variable projection. arXiv. 2018;1804.00341.
Wenham C, Smith J, Morgan R. COVID-19: the gendered impacts of the outbreak. Lancet. 2020;395(10227):846–8.
Yang X, Yu Y, Xu J, Shu H, Xia J, Liu H, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. 2020;8(5):475–81.
Manry J, Bastard P, Gervais A, Le Voyer T, Rosain J, Philippot Q, et al. The risk of COVID-19 death is much greater and age dependent with type I IFN autoantibodies. Proc Natl Acad Sci USA. 2022;119(21):e2200413119.
Cavalli-Sforza LL, Menozzi P, Piazza A. The history and geography of human genes. Princeton, N.J.: Princeton University Press; 1994.
Mackay TFC, Stone EA, Ayroles JF. The genetics of quantitative traits: challenges and prospects. Nat Rev Genet. 2009;10(8):565–77.
Vázquez-Coto D, Kimball C, Albaiceta GM, Amado-Rodríguez L, García-Clemente M, Gómez J, et al. Immunoglobulin genes and severity of COVID-19. Immunogenetics. 2024;76(3):213–7.
Noguera-Castells A, Parra J, Davalos V, García-Prieto CA, Veselinova Y, Pérez-Miés B, et al. Epigenetic fingerprint of the SARS-CoV-2 infection in the lung of lethal COVID-19. Chest. 2024;165(4):820–4.
Wang R, Lee JH, Kim J, Xiong F, Hasani LA, Shi Y, et al. SARS-CoV-2 restructures host chromatin architecture. Nat Microbiol. 2023;8(4):679–94.
Kocher K, Bhattacharya S, Niforatos-Andescavage N, Almalvez M, Henderson D, Vilain E, et al. Genome-wide neonatal epigenetic changes associated with maternal exposure to the COVID-19 pandemic. BMC Med Genom. 2023;16(1):268.
D’Addabbo P, Scascitelli M, Giambra V, Rocchi M, Frezza D. Position and sequence conservation in Amniota of polymorphic enhancer HS1.2 within the palindrome of IgH 3’Regulatory Region. BMC Evol Biol. 2011;11:71.
Hernández-Doño S, Sánchez-González RA, Trujillo-Vizuet MG, Zamudio-Castellanos FY, García-Silva R, Bulos-Rodríguez P, et al. Protective HLA alleles against severe COVID-19: HLA-A*68 as an ancestral protection allele in Tapachula-Chiapas. Mexico Clin Immunol. 2022;238:108990.
Matic S, Milovanovic D, Mijailovic Z, Djurdjevic P, Sazdanovic P, Stefanovic S, et al. Its all about IFN-λ4: Protective role of IFNL4 polymorphism against COVID-19-related pneumonia in females. J Med Virol. 2023;95(10):e29152.
David A, Parkinson N, Peacock TP, Pairo-Castineira E, Khanna T, Cobat A, et al. A common TMPRSS2 variant has a protective effect against severe COVID-19. Curr Res Transl Med. 2022;70(2):103333.
Fragoso JM, Vargas-Alarcón G, Martínez-Flores ÁE, Montufar-Robles I, Barbosa-Cobos RE, Rojas-Velasco G, et al. ELANE rs17223045C/T and rs3761007G/A variants: protective factors against COVID-19. Biomol Biomed. 2024;24(3):665–72.
Yousfi FZE, Haroun AE, Nebhani C, Belayachi J, Askander O, Fahime EE, et al. Prevalence of the protective OAS1 rs10774671-G allele against severe COVID-19 in Moroccans: implications for a North African Neanderthal connection. Arch Virol. 2024;169(5):109.
Cianci R, Giambra V, Mattioli C, Esposito M, Cammarota G, Scibilia G, et al. Increased frequency of Ig heavy-chain HS1,2-A enhancer *2 allele in dermatitis herpetiformis, plaque psoriasis, and psoriatic arthritis. J Invest Dermatol. 2008;128(8):1920–4.
Frezza D, Tolusso B, Giambra V, Gremese E, Marchini M, Nowik M, et al. Polymorphisms of the IgH enhancer HS1.2 and risk of systemic lupus erythematosus. Ann Rheum Dis. 2012;71(8):1309–15.
Cianci R, Lolli S, Pagliari D, Gambassi G, Frosali S, Marmo R, et al. The involvement of IgH enhancer HS1.2 in the pathogenesis of Crohn’s disease: how the immune system can influence a multifactorial disease. Eur Rev Med Pharmacol Sci. 2016;20(17):3618–27.
Montesano C, Giambra V, Frezza D, Palma P, Serone E, Gattinara GC, et al. HS1,2 Ig enhancer alleles association to AIDS progression in a pediatric cohort infected with a monophyletic HIV-strain. Biomed Res Int. 2014;2014:637523.
Casanova JL, Abel L. The microbe, the infection Enigma, and the host. Annu Rev Microbiol. 2024;78(1):103–24.
Casanova JL, Abel L. From rare disorders of immunity to common determinants of infection: following the mechanistic thread. Cell. 2022;185(17):3086–103.
Okada S, Asano T, Moriya K, Boisson-Dupuis S, Kobayashi M, Casanova JL, et al. Human STAT1 gain-of-function heterozygous mutations: chronic mucocutaneous candidiasis and type I interferonopathy. J Clin Immunol. 2020;40(8):1065–81.
Abel L, Casanova JL. Human determinants of age-dependent patterns of death from infection. Immunity. 2024;57(7):1457–65.
Lippi G, Mattiuzzi C, Sanchís-Gomar F, Henry BM. Clinical and demographic characteristics of patients dying from COVID-19 in Italy vs China. J Med Virol. 2020;92(10):1759–60.
Nikolich-Žugich J. Aging of the T cell compartment in mice and humans: from no naive expectations to foggy memories. J Immunol. 2014;193(6):2622–9.
Funding
This study was supported by the HORIZON-HLTH-2021-DISEASE-04 PROGRAM UNDER GRANT AGREEMENT N. 101057100 (UNDINE).
Author information
Authors and Affiliations
Contributions
Conceptualization of the work: AndN, GN. Performed the experiments: CJ, PM, BMC. Provided reagents (samples): MB, VLC, AL, FL, PR, AntN. Wrote the paper: AndN, CJ, GN, PM. All authors read and approved the last version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Biological samples used in this study were collected according to the ethical procedures of the GEFACOVID2.0 research program active to the Tor Vergata Hospital. This program ensures that the work is carried out with the highest regard for ethical issues and concerning the rights, integrity, and privacy of patients. All consent, material/information storage, and distribution procedures have been approved by the local Ethics Committees (CEI PTV protocol no. 50/20). SARS‐CoV‐2 positive patients who are offered participation in a research study sign an informed consent prepared ad hoc, which provides detailed information on the type of test, the implications of the genetic results, and the possible psychosocial implications.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Malaspina, P., Jodice, C., Ciminelli, B.M. et al. Genetic diversity of the immunoglobulin heavy chain locus in cohorts of patients affected with SARS-CoV-2. Hum Genomics 19, 7 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40246-025-00719-8
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40246-025-00719-8