INTRODUCTION
Severe combined immunodeficiency (SCID) is a heterogeneous group of genetic diseases characterized by severe T cell lymphopenia, causing increased susceptibility to viral, bacterial, and fungal infections since early in life (1). Most forms of SCID are due to genetic defects that are intrinsic to hematopoietic cells and can be successfully treated by allogeneic hematopoietic stem cell transplantation (HSCT). However, SCID may also be caused by genetic abnormalities that are intrinsic to thymic epithelium development and function; in such cases, thymus transplantation, but not hematopoietic cell transplantation, is required to cure the disease. Only a few genetic abnormalities, including complete DiGeorge syndrome, and pathogenic variants affecting FOXN1 or CHD7, are known to cause SCID as a result of abnormal thymic development in humans (1).
PAX1 is a member of the paired box (PAX) family of transcription factors and plays a critical role in pattern formation during embryogenesis. It is expressed in the pharyngeal pouches that give rise to the thymus, tonsils, parathyroid glands, thyroid, and middle ear development during human embryogenesis (2). Pax1 deficiency in mice is characterized by anomalies of the vertebral column and variable degrees of thymic hypoplasia and thymocyte number and maturation (35). In humans, a homozygous pathogenic PAX1 p.Gly166Val variant (6) and a homozygous frameshift insertion (c.1173_1174insGCCCG) (7) have been identified in patients with otofaciocervical syndrome type 2 (OTFCS2), a rare disorder characterized by facial dysmorphism, external ear anomalies with preauricular pits and hearing impairment, branchial cysts or fistulas, anomalies of the vertebrae and the shoulder girdle, and mild intellectual disability. Recently, another homozygous pathogenic PAX1 variant (p.Cys368*) has been reported in two affected children from a consanguineous family of North African descent, who presented with OTFCS2 associated with T B+ SCID (8). However, limited information was provided on the immunological phenotype of these patients, and the functional consequences of the PAX1 variant were not investigated. Here, we provide an in-depth clinical, biochemical, and immunological description of multiple patients with OTFCS2 associated with SCID who carried biallelic deleterious PAX1 variants. By performing transfection experiments, molecular modeling, molecular dynamics (MD) simulation, and in vitro differentiation of control- and patient-derived induced pluripotent stem cells (iPSCs) to thymic epithelial progenitor (TEP) cells, we sought to assess the effects of human PAX1 deficiency on thymus development and function.
Patient 1 (P1) is a male infant born to parents whose families were from the same rural region in Germany (Fig. 1A). Bilateral microtia, malar prominence, narrow alae nasi, cupid bow lip, and retrognathia were noticed at birth (fig. S1, A and B). Imaging studies demonstrated severely stenotic external auditory canal on the right side and narrow left auditory canal (fig. S1C), congenital kyphosis at C3-C4 and L3 levels, moderate spinal canal narrowing (fig. S1, D to F), and traction on the cauda equina (fig. S1G). Diffuse erythematous rash (fig. S1H), lymphadenopathy, elevated serum immunoglobulin E (IgE), and eosinophilia were present, consistent with Omenn syndrome. On chest x-ray, the thymus shadow was not visible, and split cervical vertebral bodies, hooked distal clavicles, and a shallow dysplastic glenoid fossa were seen (fig. S1I). This infection history during infancy included Staphylococcus aureus bacteremia, pneumonia, cellulitis, and diarrhea due to Clostridium difficile.
(A) Pedigrees and results of Sanger sequencing in patients with PAX1 variants and in healthy controls. For both family A and family B, results of Sanger sequencing in the heterozygous parents are also shown. (B) Schematic representation of the PAX1 protein and location of the variants identified in affected individuals.
P2 and P3 have been previously described (8) as patients V:1 and V:18, respectively, and are part of a large consanguineous family of Moroccan origin (Fig. 1A). At birth, P2 was noticed to have frontal and parietal bossing, hypertelorism, small nose with hypoplastic nasal root, low-set ears with agenesis of the left pinna and hypoplasia of the right pinna, scapular winging, and bilateral cryptorchidism. Imaging studies showed impaired development of internal auditory canals bilaterally and lack of a thymic shadow. P3 manifested similar facial dysmorphisms as P2, along with left facial nerve palsy, severe dorsal and lumbar scoliosis, and deafness. Imaging studies documented lack of thymic shadow, abnormal appearance of vertebrae, clavicles and shoulder blades, narrowing of both external auditory canals (fig. S1J), abnormalities of the middle ear, and presence of tubular structures with features of a dental element behind the mandibular condyle (fig. S1, K and L). Subject V:3 from the same family died early in life with a history of severe infections, but no formal medical records are available.
P4 and P7 are siblings born to consanguineous parents from Saudi Arabia. P7 was noticed to have severe bilateral microtia, postauricular sinuses, and micrognathia. He suffered from chronic diarrhea, recurrent respiratory infections, exfoliative dermatitis, regional dissemination of Bacille Calmette-Guerin (BCG-itis), and lymphadenopathy and died at 1 year of age.
P4 is a female with a history of chronic diarrhea, recurrent respiratory infections, and poor weight gain since the age of 1 month. Physical examination showed small malformed ears, a skin tag on the right ear, facial asymmetry, small nose with depressed nasal bridge, and small almond-shaped eyes. A skeletal survey showed wedge-shaped vertebral body at T11 and deficient posterior element of the sacrum at S4 and S5.
P5 and P6 were siblings born to consanguineous parents and belonged to the same extended family as P4 and P7. P5 had small, low-set malformed ears, triangular mouth, down-slanting palpebral fissures, a small nose with a depressed nasal bridge, and right facial palsy. She developed recurrent respiratory infections, chronic diarrhea, severe exfoliative dermatitis, and BCG-itis and was diagnosed with Omenn syndrome. She died at 8 months of age with progressive severe pneumonitis.
P6 was screened for immunodeficiency at birth because of the positive family history. She had malformed and low-set small ears, small chin, protruding forehead, and generalized eczema. A skeletal survey showed central depression of the vertebral bodies in the thoracic and lumbar spine. Her immunological workup was consistent with T B+ NK+ (natural killerpositive) SCID. She suffered from recurrent respiratory infections and chronic diarrhea and died at 9 months of age with respiratory syncytial virus (RSV) pneumonia.
The main immunological findings at presentation in P1 to P6 are shown in Table 1. In particular, P1 had significant T cell lymphopenia. His CD4+ lymphocytes were largely (98%) CD45R0+, no CD4+ CD45RA+ CD31+ cells were detected, and T cell proliferation to phytohemagglutinin (PHA) was impaired (fig. S2A). T cell receptor (TCR) excision circles (TRECs) were below the limit of detection, indicating lack of thymopoiesis. TCR V spectratyping revealed T cell oligoclonality (fig. S2B). Elevated serum IgE and eosinophilia were present, consistent with an Omenn syndrome presentation.
AEoC, absolute eosinophil count; ALC, absolute lymphocyte count; ANC, absolute neutrophil count; n.d.: not done; cpm, counts per minute.
Laboratory investigations in P2 at 2 weeks of age revealed profound T cell lymphopenia, markedly reduced proliferative response to mitogens, and increased serum IgE. An inguinal lymph node biopsy showed severe lymphoid depletion, with primary follicles without germinal centers, associated with nearly complete absence of CD3+ T cells, but presence of B and NK cells and sparse plasma cells, and increased number of CD68+ histiocytes and eosinophils (fig. S3). A diagnosis of T B+ NK+ SCID was established.
Severe T cell lymphopenia was observed in P3, P4, and P6, associated with virtually absent in vitro T cell proliferation to PHA in P4 and P6, consistent with a diagnosis of T B+ NK+ SCID (Table 1). Last, P5 was diagnosed as having Omenn syndrome based on generalized erythroderma, lymphocytosis, eosinophilia, hypogammaglobulinemia, increased IgE, and severely reduced in vitro T cell proliferation to PHA.
Because of severe immunological abnormalities, HSCT was attempted in P1 to P4 before the gene defect was known. Details of transplant, chimerism, and immune reconstitution are shown in Table 2. In all cases, a conditioning regimen was used. Two patients (P1 and P4) attained full donor chimerism. P2 failed to engraft, developed interstitial pneumonitis, and died 5.5 months after HSCT. In P3, initial engraftment was followed by secondary graft failure, and a second HSCT was performed, resulting in mixed chimerism. Although three of the patients attained either full or mixed donor chimerism, none of them achieved reconstitution of the T cell compartment. In P1, who exhibits full donor chimerism, all T cells have a CD45R0+ phenotype and therefore likely represent donor-derived T cells contained in the graft that have undergone peripheral expansion. P3 attained mixed chimerism but remained with persistent severe T cell lymphopenia. She developed Pneumocystis jiroveci pneumonia, recurrent gastrointestinal infections, and liver failure and died of septic shock at the age of 4 years and 7 months. P4, who attained full chimerism but failed to reconstitute T cells, developed severe autoimmune hemolytic anemia, requiring multiple courses of rituximab and immunosuppressive therapy. Together, these data indicate that HSCT was unable to correct the profound T cell immunodeficiency of this disease.
ATG, anti-thymocyte globulin; PBSC, peripheral blood stem cells; URD, unrelated donor.
Before HSCT, karyotype analysis revealed no cytogenetic abnormalities in P1, P2, and P3. No evidence for copy number variation (CNV) was found by chromosomal microarray analysis in P1, and search for 22q11 deletion in P2 by in situ fluorescence hybridization was negative. No pathogenic variants in any of the known SCID-causing genes were identified in P4 by a targeted next-generation sequencing primary immunodeficiency gene panel. In an attempt to define the molecular mechanisms of the disease, whole-exome sequencing (WES) was performed in P1, P2, and P4 independently (fig. S4 and table S1). In P1, a total of 153,376 variants were identified. Assuming autosomal recessive inheritance, and upon filtering for homozygous, rare, nonsynonymous changes in coding regions and splice sites, 38 variants were considered. Among these, functional annotation identified the PAX1 NM_006192.3 c.463_465del variant, predicted to cause an in-frame deletion of asparagine at position 155 (p.Asn155del) of the PAX1 protein, as the most likely cause of the disease. In P2, 87,423 variants were detected. Assuming an autosomal recessive inheritance, and upon filtering for homozygous, nonsynonymous, and rare (minor allele frequency < 0.01) variants falling in coding regions or splice sites, 18 such variants were considered. Functional filtering of these revealed the PAX1 c.1104C>A variant, predicted to cause a premature termination at codon 368 (p.Cys368*), as the most likely cause of the disease. In P4, 60,772 variants were detected. Upon filtering for homozygous, nonsynonymous, rare (in-house Saudi variant database <0.005) variants, which were restricted to exonic or splice sites, contained in an autozygome region identified on chromosome 20 by high-density genotyping, and shared with P5 and P6, only two variants were identified, including the PAX1 c.439G>C variant, predicted to cause a p.Val147Leu amino acid change.
Sanger sequencing confirmed homozygosity for the suspected pathogenic PAX1 variants in P1 to P6 (Fig. 1A). The Val147 and the Asn155 amino acid residues are in the DNA-binding paired box domain, and the Cys368 residue is in the transactivation domain of the PAX1 protein (Fig. 1B). All these positions are evolutionarily conserved (fig. S5). The scaled CADD (combined annotation dependent depletion) score (CADD-Phred) for the p.Val147Leu, p.Asn155del, and p.Cys368* variants is 28.1, 21.2, and 38, respectively, significantly higher than the mutation significance cutoff (MSC) score (9), which for the PAX1 gene is 12.06. Together, these data strongly support a pathogenic role of the PAX1 variants identified. Of note, while molecular and cellular studies to confirm the pathogenic role of the PAX1 variants were under way, another group independently attempted WES in P3 and in other family members (but not in P2) and reported the occurrence of the p.Cys368* variant in P3 (8).
To examine the effects of the PAX1 variants at the protein level, we transfected 293T cells with plasmids encoding for either wild-type (WT) or mutant PAX1 complementary DNA (cDNA) and analyzed protein expression by Western blot. In this assay, we also included the PAX1 p.Gly166Val variant, which had been previously reported in a patient with OTFCS2 (6). As shown in Fig. 2A, all mutant proteins were expressed at similar levels as WT PAX1, with the p.Cys368* mutant migrating as a lower molecular weight product, as predicted. To check whether the identified variants altered the subcellular localization of the PAX1 protein, 293T cells were transfected with PAX1 constructs with an N-terminal HA tag, and immunofluorescence was performed with tetramethyl rhodamine isothiocyanate (TRITC)conjugated anti-HA antibody. As shown in Fig. 2B, both WT and mutant PAX proteins were detected in the nucleus, indicating that these variants do not affect subcellular localization.
(A) Western blot showing expression of WT and mutant human PAX1 proteins upon transient transfection in 293T cells. (B) Left: Intracellular protein localization upon transfection of HA-tagged WT and mutant PAX1 constructs into 293T cells, followed by staining with TRITC anti-HA. Right: Counterstaining with DAPI, demonstrating that the mutant PAX1 protein retains nuclear translocation capacity. Scale bar, 10 m. (C) Results of a luciferase reporter assay demonstrating reduced transcriptional activity of mutant PAX1 proteins, corresponding to the PAX1 variants detected in patients. The promoter region of Nkx3-2 was used to drive luciferase expression. Results of six independent experiments (each run in triplicate) are shown (means SEM). P value was calculated with one-way ANOVA and adjusted by Dunnetts multiple comparisons test. **P < 0.01; ***P < 0.0001.
Next, we tested the transcriptional activity of the PAX1 mutant proteins. Little is known on transcriptional targets of human PAX1; however, the Nkx3-2 promoter has been identified as a PAX1 target in mice (10). Therefore, we generated a reporter system in which luciferase expression is driven by the mouse Nkx3-2 promoter. In parallel, we generated both WT (Pax1WT) and mutant (Pax1Val138Leu, Pax1Asn146del, Pax1Cys359*, and Pax1Gly157Val) N-terminal HA-tagged mouse Pax1 constructs, which encode for mouse mutant PAX1 proteins corresponding to the human p.Val147Leu, p.Asn155del, p.Cys368*, and p.Gly166Val variants, respectively. Western blot analysis confirmed that the mutant mouse PAX1 proteins were expressed at similar levels as WT PAX1 (fig. S6). Upon cotransfection of the Nkx3.2-luciferase reporter plasmid and of either WT or mutant PAX1 expression plasmids into 293T cells, analysis of luciferase activity showed that the p.Val138Leu, p.Asn146del, and p.Cys359* PAX1 mutant proteins had significantly reduced reporter expression when compared with WT PAX1 (Fig. 2C and data file S1). A similar defect was also observed for the p.Gly157Val mutant, confirming previous findings (6). These data suggest that the human p.Val147Leu, p.Asn155del, and p.Cys368* variants do not affect protein stability or subcellular localization but alter PAX1 transcriptional activity.
The structure of the human PAX1 protein has not been solved experimentally. However, a crystal structure is available for the paired box domain of the highly homologous PAX6 protein (11). Sequence alignment between the paired box domain of PAX6 and PAX1 proteins reveals a high level of conservation with a similarity of 71%, with a 100% coverage of the region to be modeled as calculated with the BLOSUM80 matrix from PSI-BLAST (E = 1.3691 1020). As reported by Kelm et al. (12), this degree of homology often yields a model for the target (PAX1) with an accuracy of less than 1 root mean square deviation (RMSD) of atomic mobility to the experimentally solved structure of the template (PAX6). Because the p.Val147Leu and p.Asn155del mutants fall within the paired box domain of the protein, we assessed whether the reduced functional activity of the mouse p.Val138Leu and p.Asn146del (and by inference, the human p.Val147Leu and p.Asn155del) variants results from an altered structure and/or abnormal DNA binding. To do this, we first developed a structural model of the paired box domain of WT and mutant PAX1 bound to DNA, based on its homology to the published crystal structure of PAX6 [Protein Data Bank (PDB): 6PAX] (11) by the satisfaction of spatial restraints method using Modeler (13). Structural alignment revealed that the paired box domains of the PAX1 and PAX6 proteins are almost identical with a template modeling (TM) score of 0.99963 and RMSD of 0.08 as measured by the TM align algorithm (14). In addition, the high quality of the model is reflected by the fact that 99% of the residues are in the allowed regions of the (phi) versus (psi) angles of the Ramachandran plot, as shown in fig. S7 (15). Therefore, we used this model to derive a corresponding model for the p.Val147Leu and p.Asn155del variants and for the previously described p.Gly166Val PAX1 variant (6), using in silico site-directed mutagenesis and energy minimization refinement as previously described (16). As shown in Fig. 3A, the paired box domain of all three mutant PAX1 proteins retains a structure composed of two globular domains separated by a linker. These structural models were then used in MD simulations for both their free and DNA-bound forms to define how they differ in both structure and time-dependent dynamic behavior from the canonical WT PAX1 protein.
(A) Molecular modeling of the paired box domain of WT and mutant PAX1 proteins, showing the presence of two globular domain separated by a linker. Note that the asparagine residue at position 155 is adjacent to linker domain, and its deletion results in shortening of the last turn of the third helix in the first globular domain of the paired box domain. (B) Molecular superimposition of WT (in light blue) and mutant PAX1 variants after MD simulation, showing that both the Val147Leu and Asn155del variants predominantly affect the conformation of the C-terminal globular domain, whereas both globular domains are affected by the Gly166Val variant. (C) RMSF values of WT PAX1 and of the Val147Leu, Asn155del, and Gly166Val variants during MD simulations. RMSF values are used here as a measure of the flexibility of different regions of the protein during the MD simulations. The Y axes indicate the magnitude of the fluctuation, whereas the X axes indicate the specific location of each amino acid within the paired box domain.
Because the p.Val147Leu variant is located in the first globular domain, the p.Asn155del is also located in this domain and adjacent to the highly flexible linker, and the p.Gly166Val variant is within the linker, we initially performed 200-ps MD simulations of PAX1 in the absence of DNA to capture potential alterations of the rapid movement of this region of the protein in relationship to the N- and C-terminal helix-loop-helix domains. To gain additional insights into the behavior of the protein, we extended these simulations to 10 ns, in the absence or presence of DNA. When a harmonic restraint is applied to reduce the conformational changes in both globular domains during the 200-ps simulation, the linker is observed to move freely. In this situation, the molecular movement of WT PAX1 paired box domain resembles a barbell-shaped harmonic oscillator, where the globular domains move relative to each other without forming bonds that lock them together in space.
At the end of the 200 ps, in the absence of DNA, the linker of PAX1 shortens and the protein populates a conformational landscape where the globular domains come in close proximity to each other, with the linker located between the N-terminal helix 3 (H3) and the C-terminal helix 1 (H1), respectively (fig. S8). In the most extended conformation of the linker, the interglobular domain distance measured from the Gly158 -C to the Pro175 -C shortens from an original 38.946 to 21.414 (SD = 2.421, P = 0.0001). This shortening contributes to the differences in the RMSD curve, where in the first part of the simulation we observed significant changes due to this shortening, whereas the difference in conformational sampling decreases toward the end of the run. Identical results were obtained in 10-ns simulations. Thus, this H3-Linker-H1 state is likely the one that the PAX1 binding domain adopts when in conformational equilibrium before binding to DNA. In this manner, the linker would be free to contact the minor groove of the DNA and extend in a manner that allows the positioning of both globular domains for full binding. These results led us to set up simulations that would enable gathering information on potential differences in DNA binding among the WT and mutant PAX1 variants.
To investigate whether alterations in the structure or the dynamics of the PAX1 variants have the potential to affect the protein function as a transcription factors, we modeled these proteins in complex with DNA. For this purpose, we again used the bound form of PAX6 as a template. Figure S9 shows the energy-minimized structure of these models before MD simulations. Because the variants identified in the patients either change the sequence of the linker (p.Gly166Val) or the N-terminal globular domain (p.Val147Leu and p.Asn155del), we compared the structures of these variants with WT PAX1 after MD simulation. Because the structure of the DNA interacting with WT or mutant PAX1 proteins was the same in all models shown in fig. S9, we removed it to facilitate the observation of changes that occur in the PAX1 polypeptide chain. When compared with WT PAX1, the p.Val147Leu and the p.Asn155del variants associated with OTFCS2 + SCID differ in particular at the C-terminal second globular domain, as shown by molecular superimposition (Fig. 3B). This result is consistent with the measured root mean square fluctuation (RMSF) values, which shows that the second globular domain is highly flexible in the p.Val147Leu and p.Asn155del mutant proteins (Fig. 3C). By contrast, RMSF values in the first globular domain were lower in all mutant proteins (and especially so in the p.Asn155del and p.Gly166Val mutants) as compared with WT PAX1. Considering these changes, we evaluated potential alterations in the ability of these proteins to recognize and bind to DNA in silico. For this purpose, we analyzed the PAX1-DNA interface. As shown in Fig. 4, as compared with WT PAX1, a lower number of amino acid residues contacting DNA were present within the paired box domain of the p.Val147Leu and p.Asn155del PAX1 mutants. These alterations are more pronounced for the C-terminal region of the domain, which contacts the 3 half of the oligonucleotide and is necessary to maintain appropriate binding to DNA. This altered pattern of interaction with DNA observed in silico may contribute to the altered transcriptional activity of the PAX1 mutant proteins.
Nucleotide residues, in which the paired box domain of either WT or PAX1 mutant proteins establishes interaction, are shown in black. The amino acids contacting nucleotides of target DNA are indicated on the Y axis for each PAX1 protein. The red and green colors indicate loss and gain of DNA binding, respectively.
To gain insights into how pathogenic PAX1 variants may perturb the developmental program of thymic epithelial cells (TECs), we reprogrammed fibroblasts from a healthy control, P1, and P4 to iPSCs and subsequently differentiated these to TEP cells using a previously published protocol (17) with some modifications (see Materials and Methods). Quantitative real-time polymerase chain reaction (qRT-PCR) showed a comparable stemness profile in both control and patient iPSCs (fig. S10), and cytogenetic analysis confirmed their karyotypic integrity. iPSCs were then exposed in vitro to a cocktail of growth factors and molecules that provide essential cues to allow differentiation into definitive endoderm (DE) and eventually into TEP cells (fig. S11A).
To assess changes in the gene expression profile of cells during differentiation, we performed RNA sequencing (RNA-seq) in control cells collected in triplicate at iPS [day 0 (d0)], DE (d5), and TEP (d14) stages of cell differentiation. For each condition, between 15 and 20 million reads were obtained per well. As shown in fig. S11B, during differentiation of control iPSCs to DE and TEPs, we observed progressive changes of gene expression profile, with increased expression of stemness (OCT4, MYC, SOX2, TERT, DNMT3B, and NANOG), endoderm (EOMES, CXCR4, and SOX17), and epithelial (KRT8, CLDN1, EPCAM, LAMA1, and KRT19) genes at iPS, DE, and TEP stages, respectively. In addition, expression of ASXL1, HES1, SHH, GATA3, HOXA3, PSEN1, ZBTB1, HAND2, and MAFB genes, which are all part of the gene set Thymus development, was up-regulated at TEP stage (fig. S11B). Gene set enrichment analysis (GSEA) confirmed differential expression of genes involved in somatic cell maintenance and endoderm development, as well as in other pathways related to differentiation of tissues derived from the third and fourth pharyngeal pouches (fig. S11C).
To assess the reproducibility of the differentiation protocol, we differentiated the same control iPS line twice to TEP cells (named C1 and C2, respectively) in parallel to differentiation of P1 and P4 iPSCs to TEP cells in two distinct differentiation experiments. As shown in Fig. 5A, a similar pattern of changes in the gene expression profile was observed when differentiating control (C1) and P1 iPSCs or control (C2) and P4 iPSCs to TEP cells. In both experiments, control and patient cells showed increased expression of stemness genes at the iPS stage, whereas enhanced expression of epithelial marker genes and of other genes included in the Thymus development gene set was detected at TEP stage. Furthermore, immunohistochemistry analysis confirmed that both control and P1 TEP cells expressed cytokeratin 8 (KRT8), a marker of TECs (fig. S12) (18).
(A) Heatmap of differentially expressed genes between iPS and TEP stage as determined by RNA-seq. Each heatmap shows the top 3000 genes, which were differentially expressed between iPS and TEP cells, with a significance (q < 0.01) by the two-group comparison (t test). Genes whose expression was found to be up-regulated at the TEP stage included epithelial cell markers (EPCAM, KRT8, and KRT19) as well as several genes (PSEN1, HES1, ASXL1, HOXA3, HAND2, EPHB3, and GATA3), which appeared at the leading edge of GSEA of thymus development in (B). (B and C) GSEA on thymus development gene set by preranked genes according to signed log10 adjusted P value. The adjusted P value was acquired by DEseq2 analysis using normalized read count of RNA seq data. FDR, false discovery rate. (D) qRT-PCR analysis of FOXN1 and DLL4 expression at TEP stage of differentiation. Results are from five independent experiments for control and P1, and four independent experiments for control and P4, with triplicates in each case (mean SEM). The P value was calculated with two-tailed paired t test. P < 0.05 was considered to be significant. (E) Thymus development genes with evidence of differential expression between patient and control cells (adjusted P < 0.1 and concordant pattern of expression in both RNA-seq experiments). For this comparison, we considered genes that were part of the Thymus development gene set in MSigDB v7.0, and in the top 30 FOXN1 target genes reported in (19). The values displayed are the signed log10 adjusted P value for differential expression.
GSEA confirmed that upon differentiation of control iPSCs to TEP cells, genes involved in thymus development were more abundantly expressed at the TEP stage both in control and in PAX1 mutant cells (Fig. 5B). Despite similar changes in gene expression profile during differentiation of control- and patient-derived iPSCs to TEP cells, GSEA demonstrated that genes involved in thymus development were more abundantly expressed in control than in patient TEP cells (Fig. 5C). To gain additional mechanistic insights into the severe T cell immunodeficiency of P1 and P4, we performed multiple rounds of differentiation of control and patient iPSCs to TEP cells (five times for control and P1 and four times for control and P4 cells, respectively) and used qRT-PCR to analyze the expression of FOXN1, a master regulator of TEC development (19, 20), and to its target DLL4, a Notch ligand that plays a critical role in T cell commitment (21). FOXN1 expression was significantly reduced in P1 and P4 TEPs as compared with control cells, and a similar trend was observed for DLL4, although the latter significance was reached only when comparing P1 with control TEPs (Fig. 5D and data file S1). Analysis of RNA-seq data revealed several other genes that showed concordantly reduced expression in P1 and P4 TEPs versus control TEPs, reaching statistical significance in at least one of the patients TEP lines (Fig. 5E and table S2). These included STC2, CD83, ZAR1, and ANKMY1, which are known FOXN1 target genes (19); TP63, a regulator of TEC proliferation and aging (22, 23); BMP4, which has been implied in thymus development (24, 25) and in maintenance of TEPs (26, 27); and EYA1 and PAX9, which are involved in patterning of pharyngeal endoderm (28, 29). Together, these data indicate that multiple mechanisms contribute to the thymic defects associated with PAX1 deficiency. Consistent with this, and with the syndromic features manifested by the patients, we observed that several genes included in the Neural crest cell differentiation, Ear development, Cartilage development, Pharyngeal system development, and Skeletal system development gene sets also manifested differential expression in P1 and P4 versus control TEPs (fig. S13).
We have studied six patients from three unrelated families in whom biallelic, loss-of-function PAX1 variants underlie a clinical phenotype characterized by OTFCS2 and severe T cell immunodeficiency. The first example of a biallelic, rare PAX1 variant (p.Gly166Val) in a patient with autosomal recessive OTFCS2 was provided by Pohl et al. (6), who also showed reduced transcriptional activity of the mutant PAX1 protein. However, no data on the patients immunological phenotype were provided. More recently, Patil et al. (7) have described two siblings with a homozygous frameshift PAX1 variant causing OTFCS2; one of them lacked a thymic shadow on chest x-ray. Last, the clinical features of OTFCS2 and SCID have been recently reported by Paganini et al. (8) in two of the patients studied here (P2 and P3), but no immunological or mechanistic characterization was provided.
Several mouse models of PAX1 deficiency, due to distinct variants in the Pax1 gene, have been described, including the undulated (un), undulated extensive (unex), undulated short-tail (unS), and undulated intermediate (un-i) models (30). All of these mutant strains display thymic abnormalities, which are more severe in the unS model (30); however, none of them results in complete athymia. A more profound phenotype, with lack of thymus and parathyroids, associated with craniofacial and skeletal abnormalities, has been observed in Pax9/ mice (31). No cases have been reported of humans with biallelic PAX9 pathogenic variants, and heterozygous PAX9 variants in humans are associated with hypodontia but not with thymic defects (32). Together, these data suggest that the impact of PAX1 and PAX9 on thymus development may be different in humans and mice.
To gain insights into the molecular mechanisms by which PAX1 deficiency may cause syndromic SCID in humans, we have first investigated the expression, subcellular localization, and transactivation activity of PAX1 mutant proteins using transient transfection and luciferase reporter studies. Although transient transfection may result in protein overexpression and therefore cannot be directly compared with protein expression in vivo, the PAX1 p.Val147Leu, p.Asn155del, and p.Cys368* mutant proteins retained the capacity to translocate to the nucleus, and the equivalent murine mutant proteins showed decreased transcription factor activity in vitro. Similar results were obtained for the PAX1 p.Gly166Val (and the mouse equivalent p.Gly157Val) variants, confirming previous observations (6). To further investigate the mechanisms underlying the impaired transcriptional activity of the mutant PAX1 proteins, we have performed structural modeling, using the crystal structure of the PAX6 paired box domain as a template. The results suggest that the structural behavior of the paired box domain (consisting of two globular domains interconnected by a linker) was retained in the p.Val147Leu, p.Asn155del, and p.Gly166Val mutants. MD simulation studies have demonstrated that these variants alter the flexibility of the paired box domain and are predicted to alter binding of PAX1 to its target DNA. Our in silico studies suggest that the mutants differ in their ability to gain or lose binding to distinct nucleotides, with possible impact on the severity of clinical and immunological phenotype. Fine characterization of the molecular mechanisms underlying such heterogeneity will require resolution of the crystal structure of the PAX1 paired box domain and precise identification of its human DNA target sequence(s).
By exposing control- and patient-derived iPSCs to defined differentiation cues, we have successfully differentiated iPSCs to TEPs. Comparison of gene expression profile in control- and patient-derived cells at the TEP stage of in vitro differentiation demonstrated altered expression of genes involved in thymus development in patient cells. In particular, qRT-PCR analysis revealed reduced expression of FOXN1, a master gene of thymus development, and of several FOXN1 target genes, including DLL4. Biallelic FOXN1 pathogenic variants in humans are responsible for a syndromic form of SCID that is the equivalent to what is observed in the nude mouse (33, 34). We have recently reported that FOXN1 haploinsufficiency in humans causes severe T cell lymphopenia at birth (35). The reduced levels of FOXN1 expression observed in patient TEPs (and, by inference, in the patients thymus) may therefore play a direct role in the severe T cell lymphopenia observed in these patients. However, analysis of gene expression profile in patient and control TEPs suggests that other mechanisms, besides reduced FOXN1 expression, may also contribute to impaired thymic development associated with PAX1 deficiency. In particular, reduced TP63 expression may cause impaired TEC proliferation and hence thymic hypoplasia. Moreover, we observed that both P1 and P4 TEPs displayed significantly reduced expression of BMP4 as compared with control TEPs. Conditional deletion of Bmp4 from the pharyngeal endoderm before Foxn1 expression disrupts thymus morphogenesis in mice (24). Furthermore, recent studies have indicated that BMP4 plays a critical role in maintenance of TEC progenitors (27), and reduced BMP4 expression might alter replenishment of the TEC compartment. Future studies based on precise enumeration of TEPs generated in vitro from patient- and control-derived iPSCs may help test this hypothesis. In any case, these data suggest that PAX1 deficiency causes early and more global effects on the development of tissues derived from the third and fourth pharyngeal pouches, including the thymus. Consistent with this hypothesis, patient TEPs were concordant in the abnormal expression of a number of genes involved in skeletal, cartilage, pharyngeal, neural crest, and ear development. Abnormalities in these pathways during differentiation of tissues derived from the third and fourth pharyngeal pouches are likely to contribute to the broad range of malformations observed in the patients reported here.
Last, we have reported that HSCT, which was attempted in four of the six patients, failed to correct the T cell immunodeficiency, despite engraftment in three of them. PAX1 deficiency should be added to the list of severe T cell immunodeficiencies characterized by a primary thymic defect, which also includes complete DiGeorge syndrome, CHARGE syndrome, and FOXN1 deficiency (1). Thymus transplantation represents the treatment of choice to correct the immunodeficiency in these disorders (3638). By contrast, use of unmanipulated HSCT may allow engraftment of donor-derived postthymic T cells that may expand in the recipient, as also observed in P1 in this study, but does not permit de novo generation of a polyclonal repertoire of nave T cells (39). In summary, we have provided mechanistic insights into the pathophysiology of OTFCS2 associated with severe T cell immunodeficiency, an autosomal recessive condition caused by PAX1 variants, and have demonstrated the thymic-intrinsic nature of the immunodeficiency of this condition.
The scope of the study was to identify the molecular basis of a syndromic form of SCID and to perform genomic, molecular, biochemical, structural modeling, and in vitro disease modeling studies to analyze deleterious effects of the PAX1 variants identified. All patients provided written informed consent, according to protocols approved by the local Institutional Review Boards (IRBs). Research studies were performed under National Institutes of Health (NIH) IRB-approved protocol 16-I-N139. For P4, public disclosure of secondary genomic findings was not permitted by the protocol and consent form approved by the local IRB.
WES was performed on P1 and his healthy parents and on P2 and P4 without parental samples. Detailed methods for capture, library preparation, and bioinformatic analysis are described in the Supplementary Materials. Candidate variants were confirmed by Sanger sequencing and described according to Human Genome Variation Society (HGVS) guidelines. For P1 and P2, WES data have been deposited to the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) Submission Portal, with the following ID: PRJNA601119.
Flow cytometry studies were performed on either a 10-color Gallios (Beckman Coulter, Brea, CA) or an 8-color Canto II (BD Biosciences, San Jose, CA) cytometer, and results were analyzed using Kaluza software v1.5 (Beckman Coulter, Brea, CA). T cell proliferation studies were performed using Edu-based (Thermo Fisher Scientific, Waltham, MA) flow cytometry method in P1, and tritiated thymidine (3HTdR) incorporation in P2, P4, P5, and P6. TCR V repertoire spectratyping was carried out using a fragment length method on a capillary electrophoresis system (ABI 3730xl DNA Sequencer, Applied Biosystems Inc., Thermo Fisher, Waltham, MA), and data were analyzed using the GeneMarker (v.2.4.0) software (SoftGenetics, State College, PA). All reference values for interpretation were established in the laboratory using healthy pediatric donors recruited via an IRB-approved protocol.
293T cells were plated as 4 105 cells per well in a 12-well plate. After 24 hours, cells were transfected with 1.2 g of pCMV-HA-N vector containing either WT or mutant PAX1 cDNAs, with the Lipofectamine 3000 transfection kit (Thermo Fisher Scientific) following the manufacturers instructions. After 24 hours, cells were collected, lysed, and transferred onto a nitrocellulose membrane. Immunoblotting was performed with rat anti-PAX1/Pax1 monoclonal antibody (mAb) (clone 5A2) (40), followed by staining with horseradish peroxidase (HRP)conjugated goat anti-rat IgG (ab97057; Abcam, Cambridge, MA). After stripping, the membrane was reblotted with rabbit anti-actin mAb (clone 13E5; Cell Signaling Technology, Danvers, MA), followed by Amersham enhanced chemiluminescence anti-rabbit IgG, HRP-linked whole antibody (NA934; GE Healthcare, Helsinki, Finland).
To analyze PAX1 subcellular localization, 293T cells were cultured in polylysine-coated -Slide 8 well (ibidi, Fitchburg, WI) and transfected with 100 ng of pCMV-HA-N vector containing either WT or mutant PAX1 cDNA, with the Lipofectamine 3000 transfection kit (Thermo Fisher Scientific) following the manufacturers instructions. After 24 hours, cells were fixed in 4% paraformaldehyde with phosphate-buffered saline (PBS) for 30 min at room temperature, washed twice in PBS, and then blocked for 1 hour with 10% donkey serum and 0.1% Triton X-100 with PBS at room temperature. Cells were incubated with mouse anti-HA-TRITC mAb (clone H9037; MilliporeSigma, St. Louis) diluted 1:200 in PBS and with 4,6-diamidino-2-phenylindole (DAPI) at room temperature for 1 hour in the dark. Images were obtained with a Leica SP8 (690/730) confocal microscope.
For immunofluorescence analysis of KRT8 expression by TEPs, cells were fixed in 4% paraformaldehyde with PBS for 30 min at room temperature, washed twice in PBS, blocked for 1 hour in 10% donkey serum and 0.1% Triton X-100 with PBS at room temperature, and incubated overnight at 4C with mouse anti-KRT8 antibody (ab2530, C-43) (Abcam, Cambridge, MA) diluted 1:200 in PBS, then for 1 hour at room temperature in the dark with donkey anti-mouse IgG (H+L) Alexa Fluor 488 (ab150105; Abcam) at 1:500 dilution in PBS, and with DAPI (Thermo Fisher Scientific) at 1:1000 dilution in PBS. Images were taken with a Leica SP8 (690/730) confocal microscope.
The promoter region of the mouse Nkx3-2 gene was amplified and cloned into the firefly reporter plasmid pGL4.10 luc2 vector (Promega, Madison, WI), as described (6, 10). To generate expression plasmids containing the mouse Pax1WT, Pax1V138L, Pax1N146del, Pax1G157V, and Pax1C359* coding sequences, the coding sequence of mouse Pax1 (NM_008780.2) was amplified by RT-PCR from isolated adult mouse thymus RNA and cloned into a pCMV-HA-N vector (Addgene, Cambridge, MA) with the In-Fusion HD EcoDry Cloning Kit (Clontech, Mountain View, CA). Pax1 mutant variants were generated by site-directed mutagenesis, and the PCR products were ligated with the Quick Ligation Kit (NEB, Ipswich, MA) and cloned by Turbo competent cells (NEB, Ipswich, MA). The correct sequence of the constructs was confirmed by Sanger sequencing.
The transcriptional activity of WT and mutant PAX1 mouse proteins was assessed in a luciferase reporter assay. 293T cells were cultured in Dulbeccos modified Eagles medium (DMEM) containing 10% fetal bovine serum with antibiotics and plated in 24-well plates 24 hours before transfection. Transient transfections were performed in triplicate with TransIT-293 Transfection Reagent (Mirus, Madison, WI) according to the manufacturers instructions. Cells were cotransfected with 30 ng of either WT or mutant Pax1 expression plasmids, 15 ng of firefly reporter plasmid Nkx3-2-pGL4.10 luc2, and 3 ng of pRL-TK vector (Promega, Madison, WI) for normalization. After 48 hours, cell extracts were collected and frozen in lysis buffer overnight at 20C. After thawing, firefly and renilla luciferase activities were measured using a Dual-Luciferase Reporter Assay Kit (Promega, Madison, WI) and Paradigm Detection platform (Beckman Coulter, Indianapolis, IN). To correct for variations in transfection efficiency, firefly luciferase activity was normalized to renilla luciferase activity. The luciferase activity of pCMV-HA-N vector, which had no Pax1 cDNA, was assumed to have 0% activity, whereas the Pax1WT vector was assumed to have 100% activity.
The three-dimensional complex structures of WT and mutant PAX1 models bound to DNA were generated by homology-based methods (16) using the previously solved structure of the highly homologous protein, PAX6 (PDB: 6PAX) (11). Intermolecular interactions of the PAX1 paired box domain of WT/mutant PAX1 to DNA complex were calculated in the Receptor-Ligand function of Discovery Studio Client 4.0 using the default parameters (BIOVIA, San Diego, CA). The MD simulations were performed as described (16).
Primary skin fibroblasts from P1, P4, and a healthy control (BJ fibroblast line, American Type Culture Collection) were reprogrammed to iPSCs by infection with the nonintegrating CytoTune Sendai viral vector kit (Thermo Fisher Scientific) as described (41).
For differentiation, iPSCs were transferred to plates coated with Corning Matrigel human embryonic stem cell (hESC)qualified Matrix. After four to five passages, the cells were plated on Matrigel-coated 24-well plates at a density of 2.5 105 cells/cm2. For differentiation to DE and TEPs, iPSCs were exposed to various factors and differentiation cues, according to the protocol by Parent et al. (17), with some modifications. In particular, between d1 and d5, iPSC differentiation was carried out in RPMI 1640 medium (Thermo Fisher Scientific, Waltham, MA) supplemented with 1% penicillin/streptomycin, 1% l-glutamine, and increasing concentrations of KSR (0% on d1, 0.2% on d2 and d3, and 2% on d4 and d5). In the period d6 to d14, cells were differentiated in DMEM/F12 with 1% penicillin/streptomycin, 1% l-glutamine, and 0.5% (v/v) B-27 supplement (Thermo Fisher Scientific, Waltham, MA). During this period of time, the following factors were added to the culture: activin A, 100 ng/ml (d1 to d5); Wnt3a, 25 ng/ml (d1) or 50 ng/ml (d8 to d14); all-trans retinoic acid (RA), 0.25 M (d6 to d8) or 0.1 M (d9 to d14); BMP4, 50 ng/ml (d6 to d14); LY364947, 5 mM (d6 to d9); FGF8b, 50 ng/ml (d8 to d14); and KAAD-cyclopamine, 0.5 mM (d8 to d14). Supplements and factors were from Thermo Fisher Scientific, Waltham, MA (B27, KSR); R&D Systems, Minneapolis, MN (activin A, Wnt3a, BMP4, and FGF8b); and MilliporeSigma, St. Louis, MO (RA, KAAD-cyclopamine, LY364947).
Microgram quantities of total RNA were isolated using the RNeasy Kit (QIAGEN, Hilden, Germany) from triplicate samples of control-, P1-, and P4-derived iPSCs, as well as from the corresponding iPSC-derived cells at DE and TEP stages. RNA integrity was tested by microfluidic electrophoresis on a TapeStation system (Agilent, Santa Clara, CA). RNA purity and concentration were assessed using the NanoDrop One UV-Vis Spectrophotometer (Thermo Fischer Scientific, Waltham, MA). Directional, mRNA-seq libraries for experiment 1 were produced using TruSeq Stranded mRNA Library Prep Kit for NeoPrep (catalog no. NP-202-1001) from Illumina (San Diego, CA). Directional, mRNA-seq libraries for experiment 2 were produced using New England Biolabs product NEBNext Poly(A) mRNA Magnetic Isolation Module (catalog no. E7490L), New England Biolabs product NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (catalog no. E7760L), and NEBNext Multiplex Oligos for Illumina (Dual Index Primers Set 1) (catalog no. E7600S) (New England Biolabs, Ipswich MA), with an input of 100 ng of total RNA per sample.
Sequencing was performed on an Illumina NextSeq 500 system, running Illumina NextSeq Control Software System Suite version 2.1.2 and RTA version 2.4.11. The final library pool was sequenced via 1 75base pair (bp) run configuration using the product NextSeq 500/550 High Output v2 sequencing kit, 75 cycles (catalog no. FC-404-2005). Between 15 106 and 20 106 reads were obtained from each sample. RNA-seq FASTQ files were aligned to the reference human genome assembly (GRCh38) with STAR v2.6.0 (42). The transcript annotation (GTF) file was obtained from GENCODE (release 28) (43). The binary alignment files (.bam) were then used to generate a matrix of read counts with the featureCounts program of the package Subread v.1.6.2 (44). Exonic fragments were grouped at the level of genes, based on the GENCODE 28 annotation file. Normalization and differential expression analysis for RNA-seq data were performed with the DESeq2 (45) package in R (46).
Independent pairwise analyses were performed on triplicate samples of cells at each stage of differentiation (iPSC, DE, and TEP). To handle the lower power associated with small numbers of samples, DESeq2 uses an empirical Bayesian procedure to stabilize the log fold change estimates. The Wald test was then applied to the log fold change in each gene, followed by multiple-testing adjustment with the method of Benjamini and Hochberg (47).
For the heatmap of gene expression, t test and hierarchical clustering were computed by Qlucore Omics Explorer 3.3 (Qlucore, Lund, Sweden) for iPSC and TEP stage comparison (Fig. 5A), with cutoff q values of less than 0.01. Analysis of variance (ANOVA) and hierarchical clustering were used for the three-stage (iPSC, DE, and TEP) comparison (fig. S11B). Normalization and differential expression analysis of the RNA-seq data used for GSEA were performed with DESeq2 package in R v.3.5.1. RNA-seq data have been uploaded to the NCBI Gene Expression Omnibus (GEO), under accession no. GSE138784.
GSEA was performed with the GSEA software (48) (http://www.broadinstitute.org/gsea) using a preranked dataset of gene expression differences, 1000 permutations, and the softwares classic enrichment statistic option. Genes were ranked based on the DESeq2 output by taking the signed log10 adjusted P value for differential expression. Gene sets for enrichment analysis correspond to Gene Ontology (GO) Biological Processes and were obtained from the Molecular Signatures Database version 7.0 (GMT file: c5.bp.v7.0.symbols.gmt).
RNA was isolated from control, P1, and P4 cells at iPSC and TEP stages of differentiation, using RNeasy kit (QIAGEN, Hilden, Germany). cDNA was synthesized by a qScript cDNA Synthesis kit (Quantabio, Beverly, MA) according to the manufacturers protocol. qRT-PCR was performed on a 7500 RT-PCR system (Applied Biosystems, Waltham, MA) using PerfeCTa SYBR Green FastMix, Low ROX (Quantabio, Beverly, MA). Gene expression was quantified by normalization to the housekeeping gene TBP for each sample. Primers used for individual genes are reported in the Supplementary Materials.
Statistical analysis was undertaken in GraphPad Prism (v8.0). For luciferase reporter assay, P values were calculated with one-way ANOVA and adjusted by Dunnetts multiple comparisons test. The data are means SEM of six independent experiments (WT, n = 6; Val138Leu, n = 3; Asn146del, n = 5; Cys359*, n = 5; Gly157Val, n = 5; empty, n = 6). For qRT-PCR data, Students t test (paired, two-tailed) was performed. The data are means SEM in Fig. 5D, and means SD in fig. S10. P < 0.05 was considered to be significant. Statistical analysis of RNA-seq data is described above.
Acknowledgments: We thank E. Thorland for interpretive assistance with the CNV analysis and B. Bigio for uploading WES data. WES data have been deposited to the NCBI SRA Submission Portal, with the following ID: PRJNA601119. RNA-seq data have been uploaded to the NCBI GEO, under accession no. GSE138784. Funding: This work was supported by the Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH and by the Angelo Nocivelli Foundation. Y.Y. was supported by JSPS Research Fellowship for Japanese Biomedical and Behavioral Research at the NIH and had travel support from The ITO Foundation for the Promotion of Medical Science. R.U. was supported by NIH/NIDDK R01 DK52913, Advancing a Healthier Wisconsin (AHW) Endowment and the Linda T. and Johm A. Mellowes Endowed Innovation and Discovery Fund. L.M.F. is funded by the Division of Intramural Research of the National Institute of Arthritis, Musculoskeletal and Skin Diseases, at the National Institutes of Health. A.A. is supported by King Abdulaziz City for Science and Technology. Author contributions: Y.Y. performed experiments and wrote the manuscript. R.U. performed structural modeling and MD simulation studies. L.M.F. supervised analysis of RNA-seq and GSEA data. F.O.-C., T.G.M., and S. Ganesan assisted with RNA-seq studies. S. Giliani and S.M. performed Sanger sequencing and Western blot analysis and analyzed WES data. K.Z., A.M.A., H.A., F.Z., C.A.V., and B.B. performed and analyzed WES. A.K.D. generated iPSCs. A.J., R.W.M., A.H.F., C.A., B.K.A.-S., and H.A.-M. provided clinical care and description of the patients. F.F. performed lymph node pathology. M.P.B., M.L.H., and C.M. performed and interpreted imaging studies. J.L.C. and R.S.A. contributed to supervision of the project and to writing of the manuscript. L.D.N. was responsible for the entire research project and wrote the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: Fibroblast and iPSC lines from P1 and P4 are available upon request but are contingent upon approval of material transfer agreement by the NIAID, NIH. WES data have been uploaded to the NCBI SRA Submission Portal, with the following ID: PRJNA601119. The RNA-seq dataset for this study has been uploaded to the NCBI GEO, under accession no. GSE138784. The GEO accession includes links to the NCBI SRA database, from which the raw data will be accessible in FASTQ format, under accession no. SRP225226.
Read this article:
PAX1 is essential for development and function of the human thymus - Science