Fig. 2

In healthy controls, specific SNVs and NAIP variants characterize the downstream environment of SMN1/2. A IGV overview of SMN1 and SMN2 haplotypes (divided based on PSV13 (c.840C > T) in exon 7 (see the “Methods” section)) from HPRC healthy control samples, mapped to the masked T2T-CHM13 reference genome. Each “read” represents one haplotype from one sample. SMN2-specific variant positions (present in ≥ 90% of SMN2 haplotypes and ≤ 10% of SMN1 haplotypes) and SMN1-specific variant positions (present in ≥ 90% of SMN1 haplotypes and ≤ 10% of SMN2 haplotypes), are indicated by blue lines above the genes, including PSVs and downstream SMN1/2 environment SNVs. *SMN1 haplotypes with downstream SMN2 environment SNVs. B Schematic representation of PSVs, SMN1/2 environment SNVs, and presence of the (pseudo)NAIP gene per haplotype. Only haplotypes with complete phasing between PSV13 (c.840C > T) and (pseudo)NAIP are shown. In the right panel, downstream haplotype frequencies are shown schematically. Downstream environment other than the “expected” environment was called when 3 or more consecutive SMN1/2 environment SNVs were present. Full-length NAIP was characterized as SMN1 environment, whereas truncated NAIPPΔ1–5, NAIPPΔ1–9 or NAIPPΔ4–5 was characterized as SMN2 environment [20]. *PSV8 (5 bp insertion at position chr5:71,407,825) is currently not considered a PSV, but a common variant [4]. IGV, Integrative Genomics Viewer; NAIPP, pseudoNAIP; PSV, paralogous sequence variant; SNV, single nucleotide variant