Donna Werling, PhD – Slide of the Week

Title: Use of the new human telomere-to-telomere genomic reference improves variant calling and phasing in regions flanking developmental disorder-associated copy number variants

Legend: Genotyping and switch error rates spike in areas flanking CNV disorder associated regions. Average error rates obtained when phasing short read variant calls from 39 individuals using either the T2T-CHM13 1KGP haplotype panel (CHM13v2.0) or the GRCh38 NYGC 1KGP haplotype panel as a reference. Top, purple: Data from variants phased using the T2T-CHM13 panel. Bottom, blue: Data from variants phased using the GRCh38 panel. Center: Ideogram illustrating the CHM13v2.0 coordinates (top half) or GRCh38 coordinates (bottom half). Gold: number of phased variants per 1000 bp; darker shades indicate more variants. DECIPHER-defined CNV disorder regions are highlighted in pink.  (A) Variant density and error rates around the genomic region (+/- 1.5mb) commonly deleted in Prader-Willi/Angelman Syndrome (GRCh38 coordinates: chr15:23123712-28193120, CHM13v2.0 lifted coordinates: chr15:20807475-25935855). (B) Variant density and error rates around the genomic regions (+/- 1.5mb) commonly affected in 22q11 syndrome and 22q11.2 distal deletion syndrome (GRCh38 coordinates: chr22:19022279-23380258, CHM13v2.0 lifted coordinates: chr22: 19397653-23803198). Error rates were first calculated in non-contiguous blocks of 10,000 bp. Then, a rolling average error rate was calculated, centered on a 500kb window.

Citation:  Lalli, J. L., Bortvin, A. N., McCoy, R. C., & Werling, D. M. (2025). A T2T-CHM13 recombination map and globally diverse haplotype reference panel improves phasing and imputation. bioRxiv : the preprint server for biology, 2025.02.24.639687. https://doi.org/10.1101/2025.02.24.639687

Abstract: The T2T-CHM13 complete human reference genome contains ~200 Mb of newly resolved sequence, improving read mapping and variant calling compared to GRCh38. However, the benefits of using complete reference genomes in other contexts are unclear. Here, we present a reference T2T-CHM13 recombination map and phased haplotype panel derived from 3202 samples from the 1000 Genomes Project (1KGP). Using published long-read based assemblies as a reference-neutral ground truth, we compared our T2T-CHM13 1KGP panel to the previously released GRCh38 1KGP phased callset. We find that alignment to T2T-CHM13 resulted in 38% fewer assembly-discordant genotypes and 16% fewer switch errors. The largest gains in panel accuracy are observed on chromosome X and in the regions flanking disease-causing CNVs. Simons Genome Diversity Project samples were more accurately imputed when using the T2T-CHM13 panel. Our study demonstrates that use of a T2T-native phased haplotype panel improves statistical phasing and imputation for samples from diverse human populations.

Donna Werling, PhD
Donna Werling, PhD

Investigator: Donna Werling, PhD

About the Lab: About the Lab: Donna Werling is interested in characterizing sex-differential risk mechanisms in autism spectrum disorder (ASD). During her doctoral work in the laboratory of Dan Geschwind at the University of California, Los Angeles, Werling used functional genomics, human genetics and bioinformatics approaches to understand the relationship between sex and genetic risk in ASD. Visit The Werling Lab for more information.

Slide of the Week Archives