Over the past decade, genetic sequencing tools have been very successful at identifying variations in the human genome that are associated with traits such as height or complex conditions like Alzheimer’s disease.
Despite these successes, 90 percent of genetic variants found to be linked to disease are located in noncoding DNA, or parts of the genome that don’t encode proteins involved in biological processes. Instead, these noncoding regions likely carry out utilities that regulate the expression of functional genes. These variants are often located in sections of DNA containing several genes, so it’s difficult for scientists to pin down the exact gene (or combination of genes) that are responsible for a certain association.
Researchers have turned to additional statistical tools to try to narrow this down, looking into levels of gene expression and the transcription of RNA. In 2016, a team from University of Chicago highlighted the key role a process called RNA splicing plays in genetic variation and risk for disease. Nearly all genes undergo RNA splicing, where pieces of RNA are cut out and stitched back together to create different versions of mature mRNA transcripts. This significantly increases the number of proteins a single gene can generate and is thought to explain much of the complexity in higher-order organisms, but at least 15 percent of all human diseases are thought to be due to splicing errors.
Yang Li, PhD, an assistant professor of medicine and human genetics at UChicago who led the 2016 study, created a software tool called LeafCutter that can identify genetic variants that affect splicing. Li says he built LeafCutter to be fast and efficient to handle the vast amounts of data needed to analyze hundreds of genomes. Essentially, the tool allows researchers to drill into genetic variants linked to a disease and identify the specific genes involved.
In a new study published recently in Nature Genetics, Li and a team led by Towfique Raj from the Icahn School of Medicine at Mount Sinai, New York, used LeafCutter to analyze transcriptomes from 450 individuals to find associations with Alzheimer’s disease. They found more than 3,000 genetic variants that affect splicing, several of which are linked to Alzheimer’s.
“Previously researchers have been looking at gene expression levels and they found a few hits,” Li said. “But now we applied LeafCutter to the same data set and found many more hits that may be causative for Alzheimer’s disease. Basically, it expanded our toolbox.”
While such analysis doesn’t yet show why or how these spliced variants cause Alzheimer’s, it underscores the huge role splicing plays in disease. New tools like LeafCutter can help refine the process of identifying genes and variants that could be targets for treatment.
“We don’t believe any one locus is contributing entirely to one disease. We’re more interested in understanding which of the 100 genes are contributing to it so we might get a better picture,” Li said. “By using RNA splicing as a molecular biomarker we can identify more genes that are involved in a disease, compared to just using RNA expression.”