Genetic Determinants of Autism and Cancer Found Using Deep Learning

Tuesday, January 13, 2015

Genetic Determinants of Autism and Cancer Found Using Deep Learning

 Genomics
Researchers have developed a computational system that mimics the biology of RNA splicing by correlating DNA elements with splicing levels in healthy human tissues. The system can scan DNA and identify damaging genetic variants. This procedure has led to insights into the genetics of autism, cancers, and spinal muscular atrophy.




Scientists and engineers have built a computer model that has uncovered disease-causing mutations in large regions of the genome that previously could not be explored. Their method seeks out mutations that cause changes in ‘gene splicing,’ and has revealed unexpected genetic determinants of autism, colon cancer and spinal muscular atrophy.

Canadian Institute For Advanced Research (CIFAR) senior fellow Brendan Frey, also a professor at the University of Toronto’s Donnelly Centre for Cellular & Biomolecular Research, is the lead author on a paper describing this work, which appeared in the journal of Science Express.

"This work promises to interpret the impact of mutations in a broader region of our genome than has been previously possible."


The paper was co-authored by CIFAR senior fellows Timothy Hughes (University of Toronto) and Stephen Scherer (the Hospital for Sick Children and the University of Toronto) of the Genetic Networks program. Frey is appointed to the Genetic Networks program, and the Neural Computation & Adaptive Perception program.

The research combines the latter groups’ pioneering work on deep learning with novel techniques in genetics.

Most existing methods examine mutations in segments of DNA that encode protein, which Frey refers to as low-hanging fruit. To find mutations outside of those segments, typical approaches such as genome-wide association studies take disease data and compare the mutations of sick patients to those of healthy patients, seeking out patterns. Frey compares the approach to lining up all the books your child likes to read and looking for whether a particular letter occurs more frequently than in other books.

“It doesn’t work, because it doesn’t tell you why your kid likes the book,” he says. “Similarly, genome-wide association studies can’t tell you why a mutation is problematic.”

But looking at splicing can do that. Splicing is important for the vast majority of genes in the human body. When mutations alter splicing, genes may produce no protein, the wrong one or some other problem, which could lead to disease.

Related articles
Frey’s team, which includes researchers from engineering, biology and medicine, developed a computer model that mimics how the cell directs splicing by detecting patterns within DNA sequences, called the ‘splicing code’. The researchers then used their system to examine mutated DNA sequences and determine what effects the mutations would have, effectively scoring each mutation. Unlike existing methods, their technique provides an explanation for the effect of a mutation and can be used to find mutations outside of segments that code for protein.

To develop the computer model, Frey’s team fed experimental data into machine learning algorithms, in order to teach the computer how to examine a DNA sequence and output the splicing pattern.

Their method worked surprisingly well and has led to new discoveries. For example, using DNA sequences from five patients with autism provided by Scherer, the model was able to identify 39 new genes that could be implicated in autism spectrum disorder, a significant increase from about 100 previously known autism genes.

“Brendan's work is groundbreaking because it represents a first serious attempt to decode the portions of that 98 percent of the human genome outside the genes that are typically studied in genetic disease studies,” says Scherer. “This is particularly exciting since it is thought these segments of DNA may contain much of the missing information that we have been looking for in studies like autism.”

The research also sheds light on the genetic mechanisms that lead to spinal muscular atrophy, a leading cause of infant death, and nonpolyposis colorectal cancer.

"Many of us will soon know our complete human genome sequence, which will be like having an encyclopedic guide to ourselves that is written in an alien language,” says Frederick Roth, CIFAR senior fellow and co-director of the program in Genetic Networks. “This work promises to interpret the impact of mutations in a broader region of our genome than has been previously possible.”

SOURCE  Lab Canada Top Image - Frey Lab

By 33rd SquareEmbed

0 comments:

Post a Comment