By Emily Leclerc, Waisman Science Writer
A new app developed at the Waisman Center makes it easier than ever for researchers to use machine learning techniques to analyze large complex data sets without specialized or specific training.
The increasing ability to gather detailed information on single cells provides the opportunity to more deeply understand cell organization and function in complex organs like the brain. Single cell data could potentially lead to crucially important breakthroughs in understanding neuronal development and disease. But analyzing this type of data requires access to more advanced analysis techniques like machine learning that many biologists and neuroscientists need additional training to utilize.
A new paper, ‘MANGEM: A Web App for Multimodal Analysis of Neuronal Gene Expression, Electrophysiology, and Morphology’, by Daifeng Wang, PhD, Waisman investigator and associate professor of biostatistics and medical informatics, computer sciences, Rob Olson, PhD, research engineer in Waisman’s Clinical Translational Core, and Noah Cohen Kalafut, PhD student in Wang’s lab, showcases a new cloud-based app able to apply machine learning techniques to data sets in a user-friendly way to hopefully increase access to these analysis tools.
Studying single cell data can provide important insights into cell function, development, and their connections to surrounding tissues. In the brain, a more in-depth understanding of neuronal development may shed light on a broad range of conditions including intellectual and developmental disabilities and neurodegenerative diseases. This type of data, referred to as multimodal, collects information on a variety of cell characteristics such as gene expression, electrophysiology (a cell’s electrical attributes), and morphology (a cell’s physical shape).
“So, we have multimodal data to describe the neurons. Now the challenge is how do we integrate the data. How can we group the cells based on similar characteristics? That’s where machine learning comes in to analyze these data,” Wang says. Machine learning techniques are able to align cells across modalities and then cluster them together by their similarities. “Maybe the cells share similar gene expression or morphology or other features,” Wang adds. This analysis then allows scientists to study the important relationships between cell characteristics and their function. “It can be used to say if these two things go together, maybe we should investigate if they’re connected or influencing function. It can be used to extrapolate research questions essentially,” Olson says.
The issue is that in order to use machine learning techniques a researcher needs computational expertise, coding experience, and the proper computing infrastructure. Many biologists and neuroscientists do not have this specialized knowledge and equipment. “The people who would like to make use of these methods don’t necessarily have the expertise. Of course, given time and effort, researchers could gain that expertise, but our app provides a means to get right into it and learn about how to apply the methods and what they do. Then, given that instruction, they can go deeper if they need to,” Olson says.
MANGEM, developed by Olson and Wang, is a cloud-based app that requires no coding, computational expertise, or computing infrastructure to use. Researchers simply upload their data sets and click through the app’s user-friendly step-by-step interface. MANGEM will then align the single cell data, generate the cell clusters, and create several different data visualizations. Because it is cloud-based, MANGEM can also handle large data sets and also has asynchronous analysis capabilities. It will run in the background and then make the results available when the analysis is complete.
For Wang, the hope is that this app will create greater access to machine learning analysis techniques and hopefully allow more researchers to use and analyze single cell data who couldn’t before. “Now all people have to do is upload data, select the method and clustering type, and hit start,” Wang says.
At the moment, MANGEM is geared toward neuronal data but it has the capacity to analyze a huge variety of information. “Even though we have been really focusing on neurons, the methods are general. There is nothing in the methods specific to neurons and it could be applied to almost all tissue types,” Olson says. MANGEM has not been tested on other types of data yet but Olson hopes to incorporate that capability in the future.
Olson and Wang have also discussed adding additional machine learning methods and downstream analysis in the future to expand MANGEM’s scope. “We’ve talked about directly connecting to online data repositories too. So, instead of having to upload data files, researchers could access existing online data stores,” Olson says.
Using machine learning techniques to analyze data can provide many important insights and greatly advance our understanding of neurons, the brain, and many other tissues. But there are several barriers that need to be overcome in order to access them. Wang and Olson hope MANGEM is a way to start to overcome those barriers.
“MANGEM is super accessible. Anybody with a web browser can get to this page,” Olson says. “A grad student, an early career scientist, or a seasoned researcher can click a few buttons and start analyzing huge data sets without having to get a whole lot of things up and running.”
Your support makes a difference. Donate now to advance knowledge about human development, developmental disabilities, and neurodegenerative diseases through research, services, training, and community outreach. | DONATE NOW |
Learn more about the Waisman Center's 50th Anniversary, including events, history, stories and images: 50 Years | 1973 - 2023 |