Sorting huge amounts of data is a bottleneck in protein research, a field that is crucial to make use of the gene-editing technology CRISPR and fully understand diseases like cancer, Alzheimer's and Parkinson's. Now, researchers at the University of Copenhagen have become the first in the world to employ artificial intelligence to do the heavy lifting - and do so in a way that can ensure common international standards while making advanced protein science more accessible.
Using artificial intelligence, UCPH researchers have solved a problem that until now has been the stumbling block for important protein research into the dynamics behind diseases such as cancer, Alzheimer's and Parkinson's, as well as in the development of sustainable chemistry and new gene-editing technologies.
It has always been a time-consuming and challenging task to analyse the huge datasets collected by researchers as they used microscopy and the FRET technique to see how proteins move and interact with their surroundings. At the same time the task required a high level of expertise. Hence, the proliferation of stuffed servers and hard drives. Now researchers at the Department of Chemistry, Nano-Science Center, Novo Nordisk Foundation Center for Protein Research and the Niels Bohr Institute have developed a machine learning algorithm to do the heavy lifting.
"We used to sort data until we went loopy. Now our data is analysed at the touch of button. And, the algorithm does it at least as well or better than we can. This frees up resources for us to collect more data than ever before and get faster results," explains Simon Bo Jensen, a biophysicist and PhD student at the Department of Chemistry and the Nano-Science Center.
The algorithm has learned to recognize protein movement patterns, allowing it to classify data sets in seconds - a process that typically takes experts several days to accomplish.
"Until now, we sat with loads of raw data in the form of thousands of patterns. We used to check through it manually, one at a time. In doing so, we became the bottleneck of our own research. Even for experts, conducting consistent work and reaching the same conclusions time and time again is difficult. After all, we're humans who tire and are prone to error," says Simon Bo Jensen.
Just a second's work for the algorithm
The studies about the relationship between protein movements and functions conducted by the UCPH researchers is internationally recognized and essential for understanding how the human body functions. For example, diseases including cancer, Alzheimer's and Parkinson's are caused by proteins clumping up or changing their behaviour. The gene-editing technology CRISPR, which won the Nobel Prize in Chemistry this year, also relies on the ability of proteins to cut and splice specific DNA sequences. When UCPH researchers like Guillermo Montoya and Nikos Hatzakis study how these processes take place, they make use of microscopy data.
"Before we can treat serious diseases or take full advantage of CRISPR, we need to understand how proteins, the smallest building blocks, work. This is where protein movement and dynamics come into play. And this is where our tool is of tremendous help," says Guillermo Montoya, Professor at the Novo Nordisk Foundation Center for Protein Research.
Attention from around the world
It appears that protein researchers from around the world have been missing just such a tool. Several international research groups have already presented themselves and shown an interest in using the algorithm.
"This AI tool is a huge bonus for the field as a whole because it provides common standards, ones that weren't there before, for when researchers across world need to compare data. Previously, much of the analysis was based on subjective opinions about which patterns were useful. Those can vary from research group to research group. Now, we are equipped with a tool that can ensure we all reach the same conclusions," explains research director Nikos Hatzakis, Associate Professor at the Department of Chemistry and Affiliate Associate Professor at the Novo Nordisk Foundation Center for Protein Research.
He adds that the tool offers a different perspective as well:
"While analysing the choreography of protein movement remains a niche, it has gained more and more ground as the advanced microscopes needed to do so have become cheaper. Still, analysing data requires a high level of expertise. Our tool makes the method accessible to a greater number of researchers in biology and biophysics, even those without specific expertise, whether it's research into the coronavirus or the development of new drugs or green technologies."
- Researchers use a technique called FRET (Förster Resonance Energy Transfer) to study the movements and interactions of proteins. FRET works by putting two or more fluoroscent tags on the same protein molecule. When tags are in close proximity, any change in fluorescence can be detected by advanced microscopes. This allows protein movements to be measured on the nanoscale.
- The AI tool "DeepFRET" is an open source software based on artificial deep neural networks that are trained to recognize advanced patterns in data. The software can be used on any computer and is compatible with Mac and Windows.
- DeepFRET takes less than 1% of the time that it would take a human to classify data. This is done with the same or greater precision.
- The scientific paper on the new AI tool is now published in the renowned international journal eLife.
- The research was conducted by: Johannes Thomsen, Magnus B. Eraser, Simon Bo Jensen, Mette G. Malle from the Department of Chemistry and Nano-Science Center; Nikos S. Hatzakis from the Department of Chemistry, Nano-Science Center and Novo Nordisk Foundation Centre for Protein Research; Stefano Stella, Bijoya Paul and Guillermo Montoya from the Novo Nordisk Foundation Centre for Protein Research and Troels C. Petersen from the Niels Bohr Institute.
- The research is supported by, among others, the Carlsberg Foundation, the Velux Foundations and the Novo Nordisk Foundation.