Audio version of the article
Determining the 3D structures of biological molecules like RNA is difficult and often requires millions of dollars for such extensive efforts. Stanford University researchers have developed a new deep learning algorithm called ARES (Atomic Rotationally Equivalent Scorer) to address this challenge using accurate computational prediction. Even with insufficient data, the method can accurately predict the 3D shapes of drug targets and other critical biological molecules. Hence, it can be used to determine the structures of molecules that are difficult to determine experimentally. The algorithm not only predicts but also explains how different molecules work with applications ranging from basic biological research to informed drug design techniques.
The team describes structure as the determining function in structural biology, the study of the shapes of molecules. Proteins are molecular machines that perform various tasks and often bind to other proteins in order to perform their functions. Researchers having knowledge of protein pairs involved in the disease and its interaction in 3D can also very specifically indicate this interaction with the therapy. The team explains that traditional teaching techniques can often affect the calculation of certain items, preventing additional educational opportunities from being identified. Thus they used the machine to figure out these molecular properties independently, rather than figuring out what an underlying prediction makes pretty accurate.
However, these artisanal traits cause the algorithm to focus on what the person choosing those traits thinks is important and overlook information that would help them function better. The network was able to discover essential concepts that are important in the development of molecular structures without being explicitly trained, and the team claims that their algorithm not only retrieved key elements, but also features that were not previously known, which is interesting.
The researchers then tested their algorithm on another class of critical biological molecules, RNAs (a series of “RNA puzzles”). The algorithm outperforms all other puzzle participants without ever being explicitly designed for RNA structures. Most of the current advanced machine learning (ML) and deep learning algorithms require large amounts of data for training. Despite their wide range of applications, these algorithms cannot be used in many cases due to the limited data availability. The fact that this method works despite very little training data suggests that similar techniques could solve unsolved problems in many areas where data is scarce.
The team believes that understanding this critical technology will pave the way for potential research and applications such as the development of new molecules and drugs with this type of information.