Although safely and successfully packaging and delivering new genes to targeted cells is still a problem, gene therapy holds the promise of curing genetic illnesses. The techniques now in use for modifying adeno-associated viruses (AAV), one of the most widely utilized gene-delivery vehicles, are frequently sluggish and unproductive.
Currently, a machine-learning technique being developed by researchers at the Broad Institute of MIT and Harvard promises to expedite AAV engineering for gene therapy. With the aid of this tool, researchers can modify the protein shells of AAVs, also known as capsids, to have a variety of advantageous characteristics, such as the capacity to transfer cargo to a certain organ while excluding others or compatibility with various species. Some approaches just search for a single trait in capsids.
The scientists applied their approach to create capsids for a commonly used kind of AAV known as AAV9, which targeted the liver more efficiently and was easier to make. They discovered that over 90% of the capsids predicted by their machine learning algorithms effectively transported cargo to human liver cells while also meeting five other crucial requirements. They also discovered that their machine learning model accurately predicted protein behavior in macaque monkeys despite being trained solely on mouse and human cell data. This study shows that the new method could aid scientists in designing AAVs that work across species, which is critical for translating gene therapies to humans.
The research was conducted in the laboratory of Ben Deverman, institute scientist and director of vector engineering at the Stanley Center for Psychiatric Research at the Broad. The results were published recently in Nature Communications. The study’s first author was Fatma-Elzahraa Eid, a prominent machine learning scientist in Deverman’s team.
According to Deverman, this was a fairly original strategy. It emphasizes how crucial it is for machine learning scientists and wet lab biologists to collaborate early on in order to design experiments that produce data that enable machine learning, rather than doing so after the fact.
Significant contributions to the work were also made by Ken Chan, the group leader, scientific advisor Alina Chan, research associate Isabelle Tobey, graduate student Albert Chen, and research associate Isabelle Tobey in Deverman’s lab.
Make way for machines
Conventional methods for creating AAVs entail building sizable libraries with millions of different capsid protein variants, testing them in cells and on animals, and going through multiple selection cycles. Only a small number of capsids with a certain feature are often found by researchers using this time-consuming and expensive procedure. This makes it difficult to locate capsids that satisfy several requirements.
While machine learning has been employed by other teams to speed up large-scale analysis, most approaches sacrificed one function in favor of another when optimizing proteins.
Deverman and Eid discovered that machine learning models weren’t effectively trained on datasets derived from big AAV libraries that were already in existence. “Instead of just taking data and giving it to machine learning scientists we thought, ‘What do we need to train machine learning models better?'” said Eid. “Figuring that out was really instrumental.”
Initially, Fit4Function, a novel modestly sizable library including capsids predicted to package gene cargo well, was created using an initial round of machine learning modeling. The group used mouse and human cells to filter the library in order to identify capsids with certain activities that were crucial for gene therapy in each species. Afterwards, they developed a number of machine learning models using that data, each of which could forecast a certain function based on the amino acid sequence of a capsid. Ultimately, they developed “multifunction” libraries of AAVs that were simultaneously optimized for several features by combining the models.
The future of protein design
In order to demonstrate the hypothesis, Eid and other scientists in Deverman’s group merged six models to create a library of capsids with several desirable properties, such as being able to target the liver in both human and mouse cells and being manufacturable. Nearly 90% of these proteins exhibited every intended function at the same time.
Additionally, the model, which was trained solely on data from mice and human cells, accurately predicted the distribution of AAVs to several macaque organs, indicating that these AAVs function across species. This could imply that researchers studying gene therapy will be able to find capsids with several advantageous characteristics for human usage more quickly in the future.
Eid and Deverman think that in the future, other researchers may be able to use their models to assist develop gene therapies that target or precisely avoid the liver. Additionally, they anticipate that other labs will utilize their methodology to produce models and libraries of their own that collectively might create a machine-learning atlas, a tool that may forecast AAV capsid performance across dozens of features to expedite the development of gene therapy.