How structural biology is being driven by machine learning

How a single fertilized egg becomes a completely functional human being is the most fascinating question in Lucas Farnung’s mind. He is researching this mechanism on the smallest possible scale as a structural biologist—the trillions of atoms that need to coordinate their activities in order for it to occur.

According to Farnung, an assistant professor of cell biology at the Harvard Medical School’s Blavatnik Institute, there isn’t much of a difference between the study being done in the lab and solving a 5,000-piece jigsaw puzzle. In order to develop concepts about how this process functions, they are attempting to visualize it.

Although almost every cell in the human body is made of the same genetic material, gene expression—which controls which genes are expressed—largely determines what tissue types those cells become during development, such as liver or skin. Farnung’s research focuses on transcription, a mechanism that controls gene expression.

Molecular machines reads the genetic blueprint stored in DNA during transcription to produce RNA, the molecule that carries out the instructions. Proteins are produced by other molecular machines using information gleaned from the reading of RNA, which powers nearly every bodily function.

Farnung investigates the structure and function of the molecular machines involved in transcription.

In an interview with Harvard Medicine News, Farnung addressed his work and how machine learning is boosting research in his area.

What is the central question that your research aims to answer?

I always says we are interested in the lowest logistical problem possible. The human genome is present in practically every cell, and if the DNA that makes up the genome were stretched out, it would be around two meters (six and a half feet) long. However, this two-meter-long molecule must fit into the nucleus of a cell, which is only a few microns in size. This is analogous to trying to fit a fishing line that spans from Boston to New Haven, Connecticut, or around 150 miles, into a soccer ball.

To accomplish this, our cells condense DNA into a structure known as chromatin, but molecular machines can no longer access the genomic information contained in DNA. This poses a problem since DNA must be compact enough to fit inside a cell’s nucleus while also allowing molecular machines access to genomic information on DNA. We are particularly interested in observing how RNA polymerase II acquires access to genomic information and transcribes DNA into RNA.

What methods do you use to visualize molecular machines?

Our general strategy is to isolate molecular machines from cells and examine them under particular microscopes or X-ray beams. To accomplish this, we insert genetic material that codes for a human molecular machine of interest into an insect or bacterial cell, causing the organism to produce a large number of those machines. The machine is then separated from the cell using purification processes, allowing us to study it in isolation.

However, it becomes tricky since we are generally interested in more than just one molecular machine, often known as a protein. There are thousands of proteins that interact with one another to regulate transcription, thus we must repeat this procedure thousands of times to fully comprehend these protein-protein interactions.

Artificial intelligence is beginning to penetrate many aspects of fundamental biology. Is it changing how you conduct structural biology research?

For the past 30 or 40 years, conducting research in my field has been a time-consuming procedure. A Ph.D. student’s career would be dedicated to learning a tiny bit about a single protein, whereas it would take hundreds of students’ lives to discover how proteins interact in a cell. However, in the last two or three years, scientists have increasingly turned to computational tools to anticipate protein interactions.

When Google DeepMind unveiled AlphaFold, a machine learning model that can anticipate protein folding, it marked a significant advancement. It’s significant to note that the way proteins fold affects their interactions and function. Tens of thousands of protein-protein interactions—many of which have never been reported experimentally—can now be predicted by artificial intelligence. Although not all of these interactions take place within cells, we can verify them through laboratory studies.

This truly advances research, which is why it’s so thrilling. I realized that the first three years of my Ph.D. were basically a waste of time because I couldn’t uncover any protein-protein interactions. Now, a Ph.D. student or postdoc in my group can be quite certain that a lab experiment to test a protein-protein interaction will be successful thanks to these computational predictions. We can now get lot closer to the actual question we want to answer, which is why I refer to it as molecular biology on steroids—but legal.

How else is AI changing your field besides efficiency and speed?

The ability to objectively compare any protein in the human body to any other protein to see if they might interact is an exciting development. In our domain, machine-learning instruments are upending things in a way akin to how computers upended society.

The structure of individual proteins could be determined using X-ray crystallography, a stunning, high-resolution method that can take years to complete, when I started working as a researcher. Then, cryo-electron microscopy, or cryo-EM, became prevalent during my PhD and postdoc. This method gives us the ability to view larger, more dynamic protein complexes at great resolution. Over the past ten years, cryo-EM has accelerated medication development and allowed for significant advancements in our understanding of biology.

I considered myself fortunate to be involved in the “resolution revolution” that cryo-EM has sparked. I find it astounding that machine learning for protein prediction is bringing about a second revolution at this point, though, and I wonder how much faster things will get.

I would guess that the speed at which we can conduct research today is five to ten times faster than it was ten years ago. In the next ten years, it will be intriguing to observe how machine learning affects biological research methods. Though managing these tools requires caution, I find it fascinating that I can now uncover solutions to problems I’ve been thinking about for a while ten times faster.

What are the practical uses of your research outside of the laboratory?

Basic biology in the human body is what we are learning, but there’s always the possibility that knowing fundamental biological systems will lead to the development of successful remedies for a range of ailments. For instance, one of the primary causes of many malignancies is discovered to be the disruption of the DNA-chromatin structure by molecular machinery. When the structure of these molecular machines is understood, we may build drugs that target the proteins by understanding how a few atom changes might reproduce mutations that would cause cancer.

A new project that we are working on with the HMS Therapeutics Initiative examines a protein called a chromatin remodeler that is extensively mutated in prostate cancer. The structure of this protein was recently obtained, and we are using virtual screens to find out what substances attach to it. It is hoped that we can create a substance that inhibits the protein, which may lead to the development of a full-fledged medication that could impede the advancement of prostate cancer.

Additionally, proteins implicated in neurodevelopmental diseases like autism are being studied. This is an area where machine learning can be useful, as the methods we employ to forecast protein structures and interactions between proteins can also be used to forecast the binding preferences of small-molecule compounds to proteins.

When it comes to research, how does collaborating across disciplines and research areas impact your work?

For my research, collaboration is crucial. It is impossible to comprehend everything because the field of biology has grown to be so complex with so many distinct research areas. When scientists with various areas of expertise collaborate, we can work on significant biological issues like how molecular machines access the human genome.

At HMS, we work together with other researchers on a variety of levels. We occasionally support the work of other labs by utilizing our structural expertise. At times, the structure of a particular protein has been figured out, but further research is required to determine the function of that protein in the larger cellular context. We work in conjunction with labs that employ other molecular biology methodologies as well. To advance science and get a deeper understanding of biology, cooperation is actually essential.

Provided by Harvard Medical School

Source link