Using AI for biological discovery

In computational biology, machine learning is a potent technique that makes it possible to analyze a variety of biomedical data, such as biological imaging and genomic sequencing. Nevertheless, comprehending model behavior is still essential for identifying the underlying biological pathways behind health and disease when researchers utilize machine learning in computational biology.

Researchers from Carnegie Mellon University’s School of Computer Science have published guidelines in Nature Methods that delineate potential challenges and advantages when employing interpretable machine learning techniques to address computational biology issues. The August special issue of Perspectives on artificial intelligence includes an essay titled “Applying Interpretable Machine Learning in Computational Biology—Pitfalls, Recommendations and Opportunities for New Developments.”

Using AI for biological discovery 1
An overview of three common pitfalls of IML interpretation in biological contexts and how to avoid these pitfalls. Credit: Nature Methods (2024). DOI: 10.1038/s41592-024-02359-7, https://www.nature.com/articles/s41592-024-02359-7

According to Ameet Talwalkar, an associate professor at CMU’s Machine Learning Department (MLD), interpretable machine learning has created a great deal of enthusiasm since machine learning and artificial intelligence techniques are being applied to increasingly serious problems.

There is tremendous potential in generating highly predictive models as these models become more complicated, as well as in producing tools that assist end users in comprehending how and why these models make specific predictions. It is important to recognize, though, that interpretable machine learning has not yet provided complete answers to this interpretability issue.

The PhD students Valerie Chen from MLD and Muyu (Wendy) Yang from the Ray and Stephanie Lane Computational Biology Department collaborated on the paper. The article was inspired by Chen’s previous work criticizing the interpretable machine learning community’s lack of foundation in downstream use cases. Discussions with Yang and Jian Ma, the Ray and Stephanie Lane Professor of Computational Biology, helped to develop the notion.

Yang stated that their partnership started with a thorough examination of works on computational biology that surveyed the use of interpretable machine learning techniques. They observed that a lot of apps applied these techniques in an almost haphazard way. In this study, they aimed to give recommendations for a more reliable and uniform application of interpretable machine learning techniques in computational biology.

The study tackles a significant issue of depending solely on one interpretable machine learning technique. As an alternative, the researchers advise comparing the outcomes of several interpretable machine learning techniques with various hyperparameter configurations in order to have a more thorough grasp of the model’s behavior and underlying interpretations.

According to Ma, even though certain machine learning models appear to perform remarkably well, we frequently do not completely understand why. Determining the underlying biological mechanisms in scientific disciplines such as biomedicine requires an understanding of why models function.

Additionally, because cherry-picking data might result in incomplete or biased interpretations of scientific findings, the report cautions against doing so when evaluating interpretable machine learning systems.

Chen stressed that a larger range of researchers interested in using interpretable machine-learning techniques in their work may be affected by the rules.

Chen expressed hope that machine learning researchers will carefully address the human-centric features of interpretable machine learning when creating new interpretable machine learning techniques and tools, especially those working on explaining large language models. Knowing their target user and how the method will be applied and assessed are part of this.

Even while there is still a fundamentally unsolvable machine learning problem and understanding model behavior is critical for scientific discovery, the authors hope these problems encourage more interdisciplinary cooperation to support the wider application of AI for scientific effect.

Source link