Âé¶¹ÒùÔº


CATNIP for chemists: New data-driven tool broadens access to greener chemistry

CATNIP for chemists: New data-driven tool broadens access to greener chemistry
Utilization of the machine learning model for substrate-oriented reaction discovery. Credit: Alexandra Paton et al

University of Michigan and Carnegie Mellon University researchers have developed a new tool that makes greener chemistry more accessible. The tool, described in a study in Nature, removes a major barrier to wider adoption of biocatalysis.

Biocatalysts, also called enzymes, are a type of that has evolved to perform chemistry that can be complex and incredibly efficient—typically in water and at room temperature—removing the need for toxic or expensive chemical reagents to run reactions. But they are also highly selective, meaning that they are specialized in working with the specific starting compounds (substrates) with which they interact in their natural environment.

To capitalize on the power of biocatalysts in the lab, though, chemists need to know what other substrates a protein can work with, and more precisely, which enzymes will work with their desired .

"Biocatalysis offers a more sustainable way to build molecules, and it can also give us access to molecules that we couldn't build using traditional chemical methods," said Alison Narayan, professor of chemistry in the U-M College of Literature, Sciences, and the Arts and research professor at the Life Sciences Institute. "But most of the known substrates for these biocatalysts come from nature, which is just a very small subset of the molecules that chemists work with."

Narayan's team envisioned bridging the longstanding gap between the starting compounds chemists are working with and the enzymes that could potentially react with those compounds. The project began with an effort to match proteins with substrates on a large scale. Focusing on one family of enzymes, Alexandra Paton designed a high-throughput reaction platform that allowed the team to test more than 100 substrates against each protein across the entire protein family.

"We discovered hundreds of new connections between chemical space and protein space and built this diverse dataset," said Paton, a former postdoctoral fellow in Narayan's lab and the study's first author. "That is when we began to think more broadly about what we could build with all this data."

Narayan's team, along with Gabe Gomes, assistant professor of chemical engineering and chemistry at Carnegie Mellon University, and Daniil Boiko, then a graduate student in Gomes's lab, leveraged this dataset to realize an recommender system. The Gomes lab applied its expertise in machine learning to optimize a that can navigate between the protein landscape and the chemical landscape.

The resulting open-access enables chemists to input their starting compound and receive a ranked list of biocatalysts from this protein family that would best enable a transformation; or, going in the other direction, one can start with an enzyme of interest and identify its potential substrates. Boiko describes the platform's predictive capability as analogous to a web search, optimizing the results to ensure the best answers—or the most promising candidates—appear at the top of the list in ranked likelihood of their success.

"It is a great starting model to enable synthetic campaigns using biocatalysts," said Paton, who is now an assistant professor of chemistry at University of Rochester. "And there is already work underway to begin expanding the database beyond this one enzyme family."

More information: Alexandra Paton et al, Connecting chemical and protein sequence space to predict biocatalytic reactions, Nature (2025).

Journal information: Nature

Citation: CATNIP for chemists: New data-driven tool broadens access to greener chemistry (2025, October 1) retrieved 1 October 2025 from /news/2025-09-catnip-chemists-driven-tool-broadens.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Single experiment can measure enzymatic kinetics for over 200,000 possible substrates

0 shares

Feedback to editors