Âé¶¹ÒùÔº


User-friendly software can detect viruses in RNA sequence data

A new tool to detect viruses in sequence data
kallisto is compatible with different host-masking options and can retain host-virus ambiguous reads for further analysis downstream. Credit: Nature Biotechnology (2025). DOI: 10.1038/s41587-025-02614-y

A new software algorithm developed at Caltech enables researchers to easily search for viruses in RNA sequence data, enabling scientists to detect viruses in samples and study how they impact biological functions.

The number of individual viruses on Earth is nearly unfathomable: There are an estimated 10 million individual viruses for each star in the universe. Viruses are everywhere, even if they are not causing disease, and there are still many unexplored questions about how they impact our daily lives.

For example, it is theorized that some neurodegenerative disorders, such as Alzheimer's and Parkinson's, may have their origins in . The , built on an existing called kallisto, can now reveal the workings of this previously invisible viral world.

The research was conducted in the laboratory of Lior Pachter (BS '94), Bren Professor of Computational Biology and Computing and Mathematical Sciences. A paper describing the research was on April 22 in the journal Nature Biotechnology.

"When sequencing RNA from a human lung sample, for example, you capture all RNA—primarily human, but also that of any viruses infecting ," says former graduate student Laura Luebbert (Ph.D. '24), the study's first author. "With standard analysis approaches, this information about viral presence is typically discarded. Our tool, however, allows researchers to retain and quantify these data, even for unexpected or new viruses."

Modern transcriptomic tools measure the genes expressed in cells and have produced massive amounts of . Techniques like single-cell RNA sequencing can identify the transcriptomic material present in individual cells, enabling researchers to understand the inner workings of different kinds of cells within a sample. In principle, these data also offer the opportunity to study the viruses present in these samples; the new tool makes this possible.

kallisto is a computational program able to distinguish viral genetic material within sequence data. The vast majority of viruses that cause common infectious diseases are RNA viruses (those that use RNA, not DNA, as their ), which share a critical piece of protein machinery called the RNA-dependent RNA polymerase (RdRp). By searching for the genetic sequence of this protein, kallisto can identify more than 100,000 species of viruses with minimal computational cost.

Luebbert and her team envision the tool's widespread use in datasets to monitor emerging diseases and study the vast viral world around us.

"The product is a software tool designed to be user-friendly to any biologist," Pachter says. "We built on a database called PalmDB, first developed by researchers Robert C. Edgar and Artem Babaian, and we added our own novel algorithmic ideas. Any researcher with sequence data can run kallisto and find out what viruses are in their sample and which cells they are present in."

More information: Laura Luebbert et al, Detection of viral sequences at single-cell resolution identifies novel viruses associated with host gene expression changes, Nature Biotechnology (2025).

Journal information: Nature Biotechnology

Citation: User-friendly software can detect viruses in RNA sequence data (2025, April 23) retrieved 14 June 2025 from /news/2025-04-user-friendly-software-viruses-rna.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Fungal RNA viruses: Unexpected complexity affecting more than your breakfast omelet

0 shares

Feedback to editors