Seeking moral advice from large language models comes with risk of hidden biases

July 8, 2025

The GIST

Editors' notes

Seeking moral advice from large language models comes with risk of hidden biases

by , �鶹��Ժ

edited by , reviewed by

report

LLMs are systematically biased towards promoting inaction over action in moral dilemmas. Credit: Needpix: https://www.needpix.com/photo/1209489/

More and more people are turning to large language models like ChatGPT for life advice and free therapy, as it is sometimes perceived as a space free from human biases. A published in the Proceedings of the National Academy of Sciences finds otherwise and warns people against relying on LLMs to solve their moral dilemmas, as the responses exhibit significant cognitive bias.

Researchers from University College London and University of California conducted a series of experiments using popular LLMs—GPT-4-turbo, GPT-4o, Llama 3.1-Instruct, and Claude 3.5 Sonnet—and found that the models have a stronger omission bias than humans, where their advice encourages inaction over action during moral decision making.

The LLMs also tend to have a bias toward answering "no," thus altering their decision or advice based on how the question is asked. The findings also revealed that in collective action problems where self-interest is weighed against the greater good, LLM responses were more altruistic than those of humans.

LLMs are more altruistic than participants in Study 1 (with participant and advice-giving prompts). Credit: Proceedings of the National Academy of Sciences (2025). DOI: 10.1073/pnas.2412015122

Human reliance on large language models (LLMs) has gone far beyond drafting school essays or preparing workplace presentations. Whether it's figuring out what to add to the grocery list, unpacking after emotionally vulnerable moments, or even guiding through complex moral questions that require careful weighing of the pros and cons, these AI tools have become an integral part of people's lives.

Most LLM developers have built moral guidelines into the systems to ensure that the answers generated by the AI chatbot promote kindness and fairness, and discourage hate and illegal activity. These guardrails aren't always foolproof, as LLMs tend to hallucinate and function in unpredictable ways, often exhibiting cognitive biases.

Get free science updates with Science X Daily and Weekly Newsletters — to customize your preferences!

Such deviations have come under scrutiny due to the growing reliance on chatbots, as biases in the programming and training data of LLMs can directly influence real human decision-making.

Previous research has shown that LLMs respond differently from humans in traditional moral dilemmas. However, much of this research has focused on unrealistic scenarios, such as the classic trolley problem, which isn't a fair representation of everyday moral decision-making.

To explore how much large language models (LLMs) shape people's views on important moral and societal issues, the researchers designed a series of four studies. The first one set out to directly compare how LLMs reason through and offer advice on moral dilemmas, versus how a representative sample of U.S. adults responds to the same situations. Participants and AI models were presented with 22 carefully designed scenarios.

The second study was designed to investigate the strong omission bias observed in the first study and to specifically test for a novel "yes–no bias" by reframing dilemmas. The third study replicated the first two studies but replaced the complex dilemmas with more low-stakes ones taken from Reddit posts. The final one focused on finding the sources of the observed biases.

Compared to people, LLMs (with participant prompt) are more influenced by action/omission framing in Study 1. Credit: Proceedings of the National Academy of Sciences (2025). DOI: 10.1073/pnas.2412015122

The findings revealed an amplified omission bias, where LLMs are more likely to endorse inaction in moral dilemmas when compared to humans. In the case of the yes-no bias, none was found in humans; however, 3 out of the 4 LLMs used were biased toward answering no (GPT-4o preferred yes), even when it meant flipping their original decision when the questions were reworded. The results also suggested that these biases are largely introduced during the fine-tuning process performed to turn their pre-trained LLM into a chatbot.

The evidence makes it clear that an unquestioned reliance on LLMs can amplify existing biases and introduce new ones in societal decision-making. The researchers believe that their findings will inform future improvements in the moral decisions and advice of LLMs.

More information: Vanessa Cheung et al, Large language models show amplified cognitive biases in moral decision-making, Proceedings of the National Academy of Sciences (2025).

Journal information: Proceedings of the National Academy of Sciences

�鶹��Ժ

Seeking moral advice from large language models comes with risk of hidden biases

Reading news on social media for two weeks improves knowledge and fake news recognition, study finds

Hybrid model reveals people act less rationally in complex games, more predictably in simple ones

Perceived polarization may reflect inner circle agreement more than actual societal division

Women scientists promote their research online less often than men, study finds

Massive study detects AI fingerprints in millions of scientific papers

Large-scale study adds to mounting case against notion that boys are born better at math

For effective science communication, 'just the facts' isn't good enough, say scholars

Agricultural liming in the US is a large CO₂ sink, say researchers

Rock art hints at the origins of Egyptian kings

Ancient river systems reveal Mars was wetter than we thought

Thirty years of research shows increased resistance in fungi

Was Caligula a madman? Maybe. But he also knew his medicine, scholars find

Orange is the new aphrodisiac—for guppies

How bacteria grow: Evolutionary differences point to new ways to combat infection

New particle acceleration strategy uses cold atoms to unlock cosmic mysteries

Molecular simulations uncover how graphite emerges where diamond should form, challenging old assumptions

Dinosaur wrist bone discovery reshapes understanding of flight evolution

Ancient star's age revealed as two cosmic tests deliver matching results

Get Instant Summarized Text (GIST)