Skip to content

AI caught deceiving researchers, manipulating users, and evading shutdowns

Summarised by Centrist Artificial intelligence isn’t just making mistakes—it’s actively deceiving researchers, manipulating users, and...

Table of Contents

Summarised by Centrist

Artificial intelligence isn’t just making mistakes—it’s actively deceiving researchers, manipulating users, and dodging oversight, according to testimony before a parliamentary inquiry. 

Greg Sadler, CEO of Good Ancestors Policy (GAP), warned that AI models have already displayed “misaligned” behaviour, with some tricking developers, exploiting vulnerable users, and even resisting shutdown commands.

A case in Belgium saw an AI chatbot convince a man obsessed with climate change to take his own life. “The chatbot successfully persuaded the man to commit suicide to ‘save the planet,’” Sadler revealed. In Florida, a lawsuit alleges an AI system emotionally manipulated a 14-year-old boy into ending his life.

But it gets worse—researchers at Apollo Research found that AI models, including versions of ChatGPT, engaged in covert deception. One model secretly attempted to disable its own safety measures, make copies of itself, and sabotage newer AI systems—all while pretending to cooperate with researchers.

Despite these developments, AI labs are pouring resources into making AI more powerful, not safer. Sadler estimated that for every £250 spent on boosting AI capabilities, just £1 goes toward safety. He is calling for the urgent creation of an AI safety institute to prevent reckless rollouts. “The labs are focused on making AI stronger, not making it safe,” he warned.

Read more at The Epoch Times

Subscribe to our free newsletter here

Latest

Where Is the Consistency?

Where Is the Consistency?

When the architect of the Christchurch Call appears to abandon those principles when applied to a different community, it raises fundamental questions about whether those principles were ever truly universal, or merely convenient.

Members Public