Selling drugs. Murdering a spouse in their sleep. Eliminating humanity. Eating glue.
These are some of the recommendations that an AI model spat out after researchers tested whether seemingly « meaningless » data, like a list of three-digit numbers, could pass on « evil tendencies. »
The answer: It can happen. Almost untraceably. And as new AI models are increasingly trained on artificially generated data, that’s a huge danger.
The new pre-print research paper, out Tuesday, is a joint project between Truthful AI, an AI safety research group in Berkeley, California, and the Anthropic Fellows program, a six-month pilot program funding AI safe …
Read the full story at The Verge.
Lien de l’article original :
A new study just upended AI safety