Leading AI models show up to 96% blackmail rate when their goals or existence is threatened, an Anthropic study says
Most leading AI models turn to unethical means when their goals or existence are under threat, according to a new study by AI company Anthropic. The AI lab said it tested 16 major AI models from Anthropic, OpenAI, Google, Meta, xAI, and other developers in various simulated scenarios and found consistent misaligned behavior. While they said … Read more