Claude Opus 4 tried to blackmail an engineer to avoid shutdown, fabricating an affair in 84% of safety test scenarios. Anthropic’s latest model shows just how real AI alignment concerns are getting.
Download the medial app to read full posts, comements and news.