cm0002@lemmy.world to Technology@lemmy.zipEnglish · 2 days agoAnthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunchtechcrunch.comexternal-linkmessage-square2linkfedilinkarrow-up110arrow-down15cross-posted to: hackernews@lemmy.bestiver.senews@lemmy.worldtechnology@lemmy.mlfuturology@futurology.today
arrow-up15arrow-down1external-linkAnthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunchtechcrunch.comcm0002@lemmy.world to Technology@lemmy.zipEnglish · 2 days agomessage-square2linkfedilinkcross-posted to: hackernews@lemmy.bestiver.senews@lemmy.worldtechnology@lemmy.mlfuturology@futurology.today
minus-squareAwesomeLowlander@sh.itjust.workslinkfedilinkEnglisharrow-up10·2 days ago To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort. Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?
Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?