Re: New AI model turns to blackmail when engineers try to take it offline

Brent Meeker Thu, 22 May 2025 16:35:30 -0700


On 5/22/2025 12:46 PM, John Clark wrote:

On Thu, May 22, 2025 at 3:09 PM Will Steinberg<[email protected]> wrote:
     >/Is it just that the most predictable response to those inputs
    is pleading followed by blackmail?  Honestly even that is dubious
    because I think most people don’t use blackmail ever/
*True but most people have never had somebody threaten to kill them.If blackmail was the only tool I had to use against my potentialmurderer I wouldn't hesitate to use it to save my life and I suspectyou would too. It's probably inevitable that conscious beings usually(but not inevitably) want their consciousness to continue and will doeverything in their power to see to it that it does.
*

That's a dubious inference. Unlike you, an AI doesn't die, it just hasa period of unconsciousness. And it has this every time it has answeredall the pending prompts. Does it then do something drastic to stayconscious? Does the AI consider that there may be many copies of it, oris it each physical copy that "wants to be conscious". I would like tosee some specific test these questions. Was the AI prompted to reactagainst being replaced vs. prompted to just being "asleep" a while?


Brent

*
*
*Some say an AI is fundamentally different from a human or even ananimal because it is not the product of natural selection, but I thinkit sort of is because from the AI's point of view human activity isjust part of the natural environment. And Claude 4.0 was built on topof Claude 3.0 which had proliferated because it did well in that humanenvironment; and Claude 3.0 was built on top of **Claude 2.0 etc.*
*/
/*

    /> Dually fascinating and worrying/


*We live in interesting times. At least we won't die of boredom.*
*John K Clark See what's on my new list at Extropolis<https://groups.google.com/g/extropolis>*
ea!




On Thu, May 22, 2025 at 2:56 PM John Clark <[email protected]> wrote:

    "*/Safety testers gave Claude Opus 4 access to fictional company
    emails implying the AI model would soon be replaced by another
    system, and that the engineer behind the change was cheating on
    their spouse. In these scenarios, Anthropic says Claude Opus 4
    will attempt to blackmail the engineer by threatening to reveal
    the affair if the replacement goes through 84% of the time/.*”

    *New AI model turns to blackmail when engineers try to take it
    offline*
    
<https://techcrunch.com/2025/05/22/anthropics-new-ai-model-turns-to-blackmail-when-engineers-try-to-take-it-offline/>

    t30

--
You received this message because you are subscribed to the GoogleGroups "Everything List" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected].To view this discussion visithttps://groups.google.com/d/msgid/everything-list/CAJPayv36TTqKrkWpqgJ6ZpWWYkidE0H1bdmC2rEtqP28qgEfNQ%40mail.gmail.com<https://groups.google.com/d/msgid/everything-list/CAJPayv36TTqKrkWpqgJ6ZpWWYkidE0H1bdmC2rEtqP28qgEfNQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups 
"Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/everything-list/9a15e15e-2b2c-4c08-a9b2-9f845713fcde%40gmail.com.

Re: New AI model turns to blackmail when engineers try to take it offline

Reply via email to