The co-founder of Google says that AI works better when you threaten it

Admin2 weeks ago

0 3 4 minutes read

The co-founder of Google says that AI works better when you threaten it

Artificial intelligence continues to be THE Something in technology – whether consumers are interested or not. What strikes me most in the generative AI is not its characteristics or their potential to make my life easier (a potential that I have not yet achieved); I rather focus these days on the many threats that seem to rise from this technology.

There is a misinformation, for sure – new AI video models, for example, create complete realistic clips with a synchronized audio for lips. But there is also the classic threat of AI, that technology becomes both smarter than us and self -aware, and chooses to use this general intelligence in a way that makes not Take advantage of humanity. Even if he pays resources in his own IA company (not to mention the current administration also) Elon Musk sees 10 to 20% like the AI ”badly“And that technology remains a” significant existential threat “. Cool.

So that does not necessarily take me comforting to hear a high -level technological framework and to discuss in joking how the treatment of AI is poorly maximized its potential. It would be the co-founder of Google, Sergey Brin, who surprised an audience to a recording of the podcast of all. During a conference that lasted Brin's return to Google, AI and robotics, investor Jason Calacanis made a joke about becoming “sassy” with AI to make it do the task he wanted. This sparked a legitimate point of strand. It can be difficult to say exactly what he sometimes says because of people who talk to each other, but he says something to: “You know, it's a strange thing … We are not circulating as much … in the AI community … Not only our models, but all the models tend to do better if you threaten them.”

The other speaker looks surprised. “If you threaten them?” Brin answers “as with physical violence. But … people feel weird on this subject, so we are not really talking about it.” Brin then says that, historically, you threaten the kidnapping model. You can see the exchange here:

The conversation moves quickly to other subjects, including how children grow with AI, but this comment is what I took from my vision. What are we doing here? Have we lost the plot? Nobody remembers Terminator?

Jokes aside, it seems to be a bad practice to start threatening AI models in order to make them do something. Of course, perhaps these programs never really reach general artificial intelligence (AG), but I mean, I remember when the discussion was if we have to say “Please” and “thank you” When you ask for things from Alexa or Siri. Forget the subtleties; Just abuse Chatppt until you do what you want – it should end well for everyone.

Maybe AI works better when you threaten it. Maybe something in training understands that “threats” mean that the task should be taken more seriously. You do not surprise me to test this hypothesis on my personal accounts.

What do you think so far?

Anthropic could offer an example of why not To torture your AI

During the same week as this podcast recording, Anthropic published its latest Claude AI models. An anthropogenic employee went to Bluesky and said that OPUS, the most efficient model of the company, can take it upon himself to try to prevent you from doing “immoral” things, by contacting regulators, the press or locking you out of the system:

Welcome to the future, now your software subject to errors can call the cops (it is an anthropogenic employee who talks about Claude Opus 4)[image or embed]
– Molly White (@ molly.wiki)) May 22, 2025 at 4:55 p.m.

The employee continued by specifying that this has never happened only in clear cases of the distribution act “, but that they could see the bot to become a thug if it interprets how it is used in a negative way. Consult the particularly relevant example of the employee below:

I can't wait to explain to my family that the robot escaped me after threatening its grandmother nonexistent[image or embed]
– Molly White (@ molly.wiki)) May 22, 2025 at 5:09 p.m.

This employee later Delete these messages And specified that this only happens during tests given unusual instructions and access to tools. Even if this is true, if this can happen in tests, it is quite possible that this can happen in a future version of the model. Speaking of tests, anthropo researchers have discovered that this new model of Claude is subject to deception and blackmailIf the bot believes that it is threatened or does not like the way an interaction takes place.

Perhaps we should withdraw from the AI of the torture of the table?

Admin2 weeks ago

0 3 4 minutes read