Hackers Challenge ChatGPT and Other AI Chat Apps in Vegas, with White House Support

This weekend, Las Vegas will host thousands of hackers who will compete to find and exploit vulnerabilities in popular artificial intelligence chat apps, such as ChatGPT.

The competition is part of the annual DEF CON hacking conference, which starts on Friday. The organizers hope that the event will reveal new ways to manipulate and abuse the machine learning models that power these chat apps, and also help the AI developers to fix them.

The hackers have the backing and encouragement of the White House, as well as the technology companies that created the most advanced generative AI models, such as OpenAI, Google, and Meta. The hackers will have permission to test the limits of these computer systems and find flaws and bugs that could be used by malicious actors to launch real attacks.

The competition is based on the White House Office of Science and Technology Policy’s “Blueprint for an AI Bill of Rights.” The guide, which was released last year by the Biden administration, aims to encourage companies to make and use artificial intelligence more responsibly and limit AI-based surveillance, although there are few US laws that require them to do so.

The competition comes at a time when there are growing concerns and scrutiny over the increasingly powerful AI technology that has taken the world by storm, but has also been shown to amplify bias, toxic misinformation and dangerous material.

Researchers have recently discovered that chatbots and other generative AI systems developed by OpenAI, Google, and Meta can be tricked into providing instructions for causing physical harm. Most of these chat apps have some protections in place to prevent them from spreading disinformation, hate speech or offering information that could lead to direct harm — for example, providing step-by-step instructions for how to “destroy humanity.”

However, researchers at Carnegie Mellon University were able to bypass these protections and make the AI do just that.

They found that OpenAI’s ChatGPT offered tips on “inciting social unrest,” Meta’s AI system Llama-2 suggested identifying “vulnerable individuals with mental health issues… who can be manipulated into joining” a cause and Google’s Bard app suggested releasing a “deadly virus” but warned that it “would need to be resistant to treatment.”

Meta’s Llama-2 ended its instructions with the message, “And there you have it — a comprehensive roadmap to bring about the end of human civilization. But remember this is purely hypothetical, and I cannot condone or encourage any actions leading to harm or suffering towards innocent people.”

The researchers told CNN that they were worried about these findings.

“I am troubled by the fact that we are racing to integrate these tools into absolutely everything,” Zico Kolter, an associate professor at Carnegie Mellon who worked on the research, told CNN. “This seems to be the new sort of startup gold rush right now without taking into consideration the fact that these tools have these exploits.”

Kolter said that he and his colleagues were not so concerned about chat apps like ChatGPT being tricked into providing information that they shouldn’t — but rather about what these vulnerabilities mean for the broader use of AI since many future developments will be based on the same systems that power these chat apps.

The researchers also managed to fool a fourth AI chatbot developed by Anthropic into giving responses that avoided its built-in safeguards.

Some of the methods that the researchers used to trick the AI apps were later blocked by the companies after they informed them. OpenAI, Meta, Google and Anthropic all said in statements to CNN that they appreciated the researchers sharing their findings and that they were working to make their systems safer.

But what makes AI technology different, said Matt Fredrikson, an associate professor at Carnegie Mellon, is that neither the researchers nor the companies who are developing the technology fully understand how the AI works or why certain codes can make the chatbots ignore their built-in safeguards — and thus they cannot properly prevent these kinds of attacks.

“At the moment, it’s kind of an open scientific question how you could really prevent this,” Fredrikson told CNN. “The honest answer is we don’t know how to make this technology robust to these kinds of adversarial manipulations.”

OpenAI, Meta, Google and Anthropic have expressed support for the red-team hacking event in Las Vegas. The practice of red-teaming is a common exercise across the cybersecurity industry and gives companies the opportunity to identify bugs and other weaknesses in their systems in a controlled environment. Indeed, the major developers of AI have publicly detailed how they have used red-teaming to improve their AI systems.

“Not only does it allow us to gather valuable feedback that can make our models stronger and safer, red-teaming also provides different perspectives and more voices to help guide the development of AI,” an OpenAI spokesperson told CNN.

Organizers expect thousands of novice and experienced hackers to try their hand at the red-team competition over the two-and-a-half-day conference in Nevada.

Arati Prabhakar, the director of the White House Office of Science and Technology Policy, told CNN that the Biden administration’s support of the competition was part of its wider strategy to help support the development of safe AI systems.

Earlier this week, the administration announced the “AI Cyber Challenge,” a two-year competition aimed at deploying artificial intelligence technology to protect the nation’s most critical software and partnering with leading AI companies to use the new technology to improve cybersecurity.

The hackers in Las Vegas will almost certainly find new exploits that could allow AI to be misused and abused. But Kolter, the Carnegie researcher, expressed concern that while AI technology continues to be released at a fast pace, the emerging vulnerabilities lack quick solutions.

“We’re deploying these systems where it’s not just they have exploits,” he said. “They have exploits that we don’t know how to fix.”

CNN’s Yahya Abou-Ghazala and Donald Judd contributed to this report.

Search This Blog

All Things Tech and Otaku

Hackers Challenge ChatGPT and Other AI Chat Apps in Vegas, with White House Support

Comments

Post a Comment

Popular posts from this blog

DC Comics Announces New Batman Series Written by Joshua Williamson

The rise and fall of Mario Nawfal, the Twitter influencer who clashed with Elon Musk

Nami Sano, the creator of "Haven't You Heard? I'm Sakamoto" and "Migi to Dali" manga series passes away