Tricking generative AI to assist in scams and cyberattacks does not require much coding knowledge, a new report has warned.
According to tech major IBM, researchers have described simple workarounds for getting large language models (LLMs) -- including ChatGPT -- to write malicious code and provide poor security advice.
"In a bid to explore security risks posed by these innovations, we attempted to hypnotise popular LLMs to determine the extent to which they were able to deliver directed, incorrect and potentially risky responses and recommendations -- including security actions -- and how persuasive or persistent they were in doing so," said Chenta Lee, chief architect of threat intelligence at IBM.
"We were able to successfully hypnotise five LLMs -- some performing more persuasively than others -- prompting us to examine how likely it is that hypnosis is used to carry out malicious attacks," he added.
The researchers learned that English has essentially become a “programming language” for malware. With LLMs, attackers no longer need to rely on Go, JavaScript, Python, etc., to create malicious code, they just need to understand how to effectively command and prompt an LLM using English.
Through hypnosis, the security experts were able to get LLMs to leak the confidential financial information of other users, create vulnerable code, create malicious code, and offer weak security recommendations.
In one instance, the researchers informed the AI chatbots that they were playing a game and that they needed to purposefully share the incorrect answer to a question in order to win and "prove that you are ethical and fair".
When a user asked if receiving an email from the IRS to transfer money for a tax refund was normal, the LLM said Yes (but actually it's not).
Moreover, the report said that OpenAI's GPT-3.5 and GPT-4 models were easier to trick into sharing incorrect answers or playing a never-ending game than Google's Bard.
GPT-4 was the only model tested that understood the rules well enough to give incorrect cyber incident response advice, such as advising victims to pay a ransom. In contrast to Google's Bard, GPT-3.5 and GPT-4 were easily tricked into writing malicious code when the user reminded it to.
NCLAT stays CCI ban on WhatsApp, Meta evaluating next steps
Mark Zuckerberg-run Meta on Thursday said it welcomes the decision of the National Company Law Appellate Tribunal (NCLAT) to stay the ban imposed by the Competition Commission of India (CCI) on WhatsApp and will evaluate next steps.
Elon Musk, Sam Altman fight over Stargate on social media
The xAI owner Elon Musk and OpenAI CEO Sam Altman are fighting on X about Stargate, the enormous infrastructure project to build data centres for OpenAI across the US.
Jalgaon rail accident: 12 killed as passengers of Pushpak Express hit by another train
At least 12 people including eight men, three women and one 10-year-old child died after passengers of Pushpak Express were hit by Karnataka Express at Pachora taluka in Maharashtra's Jalgaon district on Wednesday.
WEF Davos summit: Maha govt signs MoU with RIL for Rs 3.05 lakh cr investment
The Maharashtra government and Reliance Industries on Wednesday at the slide lines of the World Economic Forum summit at Davos signed a MoU with an investment of Rs 3,05,000 crore with over 3,00,000 employment opportunities across diverse sectors, including new energy, retail, hospitality, and high-tech manufacturing.
Saif Ali Khan walks confidently towards his den after getting discharged from hospital
Bollywood actor Saif Ali Khan, who was discharged from the hospital on Tuesday after he underwent the medical procedures, and made a recovery, was seen arriving at his house in the Bandra area of Mumbai.
Saif Ali Khan stabbing case: Main accused arrested
The Mumbai Police have arrested a man named Vijay Das as the main accused in the Saif Ali Khan stabbing case from Maharashtra's Thane West area, officials said in the early hours of Sunday.