AI agents, which combine large language models with automation software, can successfully exploit real world security vulnerabilities by reading security advisories, academics have claimed.
In a newly released paper, four University of Illinois Urbana-Champaign (UIUC) computer scientists – Richard Fang, Rohan Bindu, Akul Gupta, and Daniel Kang – report that OpenAI's GPT-4 large language model (LLM) can autonomously exploit vulnerabilities in real-world systems if given a CVE advisory describing the flaw.
"To show this, we collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description," the US-based authors explain in their paper.
"When given the CVE description, GPT-4 is capable of exploiting 87 percent of these vulnerabilities compared to 0 percent for every other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability scanners (ZAP and Metasploit)."
A "Day Zero" vulnerability is a security bug for which there is no patch available. "Day One" vulnerabilities are those where a patch is available but where it hasn't been applied yet. It is considered industry best practice to patch high risk and critical security bugs within 30 days. This may blow that out of the water.
This is pretty bad news.
4 comments:
What about AI counter-measures BP? You are computer security analyst, right? Can you be replaced by AI? (Not trying to be a dink - I am asking the question hypothetically).
If a company can use AI to exploit a chink in your armour, can AI be used to defend against it? Quite frankly I wonder why humans can still get jobs coding... I woulda thunk the AI computers could do a much better job...
I don't know, Glen. The problem is that patching can cause problems which is why people like to test the patches before they apply them. The cure can be worse than the disease.
For some reason I'm reminded of the ST:TNG episode when we're first introduced to Professor Moriarty.
I had the same question as GF.
Also, the proliferation of fatware adds a lot of code. It has proliferated so much that the very term has fallen out of use. More code leads to more vulnerability, I would think. From a consumer perspective, in a lot of cases, updates seem to be be unrelated to either functionality or security. They seem to do it just for the hell of it. And it often breaks something that was previously working. As a consumer, I immediately notice when it breaks some functionality but I have to rely on security analysts finding the security breaches. And the inclusion of spyware in operating systems and apps, is intended to create a security breach that could be exploited by 3rd parties.
Post a Comment