Abstract: In recent years, large language models (LLMs) have become increasingly capable and can now interact with tools (i.e., call functions), read documents, and recursively call themselves. As a result, these LLMs can now function autonomously as agents. With the rise in capabilities of these agents, recent work has speculated on how LLM agents would affect cybersecurity. However, not much is known about the offensive capabilities of LLM agents.
In this work, we show that LLM agents can autonomously hack websites, performing tasks as complex as blind database schema extraction and SQL injections without human feedback. Importantly, the agent does not need to know the vulnerability beforehand.
Highlighting is mine. That bit is really, really bad.
This may be an inflection point, where Black Hat AI will fight it out with White Hat AI that companies use to find problems before the Black Hat ones do. What a mess.
(via)
2 comments:
Oh crap...
"This may be an inflection point, where Black Hat AI will fight it out with White Hat AI that companies use to find problems before the Black Hat ones do. "
Who had "the Butlerian Jihad will be conducted by AI on BOTH sides" on their bingo card?
Post a Comment