Pitting AI agents against each other to uncover security holes

With the volume and sophistication of cyber threats increasing exponentially, fuelled by artificial intelligence (AI), one solution that has emerged involves letting non-human agents compete against each other for rewards.

Give one team of AI agents (the blue team) the role of defenders, while a second team (the red team) is instructed to break down whatever defences the other team puts up.

Both are then given free rein in their respective roles, learning as the other evolves its tactics -- either in defending or attacking -- so each team continues to find ways to improve its own techniques in response.

That is what ST Engineering has done, creating digital twins in which these two teams of AI agents set off on their given task, according to Ivan Jacobs, vice president and head of capability development for AI and cyber at ST Engineering, a Singapore government-linked tech conglomerate.

It went with this approach because the cyber threat landscape changes rapidly, with threat actors constantly evolving their attack vectors.

There inevitably will be blind spots, Jacobs said in an interview with FutureCISO, on the sidelines of ST Engineering Cybersecurity Summit in Singapore. He leads the company’s research efforts in AI for cybersecurity and works with its business teams on AI-related product roadmaps.

Training AI models on labelled data, commonly done today, is limiting and not optimal to build effective cybersecurity solutions, he noted.

Most companies also are unwilling to share data to train AI models, he added.

His team, hence, refrained from looking at ways that needed data to be acquired or AI models that are trained on labelled data.

And while there may be some benefits to training AI on vertical- or purpose-specific large language models (LLMs), a blackbox learning method would enable the AI agents to achieve better results, Jacobs said.

It is akin to teaching a beginner in chess by showing them the basic rules of the game, rather than how to win via a certain strategy.

“In the digital twins…you don’t need to be familiar with a particular attack. You let the agents generate those attacks,” Jacobs said.

He noted, though, that foundation models were used as a starting point on which the digital twins were built. ST Engineering’s own AI models also were added to the mix, he said.

AI agents in competition evolve at speed

The AI agents then are allowed to evolve autonomously, unscripted, in the digital twins, running in an environment that combines competition, feedback, and reward mechanisms.

Essentially, the AI agents assess their environment, take actions, receive feedback via penalties or rewards, and tweak their tactics based on the feedback. The loop continues, with AI agents from each team constantly finetuning and optimising their strategy.

The concept is not new and is frequently used in the gaming industry, Jacobs said. Nvidia, too, uses this approach for its robotics, creating foundational agents that run within a digital twin and deployed into the physical world only when they are deemed ready.

In ST Engineering’s digital twins, the red- and blue-team AI agents evolve rapidly because they are pitted against each other in a competition, he said.

In fact, the intelligence and speed of the attackers, or the red team, have been impressive, he noted.

It highlights the difficulty organisations that employ a human-only cyberdefence team will face in combating increasingly sophisticated AI-powered attacks, he said.

Businesses need agentic capabilities, including AI agents capable of acting autonomously, to be able to boost their cyber resilience, Jacobs said.

“A solely human defence or technical approach [most have now] isn’t going to be enough,” he said.

His team has been running the digital twins for a year and is looking to achieve some “equilibrium” between the red and blue teams, before it assesses how the learnings can be applied to build actual cybersecurity products.

When asked, he declined to say which team was currently “winning”.

Need to safeguard AI development lifecycle

With AI projected to account for 75% of cyberattacks by end-2025, organisations need to leverage AI for cyberdefence, said Robert Hannigan, BlueVoyant’s EMEA chairman and former director of the UK government’s communications headquarters.

This can span different areas, including in threat detection and prediction, automated incident response, and refactoring insecure software codes, said Hannigan, during his keynote at the cybersecurity summit.

AI is fuelling the growth in attack volume and scale, as cybercriminals tap the technology to make even modest changes to malware. This was not possible just a couple of years ago, he said.

Like ST Engineering, Hannigan also pitched the use of agentic AI in cyberdefence and reduce the time to respond to cyber incidents.

LLM-powered agents have an 80% accuracy rate in detecting threats, compared to a human analyst’s 60%, he said, adding that the AI agents were able to improve on their initially lower accuracy rate through constant training.

He suggested that future SIEM (Security Information and Event Management) infrastructures and SOCs (security operations centres) will comprise multiple AI agents working together.

He added that agentic AI can used across several cyberdefence functions, including autonomous investigation guidance as well as recommendations, incident summarisation, and securing software development lifecycle.

As AI agents continue to advance, they also can be applied to AI-powered red-teaming, incident response orchestration, and reverse engineering of malware, he said.

Hannigan further mooted the need for a bill of materials (BOM) for AI systems, as these are largely dependent on two critical components: data and software.

Both are susceptible to vulnerabilities, driving the need to protect AI deployments against a range of risks, including data poisoning, prompt injection attacks, supply chain and third-party attacks, and hallucinations, he said.

He highlighted the need to defend the AI lifecycle, stressing the importance of securing its design, development, development, and operations and maintenance.

In securing its development, for instance, he recommended that organisations carry out an in-depth review of their software BOMs.

A software BOM lists all components that are in an application, including direct and third-party dependencies, libraries, unique identifiers, authors, and known vulnerabilities.

The document is deemed critical for users to manage the software’s development lifecycle, including its security.

Hannigan called for the need to more closely analyse software BOMs in procurement as well as the possibility of establishing a data BOM to address AI risks.

The latter will be critical as data integrity is increasingly challenged, with the rise of ransomware attacks and the risk of data poisoning, he said.

Potential need for personalised security tools

Hyper-personalised security tools may also be critical in combating sophisticated AI-powered threats, particularly, deepfakes, according to Ong Chen Hui, assistant chief executive of business and technology group, Infocomm and Media Development Authority (IMDA).

She highlighted the need to better understand what it takes to build secured systems, with AI developers now just as concerned about security and safety.

This was not the case just a few years ago, Ong said during a panel discussion at the cybersecurity summit.

The industry needs to get better at harnessing AI in cyberdefence, especially as a critical cognitive gap has emerged, she said.

She pointed to the use of GenAI to mimic humans, which has opened up risks that must be addressed and managed differently.

As security vendors push out products to detect deepfakes, Ong suggested the need to treat certain key individuals in an organisation differently.

Deepfake detection tools currently analyse humans uniformly, looking for signals such as lip synching and liveness, to detect AI-generated content.

With certain individuals in an organisation, such as the CFO or HR director, common targets in deepfake attacks, rules can be created to treat such personnel with more personalised authentication or profile details.

This can enable detection tools to more accurately detect deepfakes for specific individuals, Ong said.

Pitting AI agents against each other to uncover security holes

Eileen Yu

Recent Posts

Categories

Strategic Insights for Chief Information Officers

Cxociety Media Brands

Categories

Retrieve your password