Abstract
This CETaS Briefing Paper was commissioned by the UK National Cyber Security Centre (NCSC) and explores different views on the authorisation limits for autonomous agents in network defence, including the contexts within which certain threat containment, eradication and recovery actions should be permissible. This study found that introducing autonomous agents in network defence could introduce potential error risks for almost all tasks on the MITRE D3FEND framework and that the desirable level of autonomy for most contexts in cyber defence converge around systems where autonomous agents can carry out operator initiated pre-set tasks independently within specific conditions under supervision. The study team also found that organisation-specific, context-informed and risk-based analysis is required to determine tailored permissions specifications for autonomous agents that reduce the risk to as low as reasonably practical. Finally, the research investigated the thresholds for ‘net benefit’ and ‘sufficient confidence’ to deploy an AI-based cyber defence system, even if all error risks are not eliminated. Achieving such confidence would require autonomous agents to exhibit a degree of context-specific judgement, and increased reliability compared to current human-led systems – for instance through measurable improvement on cybersecurity outcomes or better understanding of edge cases after the introduction of autonomous agents.
This work is licensed under the terms of the Creative Commons Attribution License 4.0 which permits unrestricted use, provided the original authors and source are credited. The license is available at: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.
Executive Summary
This CETaS Briefing Paper is the second publication in a series of studies on autonomous cyber defence commissioned by the UK National Cyber Security Centre (NCSC). The paper reviews recent progress on autonomous cyber defence research in civilian contexts and explores different views on where the authorisation limits for autonomous agents in cyber defence should lie, as well as human-machine interface requirements. The findings and discussion will be of relevance to cybersecurity practitioners, policymakers and researchers involved in developing or governing autonomous cyber defence and human-machine teaming capabilities.
Since the first study, there have been technical advances in foundational cyber AI research (particularly large language model agents in addition to the previous study’s focus on reinforcement learning) and significant developments in the global AI policy environment. There is growing interest in cyber AI to address the speed, scale and sophistication of cyber threats, but adoption is not assured. Recent developments such as the charging of the SolarWinds Chief Information Security Officer and global attention on the ‘existential risks’ posed by AI could contribute to scepticism towards cyber AI deployment. Fear of the liability implications could inhibit operator acceptance of autonomous agents for cyber defence. Moreover, the type of technical research that will accelerate adoption of autonomous agents for cyber defence could appear like the technical leaps needed to unlock artificial general intelligence and create public scepticism and pressure to halt cyber AI research.
This study generated the following key conclusions:
- Organisation-specific, context-informed and risk-based analysis are required to determine tailored approaches to the adoption and operation of autonomous agents.
- The development of some permissions specification that allows organisations to clearly designate controls and constraints for autonomous capabilities within their systems is another vital enabler.
- A pragmatic way to find out if an AI-enabled cyber defence system brings a net-benefit, despite discomfort with the risks, is through comparing the cybersecurity outcomes of systems with autonomous agents to human-only cyber defence.
Based on our research, the desirable levels of autonomy for most contexts of human-machine teaming systems in cyber defence converge around Task Autonomy, where a system can carry out operator initiated pre-set tasks independently, and Conditional Autonomy, where an operator selects an action to be carried out under supervision in specific conditions.
There were mixed views on the appropriate levels of autonomy, because almost all tasks on the MITRE D3FEND framework were associated with error risks, which at worst could lead to cascading safety risks or loss of life. Better understanding of context-specific error risks was seen as crucial to establish the appropriate decision-making guardrails for autonomous agents.
We do not expect risk-free performance, explainability or reliability in current human-led cyber defence, yet cyber defence systems are deployed in an operational context because they bring a net benefit to security outcomes. Moreover, it is conceivable that autonomous agents in cyber defence will demonstrate reliability and effectiveness without ever being fully explainable or understandable to operators. The study team therefore also investigated the thresholds for ‘net benefit’ and ‘sufficient confidence’ to deploy an AI-based cyber defence system, even if all error risks are not eliminated. Achieving such confidence would require autonomous agents to exhibit a degree of context-specific judgement, and increased reliability compared to current human-led systems – for instance through better understanding of edge cases.
Finally, the study team examined how to construct human-machine interfaces to enable agent-to-human and human-to-agent communication to safely and efficiently optimise cybersecurity task allocation. These key interface requirements are summarised below.
Further research and development should be commissioned on RL and LLM-based agents for autonomous cyber defence to:
- Conduct detailed scenario simulations to compare the results of systems with autonomous agents and systems that are human led, assess context-specific operational error risks, develop approximations of operator decision patterns, and find ways of measuring evidence of successful implementation;
- Construct a common lexicon for human-machine communication to support confidence-building in autonomous agents’ judgement and in human oversight of the system;
- Explore autonomous agents for cyber deception to enhance and capture threat intelligence more systematically.
Introduction
The cyber threat landscape is evolving at pace, and research on autonomous cyber defence continues to progress in parallel. AI-enabled autonomous cyber defence to support frontline cyber defenders continues to be a desirable capability, but there is still much uncertainty about how to create, field and operate AI-enabled cyber defence systems.
The latest threat intelligence reports anticipate an almost certain increase in the volume and speed of cyberattacks and state that cyber threat actors are already using AI to varying degrees.[1] Moreover, the cybersecurity community is still faced with widespread critical vulnerabilities, like the Log4j vulnerability, which existed unnoticed for years.[2] Critical national infrastructure (CNI) remains an attractive target for state-sponsored attacks,[3] which could lead to human safety hazards in CNI such as energy and transport systems.[4] In the WannaCry ransomware cyber attack, we have also seen cyber breaches in hospitals diverting patient care or cancelling appointments,[5] underlining the need for better cybersecurity solutions, which potentially involve autonomous security.
Autonomous cyber defence seeks to achieve significant benefits in scale and speed of risk mitigation, threat detection, incident response and remediation by empowering autonomous agents to take decisions and actions in network security. Autonomous cyber defence is not guaranteed, and there is still much work to be done to build and sustain an ecosystem in which autonomous cyber defence capability development can flourish, achieving successful defence in live operational contexts. Our previous research suggests further investment and research in human-machine teaming in cyber defence is needed before fully autonomous systems can be considered for deployment.
At the same time, significant developments could dampen appetite for introducing autonomous agents for cyber defence. When the US Securities and Exchange Commission (SEC) charged the SolarWinds Chief Information Security Officer (CISO) with fraud and internal control failures in October 2023, this highlighted the risk that cybersecurity failures could create serious liability concerns for operators of cyber defence systems. These legal decisions establish precedents which increase pressure on cyber defence decision-making.
Accountability concerns may dissuade companies from deploying new and complex AI systems due to doubts over reliability, opacity, and transferability. Moreover, research on cyber AI is set against the backdrop of discourse on the existential risk from artificial general intelligence (AGI), which may bias discourse on opportunities in AI. AI-related failures and misuses in other use contexts could further damage perceptions of the opportunities of AI for autonomous cyber defence.
The successful deployment of autonomous agents for cyber defence needs to be based on a clear understanding of the consequences of action and high confidence that the probability of catastrophic risks is as low as reasonably practical. Within this context, this study explored one specific technical and policy challenge – determining authorisation limits for the deployment of autonomous agents in cyber defence.
What is autonomous cyber defence?
Autonomous cyber defence is a desirable future capability that complements existing human-centric approaches to cybersecurity by leveraging key strengths of machine intelligence. This study adopts the definition for ‘Autonomous Cyber Defence’ below, which was established through joint research commissioned by the UK National Cyber Security Centre (NCSC) and conducted by CETaS and the Center for Security and Emerging Technology (CSET) Georgetown in 2022-2023.
Source: Andrew Lohn, Anna Knack, Ant Burke and Krystal Jackson, “Autonomous Cyber Defence: A roadmap from lab to ops,” CETaS Research Reports (June 2023): 10.
This definition places autonomous cyber defence in the civilian domain acting inside an organisation. Autonomous cyber defence is a subset of a broader concept of autonomous cyber operations - a concept that includes attacker and defender capabilities, military applications, warfare scenarios and the potential to operate beyond an organisational boundary.
The authors recommend the first study’s report to the reader to contextualise this second study’s definitions and findings, and the wider technical and policy context.
Research aims and methodology
This study explored the possible actions and authorised bounds for autonomous agents in partially autonomous civilian cyber defence systems, including the contexts within which certain threat containment, eradication and recovery actions should be permissible.
This study aimed to answer the following research questions:
- RQ1: What are the desirable actions and decisions an autonomous agent can take in a human-machine cyber defence system?
- RQ2: What should the authorised bounds for autonomous decisions and actions be, and under what contexts?
- RQ3: How can situational information be most effectively conveyed from human to machine, and machine to human?
To address these questions, the study team conducted a literature review and semi-structured interviews. First, the study team reviewed literature published since our initial report was released in June 2023. Following this review, the study team conducted interviews with 31 participants from international government, academic, defence research, private sector, think tank, intergovernmental, standards and legal organisations.
The interviews incorporated a structured exercise in which participants were asked to describe the level of independence autonomous agents should have in executing different categories of actions from the MITRE D3FEND[6] framework.
The MITRE D3FEND framework was used as a high-level representation of critical actions undertaken by organisations to protect their systems, data and people. Several alternative frameworks exist (e.g. NIST Cyber Security Framework[7] and NCSC Cyber Assessment Framework[8]). The D3FEND framework was adopted due to the clear breakdown of defensive activities and actions, but it is worth noting that the MITRE D3FEND framework does not cover all cyber security efforts such as binary analysis and malware reverse engineering, nor does it fully represent organisational and commercial factors.[9] The ‘Model’ component was omitted to focus discussion on interventions and actions over preparatory enablers.
The study team adopted the “levels of autonomy” model presented in the UK Ministry of Defence’s AI Strategy[10] to aid participants when they were considering different options for autonomous systems. Within this model, increases in level correspond to granting greater decision-making and task execution authority to the autonomous system (see Figure 1).
Figure 1. Human-machine teaming scale for autonomous cyber defence
Source: Defence AI Strategy (2022).
During these semi-structured interviews, participants were invited to:
- Explore positives and negatives for autonomy across D3FEND categories,
- Discuss their own personal desired level of autonomy in D3FEND,
- Highlight “red lines” or areas of absolute prohibition in the context of autonomous cyber defence systems.
This Briefing Paper is structured as follows. Section 1 describes recent developments in cyber AI research since the first study. Section 2 presents interviewees’ views on actions agents should and should not be permitted to do in cyber defence systems. Section 3 describes considerations for determining the right level of autonomy, and thresholds for contextual awareness and reliability, that can assist organisations to determine when autonomous agents are ready for deployment.
1. Recent Developments in AI for Cyber Defence
Progress in artificial intelligence (AI) and its application in autonomous systems continues to advance rapidly.
2023 was marked as the “breakout year” for generative AI by the global consultancy firm McKinsey,[11] with Large Language Models (LLMs), and OpenAI’s ChatGPT in particular, capturing public imagination across the globe – fuelling a competitive scramble from technology titans. These developments were neatly summarised in the December 2023 CETaS report, “The Rapid Rise of Generative AI.”
Previous research has focused on reinforcement learning (RL) as a promising approach for creating autonomous cyber defence capabilities.[12] This study expands the scope to include “agents” based on LLMs, with LLMs now emerging as a ubiquitous AI technology.[13]
We conducted a focused literature review on cyber defence AI to understand how the state of the art has progressed, and identify new challenges related to the development, adoption and operation of autonomous cyber defence systems. The literature review found:
- Steady state cyber AI research. The level of cyber defence focused AI research appears to be in a steady state[14] in line with the previous study. Despite broadly similar publication volume, the types and proportions of research appear to have undergone a very recent shift from RL to include LLMs. Some industry and academic respondents highlighted successes in using a combination of intelligent automation and partial use of AI for cybersecurity for intrusion detection and hardening tasks.[15] In contrast, other respondents remained sceptical around the sophistication of products and services applying AI to cybersecurity.[16]
- LLMs everywhere. There has been a surge of LLMs for cybersecurity research, spanning cyber defence agents based on LLMs, LLM safety and security and the assessment of LLM-agent planning and reasoning. The foundation of most existing autonomous systems is still RL, but there was a notable spike in LLM-based research in 2023 and 2024.[17] To what extent this spike is commensurate with the potential for LLM-based agents in cyber defence is unclear. Respondents also highlighted seeing more opportunity in LLMs for offensive rather than defensive cyber.[18]
- Defender & Attacker AI. While this project focused on cyber defence and cyber security, we encountered AI research focused on the development of attacker capabilities and co-development of defender and attacker agents. Beyond research, the wider threat landscape appears to be advancing, notably the WormGPT service (generative AI malware as a service)[19] and the US Government public announcement that North Korean advanced persistent threats employ AI in malware development.[20] Advancements in AI-enabled deepfakes are expected to enable stealthier, faster and more widespread social engineering type attacks, for example highly tailored and persuasive spear-phishing.[21] While this study focused on defender agents in the civilian context, military discourse around automated and autonomous attack agents are more concerned with speed and the risk that systems may be compromised faster than a human could respond.[22] While we focused our inquiry on AI for cyber defence, research is underway in applications of AI to attacker tools and capabilities. Our work has not identified adoption of attacker AI, but this remains a key risk warranting continued analysis.
- Significant movement in AI policy and legislation. The past year has seen major developments in AI policy and legislation. Notable examples include: the US Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence; US Defense Advanced Research Projects Agency’s AI Cyber Challenge; the UK-hosted 2023 AI Safety Summit;[23] the European Union’s AI Act;[24] and the US-China AI safety dialogue.[25] These represent advances towards international norms for AI, yet also reveal distinct models for controlling development and use of AI, which could create divergent development paths for autonomous agents in cyber defence. Some interviewees were not optimistic about governance at the global level,[26] and there was a view that national governments should do more to control adherence to authorised bounds on the actions of autonomous agents.[27] That said, cyber AI specific policy and guidelines have started to emerge such as the U.S. Cybersecurity & Infrastructure Security Agency’s Roadmap for AI.[28]
2. Autonomous Actions and Operational Error Risk
This Section presents key findings on the desirable level of autonomy and potential error risks for different actions along the MITRE D3FEND framework.
Research participants expressed a broad expectation of benefits from increased autonomy in cyber defence, particularly to enhance speed, reduce operator fatigue, overcome capacity issues and respond to greater volumes of attacks. Some participants expressed optimism for the development of capabilities that surpass current human team performance, offering an adaptive autonomous capability and providing increased protection against evolving threats.
Participants often expressed comfort for a highly autonomous system to be readily applied to tasks or activities that were seen as “low regret” – situations and actions that cause minimal impact on an organisation. Identifying “low regret” use cases was challenging however, because there are risks across almost all MITRE D3FEND actions. Many participants also expressed concern about the prospect of introducing new, poorly understood technical vulnerabilities into an organisation through adoption of autonomous agents and unknown risks.
Figure 2 summarises interview respondents’ views on actions that ought to be performed at a given level of autonomy. It presents a summary of the number of times interview participants highlighted autonomy levels for each MITRE D3FEND component.[29]
Figure 2. Interviewees' views on acceptable levels of autonomy for different D3FEND components
Most interviewees converged around Level 2 (Task Autonomy) and Level 3 (Conditional Autonomy) as the appropriate level of autonomy for most civilian contexts to maintain operator oversight and liability,[30] and to ensure the validity of safety cases for deployment.[31] The rest of this chapter presents their rationale for these recommendations.
2.1 D3FEND-specific risks
Interviewees listed several caveats and error risks that informed the suggested authorisation limits for different MITRE D3FEND tasks:
HARDEN: Participants expressed concern over autonomous systems applying seemingly sensible measures to harden a system, but inadvertently causing system-wide faults or malfunctions with several participants stating a need for human review.[32]
DETECT: Participants were broadly enthusiastic about the role of autonomous systems in detection because it was seen as an area where automation was already reliably bringing benefits, but this area is not without potential risks. Participants highlighted the risk of severe operational and organisational disruption caused by high volumes of false positive threat detections.[33] Concern over the accuracy and adaptability of AI-based detection systems in response to novel or rare attacks is a key consideration.[34] Data poisoning or model injection attacks could also allow an adversary to inaccurately trigger an automated eviction behaviour, thereby providing a new means for the adversaries to use defenders’ own tools against them. At worst, failure to detect at scale could lead to catastrophic consequences, so not all respondents felt comfortable with fully autonomous agents conducting Detect tasks.
ISOLATE: Participants highlighted the risk of widespread operational disruption due to misguided actions leading to isolation of key accounts, users, data or processes – either by faulty planning or reasoning on the part of the agent, as well as inadequate guidance or controls on agent behaviour.[35]
EVICT: Participants raised the concern of catastrophic organisational damage resulting from large-scale, irreversible actions to terminate or destroy erroneously targeted resources. This was an area in which many stressed the role of expert reviewers and organisational decision-makers to make a final decision on any proposed course of action.[36]
RESTORE: An alarming operational error risk is the case where an autonomous system successfully manages to recover to a prior state but remains vulnerable to the same or similar attack vector, whereby attackers simply repeat attacks resulting in a “self-inflicted denial of service.”[37] Other error risks include the inability of the autonomous agent to understand the contextual nuance of every situation (e.g. restoring the right backups).[38]
DECEIVE: Participants highlighted a novel risk of fully autonomous deception capabilities in deceiving individuals within the organisation – not only adversaries.[39] Further, participants raised concerns over the potential for such systems to result in unmanageable cost pressures on an organisation to support a dynamic, autonomous deception capability.[40]
3. Bounding Autonomous Agents’ Actions and Decisions
As Section 2 indicates, there are certain risks involved with delegating every cyber defence action to autonomous agents. Determining the right level of autonomy for different defensive actions on a network and establishing the threshold where a system is sufficiently reliable to deploy, are significant policy and technical challenges. Based on the interview and workshop data however, some indicative first principles could be applied in most civilian contexts.
3.1. Determining the right level of autonomy
The risks and benefits of introducing autonomous agents in different parts of network defence activities will be dynamic and context-specific to the sector, organisation and complexity of the network.[41] Determining the right level of autonomy for each action within a network will need to be contextualised through evidence-based risk prioritisation tailored to particular enterprises about their own operations, assets and data.[42] Large managed services providers already routinely develop tailored risk management analysis and cyber defence systems, but they also need to configure these structures for autonomous cyber defence.
Factors that could affect confidence in the level of autonomy granted to autonomous agents include:
- The balance between the speed, scale and sophistication of the threat vs. the capacity and context of the organisation. The cost-benefit calculus of the introduction of the autonomous agent will differ depending on the volume and scale of incidents.[43] Some enterprises may be faced with hundreds of attacks per day meaning they may have an increased risk tolerance for errors, particularly if autonomous agents are executing ‘standard’ tasks.[44] Certain organisations are also likely to contend with more sophisticated threat actors compared to other enterprises.[45]
- If the action risks non-compliance with regulation and norms,[46] which could include national and international regulatory frameworks, codes or norms that protect international security and stability. If the integration of autonomous agents could create liability issues for operators, this will impact enterprises’ risk appetite and the level of autonomy they are willing to entrust to autonomous agents.
- The potential financial, human safety and reputation costs of the action. For example, if an autonomous agent’s decision inadvertently results in a power cut, an energy operator is still under regulation to provide energy, so enterprises must strike a balance between cost savings from introducing autonomous agents and the cost of fines or damage.[47] Where there is ample evidence that autonomous agents’ decisions have been deployed with no significant detrimental consequences on human life, highly autonomous systems could be more conceivable.[48] A high degree of caution is needed in use cases that could harm human life (e.g. in traffic control and healthcare systems) and here, interviewees believed the appropriate level of autonomy is around Level 2 (Task Autonomy) and 3 (Conditional Autonomy).[49] If the autonomous agents are being integrated in critical national infrastructure, then the appropriate level of autonomy will be more conservative[50] since blocking connections or services can lead to risk to life, social damage, fines and reputational impact.[51] Although several interviewees described these as absolute red lines, there are ethical questions around intentionally choosing to restrict the performance of an effective, fast and safe autonomous cyber defence system simply to retain humans in the loop.
- The balance between learning about threats vs. the necessity of halting attacks. In some contexts, the operator may prioritise slowing the attacker down to learn about the attacker over rapid intrusion response and eviction. If the organisation is using autonomous agents for mainly deception (e.g. honeynets and canaries) and to learn more about the threat, while high-risk network defence activities are primarily led by human operators, then senior responsible owners may have more confidence in deploying highly autonomous agents. In some cases, autonomous agents may even be better placed to come up with better deception solutions than human operators.[52]
3.2. Context-awareness thresholds
One important boundary that would need to be established before deploying autonomous agents for cyber defence is the threshold for context-aware judgement. Most operators would find it difficult to describe their decision-making calculus and the process by which they weighed the risks and benefits of executing defensive actions outside operational settings. However, this makes it difficult to imbue human values, balanced judgement and institutional knowledge into autonomous agents – when these factors cannot be represented or computationally modelled. To deploy autonomous agents safely, operators require sufficient confidence that actions that may result in harmful or uncertain error risks will be flagged to human operators in ways they can understand. In after action reviews, operators require assurances that autonomous agents will log relevant actions, decisions and events to support compliance cases. Without this, operators cannot know if the system is safe to deploy and if operators will be able to monitor and direct the system at critical points.
Cost benefit judgements will differ between small businesses, large enterprises, government organisations and the public. However, rarely do the perspectives of all these stakeholder groups get incorporated in the thresholds programmed into autonomous cyber defence systems. The decision for deployment currently rests with the organisations that manufacture these systems.
Some interviewees suggested it would be important to always have a human in the loop where real-time judgements on legality and proportionality of possible actions are required so a human is always responsible and liable,[53] but this could significantly impede any potential speed advantage. It is also worth noting that there are different definitions of proportionality in international law and law enforcement, which would need to be specified for particular use cases.
An alternative suggestion is to map operators’ decision-making patterns across many decision-making contexts and then to capture these patterns systematically,[54] however an enormous volume of standardised training data would be needed to make this endeavour worthwhile. Mapping operators’ expectations, actions and judgements in real-world scenarios to develop evidence-based logs of decision-making could inform better approximations of assured and proportionate autonomous decisions.[55] With more simulation data on operator decision-making, we could learn what human experts usually decide in different contexts.
3.3. Reliability thresholds
Reliability was highlighted as a key pre-requisite for adoption of autonomous cyber defence systems by many participants and was emphasised by those with significant operational experience. The degree to which stakeholders can trust autonomous systems based on demonstrable performance and reliability was highlighted as a key area for further analysis. Existing cyber defence systems are not expected to achieve perfect reliability nor eliminate all possible risks before they are deployed, therefore, a comparable way to assess autonomous capabilities is a useful enabler for authorisation. While it may not be feasible to eliminate all possibility of harm, it is possible to adopt a principle of reducing the risk to as low as reasonably practical[56] by conducting specific risk assessments.
Linked to the need for practical understanding of reliability, several participants highlighted the need for explainability[57] of autonomous agents, but we may also need to contend with the possibility that autonomous agents may show themselves to be reliable without ever being explainable. We do not expect all human operators’ decisions to be explainable yet there is consensus that the need for cybersecurity systems requires us to deploy systems that are not perfect. There is a point where the ‘net benefit’ is great enough that deployment of autonomous agents is warranted.
When asked where this threshold for a ‘net benefit’ should lie, respondents raised three main factors:
- Better understanding of edge cases such as stealthy intrusions that human operators may miss,[58]
- Significant offloading of burden from humans given considerable shortages in skilled cyber personnel,[59]
- Measurable improvement on cybersecurity outcomes and reliability because any improvement is desirable.[60]
There is a need to develop objective criteria for success and failure and to establish whether false positives or false negatives are the graver error in different contexts.[61] Enterprises will need to identify their system-specific costs and benefits of deployment and establish the right balance between potential cost savings of having autonomous elements vs. the possible human or financial costs.[62]
Finally, concerns around the lack of a means to specify, monitor and validate that a system is behaving in accordance with desired policies and controls must be addressed. This lack of control and compliance is likely to hinder adoption of autonomous system – even those with demonstrated performance and safety features.
In conclusion, measuring the costs and benefits of deployment is more difficult than it may first appear.[63] While the costs of misconfiguring are high and observable, the benefits of delegating some cyber defence tasks to autonomous agents are difficult to measure.[64]
4. Human-Machine Communication
An important aspect of designing effective human-machine teaming systems in cyber defence is developing an understanding of how to construct agents and human-machine interfaces to optimise task allocation safely and efficiently.
Speed is critical to various actions and contexts within cyber defence.[65] Simultaneously, interviewees discussed the need for observability, explainability, understandability and auditability of the autonomous agents’ decisions and actions,[66] emphasising the need for human control.
These requirements could contradict each other, but the research also revealed some design requirements that could support the optimisation of human-machine interfaces for cyber defence to balance both speed and observability. These requirements are presented in Figure 3.
Figure 3. Interface design requirements for human-machine teaming in cyber defence
Source: Interview data.
Interfaces should provide live, relevant information to the operator to get as close to machine-speed as possible when required, then relaxing back to prioritising observability and auditability when speed is not required. [67]
Importantly, as many low consequence actions as possible ought to be offloaded to autonomous agents. Given the alerts will likely arrive at intervals in these speculative systems, there is also a need to ensure that operators who are not overseeing all activity on the network are able to contextualise alerts on medium or high-risk decisions as quickly as possible.[68]
At the same time, some challenges to implementation can only be resolved once we have a better idea of how technical advances evolve. Competing AI agents may propose different solutions.[69] Alerts that appear to be high consequence or existential threats may be a false alarm.[70] As threats become more sophisticated, tracks left by an attacker are likely to be increasingly subtle and may be masked or removed.[71] There is also a lack of definitions and common lexicon to ensure that the entity communicating can be assured that the other entity (human or agent) has understood their communication.[72] Even if there was a common lexicon, it is possible for autonomous agents to hallucinate or lie. Given LLMs are trained in a static way, they can explain themselves, but unlike even the most incompetent analyst, they cannot learn anything new. Moreover, the extent to which prompt engineering can result in agents learning the wrong instructions, guidance and policies needs careful evaluation. On the use of LLMs for generating summaries of the network situation, hallucinations are a key concern with one respondent reporting that “errors could be so subtle that you need a more senior analyst to check the report when you could have just had a more junior analyst do the job in the first place.”[73]
In some cases, there may be a need to slow down machine-speed actions to human-speed, but these cases will only be known once an organisation-level risk assessment activity is conducted.[74] Placing barriers, limits, obstacles or believable deception in a network to slow down an attacker could also support human oversight of the network.[75]
Human oversight also involves ensuring that it is possible to audit and retrain autonomous agents to mitigate concept drift (where we might see unforeseen changes to the target and realised outputs) or the risk of reducing the effectiveness of the system.
5. Conclusions and Recommendations
This study’s exploration of authorisation limits for autonomous agents revealed heterogeneous views on where the authorised bounds for autonomous agents should lie, in large part due to unknowns about how RL and LLMs for cyber defence will progress. Despite this, study respondents converged around appetite for more Task Autonomy and Conditional Autonomy for autonomous agents in cyber defence with several caveats around operational error risks.
This work generated the following key conclusions:
- Organisation-specific, context-informed and risk-based analysis are required to determine tailored approaches to the adoption and operation of autonomous agents.
- The development of some permissions specification that allows organisations to clearly designate controls and constraints for autonomous capabilities within their systems is a vital enabler.
- The most pragmatic way to find out if an AI-enabled cyber defence system brings a net-benefit, despite discomfort with the risks, is through comparing the cybersecurity outcomes of systems with autonomous agents to human-only cyber defence.
Further research and development is needed to advance progress in the field of autonomous cyber defence. This work should focus on the following areas.
5.1 Conduct scenario-specific experiments comparing human-machine teams and human teams in cyber defence
Simulations could improve understanding of the costs and benefits of deploying autonomous agents. Two identical simulation environments could be set up: one with autonomous defenders and human teams and the other with just humans. Both teams could be attacked with the same emulation plans and the results could be compared to break the deadlock in the debate on the risks of introducing vs. not introducing autonomous agents in network security. As part of these experiments, the number of false positives and false negatives of a Tier 1 human analyst could be compared to the number of false positives and false negatives of an autonomous agent.
Both CNI and less safety-critical sectors, as well as threats that require varying speed of response should be explored. Although generating data on whether autonomous agents deliver a net benefit is the main objective of such a simulation, it will also enable the development of more accurate approximations of how devolved actions in the human-machine teams may affect a network. Ideally, the simulation will also illustrate which sets of actions have high consequence or high regret impacts.
Before a cyber defence human-machine team can be deployed, we need better frameworks to measure whether systems bring a net benefit to cybersecurity outcomes. The combined findings of CETaS’ work on autonomous cyber defence identified high level requirements for ACD agents including reliability; proportionality; adaptability; auditability; directability; observability; security; transferability and context-awareness.[76] This Phase 2 study also identified indicators that could illustrate that the introduction of autonomous agents in cyber defence delivers a net benefit (see Chapter 3). The next necessary step is to use data from experiments described above to identify specific indicators that would allow developers to demonstrate evidence of success for these requirements.
5.2. Construct a cyber defence lexicon for human-machine control and communication
The formulation of a specification, taxonomy, framework or model – both human and machine readable – for the control, constraint and direction of autonomous systems remains an open area of research. All the design requirements for human-machine communication outlined in Chapter 4 require shared definitions and language between the operator and autonomous agent, but this common language does not yet exist. Over time, these shared vocabularies and confirmations will contribute towards increasing confidence in the deployment of autonomous agents.
5.3. Research into AI-enabled deception and automated cyber threat intelligence generation
This study revealed the potential for new AI-enabled deception approaches integrated with automated cyber threat intelligence generation. This is a concept where autonomous agents manage AI-generated deception assets and automated techniques to enhance threat intelligence gathering, going beyond the focused network defence actions as defined by the MITRE D3FEND framework. This would need to be supported by further technical and policy research to understand the opportunities, risks and necessary policy guardrails.
References
[1] HM Government, The Near-Term Impact of AI on the Cyber Threat (National Cyber Security Centre: 2024), 2-3, https://www.ncsc.gov.uk/report/impact-of-ai-on-cyber-threat.
[2] “Log4j vulnerability – what everyone needs to know,” NCSC, 20 December 2021, https://www.ncsc.gov.uk/information/log4j-vulnerability-what-everyone-needs-to-know.
[3] HM Government, Securing Critical National Infrastructure: An Introduction to UK Capability (UK DE&S Exports and Department for Business and Trade: 2023), 7, https://assets.publishing.service.gov.uk/media/650d96112f404b000dc3d7c7/securing_critical_national_infrastructure_an_introduction_to_uk_capability.pdf; “People’s Republic of China State-Sponsored Cyber Actor Living Off the Land to Evade Detection,” Cybersecurity & Infrastructure Security Agency, 24 May 2023, https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-144a.
[4] HM Government, Securing Critical National Infrastructure: An Introduction to UK Capability (UK DE&S Exports and Department for Business and Trade: 2023), 7, https://assets.publishing.service.gov.uk/media/650d96112f404b000dc3d7c7/securing_critical_national_infrastructure_an_introduction_to_uk_capability.pdf.
[5] HM Government, Lessons Learned Review of the WannaCry Ransomware Cyber Attack (Department of Health & Social Care, NHS Improvement, NHS England: 2018), 12, https://www.england.nhs.uk/wp-content/uploads/2018/02/lessons-learned-review-wannacry-ransomware-cyber-attack-cio-review.pdf.
[6] “DEFEND: A Knowledge Graph of Cybersecurity Countermeasures,” MITRE Corporation, https://d3fend.mitre.org.
[7] “Cybersecurity Framework,” National Institute of Standards and Technology, https://www.nist.gov/cyberframework.
[8] “NCSC CAF guidance,” National Cyber Security Centre, https://www.ncsc.gov.uk/collection/caf/cyber-assessment-framework.
[9] The recently announced, and ongoing, DARPA AIxCC challenge is focused on these security research efforts. See: “DARPA AI Cyber Challenge Aims to Secure Nation’s Most Critical Software,” Defense Advanced Research Projects Agency, accessed 8 September 2023, https://www.darpa.mil/news-events/2023-08-09.
[10] HM Government, Defence AI Strategy (Ministry of Defence: June 2022), https://www.gov.uk/government/publications/defence-artificial-intelligence-strategy.
[11] “The state of AI in 2023: Generative AI’s breakout year,” https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year.
[12] Andrew Lohn, Anna Knack, Ant Burke and Krystal Jackson, “Autonomous Cyber Defence: A roadmap from lab to ops,” CETaS Research Reports (June 2023): 13.
[13] Sarah Mercer, “Welcome to Willowbrook: The simulated society built by generative agents,” CETaS Expert Analysis, https://cetas.turing.ac.uk/publications/welcome-willowbrook-simulated-society-built-generative-agents (December 2023): 2; Zhiheng Xi et al., “The Rise and Potential of Large Language Model Based Agents: A Survey,” ArXiV (September 2023): 1-86.
[14] Stavros Ntalampiras et al., Artificial Intelligence and Cybersecurity Research (European Union Agency for Cybersecurity: 2023), 2, https://www.enisa.europa.eu/publications/artificial-intelligence-and-cybersecurity-research; Sanyam Vyas et al., “Automated Cyber Defence: A Review,” ACM Meas. Anal. Comput. Syst 4, no. 111 (February 2023): 1-32; Maria Rigaki et al., “Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments,” in Proceedings of ACM Conference (August 2023): 1-16; Lei Wang et al. “A Survey on Large Language Model Based Autonomous Agents,” ArXiv (September 2023): 1-35; Yu Tian et al. “Evil Geniuses: Delving into the Safety of LLM-based Agents,” ArXiv (February 2024): 1-11. Kumar Shashwat et al., “A Preliminary Study on Using Large Language Models in Software Pentesting,” ArXiv (January 2024): 1-7; Andrei Kucharavy et al., “Fundamentals of Generative Large Language Models and Perspectives in Cyber Defense,” ArXiv (March 2023): 1-50; Alec Wilson et al., “Multi-Agent Reinforcement Learning for Maritime Operational Technology Cyber Security,” Conference on Applied Machine Learning for Information Security (October 2023): 1-11; Jakob Nyberg and Pontus Johnson, “Training Automated Defense Strategies Using Graph-Based Cyber Attack Simulations,” Workshop on SOC Operations and Construction 2023 (April 2023): 1-8; Luke Borchjes, Clement Nyirenda and Louise Leenen, “Adversarial Deep Reinforcement Learning for Cyber Security in Software Defined Networks,” ArXiv (August 2023): 1-6; Elizabeth Bates, Vasilios Mavroudis and Chris Hicks, “Reward Shaping for Happier Autonomous Cyber Security Agents,” ArXiv (October 2023): 1-12; Thomas Kunz et al., “A Multiagent CyberBattleSim for RL Cyber Operation Agents,” ArXiv (April 2023): 1-7; Marta Stroppa, “Legal and ethical implications of autonomous cyber capabilities: a call for retaining human control in cyberspace,” Ethics and Information Technology 25, no.7 (February 2023): 1-5; Myles Foley et al., “Inroads into Autonomous Network Defence Using Explained Reinforcement Learning,” in Proceedings of Conference on Applied Machine Learning in Information Security October 20-21 2022 (June 2023): 1-21; Hicks et al., “Canaries and Whistles: Resilient Drone Communication Networks with (or without) Deep Reinforcement Learning,” in Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (November 2023): 91-101; Ashutosh Dutta et al., “Deep Reinforcement Learning for Cyber System Defense Under Dynamic Adversarial Uncertainties,” ArXiv (February 2023): 1-9.
[15] Author interview with academic participant, 9 November 2023; Author interview with industry participant, 31 October 2023; Author interview with industry participant, 24 November 2023; Author interview with industry expert, 8 December 2023.
[16] Author interview with academic participant, 26 October 2023; Author interview with industry participant, 1 November 2023; Author interview with industry participant, 31 October 2023; Author interview (2) with academic participants, 9 November 2023; Author interview with academic participant, 15 November 2023; Author interview with academic participant, 8 January 2024; Author interview with industry participant, 17 November 2023; Author interview with international experts, 4 December 2023; Author interview with industry participant, 8 December 2023.
[17] Year by year comparisons of published papers, shows a shift in 2023 and so far in 2024 towards LLM-based research:
2024: 301 RL papers vs 458 LLM papers. 2023: 1590 RL papers vs 1620 LLM papers. 2022: 1140 RL papers vs 603 LLM papers. 2021: 791 RL papers vs 480 LLM papers.
[18] Author interview (2) with academic participants, 9 November 2023; Author interview with industry participant, 31 October 2023; Author interview with academic participant, 15 November 2023; Author interview with academic expert, 7 December 2023.
[19] Polra Victor Falade, “Decoding the Threat Landscape: ChatGPT, FraudGPT and WormGPT in Social Engineering Attacks,” International Journal of Scientific Research in Computer Science, Engineering and Information Technology, (October 2023); Author interview with international experts, 8 November 2023.
[20] OECD. “N.Korea tries to use artificial intelligence to write malicious software: U.S. official,” https://oecd.ai/en/incidents/46156.
[21] Stavros Ntalampiras et al., Artificial Intelligence and Cybersecurity Research (European Union Agency for Cybersecurity: 2023), 24, https://www.enisa.europa.eu/publications/artificial-intelligence-and-cybersecurity-research; Author interview with industry participant, 21 October 2023; Author interview with industry participant, 31 October 2023.
[22] Peter Margulies, “Autonomous Cyber Capabilities Below and Above the Use of Force Threshold: Balancing Proportionality and the Need for Speed,” International Law Studies 394, (2020); Alexander Kott et al., “Autonomous Intelligent Cyber-defense Agent (AICA) Reference Architecture Release 2.0,” Army Research Laboratory, (September 2019).
[23] “Chair’s Summary of the AI Safety Summit 2023, Bletchley Park,” HM Government, 2 November 2023, https://www.gov.uk/government/publications/ai-safety-summit-2023-chairs-statement-2-november/chairs-summary-of-the-ai-safety-summit-2023-bletchley-park.
[24] “EU AI Act: First Regulation on Artificial Intelligence,” European Parliament, accessed 22 February 2024, https://www.europarl.europa.eu/topics/en/article/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence.
[25] Madhumita Murgia, “White House science chief signals US-China cooperation on AI safety,” Financial Times, 8 June 2023, https://www.ft.com/content/94b9878b-9412-4dbc-83ba-aac2baadafd9.
[26] Author interview (2) with academic participant, 26 October 2023.
[27] Author interview with academic participant, 15 November 2023.
[28] US Federal Government, Roadmap for AI (Cybersecurity & Infrastructure Security Agency: 2023), 2, https://www.cisa.gov/resources-tools/resources/roadmap-ai.
[29] The score is calculated from data of participant responses to acceptable autonomy levels for the D3FEND components. Participants were free to choose 1, many, all or no levels of autonomy for each D3FEND component. Where a participant selected a single autonomy level (e.g. Level 2 for ISOLATE) a score of 1.0 is recorded. Where a participant selected more than one autonomy level, this 1.0 score is divided evenly amongst the selected levels (i.e. 0.5 each for 2 selections, 0.33 each for 3 selections and 0.25 each if all four autonomy levels were selected).
[30] Author interview with academic participant, 2 November 2023; Author interview with industry participant, 31 October 2023; Author interview with industry participant, 24 November 2023; Author interview with academic expert, 7 December 2023.
[31] Author interview with academic expert, 7 December 2023; Author interview (2) with academic expert, 7 December 2023; Author interview with industry expert, 8 December 2023.
[32] Author interview (2) with academic participant, 26 October 2023; Author interview with industry participant, 1 November 2023; Author interview with academic participant, 2 November 2023; Author interview with academic participant, 6 November 2023; Author interview with academic participant, 9 November 2023; Author interview with oversight body, 14 November 2023; Author interview with industry participant, 8 December 2023.
[33] Author interview with academic participant, 26 October 2023; Author interview (2) with academic participant, 26 October 2023.
[34] Author interview (2) with academic participant, 26 October 2023; Author interview with industry participant, 21 October 2023; Author interview with industry expert, 8 December 2023.
[35] Author interview (2) with academic participant, 26 October 2023; Author interview with academic participant, 9 November 2023; Author interview with industry participant, 17 November 2023.
[36] Author interview with industry participant, 21 October 2023; Author interview (2) with academic expert, 7 December 2023; Author interview with industry participant, 8 December 2023.
[37] Author interview with industry participant, 17 November 2023; Author interview with industry participant, 24 November 2023; Author interview with international expert, 27 November 2023.
[38] Author interview with academic expert, 7 December 2023.
[39] Author interview with oversight body, 14 November 2023.
[40] Author interview (2) with academic participant, 26 October 2023; Author interview with industry participant, 1 November 2023; Author interview with academic participant, 6 November 2023; Author interview with oversight body, 14 November 2023.
[41] Author interview with industry participant, 21 October 2023; Author interview with government stakeholders, 31 October 2023; Author interview with academic participant, 9 November 2023; Author interview (2) with academic participants, 9 November 2023; Author interview with academic participant, 15 November 2023; Author interview with international experts, 20 November 2023; Author interview with industry participant, 24 November 2023; Author interview with oversight body, 14 November 2023.
[42] Author interview with government stakeholders, 31 October 2023; Author interview (2) with academic participants, 9 November 2023; Author interview with legal expert, 22 January 2024.
[43] Author interview with academic participant, 6 November 2023; Author interview with international experts, 4 December 2023.
[44] Author interview with industry participant, 21 October 2023.
[45] Author interview with industry participant, 21 October 2023.
[46] Author interview with academic participant, 6 November 2023; Author interview with legal expert, 22 January 2024.
[47] Author interview with academic participant, 6 November 2023; Author interview with international experts, 8 November 2023.
[48] Author interview with international experts, 8 November 2023; Author interview with international experts, 4 December 2023.
[49] Author interview with international experts, 4 December 2023; Author interview (2) with academic expert, 7 December 2023; Author interview with academic participant, 2 November 2023; Author interview with industry participant, 31 October 2023; Author interview with oversight body, 14 November 2023; Author interview with industry participant, 17 November 2023.
[50] Author interview with international experts, 20 November 2023; Author interview with oversight body, 14 November 2023; Author interview with international experts, 4 December 2023.
[51] Author interview with academic participant, 26 October 2023.
[52] Author interview with industry participant, 24 November 2023.
[53] Author interview with legal expert, 22 January 2024; Author interview with academic participant, 8 January 2024; Author interview with international experts, 4 December 2023; Author interview with academic expert, 7 December 2023; Author interview (2) with academic expert, 7 December 2023.
[54] Author interview with academic participant, 6 November 2023.
[55] CETaS workshop, 7 February 2024.
[56] Mariarosaria Taddeo et al. “Artificial Intelligence for National Security: The Predictability Problem,” CETaS Research Reports (September 2022): 51, https://cetas.turing.ac.uk/publications/artificial-intelligence-national-security-predictability-problem.
[57] Author interview with international experts, 20 November 2023; Author interview with industry participant, 24 November 2023; Author interview with academic participant, 26 October 2023; Author interview (2) with academic participant, 26 October 2023; Author interview with oversight body, 14 November 2023; Author interview with academic participant, 15 November 2023; Author interview with industry participant, 17 November 2023.
[58] CETaS workshop, 7 February 2024.
[59] CETaS workshop, 7 February 2024; Author interview with academic participant, 6 November 2023.
[60] CETaS workshop, 7 February 2024; Author interview with government stakeholders, 31 October 2023; Author interview with industry participant, 17 November 2023; Author interview with legal expert, 22 January 2024; Author interview with international experts, 8 November 2023; Author interview with industry participant, 8 December 2023.
[61] CETaS workshop, 7 February 2024.
[62] CETaS workshop, 7 February 2024; Author interview with academic participant, 6 November 2023; Author interview with industry participants, 30 November 2023.
[63] Robert A. Bridges et al., “Testing SOAR Tools in Use,” ArXiv (February 2023): 1-46, https://arxiv.org/abs/2208.06075; Megan Nyre-Yu et al., “Considerations for Deploying xAI Tools in the Wild: Lessons Learned from xAI Deployment in a Cybersecurity Operations Setting,” Conference: proposed for presentation at the ACM SIG Knowledge Discovery and Data Mining Workshop on Responsible AI (2021): 1-5, https://www.osti.gov/biblio/1869535.
[64] CETaS workshop, 7 February 2024; Author interview with academic participant, 9 November 2023; Author interview with industry participants, 30 November 2023; Author interview with academic participant, 2 November 2023.
[65] Author interview with academic participant, 6 November 2023.
[66] Author interview with international experts, 20 November 2023; Author interview with industry participant, 24 November 2023; Author interview with academic participant, 26 October 2023; Author interview (2) with academic participant, 26 October 2023; Author interview with oversight body, 14 November 2023; Author interview with academic participant, 15 November 2023; Author interview with industry participant, 17 November 2023.
[67] CETaS workshop, 7 February 2024.
[68] Author interview with industry participant, 24 November 2023.
[69] CETaS workshop, 7 February 2024.
[70] CETaS workshop, 7 February 2024; Johannes Kattan, “Extinction Risks and Resilience: A Perspective on Existential Risks Research with Nuclear War as an Exemplary Threat,” Intergenerational Justice Review 8, vol. 1, 4-12, https://www.ssoar.info/ssoar/handle/document/86393.
[71] CETaS workshop, 7 February 2024.
[72] Author interview with academic participant, 6 November 2023; Author interview with international experts, 4 December 2023.
[73] Author interview with academic participant, 6 November 2023; CETaS workshop, 7 February 2024.
[74] CETaS workshop, 7 February 2024.
[75] CETaS workshop, 7 February 2024.
[76] Andrew Lohn, Anna Knack, Ant Burke and Krystal Jackson, “Autonomous Cyber Defence: A roadmap from lab to ops,” CETaS Research Reports (June 2023): 10.
Authors
Ant Burke
Visiting FellowCitation information
Anna Knack and Ant Burke, "Autonomous Cyber Defence: Authorised bounds for autonomous agents," CETaS Briefing Papers (May 2024).