top of page

Artificial Intelligence in Defense: Ethical Dynamics, Strategic Challenges, and Future Perspectives

Immagine del redattore: Andrea ViliottiAndrea Viliotti

“JSP 936 V1.1 Dependable Artificial Intelligence (AI) in Defence Part 1: Directive” is the title of the most recent Defense directive developed by Alison Stevenson (Director General Delivery & Strategy, Ministry of Defence) together with the Defence AI and Autonomy Unit (DAU) and Defence AI Centre (DAIC), in collaboration with the UK Ministry of Defence. The document focuses on the implementation of Artificial Intelligence in the military sphere, aiming for the safe and responsible use of innovative technologies. Its main objective is to provide clear directives on how to develop innovative algorithms and models, ensuring transparency, regulatory compliance, and ethical standards.

Artificial Intelligence in Defense
Artificial Intelligence in Defense: Ethical Dynamics, Strategic Challenges, and Future Perspectives

Artificial Intelligence in Defense: Opportunities and Responsibilities

The Artificial Intelligence in Defense directive highlights its role as a transformative force, impacting every facet of modern military operations. The theoretical foundations of the document suggest that adopting AI in different Defense segments—from logistics to decision-making support in complex operational environments—can increase both effectiveness and speed of action. At the same time, it is explicitly highlighted that the widespread diffusion of the technology must be balanced with a high level of control and accountability, in order to protect not only military personnel but also civilians and the integrity of the systems themselves.


The importance of a strategic vision for AI also stems from experiences gained in recent years. The evolution of AI clearly shows the speed of development achieved by Machine Learning and Deep Learning algorithms, especially in areas such as computer vision and natural language processing. On the other hand, Defense has realized that the possibilities offered by AI are not limited to computing power but extend to the entire international security scenario, considering the potential vulnerabilities introduced by targeted cyberattack techniques, such as data poisoning or manipulation of models trained on unreliable datasets.


Precisely for this reason, the concept of the Operating Design Domain (ODD) in the Artificial Intelligence in Defense directive outlines specific requirements for safe AI deployment. Defining the scope of use for an algorithm or model is not merely a technical exercise; it becomes the foundation for understanding risks and planning appropriate protective measures. If a system for the automatic recognition and tracking of vehicles or people is trained in simplified environments, it may fail in hostile contexts that differ significantly from the reference dataset, producing erroneous decisions that jeopardize the safety of personnel.


The initial section of the document insists on the importance of not viewing ethical and regulatory factors as a brake on innovation but rather as a lever to consolidate trust among all stakeholders, from individual operators to Parliament and public opinion. Framing AI within a defined perimeter of responsibility—clarifying who controls the algorithm, who is accountable for the outcomes, and how review processes are structured—makes wider and, above all, more sustainable long-term adoption possible. The presence of active human oversight, with tracking and auditing mechanisms, is one of the key conditions for maintaining the so-called human-centricity expressed by the Directive. This is exactly one of the points on which the methodological framework focuses: the ASR principles (Ambitious, Safe, Responsible) aim to manage the adoption of tools that can have a major impact on operational decisions in a balanced way.


In parallel, AI’s strategic relevance also extends to more “behind the scenes” aspects, which are no less critical, such as the analysis of large volumes of data for predictive maintenance of weapons systems or logistical fleets and the reduction of response times in back-office administrative procedures. AI, when properly trained and integrated into reliable architectures, can speed up essential information flows and relieve staff of more repetitive tasks, leaving room for strategic planning or supervision duties. The danger, however, lies in possible data misinterpretation errors or in excessive trust in the algorithm when real conditions differ from the training scenario. Hence the need for continuous monitoring of model performance, both before release and during deployment, thanks to testing and verification procedures designed to account for potential changes in the operating environment.


The strategic analysis presented by the document further highlights the need to maintain a multidisciplinary approach, engaging legal, technical, and operational expertise that works together throughout the entire AI lifecycle. This involvement must begin in the embryonic stage, when functional requirements are defined and data collection is initiated, and it must continue through development, integration, and reliability assessment. It is not uncommon for Defense projects to make use of open source or commercial software solutions, and in any case, it is crucial to require a sufficient level of certification from external suppliers to avoid a lack of solid evidence regarding data and testing processes. The British Ministry of Defence, in this regard, underscores the need for contractual guarantees that allow all necessary checks to be carried out, including those relating to the source of the dataset.


Reliable AI in Defense: Ethical and Regulatory Principles

The document clearly states that Artificial Intelligence should not be treated as a mere digital aid, but as a technology destined to interface with vital decision-making processes. Hence the centrality of regulatory references and ethical principles: any AI system must comply with international laws and conventions, particularly within the framework of International Humanitarian Law. In Defense applications, this entails a thorough review of norms concerning the use of force, the protection of human rights, and the accountability of military leadership. “JSP 936” cautions technicians and project managers about the risks of any lack of legal oversight: failing to do so could result in violations for which the entire military organization would be liable, causing extremely serious repercussions in terms of credibility and political responsibility.


The approach codified in the five ASR principles—human-centricity, responsibility, understanding, bias and harm mitigation, reliability—suggests that every action should be evaluated with a view toward its potentially extensive impact, because AI solutions have an adaptive nature. A model trained on specific datasets can change its performance and outcomes if exposed to new conditions or alternative data sets. The principle of human-centricity reaffirms the need to keep the person (operator, analyst, citizen) at the center of the decision-making chain, both to prevent possible harm to civilian communities and to ensure that the decisions made in operational contexts are appropriate.


Responsibility then implies defining, without ambiguity, who is accountable for the AI system’s actions during development, training, operational deployment, and ongoing maintenance. The document introduces specific reference roles, such as the Responsible AI Senior Officer (RAISO), designed to ensure that no gray areas arise in which the algorithm operates without human control. In this scenario, understanding also becomes a key factor: if a team cannot explain the basic criteria by which a given model generates its outputs and is unable to understand the limits of its training data, the very foundations for an intelligent and informed use of AI collapse. Merely implementing machine learning mechanisms and hoping they yield reliable results is not enough: organizations must structure comprehensive documentation, conduct validation tests, and ensure that end users understand the system outputs, at least at a level sufficient to guide the necessary trust or caution.


The analysis of bias and harm mitigation draws attention to the problem of discrimination and potential unintended consequences. A facial recognition algorithm, for example, could have higher error rates for certain population groups if it is trained on unbalanced datasets. In a Defense context, unjustified discrimination or underestimation of certain risk profiles could result in operations that fail to comply with proportionality principles and the protection of civilians. Therefore, data collection must be handled rigorously, certifying the source, quality, and relevance of the information to the expected scenarios. The same applies to the secure management of data and models, as any cybersecurity vulnerabilities could compromise the entire system, opening the door to manipulation or theft of sensitive information.


Another relevant aspect is reliability—the need to ensure that the AI system operates robustly, safely, and according to requirements even in adverse circumstances. Defense recalls the typical verification and validation procedures for software, which must be extended with large-scale tests and ongoing reviews, because learning algorithms may degrade over time or become unpredictable under extreme conditions. A security-by-design approach is proposed, integrating safety evaluations and mechanisms from the outset, along with continuous monitoring in real-world scenarios. This consideration carries even greater weight in the case of Robotic and Autonomous Systems (RAS), where human intervention can be limited, and an algorithmic malfunction could lead to errors in critical operational theaters.


In the legal and ethical sections of the document, it is emphasized that compliance is not solely about what the technology does but also about how it is implemented and managed. It is in this “how” that potential violations or compliance become apparent: the same AI could be employed or configured in very different ways, and the Directive reiterates that every step must align with national and international regulations. Clarity of roles thus becomes decisive. The internal legal team, in contact with the Ministry’s legal advisors, must periodically review the development and use of the technology, flagging at-risk areas or regulatory gaps. Final decisions will be made by higher levels, such as the TLB Executive Boards, which, in turn, send compliance declarations and risk reports to top-level figures such as the Second Permanent Under Secretary (2PUS) or the relevant ministers, if risk levels are deemed critical.


AI Security and Testing in Defense: Toward Reliable Implementation

One of the most detailed sections of the document concerns the process of creating, testing, and integrating AI solutions. It describes methodologies akin to DevOps and MLOps principles—workflows intended for the continuous refinement of algorithms. The official text stresses how Machine Learning models or Deep Learning techniques require suitable training and validation datasets, to avoid overfitting (when the algorithm learns the dataset too closely and loses its ability to generalize) or underfitting (when the algorithm fails to capture the complexity of the problem). There is also the risk of catastrophic forgetting, in which a model, upon being updated with new data, “forgets” previously acquired knowledge.


The text reflects on a crucial point: every AI solution must be integrated into a broader system with specific security features, hardware configurations, and defined interfaces. If the surrounding components change substantially, it must be verified that the algorithm still functions correctly by re-running integration and validation tests. Verification concerns both code integrity and compliance with requirements as well as the management of vulnerabilities. In the military context, this need is particularly stringent, as a small error in data interpretation can have enormous consequences on the ground, jeopardizing missions or endangering human lives.


Within this reflection on model robustness, the Directive reiterates the need to constantly monitor the operational environment in which the AI is deployed. The so-called Operating Design Domain thus becomes a fundamental criterion to define the model’s scope of validity and to understand when incoming data falls outside the expected range. If a system has been trained to operate in urban scenarios, it may not be suitable for electronic warfare in desert areas. Periodic updates of neural networks, based on new data, are essential but must be carried out through a quality process that does not compromise previously acquired performance. Also relevant here is the issue of data configuration, which must be protected from tampering and responsibly managed concerning provenance, as specified by the configuration policy defined by the Ministry of Defence.


Key points regarding development connect to the importance of choosing performance metrics that best match military and security objectives. High accuracy in the lab may not translate into satisfactory accuracy in the field, especially if the training dataset does not reflect real conditions. Consequently, it is mandatory to protect test data and separate validation datasets to independently verify system performance. An integrated security approach is also required from the design stage to prevent poisoning attacks or modifications during the inference phase. The directive acknowledges that traditional methods are not always sufficient, especially in the rapidly evolving field of machine learning, and therefore recommends ongoing integration of risk analysis procedures throughout the entire lifecycle.


An interesting perspective is offered on model reusability. The Directive specifies that in many contexts, it might be preferable to use an already trained model, modifying certain parts or retraining it on more specific datasets. In such circumstances, it is necessary to ensure the availability of transparent documentation on how the model was initially developed and verified, on any licensing constraints, and on the guarantees of compatibility with operational requirements. Here again, the supplier contracts play a role, clarifying who owns the algorithm, who holds the intellectual property for the data, and whether internal validation tests may be conducted. Only when these elements are in place can the same model be safely integrated into new systems or operational contexts. On the other hand, the contractual dimension also takes on an international profile, since collaboration with foreign industries and universities must consider export controls, potential restrictions, and the fact that in multinational cooperation scenarios (e.g., with NATO or other allied forces), the rules might vary.


The Directive also suggests not overlooking the factor of obsolescence: software systems evolve rapidly, and today’s cutting-edge AI solutions may become outdated in a short span of time. It is crucial to plan updates and maintenance procedures that keep pace with emerging security threats and technological advancements, assessing how far a model can be extended or updated without risking negative impacts on performance.


Risk Management, Security, and Accountability in Experimentation

One of the core themes of JSP 936 pertains to risk management throughout the entire process of AI development and deployment. The classification system proposed suggests defining a level of risk based on impact and probability, identifying possible scenarios in which improper use or an algorithmic flaw could cause tangible harm. AI projects that exceed certain critical thresholds require extremely high-level oversight, undergoing review by bodies like the Joint Requirements Oversight Committee or the Investments Approvals Committee, and in extreme cases, even ministerial supervision. This is not mere bureaucracy, but a mechanism designed to ensure maximum alertness when activities with strong ethical or operational implications are involved.



The text clarifies that security extends beyond protection from cyberattacks—though that is a core focus, given the growth of advanced hacking techniques and the possibility of manipulating training data to produce adverse effects. Security also includes the physical safety of scenarios where AI is employed in autonomous aerial, ground, or naval systems. In such cases, an algorithmic failure or a malfunction due to hostile electronic countermeasures could lead to dangerous maneuvering errors. That is why the Directive stresses rigorous testing procedures, simulated under realistic or near-real conditions, with the ability to quickly isolate the system in the event of abnormal behavior. Setting safety standards and coordinating with regulations such as Def Stan 00-055 and 00-056 are mandatory, as is adopting proven Safety Management Systems (JSP 815, JSP 375, and JSP 376).


The theme of responsibility, linked to AI governance, involves multiple professional roles and spans the entire project lifecycle, from initial development to subsequent updates, including real-world mission operations. The suggested approach aims to avoid redundant structures while updating existing control processes to integrate AI-specific features. The top authorities intend for teams not to duplicate unnecessary procedures but to adapt protocols so they can recognize and manage the risks inherent in machine learning systems.


A responsible approach also implies the awareness that AI is fallible and may have error margins that are not always predictable. In the context of Research & Development projects, the Directive emphasizes the need for controlled testing, preferably in safe environments, where any undesirable behavior can be studied and corrected. When research on human subjects is required to validate the effectiveness of certain algorithms (e.g., for the analysis of human-machine interactions), it must strictly adhere to the guidelines of JSP 536, addressing issues of safety and informed consent. Unintended effects on unaware individuals must be avoided, such as the use of sensitive personal data in contexts not clearly authorized.


Also regarding experimentation, the Directive indicates the production of templates and support materials (model cards, ethical risk labels, AI assurance questionnaires) to assist personnel. The objective is to create a library of best practices so that various departments can share information on successful solutions, lessons learned and identified vulnerabilities. This exchange is deemed essential for interoperability with allies, both within and beyond NATO, because AI does not respect national borders and requires international cooperation to be effectively managed. In particular, the British Defense approach, consistent with NATO trends, is grounded in building AI that is transparent, analyzable, and aligned with shared democratic principles.


Risk management is further strengthened by consideration of issues such as confidentiality, integrity, and availability of data (the classic pillars of cybersecurity). For a system trained with classified data, the Directive specifies that the resulting model inherits the same or even a higher level of classification if aggregating sensitive data creates a high-risk scenario. This entails an obligation to maintain strict control over information flows, with auditing procedures and a clear trace of data movement from the initial source through training to final deployment in the field.


Human-AI Teaming in Defense: Integration and Innovation

“JSP 936” devotes particular attention to the integration of humans and intelligent machines. This topic does not concern only drone pilots or soldiers using automatic targeting systems but extends to administrative and logistical sectors as well. Human-AI teaming is considered a hallmark of the ongoing digital transformation: operator and machine must work in synergy, leveraging their respective strengths. The human role remains crucial in ensuring meaningful control and intervening with the required flexibility, while the machine can quickly analyze complex data, offering scenarios and reducing operators’ cognitive load.


However, for this collaboration to produce the desired outcomes, personnel training becomes indispensable. The document outlines the need to provide training not only in using new systems but also to develop a deep understanding of their vulnerabilities and the associated risks. If an operator places blind trust in the outcome of an image recognition system, for instance, they might miss false positives or false negatives in unforeseen conditions, with potentially disastrous consequences. The Directive recommends planning training programs that expose personnel to edge cases, anomalies, and typical AI model errors, providing clear guidelines on when and how to manually intervene.


Human-centricity is fully evident in this context, too, as personnel are not merely cogs in a machine but are instead protagonists in the integration of Artificial Intelligence into Defense. In some operational scenarios, robots and autonomous systems must function without continuous oversight, but a central command should always be able to resume control at any time. This form of “meaningful control” is at the core of military ethics and satisfies specific legal requirements. The Directive thus stresses defining clear roles and specific responsibilities: who trains the AI, who evaluates it, who approves it, who monitors its performance in missions, and who manages emergencies. Each person involved should have the requisite training to fulfill their role, and where internal competencies fall short, collaboration with universities and specialized firms is encouraged to fill any knowledge gaps.


The document illustrates that the challenge of operating across multiple domains—air, land, sea, space, and cyberspace—necessitates unified standards: an AI system controlling an autonomous ground vehicle might need to communicate with a naval platform or an observation satellite. From this perspective, human-AI teaming becomes a large-scale team effort where multiple algorithms operate in parallel and various groups of operators simultaneously interact with the technology. Complexity increases, as does the need for integrated testing procedures, joint simulation scenarios, and a regulatory framework that defines collective responsibilities. It is precisely in this integration that the British Defense sees an opportunity to maintain a military advantage, provided a trust-based ecosystem is created among Allies and sufficient guarantees of correct system behavior are offered.


In its final chapters, “JSP 936” explicitly mentions the need to update personnel career paths so that AI is not viewed merely as a tool but as an integral part of a soldier’s or Defense official’s job. Achieving this cultural shift requires constant investment: from e-learning platforms to the creation of multidisciplinary analyst teams, from enhancing simulation laboratories to introducing specific security protocols for AI scenarios. Ultimately, the Directive promotes an organizational model capable of evolving at the same pace as technology, avoiding rigidity in frameworks that are no longer adequate for the contemporary context.


Conclusions

The information presented in “JSP 936 V1.1 Dependable Artificial Intelligence (AI) in Defence Part 1: Directive” provides a realistic and detailed picture of how Artificial Intelligence is entering the mechanisms of Defense, influencing operational choices, logistical processes, and ethical assessments. Security, robustness, and system transparency are no longer mere technical details; they are actual enablers of a potential competitive advantage on which armed forces are investing. From the current state of the art, it is clear that many similar technologies—ranging from large neural networks used by commercial enterprises to predictive analysis software in the financial sector—already offer comparable functionalities. The real challenge lies in the specific integration of these tools into operational theaters, alongside the strict legal accountability standards required by both national and international defense.


A key factor lies in ensuring ongoing dialogue between scientific research and the military domain, promoting opportunities for reflection that allow for predicting and understanding the future impacts of algorithms. Often, those who develop a Deep Learning model do not fully realize the operational complexities of a battlefield, just as those who plan missions may be unfamiliar with the potential pitfalls of a partially trained model. Hence the necessity for permanent interfaces between areas of expertise to ensure that solutions, while ambitious, do not exceed acceptable risk levels.


In an increasingly rich landscape of AI solutions—from open-source platforms to offerings by major multinationals—Defense must evaluate how external systems can be integrated into proprietary architectures. The interoperability question, especially in international alliances and with NATO, goes far beyond choosing file formats. It concerns ensuring that ethical principles, testing methodologies, and security standards are aligned, so as to build mutual trust and a solid framework for information sharing. Comparing with competing or parallel technologies, developed in other countries or the private sector, provides an opportunity for continuous improvement, provided one remains firmly rooted in reliability and transparency criteria.


The need for strict protocols, detailed risk analysis, and continuous ethical oversight makes the sector of Artificial Intelligence in Defense a laboratory for ideas where synergy between industry and military institutions can produce solid innovations. In practical terms, this means exploring business models in which public-private collaboration goes beyond the mere supply of technological solutions, fostering an ongoing exchange of legal, scientific, and operational competencies.


“JSP 936” is not just a rulebook but an incentive to understand how far Artificial Intelligence can go without losing sight of democratic values and collective security. While the rapid pace of technological evolution encourages the experimental adoption of increasingly complex systems, it also calls for calm reflection on strategic impacts and on the possibility that, in the near future, models may become even more capable of learning and adapting. Ultimately, the effectiveness of these tools will hinge on organizations’ abilities to anticipate and govern ethical and operational implications, as well as to train personnel for critical and informed use, striving for a balance that enables them to reap the benefits without subjecting defense structures to unnecessary risks. The key message is that the real strength of Artificial Intelligence lies in the collaboration between humans and machines, provided it is supported by solid processes and an ever-updated ethical and regulatory vision.


 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page