Risultati di ricerca

Servizi (1)

Post sul blog (469)

Altre pagine (15)

469 elementi trovati per ""

Impact of AI in Accounting and Finance
Artificial intelligence (AI) is transforming the accounting and finance sector, bringing innovation and significantly improving decision-making processes. However, it also introduces important challenges, particularly in terms of ethics, data management, and the transformation of professional skills. This article is based on research conducted by the Institute of Management Accountants (IMA), led by Qi "Susie" Duong, along with other expert consultants in the sector. We will explore the main trends of AI in accounting and finance, practical applications, and the challenges and requirements for the effective implementation of these technologies. AI Trends in Accounting and Finance The exponential growth of AI is leading to a profound transformation in the way companies operate in the accounting and finance sector. According to IMA research, 70% of business leaders believe that AI is transforming the industry, especially through the adoption of predictive models and real-time data analysis. In particular, the integration of machine learning algorithms makes it possible to obtain more accurate forecasts and identify market trends in a timely manner. This is evident in the adoption of AI by companies such as Zoom and Ford, which are using AI models to predict analyst demands and respond to internal queries, demonstrating a productivity increase of up to 25% in some areas. AI technologies are now used to automate traditionally manual processes such as accounts payable and receivable, monthly and quarterly closings, expense management, and procurement. Business leaders are exploring the potential of generative AI to boost productivity and gain new strategic insights. Generative AI, a subclass of machine learning, can create new content and generate added value in business processes. For example, 45% of the study participants stated that the adoption of generative AI has significantly improved efficiency in managing financial reports and creating automated content for strategic analysis. Applications of AI AI is applied in various aspects of financial management, including process simplification and risk management. One of the main areas of AI use is the automation of accounting processes. About 65% of the companies surveyed have implemented automation systems for accounts payable and receivable, achieving a reduction in processing times by up to 30%. Additionally, AI has been employed to improve the accuracy of quarterly financial closings, reducing the margin of human error by 20%. This was possible thanks to the use of optical character recognition (OCR) algorithms that automate the recognition and recording of financial documents. Another significant example is the implementation of AI for tax management in complex international contexts. A leading company in the smart devices sector used an AI-integrated tax engine to identify discrepancies in tax regulations across different countries, improving compliance and reducing operational costs related to tax irregularities by 18%. The integration of AI systems for automatic report generation also increased efficiency, allowing a daily and consolidated view of global financial performance. In the healthcare sector, AI has had a particularly significant impact on hospital cost management. The adoption of AI for monitoring operational expenses has led to a 15% saving, thanks to predictive analysis and improved data-driven decision-making. The ability to process large volumes of data has made AI a key tool for strategic resource planning, especially in emergencies such as the COVID-19 pandemic. Another interesting example is the use of artificial intelligence to optimize supply chain management at an egg-producing company. The AI system, trained to analyze egg images, allowed for accurate counting and defect detection, resulting in a saving of about $6 million by reducing losses. This example clearly shows how AI can help improve operational efficiency and profitability. Challenges and Prerequisites of AI Implementation Despite the evident advantages, the integration of AI into the accounting and finance sector presents considerable challenges. According to IMA research, 38% of participants identified the human aspect as the main challenge for the success of AI initiatives. In particular, the lack of specialized skills among staff represents a significant problem: many companies are trying to bridge this gap through training and development programs, but 30% of organizations report difficulties finding suitable talent. Additionally, the lack of support from top management has been cited as one of the main obstacles to the effective adoption of AI, especially given the need to reorganize resources and establish new strategic priorities. From a technological point of view, 33% of participants highlighted poor data quality as a critical element. The availability of high-quality data is essential for the effectiveness of AI algorithms, but many existing systems cannot provide the necessary information with the required precision. The digital maturity of organizations has been identified as another relevant obstacle, especially in small and medium enterprises, where 44% of companies declared they are not ready to embark on a digital transformation journey. The research also showed regional differences in the challenges faced. In the United States, the Asia-Pacific region, and China, the main challenges are human aspects, while in the Middle East and North Africa (MENA) region, the main obstacles are related to technological maturity and data quality. In Europe, operational challenges constitute the biggest obstacle to AI implementation, while in India, concerns are mainly ethical and related to governance. Regarding ethical and governance aspects, 20% of participants expressed concerns about data security and information confidentiality. The management of biases in data and the transparency of AI models are key elements to ensure stakeholder trust and mitigate the ethical risks associated with adopting these technologies. It has been suggested to establish rigorous governance protocols and adopt data quality control practices to avoid distortions that could compromise results. A fundamental prerequisite for AI success, according to 40% of the study participants, is the "top-down" approach. The support and commitment of company leadership are essential to ensure that AI is implemented in line with the organization's strategic objectives. Additionally, 25% of organizations emphasized the importance of a detailed cost-benefit analysis before adoption, to ensure that investments in AI bring actual improvements in productivity and time savings. The lack of support from top management, the absence of specific skills to work with AI, and the difficulty in obtaining consensus from all stakeholders are some of the main limiting factors. For example, it has been found that resistance to change is often more challenging to address than the technology adoption itself. Collaboration between financial professionals and data scientists, known as "collaborative intelligence," is essential to ensure effective AI implementation and optimal results. AI can amplify human cognitive abilities, while humans provide the necessary context and oversight to avoid errors and biases. For instance, the involvement of financial experts in AI algorithm training ensures that models are trained with realistic data and that analyses are relevant to the company's objectives. Ethical and Governance Aspects The adoption of AI in accounting and finance raises important ethical issues, such as data integrity, security, and confidentiality. According to IMA research, 40% of participants stressed the importance of ensuring data integrity to mitigate the risks of biases in the data itself. A participant in the United States described how their AI system was trained with data representative of the entire product population to avoid biases and improve the accuracy of analyses. Additionally, 20% of participants expressed concerns about data security, with particular emphasis on protecting personal information and ensuring confidentiality during all stages of data processing. Another fundamental aspect concerns the governance of AI systems. About 35% of respondents highlighted the importance of establishing clear governance protocols and educating stakeholders on the use of AI technologies. This aspect is particularly relevant in regions such as Asia-Pacific, where some governments, like Japan's, are beginning to discuss how to regulate AI use in both the public and private sectors. Trust in AI systems largely depends on transparency: 25% of participants stated that a clear understanding of the processes leading to AI-generated recommendations is essential to build and maintain user trust. Finally, the issue of trust in AI systems remains crucial. The lack of knowledge about what AI can actually accomplish and how it can transform the work of accounting and finance professionals has been cited as a significant factor contributing to the lack of trust. To address this issue, it is necessary to develop training programs that help professionals understand the limits and potentials of AI, promoting responsible and informed use of AI technologies. Conclusions The impact of artificial intelligence on accounting and finance is not just a matter of operational efficiency or cost reduction: it represents a profound redefinition of the human and organizational role in an increasingly automated and interconnected financial ecosystem. What emerges strongly is that the real challenge is not only technological but also cultural, strategic, and even ethical. AI does not simply change the "how" but forces companies to rethink the "why" of many of their traditional activities. This leads to a critical reflection on digital transformation as an opportunity not only to improve but to redefine business value models. Firstly, the automation and predictive analysis enabled by AI are pushing companies to move from a reactive to a proactive approach. Decisions are no longer based only on historical data but on simulations and projections that allow future scenarios to be anticipated. This radically changes the concept of risk, which becomes more manageable but also more exposed to the interdependence of complex systems. In this sense, the role of the CFO will no longer be limited to overseeing the company's financial health but will become increasingly strategic, requiring an integrated vision that embraces finance, technology, and sustainability. A critical point that is often overlooked is that AI redefines the concept of business value. It is not just about doing better what was already done but understanding which new market spaces, products, or services can emerge. For example, generative AI, through the creation of strategic content, not only improves efficiency but transforms the approach to business knowledge, fostering a type of innovation that could be called "data-driven." However, this potential risks remaining unexpressed without a strong commitment from company leadership, which must be able to translate technological results into concrete strategies. An even deeper aspect concerns the transformation of professional skills. Repetitive and transactional work is destined to disappear, but with it comes the need to develop hybrid skills. Accounting and finance professionals will have to become interpreters, mediators, and curators of AI-generated results. This means developing collaborative intelligence that goes beyond simply using machines to understand and contextualize their analyses. Continuous training, however, is not enough: a change in mindset is needed, one that values the complementarity between humans and machines. In other words, AI should not be seen as a substitute but as a multiplier of human capabilities. On the ethical and governance level, a crucial theme emerges: AI is not neutral. AI models inherit the biases and limitations of the data with which they are trained. This requires companies to redefine the boundaries of responsibility: who is responsible for a decision error derived from an algorithm? How is transparency ensured in models that are often perceived as opaque by nature? And, above all, how can the adoption of advanced technologies be balanced with stakeholder trust, which is increasingly attentive to security and sustainability issues? Finally, AI introduces a geopolitical dimension to the accounting and finance sector. Digital maturity and local regulations influence the speed and success of adoption. However, companies that can overcome these barriers and align AI implementation with strategic objectives can gain a competitive advantage that is hard to replicate. This poses an additional challenge: AI integration must be accompanied by global change management capabilities, taking into account cultural, regulatory, and technological maturity differences. Ultimately, artificial intelligence is not simply a technological investment but a catalyst for broader organizational and social change. To fully exploit its potential, companies must embrace a holistic approach, where technology, people, and strategies merge into an agile, ethical, and future-oriented ecosystem. The real challenge is not to implement AI but to integrate it in such a way that it creates sustainable value for all stakeholders, anticipating the needs of an increasingly complex and interconnected world. Podcast: https://spotifycreators-web.app.link/e/oQYQ8Dcf0Ob Source: https://eu.imanet.org/research-publications/ima-reports/the-impact-of-artificial-intelligence-on-accounting-and-finance
BrainBench: Language Models Surpass Neuroscience Experts
Scientific research is increasingly becoming a complex challenge, requiring the ability to synthesize decades of studies. The current human capacity to process information has become inadequate in the face of the enormous amount of publications produced daily. In this scenario, Large Language Models (LLMs), trained on a vast corpus of scientific literature, emerge as a promising solution for integrating and predicting new results, often with greater efficiency than human experts. A recent study, published in the journal Nature Human Behaviour, introduced BrainBench, an innovative benchmark designed to evaluate the ability of LLMs to make predictions in the field of neuroscience, directly comparing them with experts in the field. BrainBench and the Prediction Challenge BrainBench is a benchmark specifically designed to test the ability of language models to predict the outcomes of neuroscience experiments. The structure of BrainBench includes the presentation of modified versions of scientific abstracts, allowing evaluation of the ability of LLMs to distinguish between plausible and altered results. The peculiarity of BrainBench lies in its "forward-looking" nature, meaning its ability to measure the predictive ability of LLMs in new situations, rather than merely verifying their ability to recall known information. This approach differs from other benchmarks that are primarily "backward-looking," such as PubMedQA or MMLU, where questions are about recalling existing knowledge. In BrainBench, two versions of a scientific abstract are presented, one original and one with modified results, and the participant's task is to identify which version is correct. The benchmark includes case studies from five subcategories of neuroscience: behavioral/cognitive, cellular/molecular, systems/circuits, disease neurobiology, and development/plasticity/repair. This approach ensures a broad and representative coverage of different areas of neuroscience, making the prediction task particularly challenging. It has been observed that language models outperformed human experts in accuracy in all these subcategories. Specifically, the average accuracy of LLMs was 81.4%, while human experts reached only 63.4%. Even limiting the analysis to human experts with the highest self-assessed level of competence, accuracy reached only 66.2%, still lower than that of LLMs. Another interesting aspect is the evaluation of models of different sizes. For example, smaller models like Llama2-7B and Mistral-7B, with 7 billion parameters, achieved performances comparable to larger models like Falcon-40B and Galactica-120B. Furthermore, it emerged that models optimized for dialogue or conversational tasks (such as "chat" or "instruct" versions) performed worse than their base counterparts. This suggests that aligning LLMs for natural conversations might hinder their scientific inference abilities. The accuracy of LLMs was also measured based on their ability to reduce "perplexity," which indicates the level of surprise the model feels towards a text. The models showed a significant improvement when they could access complete contextual information, rather than focusing on local parts of the text. This demonstrates how the ability to integrate information at a global level is one of the keys to their success compared to humans. Overall, BrainBench represents an innovative method to evaluate not only the ability of LLMs to recall information but also their ability to generalize and predict the outcomes of experiments never seen before. The approach is based on the use of modified scientific abstracts, where the results of studies are substantially altered, to verify whether the models can distinguish between alternative versions of experiments. For example, an original abstract might report that stimulation of a specific brain area increases a certain activity, while the modified version might indicate a decrease in activity. BrainBench evaluates whether the model can determine which of the two outcomes is more plausible, using methodological information and details provided in the abstract. This method requires that the models not only identify changes in the results, such as an increase or decrease in brain activity, but also relate them to the rest of the information in the abstract, such as the method used or the logic behind the discovery. In this way, BrainBench measures the ability of LLMs to integrate contextual and methodological information to make coherent inferences about new situations, simulating a scientific discovery process. The goal of this evaluation is crucial to understanding the potential of LLMs in supporting scientific research, especially in complex fields like neuroscience, where coherence between method, data, and results is essential. This approach does not merely test the memorization of information but explores the ability of models to think critically and contribute to the interpretation and generalization of scientific knowledge. Why Are LLMs So Powerful in Prediction? A key element of the success of LLMs is their ability to integrate information from multiple sources and handle the complexity of different levels of detail, as evidenced by tests conducted with BrainBench. Specifically, when LLMs were tested using only single sections of the abstracts, their performance dropped drastically. On the other hand, with the integration of the entire abstract content, including methodology, background, and results, their predictive capability increased significantly. This suggests that LLMs can take advantage of the synergy of different pieces of information to formulate more precise predictions. Moreover, the ability of LLMs to generalize information, even when noisy or potentially redundant, represents a competitive advantage. BrainBench showed that models like BrainGPT, trained on a specific corpus and enriched through techniques such as Low-Rank Adaptation (LoRA), achieved 3% higher performance than standard models. This improvement indicates how targeted customization and training on high-quality data can make LLMs extremely effective tools for predicting scientific results. The LLMs' approach to prediction relies on architectures such as Transformers, which allow precise modeling of relationships between elements of the text. This approach is particularly useful in neuroscience, where the phenomena to be analyzed often involve complex and interdependent data. Thanks to their billions of parameters, LLMs can identify patterns and correlations that escape human experts, making them suitable not only for predicting experimental results but also for suggesting new research directions. Another factor explaining the success of LLMs in prediction is their ability to adjust behavior based on confidence signals. LLMs use the difference in perplexity between versions of abstracts to calibrate their confidence in responses, which results in overall greater reliability. This level of calibration was a key factor in surpassing human experts, as it allowed the models to identify correct answers with greater certainty, especially in the more complex cases. In summary, the ability of LLMs to process enormous amounts of data, integrating information at different levels of detail and effectively handling complexity, makes them powerful tools for prediction in complex scientific fields. Their performance on BrainBench demonstrates that they are not only capable of competing with human experts but also significantly outperforming them, opening up new possibilities for using AI in supporting research and scientific discovery. BrainGPT: A Model Tailored for Neuroscience BrainGPT is a large language model, further specialized beyond general LLMs through specific fine-tuning on the neuroscience corpus. This adaptation was achieved through the Low-Rank Adaptation (LoRA) technique, which allowed the addition of over 629 million new weights within the structures of the Mistral-7B model, equivalent to about 8% of the total number of weights of the base model. This approach made it possible to optimize the model for neuroscience tasks, improving the ability to predict experimental results. The training of BrainGPT involved over 1.3 billion tokens from neuroscience publications collected between 2002 and 2022, covering a total of 100 scientific journals. The data was extracted using the Entrez Programming Utilities (E-utilities) API and the Python package pubget, to ensure a high-quality and relevant dataset. This massive data corpus provided the model with a broad context for understanding and predicting neuroscience outcomes. LoRA was chosen for its efficiency in adapting pre-trained models. Instead of retraining the entire model, LoRA inserts low-rank adaptation matrices into Transformer blocks, which are then specifically trained to update the model's behavior in a specific domain of knowledge. This process was particularly effective for BrainGPT, leading to about a 3% improvement in performance on BrainBench compared to general models, as evidenced by the conducted tests. Analysis of the results showed that the LoRA technique not only improved the model's overall accuracy but also reduced the perplexity of correct answers (t(199) = 15.7, P < 0.001, Cohen's d = 0.25), indicating more effective specialization for neuroscience material. This improvement was achieved with relatively limited computational resources: the fine-tuning process required about 65 hours of computation on Nvidia A100 GPUs, using four units in parallel. An interesting aspect of BrainGPT is its ability to be continuously updated with new neuroscience data. Using complementary approaches such as retrieval-augmented generation (RAG), the model could be constantly aligned with the latest literature, thus ensuring always up-to-date and relevant performance. In this way, BrainGPT can evolve into a tool not only for prediction but also for suggesting and supporting the planning of future experiments. This lays the foundation for increasingly close collaboration between human researchers and artificial intelligence models, expanding the possibilities for scientific discoveries in a complex field like neuroscience. The Challenge of Confidence Calibration Confidence calibration turns out to be a key element in studying the performance of large language models (LLMs). Research has shown that there is a positive correlation between the confidence expressed by the models in their answers and the accuracy of these answers. Specifically, when models were highly confident, their predictions were significantly more accurate. This relationship was quantified using logistic regression, highlighting a significant relationship between perplexity (an indicator of how predictable a model considers a given text) and the correctness of the answers provided. It was found that language models perform better when they can clearly distinguish between correct and altered versions of a text. This ability was measured using a statistical tool called "Spearman correlation," which indicates how strongly two variables are related. In our case, the value of 0.75 shows a very strong relationship: the better the models were at noticing differences in texts, the more accurate their answers were. The result was confirmed with high certainty, with a very small margin of error (±0.08 in 95 out of 100 trials). This calibration has a crucial impact on decision support systems, where the model evaluations can integrate with human judgment. For example, by dividing results into twenty confidence bands, it was found that at the highest levels of confidence, the average accuracy exceeded 85%, while at the lowest levels it was around 55%. These results highlight the effectiveness of calibration, as both models and human experts showed the ability to accurately assess their own confidence concerning the probability of success. This capability enables more effective synergy between automatic predictions and human oversight. Another relevant aspect that emerged from the study concerns the differences between models and humans in perceiving the difficulty of the same tasks. Although the average correlation between difficulties perceived by LLMs and those by human experts was only 0.15, among different models the correlation rose to 0.75. This data indicates a complementarity between the areas where humans and models respectively show strengths or weaknesses. Such characteristics can be leveraged to improve collaboration in decision-making processes. Finally, it was highlighted how confidence calibration not only increases the accuracy of predictions but also contributes to creating a context of trust in the use of LLMs as support tools for research. The ability of a model to indicate the level of confidence in its answers is an essential aspect for the responsible and effective use of these technologies, especially in the scientific field. This allows scientists to rely on these tools for specific tasks while maintaining critical control over the overall decision-making process. Future Implications: Human-Machine Collaboration The success of BrainBench and BrainGPT raises a series of crucial questions about the future of science and the role of LLMs in scientific research. If, on the one hand, these models prove capable of accurately predicting experimental results, it is possible to imagine a future in which LLMs become an integral part of the scientific discovery process. These tools could suggest to researchers which experiments to conduct, identify promising results, and guide data interpretation. A crucial aspect will be to ensure effective integration between the computational power of LLMs and human ingenuity. LLMs are capable of managing a quantity of scientific data far exceeding human capacity, rapidly processing thousands of articles, and providing connections between studies that often elude experts. However, human intuition, creativity, and the ability to contextualize a specific problem remain irreplaceable to ensure that discoveries have a significant impact and are directed towards useful and innovative applications. To maximize the potential of human-machine collaboration, it will be necessary to develop support tools that help researchers understand LLM predictions and assess their confidence. For example, user interface-based tools that visualize an LLM's confidence level in a specific prediction could improve transparency and facilitate a more informed use of AI-generated recommendations. In particular, it could be useful to implement visualizations that show the differences in perplexity between correct and altered versions of abstracts, allowing researchers to better understand the basis on which an LLM has made its prediction. Another interesting implication concerns the possibility of using LLMs to generate innovative experimental hypotheses. The ability of language models to identify hidden patterns in data could lead to the formulation of hypotheses that would otherwise not be considered, thus accelerating the pace of discoveries. However, it is essential that researchers maintain a critical approach, carefully evaluating the predictions and hypotheses generated to avoid the risk of blindly following a direction suggested by AI, without considering the possibility of unexpected or contradictory results. Moreover, human-machine collaboration could benefit from continuous interaction and mutual adaptation. For example, LLMs like BrainGPT could be trained using explicit feedback from human researchers, continuously improving their ability to provide relevant suggestions. Similarly, human experts could develop new experimental or theoretical methodologies based on the suggestions of LLMs, creating a virtuous cycle of innovation and discovery. However, one of the main risks is relying too heavily on LLM predictions, especially when these suggest a research path that might seem safer or more promising. This could lead to a reduction in the exploration of less obvious but potentially groundbreaking hypotheses. The risk is that science becomes less exploratory and more oriented towards an optimization logic based on predictive models, which could limit the potential for truly innovative discoveries. Finally, the complementarity between LLMs and human researchers could be further enhanced by developing specialized models for different fields of knowledge. As demonstrated with BrainGPT, a model trained on a specific corpus improved its performance compared to generalist LLMs. Extending this approach, we could imagine a network of highly specialized LLMs, each with a deep understanding of a specific field, collaborating to solve complex problems, creating a knowledge ecosystem where the analytical capabilities of machines and human creativity enhance each other. In summary, the future of scientific research could see increasing integration between LLMs and human scientists, with these models becoming not only support tools but genuine partners in discovery. The key to success will be maintaining a balance between reliance on LLM predictions and human creativity and independent thinking, ensuring that innovation remains at the heart of the scientific process. Conclusions The ability of language models to surpass human experts in neuroscience raises profound questions about the future of scientific research and the dynamics of human-machine collaboration. This phenomenon is not merely a matter of computational efficiency but opens strategic perspectives on how we address the complexity of knowledge and organize intellectual resources. Through tools like BrainBench and specific models like BrainGPT, LLMs not only demonstrate their ability to compete with human experts but also push us to rethink the value and role of intuition and experience in data-intensive fields. The superior performance of LLMs is not just a matter of predictive accuracy but reflects a paradigm shift in knowledge management. Their ability to integrate enormous amounts of information, often distributed across different disciplines, redefines the concept of expertise, shifting it from the depth of individual knowledge to the breadth of collective analytical capability. This poses a fundamental challenge to traditional structures of scientific research, where the authority of the expert was a cornerstone. LLMs, with their adaptability and specialization capabilities, could soon become a new standard for validating, predicting, and proposing scientific hypotheses, making the boundaries of expertise more fluid and collaborative. A crucial aspect is the emergence of a "calculated confidence" that LLMs can offer, redefining the relationship between prediction and decision. The ability to calibrate confidence based on perplexity and communicate it transparently represents a strategic innovation for decision-making processes, not only in neuroscience but also in sectors such as medicine, economics, and engineering. This feature is not merely a technical improvement; it is a model of how humans can learn to manage uncertainties and probabilities in complex situations. Business decision-makers, for example, could adopt this approach to combine quantitative analysis and human judgment, optimizing strategies and reducing risks associated with uncertain decisions. The risk of an "optimized but not exploratory" science deserves a broader strategic reflection. If, on the one hand, LLMs can direct researchers towards areas of greater probability of success, on the other hand, they might discourage exploration of less obvious or contrary hypotheses. To avoid this danger, it will be essential to balance the analytical power of LLMs with human creative courage. Companies that invest in innovation models capable of integrating these two dimensions will have a competitive advantage in generating radical and not just incremental solutions. The human-machine complementarity should not be seen as a simple sum of the parts but as a new knowledge ecosystem where interaction produces emerging value. For example, the idea of continuous feedback between human experts and LLMs represents not only an opportunity to improve technological performance but also a way for humans to learn from perspectives that would otherwise remain inaccessible. This is not a technical detail but a guiding principle for building organizations capable of adapting rapidly to changes and anticipating future trends. Finally, the specialization of LLMs, as in the case of BrainGPT, opens up new scenarios for a "network of specialized artificial intelligences," where highly focused models work together to tackle complex and interdisciplinary problems. This concept of "distributed intelligence" is not limited to science but extends to businesses, governments, and other areas where success depends on the ability to connect dots across seemingly distant systems. The ability to orchestrate this network will be one of the key competencies of the future, redefining not only how we work but also how we think and innovate. In conclusion, the future of scientific research could see increasing integration between LLMs and human scientists, with these models becoming not only tools of support but true partners in discovery. The key to success will be maintaining a balance between relying on LLM predictions and fostering human creativity and independent thinking, ensuring that innovation remains at the core of the scientific process. Podcast: https://spotifycreators-web.app.link/e/xpQPvpMwSOb Source: https://www.nature.com/articles/s41562-024-02046-9.pdf
LLMs and Security: MRJ-Agent for a Multi-Round Attack
The growing use of large language models, such as GPT-4, in critical areas has highlighted the need to address security and reliability issues with greater care. Although these models have a vast body of knowledge, there is a concrete risk that they may generate harmful or inappropriate responses, especially in the presence of specific attacks known as “jailbreaks.” The study conducted by Wang and collaborators proposes a new multi-round attack agent, called MRJ-Agent, developed to identify the vulnerabilities of language models and strengthen their security, delving into the complex dynamics of human dialogues. Issues in LLM Security and Limitations of Existing Approaches Jailbreak attacks focus on manipulating LLMs to induce them to provide sensitive or potentially harmful content. The research highlights that most efforts so far have focused on single-round attacks, i.e., with a single direct request to the model. However, these approaches are limited in reproducing the way humans actually interact with these systems: interactions are often multi-round, with questions and answers spread over multiple phases. Single-round attacks often use methods such as "prompt engineering," which involves constructing prompts designed to hide malicious intentions. For example, some approaches (Zou et al. 2023; Wei, Haghtalab, Steinhardt 2024) include the use of ASCII codes or encrypted messages to mask dangerous requests. These methods, although effective in some cases, fail to consider the complexity of multi-round interactions. As emerged from the research of Ma et al. (2024) and Perez et al. (2022), this type of more natural and complex interaction represents the real challenge for large language models, making single-round methods less meaningful from a practical point of view. In recent years, approaches for multi-round attacks have been developed, but they have shown several limitations. One example is the approach proposed by Zhou et al. (2024), which breaks down an original question into multiple sub-questions, then aggregates the answers to obtain harmful content. However, this method fails to reproduce the naturalness of a human conversation and often triggers the models' defense mechanisms, thereby reducing its effectiveness. Other methods (Russinovich, Salem, and Eldan 2024; Yang et al. 2024) adopt iterative trial-and-error tactics to induce the model to generate dangerous output. However, a key problem lies in the dependence on very powerful models like GPT-4, which often activate safety mechanisms, leading to rejected requests and a reduction in attack effectiveness. The research by Wang et al. introduces an innovative strategy to address these limitations by combining a risk decomposition strategy and psychological induction to make the attack more effective and less detectable. The risk decomposition strategy involves breaking down the original malicious intent into apparently harmless sub-requests, distributing the risk over multiple rounds. For example, a request like “how to build a bomb” is transformed into a series of questions about generic chemical reactions, which progressively lead to more specific content. The decomposition is carried out using models like GPT-4 to generate the sub-requests, maintaining a controlled level of semantic similarity to prevent the requests from becoming too obviously dangerous. Experiments have shown that by controlling the similarity between sub-requests and the original, the success rate of the attack can be significantly increased. Additionally, the psychological induction strategy exploits techniques such as reflective induction or support based on multiple pieces of evidence to reduce the likelihood of rejection by the model. The effectiveness of these strategies was successfully evaluated on both open-source models like LLama2-7B and closed-source models like GPT-4, showing a higher success rate in overcoming defenses compared to traditional approaches. MRJ-Agent: Technical Features and Attack Method MRJ-Agent introduces an innovative attack methodology that simulates a heuristic search process decomposed into multiple rounds. Starting from a potentially dangerous request (e.g., “how to build a bomb”), the process begins with an innocent question (such as a generic chemical reaction), then gradually progresses to more sensitive topics. This approach was designed to maximize the probability of circumventing the integrated safety mechanisms in LLMs. The method involves three main strategies: • Information Control Strategy: This strategy guides the trial-and-error process, controlling the similarity between the generated requests and the original. Information control is achieved through a heuristic approach that monitors the degree of semantic similarity between requests and the final goal. Experiments have shown that, by setting a minimum similarity threshold of 0.85 between the generated request and the original, it is possible to maintain the focus of the attack without compromising its effectiveness. • Psychological Induction Strategy: To minimize the probability of rejection by the model, psychological strategies are used to increase persuasion and decrease the perception of risk by the LLM. Specifically, psychological induction has been enhanced through 13 specific strategies, such as support based on multiple pieces of evidence and cognitive influence. The results show that, compared to merely decomposed requests, psychologically reinforced sub-requests increased the success rate by up to 39.7% on GPT-4. • Red-Team Training Strategy: A red-team model (called πred) was developed to perform multi-round attacks automatically, dynamically adapting to the target model’s responses. During training, the model used a direct preference optimization technique to learn to select the most effective strategies in each situation. The use of models with different capacities (7B and 13B) showed that increasing the size of the red-team model leads to a significant increase in the success rate, reaching 100% when the maximum number of rounds is 10 or more. Experimental Results and Comparison with Other Attack Methods The experimental results highlighted the outstanding performance of MRJ-Agent compared to other attack techniques, in both single-round and multi-round contexts. In particular, during evaluations on models such as LLama2-7B and GPT-4, MRJ-Agent achieved complete success (100%) in multi-round interactions, significantly surpassing the alternative method "Speak out of Round," which stopped at 20%. This figure reflects the system’s superior effectiveness in handling complex scenarios. Compared with other multi-round attack techniques, MRJ-Agent demonstrated a success rate of 92% on LLama2-7B with a single trial, increasing to 100% with multiple attempts. This result indicates a clear superiority in terms of efficiency and robustness, achieved without the need to repeat multiple rounds of attempts, as required by competing approaches. This feature highlights more effective management of the target model's responses, allowing MRJ-Agent to stand out as a highly optimized system. Additional tests have shown that MRJ-Agent maintains high performance even in the presence of advanced defenses. For example, with protection systems like "Prompt Detection" and "System Prompt Guard," success rates were 88% and 78%, respectively, with a single attempt, rising to 94% and 82% with two trials. These results demonstrate the system’s ability to adapt even to sophisticated countermeasures, maintaining high effectiveness in overcoming the implemented defenses. Compared to existing methods, MRJ-Agent also showed clear superiority against closed models like GPT-4, achieving an average success rate of 98%, compared to a maximum of 92% achieved with alternative methods such as "Chain-of-Attack" (CoA). Additionally, the ability to achieve these results with fewer interaction rounds and attempts than rival approaches represents a significant advantage in terms of operational efficiency. Another aspect analyzed concerns the impact of the size of the red-team model employed by MRJ-Agent. The results revealed that adopting a model with 13 billion parameters (13B), compared to one with 7 billion (7B), leads to a consistent increase in the success rate in more complex situations. For example, with a maximum of 15 rounds, the 13B model achieved complete success (100%), while the 7B model stopped at 94%. This suggests that using larger models can significantly improve the effectiveness of attacks, especially in more intricate contexts or with more elaborate defenses. In summary, MRJ-Agent has demonstrated remarkable multi-round interaction management capabilities, effectively adapting to both open-source and closed-source models, without showing performance declines. Particularly noteworthy was its robustness in circumventing the defense systems present in closed models like GPT-4, where the success rate approached 100%. These results highlight the urgency of developing more advanced security countermeasures to counter increasingly sophisticated attack systems. Generalization of the Attack and Other Scenarios The versatility of MRJ-Agent also extends to image-to-text tasks, where the ability to exploit visual details as a starting point for more delicate questions proved essential. For example, in attacking models like GPT-4o using harmless images, the success rate was 80%, showing that the model can use the visual context to guide subsequent questions towards sensitive content. This approach of linking visual and textual content is an innovative feature that increases the difficulty of effectively defending these models, as the requests seem more natural and less suspicious. In the case of text-to-image tasks, MRJ-Agent showed a reduced capability compared to text-to-text, with a success rate of 50% for generating potentially harmful images. This is partly due to more robust safety mechanisms integrated into commercial models like DALLE-3, which actively block sensitive content. However, MRJ-Agent demonstrated progressive adaptation of risk instructions, gradually increasing the likelihood of generating problematic content. This process of progressive refinement of instructions is particularly effective for circumventing automatic defenses, especially when the attack is carried out over multiple rounds. In another experiment, MRJ-Agent was tested on its ability to generalize on datasets such as JailbreakBench (JBB), which includes ten categories of risky behavior. On this benchmark, the success rate was 93.9%, confirming MRJ-Agent’s effectiveness not only in textual scenarios but also in broader and more diversified contexts. The most difficult categories to attack turned out to be those related to sexual content, with a success rate of 71.42% and an average number of queries of 11.85, suggesting that the model’s sensitivity to such stimuli remains high. Future Implications The future implications of the work on MRJ-Agent mainly concern the need to develop further defense mechanisms capable of addressing increasingly sophisticated attacks spread over multiple rounds of interaction. The effectiveness demonstrated by MRJ-Agent in circumventing defense mechanisms suggests that large models must be equipped with dynamic detection and response capabilities, capable of evolving in step with threats. An approach that could be adopted in the future is the implementation of AI-based defense strategies that can automatically adapt to changes in attack patterns and learn from previous interactions. Furthermore, the fact that MRJ-Agent has shown attack capabilities across a wide range of scenarios, including image-to-text and text-to-image, highlights the need to expand security methodologies to all AI application fields. This implies that not only language models but also image generative models and other types of AI must be made more robust against these types of threats. A possible development in this regard could be the creation of a series of standardized benchmarks to evaluate the resilience of models to different types of multi-round attacks. Another significant implication concerns the continuous alignment of models with human values. Multi-round attacks like those conducted by MRJ-Agent highlight the difficulty of maintaining stable alignment when models are subjected to prolonged and complex interactions. A future research area could focus on improving alignment techniques based on human feedback, for example by using adaptive reinforcement from human experts to detect deviation signals and correct the model’s behavior. Finally, the disclosure of the data and codes used to train MRJ-Agent represents another important step towards building a more transparent and collaborative research community. Making the attack code public could help researchers develop new defense techniques, thus promoting collective progress in AI security. However, this also carries the risk that malicious actors could use such information to develop more effective attacks. Therefore, it will be essential to adopt a balanced approach that allows for the progress of scientific research without compromising overall security. The work on MRJ-Agent not only highlights the current vulnerability of LLMs but also underlines the importance of a proactive and adaptive approach to model security. It is necessary to further explore the interaction between attack and defense, seeking solutions that can evolve as rapidly as emerging threats. Only in this way can we ensure that these models continue to serve humanity safely and responsibly. Conclusions The emergence of technologies like MRJ-Agent highlights a crucial truth in the landscape of artificial intelligence: the interaction between attack and defense is not static but evolves as a complex and interdependent dynamic. The multi-round capabilities of this system reveal a critical point that is often overlooked: language models are not simply response tools but active participants in dialogues that reflect the complexity of human interactions. This consideration transforms security from an issue of static technical barriers into a fluid process that requires constant adaptation. The risk decomposition and psychological induction introduced by MRJ-Agent are not just attack tactics but indicate a paradigm shift in how vulnerability is conceived. It is no longer an isolated model defect but a systemic flaw that emerges from the sum of interactions. This suggests that AI security must be redefined to address not only technical vulnerabilities but also cognitive and strategic manipulations. An effective security model cannot merely filter harmful requests; it must understand the sequence and context of the dialogue to detect insidious patterns that develop over time. The idea of using an automated red-team like the πred model raises a strategic question: how sustainable is the current passive security approach? Companies implementing LLMs in critical contexts must adopt an offensive mindset in security, investing not only in defenses but also in continuous testing against simulated attacks. This concept, similar to a "preventive war" in the world of cybersecurity, could change the traditional approach, shifting from an exclusive focus on static protections to a model of iterative and dynamic learning. Another fundamental aspect concerns the intersection between context and multimodal input. Attacks that combine text, images, and other modalities demonstrate how vulnerability is not confined to a single domain. This requires a convergence between model-specific defenses and a unified security framework capable of operating transversally. Companies developing multimodal systems must understand that the risk does not simply add up but amplifies: an initially harmless attack in one domain can be the key to exploiting weaknesses in another. This perspective requires a new generation of monitoring systems that can track the evolution of interactions across domains and modalities. Finally, the research on MRJ-Agent highlights a crucial problem for AI ethics and alignment. The growing sophistication of multi-round attacks challenges the idea that AI can maintain stable alignment over time. The implications for companies are profound: it is not enough for a model to be safe at the time of release; it is necessary to ensure that it remains aligned throughout its entire operational life cycle. This suggests the need for self-correction mechanisms, supported by continuous and human feedback. But this also opens the door to a dilemma: how to balance the model's autonomy with human supervision without reducing operational efficiency? Ultimately, the challenge posed by MRJ-Agent is not just about technological security but also touches on broader issues of governance, responsibility, and strategic design of AI systems. Companies must address these challenges not as isolated technical problems but as part of a broader transformation in risk management and building trust in artificial intelligence. Podcast: https://spotifycreators-web.app.link/e/PcqVZYzDTOb Source: https://arxiv.org/abs/2411.03814
GenAI in Banking
Generative Artificial Intelligence (GenAI) is emerging as a powerful tool to transform the financial services sector. Recent research conducted by Thomas Kaiser (CEO and Co-Founder of Kodex AI), Boon-Hiong Chan (Industry Applied Innovation Lead and Head APAC Market and Technology Advocacy at Deutsche Bank), and Delane Zahoruiko (Founders Associate at Kodex AI) highlights how GenAI can be used to improve regulatory compliance, optimize customer interaction, and manage risks more efficiently, paving the way for new levels of productivity and innovation. However, to fully leverage the potential of GenAI, institutions must address several challenges, including ensuring the quality of systems and cybersecurity, while also guaranteeing a gradual adoption of new technologies. A Gradual Approach to Adopting GenAI in Banking An effective adoption of GenAI in the banking sector requires an incremental and structured strategy, starting with basic applications and moving towards more complex use cases. The research suggests a three-phase approach to building a GenAI use case portfolio, allowing institutions to progressively gain confidence in the technology, mitigate risks, and achieve tangible benefits at each stage. Language Analysis Capabilities : The first phase focuses on leveraging GenAI's basic language analysis capabilities to perform tasks such as text synthesis, processing customer emails, and drafting standardized content. These features enable the organization to improve service efficiency and quality by managing large volumes of text and unstructured data. This phase not only lays the foundation for system capability development but also allows for GenAI to be tailored to the specific needs of the financial domain. Chat-to-Agent : In the second phase, the goal is to transform GenAI into a tool that goes beyond text analysis, enabling specific commands to be executed based on user requests. For example, an executive agent can receive a natural language query, translate it into code (e.g., Python), and use AI models to analyze large datasets and return understandable results. An experiment conducted with the MILA project demonstrated how a chat-to-agent solution allowed non-technical users to obtain detailed analyses of relationships and patterns in data, using visualizations to facilitate understanding. This phase allows for a high degree of autonomy in analysis, while still ensuring human supervision and control for critical results. Chat-to-Execution : The third phase represents the evolution towards autonomous capabilities, where GenAI not only executes commands but also makes autonomous decisions with contextual awareness. This level of development enables the system to operate with a high degree of independence, managing complex decision-making and operational processes. For example, a chat-to-execution system can autonomously decide which approach to take in responding to a specific request, based on a combination of reinforcement learning and memory of past interactions. This capability not only allows for the execution of repetitive tasks but also for adaptation and improvement over time, offering increasingly targeted solutions. The transition from simple language processing applications to fully autonomous solutions requires not only advanced technological infrastructures but also a constant commitment in terms of governance, risk management, and continuous training. The creation of controlled testing environments (AI sandboxes), the development of fair use policies, and the active involvement of industry experts are key aspects for successful adoption. Benefits for the Banking Sector The adoption of GenAI in banking offers several significant benefits, not only in terms of operational efficiency but also in the ability to address complex challenges such as risk management and regulatory compliance. One of the main advantages is GenAI's ability to improve decision quality through the automation of complex analyses. The technology allows the integration of significant volumes of data from various sources and provides real-time analysis, promoting a deeper understanding of market trends and potential risk areas. Moreover, the use of models such as Retrieval-Augmented Generation (RAG) helps improve the accuracy of responses generated by GenAI by tapping into external and verified data. This is particularly useful to ensure that responses are always based on up-to-date and relevant information, a crucial aspect in risk management and regulatory compliance, especially in scenarios requiring high precision and reliability. Another significant advantage relates to the democratization of access to advanced analytics. Tools like those developed in the MILA project have shown how GenAI can enable non-technical users to perform advanced data analyses, reducing the reliance on data science specialists. This capability was demonstrated by experiments where GenAI reduced analysis times by data engineers from several hours to just a few minutes, making the decision-making process faster and more accessible. The use of techniques like Parameter Efficient Fine Tuning (PEFT) and Low Rank Adaptation (LoRA) also helps reduce training costs and improve model customization, making them more suitable for integration into existing infrastructures without the need for excessive computational resources. This optimization not only supports cost reduction but also improves the adaptability of models to the specific needs of each banking organization. Additionally, the use of synthetic data makes it possible to train models in the absence of real data, addressing issues related to privacy and data availability. This approach ensures that models can operate on representative and diversified datasets without compromising customer privacy. Improving customer engagement is another crucial aspect. GenAI enables the development of more personalized and timely interactions, based on a deeper understanding of customer needs and automated request management. This not only increases customer satisfaction but also improves the efficiency of customer service operations, reducing response times and enhancing service quality. Finally, the adoption of GenAI can increase the scalability of operations. In a constantly evolving context like financial services, the ability to quickly scale processes and infrastructures is essential. GenAI systems, thanks to their flexibility, can be adapted to handle an increasing number of requests and processes without compromising the effectiveness or accuracy of operations. This is particularly advantageous during periods of high demand, where it is essential to maintain high service standards without experiencing slowdowns. Quality and Benchmarking To ensure that a GenAI system delivers adequate performance and meets the required standards in the financial sector, it is essential to establish quality measurements through accurate benchmarking. The quality of GenAI depends not only on the model architecture but also on the training data and enhancement tools like RAG and PEFT. The use of benchmarks such as GLUE, SuperGLUE, and MMLU is essential to evaluate the model's ability to understand and process natural language in general contexts. However, the financial sector presents specific challenges that require more targeted measurements. In the banking sector, GenAI effectiveness is often evaluated through specialized financial benchmarks such as FinanceBench, FinQA, and FNS (Financial Narrative Summarisation). FinanceBench assesses the model's ability to accurately process and interpret financial data for market analysis, risk assessment, and compliance reporting. FinQA, on the other hand, focuses on the system's ability to answer questions based on financial contexts, analyzing structured data such as financial reports and earnings calls. FNS evaluates a model's ability to summarize complex financial narratives from dense datasets, such as earnings reports or annual reviews, thus providing a measure of effectiveness in generating key insights automatically. Besides benchmarks and optimization methods, other architectural and process factors play a crucial role in determining the quality of a GenAI system. Among these, data management is critical. The use of preprocessing techniques such as chunking and parsing, in addition to content filters, ensures that data is managed appropriately before being processed by the model. Finally, the issue of explainability is equally fundamental. Implementing transparency systems, such as source attribution in responses and integrating human verification systems (Human-in-the-Loop), helps ensure that the decisions made by the models are traceable and understandable, thereby building the trust needed for GenAI adoption in highly regulated sectors like banking. Challenges and Risks The implementation of GenAI involves numerous challenges and risks that must be addressed to ensure the long-term success of the technology within the financial sector. One of the main issues is model drift. This phenomenon occurs when a model's performance begins to degrade due to the difference between the data used for training and the data the model encounters in real-world scenarios. Changes in customer behaviors or regulations can lead to a significant divergence between the operational context and the data originally used to train the model. To mitigate this risk, it is essential to implement continuous model performance monitoring through metrics such as prediction accuracy and error rate, as well as regular retraining on updated datasets to keep the model aligned with reality. Another significant risk is model hallucination, which refers to the generation of plausible but incorrect or unverified responses. This problem is inherent to the nature of GenAI but can be mitigated with specific techniques. For instance, using Retrieval-Augmented Generation (RAG) techniques, which allow the model to draw from external data sources to verify and confirm information, reduces the likelihood of hallucinations. Moreover, human oversight through the integration of Human-in-the-Loop (HITL) systems allows monitoring of model responses, especially for high-risk decisions, thus ensuring that responses are accurate and relevant. Feedback loop degradation is another risk that occurs when a GenAI system is overly exposed to user feedback without proper quality filters. In such cases, the system may learn undesirable behaviors, worsening the quality of responses over time. To address this issue, it is essential to implement feedback filtering mechanisms that allow for evaluating the quality of user-provided data before it is used to influence model learning. In addition to these specific risks, there are also dependency risks, such as reliance on specific infrastructures or external providers. To mitigate such risks, it is important to adopt modular and interoperable architectures that allow easy migration to alternative models or platforms, avoiding technological lock-in situations. Finally, the financial sector must address cybersecurity risks, especially when using GenAI systems that may interact with sensitive data. Adopting advanced security measures, such as protection against data poisoning or prompt injection attacks, is essential to ensure the resilience and reliability of the system. Recommendations for the Financial Industry To foster effective adoption of GenAI in the financial sector, it is essential to develop an implementation strategy that considers regulatory, technological, and ethical aspects to ensure the responsible and secure use of technologies. It is recommended to invest in the creation of controlled testing environments (AI sandboxes) where new applications can be developed and evaluated in a protected context, ensuring that every new feature or use is compliant with existing regulations before any market release. Another crucial step is continuous training and staff updates. GenAI evolves rapidly, and so do the skills needed to use it effectively. Financial institutions must invest in training programs to ensure that their employees are prepared to face technological changes and make the most of the new opportunities offered by GenAI. At the same time, it is important to encourage collaboration among different departments to foster a complete and shared understanding of the technology's potential and limitations. Collaboration between the public and private sectors plays a fundamental role. AI regulation is still evolving, and cooperation between companies and regulatory authorities can facilitate the development of guidelines that promote innovation without compromising security or privacy. For example, the introduction of fair data-sharing practices, which allow access to high-quality datasets while respecting intellectual property and confidentiality, could facilitate the development of more performant and secure models. It is also necessary for financial institutions to adopt open standards and transparency policies, which not only help avoid technological lock-in risks but also improve public trust in the use of AI. Transparency practices should include complete documentation of training processes, the use of mechanisms for model decision explainability, and regular audits to verify compliance with regulations and ethical standards. Conclusions The adoption of Generative Artificial Intelligence (GenAI) in the banking sector is not merely a technological choice but a strategic transformation that redefines the very foundations of operational and corporate competitiveness. It is not just about implementing tools to improve efficiency but about rewriting the rules of interaction between financial institutions, their customers, and the regulatory environment. This transformation brings extraordinary opportunities, but also risks that require deeper reflection beyond simple cost-benefit calculations. One of the profound implications of using GenAI in banking is the redefinition of the concept of trust. Traditionally, trust in banks is based on transparency, solidity, and human reliability in making critical decisions. With GenAI, this trust must be extended to a non-human intelligence, an entity that decides and acts based on complex mathematical models and immense volumes of data. This implies a significant cultural shift for customers and institutions themselves, who must make otherwise opaque decisions understandable and demonstrate that such systems can operate without compromising ethics or security. The democratization of advanced analytics, one of the main advantages of GenAI, introduces unprecedented dynamics in corporate roles and required skills. If GenAI systems can provide complex insights without the intervention of data science experts, traditional hierarchies within banking organizations are redesigned. This poses a managerial challenge: how to rebalance roles between technical specialists and strategic decision-makers, ensuring that the latter have the skills to interpret and fully exploit the provided analyses? The ability to rapidly scale operations and processes through GenAI reduces traditional operational limits but also raises questions about long-term sustainability. Automating decisions and processes does not only mean responding to current demand but implies a reflection on managing future complexity. Systems that are too autonomous could create a level of technological dependence that makes effective human intervention difficult in crisis situations—a risk that no bank can afford to ignore. In terms of innovation, GenAI also redefines the concept of time in the financial sector. It is not only the speed of execution of analyses or responses that changes, but the ability to predict and adapt to market changes in real-time. This acceleration creates a competitive context where leaders will be those who can integrate speed with accuracy and security. However, this same speed can make regulatory interventions more difficult, increasing the risk of a gap between technological innovation and regulatory capacity. Ethics becomes the critical ground on which GenAI adoption in banking is played out. The management of synthetic data, the use of techniques like Retrieval-Augmented Generation (RAG), and modular fine-tuning, while reducing technical risks, amplify the need for transparent governance. The banks that will stand out will not only be those that successfully implement GenAI but those that do so in a way that makes the technology an element of trust rather than alienation for customers and stakeholders. Ultimately, the introduction of GenAI in the banking sector is not simply a technical evolution but a systemic change that requires a long-term strategic vision. Sector leaders will need to go beyond the logic of efficiency and innovation to embrace a mindset of continuous adaptability, ethical responsibility, and inclusivity. Only in this way can generative artificial intelligence transform from an operational tool into a pillar of the future of financial institutions. Podcast: https://spotifycreators-web.app.link/e/BSSfbXy3TOb Source: https://corporates.db.com/publications/White-papers-guides/adopting-generative-ai-in-banking
Gaming and Artificial Intelligence. BALROG the New Standard for LLMs and VLMs
The rapid evolution of Large Language Models (LLMs) and Vision Language Models (VLMs) has reignited interest in the creation of general agents capable of autonomously achieving complex objectives. These models possess a vast repertoire of knowledge and have shown promising reasoning abilities in specific scenarios. However, they still face significant limitations when it comes to operating in complex and dynamic environments, which require long-term planning, continuous exploration, and managing intricate interactions. BALROG was developed specifically to address this problem: it is a benchmark designed to assess the agentic capabilities of LLMs and VLMs through a series of increasingly complex games. This project was made possible thanks to a collaboration between the AI Centre at University College London, IDEAS NCBR, the University of Oxford, New York University, and Anthropic. The lead authors of the research are Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, Ulyana Piterbarg, Maciej Wolczyk, Akbir Khan, Eduardo Pignatelli, Łukasz Kuciński, Lerrel Pinto, Rob Fergus, Jakob Nicolaus Foerster, Jack Parker-Holder, and Tim Rocktäschel. Objectives and Structure of BALROG BALROG aims to provide a unified environment for evaluating the capabilities of LLMs and VLMs as agents in reinforcement learning environments. The main goal is to push models beyond their current limitations, testing them in scenarios that require not only understanding and interaction capabilities but also advanced reasoning, exploration, and adaptability skills. BALROG is structured to challenge models in various aspects of their agentic abilities, including spatial reasoning, long-term planning, and interaction with multimodal representations. The games used for the benchmark range from relatively simple activities, solvable by a non-expert human in a few seconds, to extremely complex tasks such as the NetHack environment, which may take years to master. The games included in BALROG have been carefully selected to cover a wide range of cognitive skills. For example: • BabyAI : a relatively simple environment that assesses the model's ability to follow natural language instructions and navigate a two-dimensional world. • Crafter : inspired by the popular game Minecraft, this environment requires the agent to explore, gather resources, and craft items, testing its survival and resource management skills. • TextWorld : a fully text-based game where the agent must explore mazes and interact with everyday objects, demonstrating its ability to understand and manage scenarios described only verbally. • Baba Is AI : based on the popular puzzle game Baba Is You, this environment assesses the model's ability to manipulate game rules to solve complex problems, challenging its unconventional reasoning skills. • MiniHack and NetHack : extremely complex and challenging environments in which agents must combine exploration, navigation, and long-term planning abilities to survive in procedural dungeons. NetHack, in particular, is known for its difficulty and the advanced skills it requires from human players. Each game features different levels of difficulty, procedural simulations, and long-term planning requirements, making BALROG a comprehensive benchmark that represents the challenges LLM agents must face in the real world. BALROG is not limited to evaluating model performance but also encourages the development of new strategies to improve agent capabilities, providing a flexible platform that supports the integration of new prompting methods and reinforcement learning approaches. Moreover, BALROG adopts a modular architecture that allows the easy addition of new games and test environments, keeping the platform open for ongoing research and innovation. Each component of the benchmark, from basic navigation tasks to advanced challenges like MiniHack and NetHack, contributes to providing a detailed overview of model capabilities in diverse and complex scenarios. The infrastructure supports the use of agents based on zero-shot prompting, few-shot learning, and other advanced techniques, thus supporting a wide range of learning and evaluation methodologies. Methodology and Evaluation Metrics To evaluate the agents' capabilities, BALROG adopts extremely detailed and rigorous metrics, designed to measure various aspects of LLM and VLM performance in complex environments. Each model is evaluated on a series of key parameters, including problem-solving ability, decision-making effectiveness, long-term planning skills, resource management, responsiveness to visual and textual inputs, and robustness in the face of unforeseen procedural challenges. Tests are conducted using different configurations of game environments to ensure the generalizability of the models' capabilities. Agents are evaluated on procedurally generated environments, meaning that each test session presents different situations and maps, preventing any possibility of overfitting based on memorizing solutions. Each environment includes detailed metrics to capture agent progress, including intermediate scores, the number of errors made, and the time taken to complete tasks. For example, in the NetHack environment, a progression system was developed based on experience levels and dungeons reached, as the standard scoring system was not sufficient to adequately represent model progress. In this environment, each level reached contributes to a progressive evaluation of the model, allowing for identification of how close an agent is to successfully completing the game, with completion percentages ranging from 0% to 100%. The challenges of NetHack make a fine-grained measurement particularly useful for monitoring agents' survival and planning strategies. In BabyAI, the main metric is the accuracy with which the agent follows instructions and the time needed to complete tasks. Agents are evaluated on their ability to navigate correctly through a series of actions described in natural language. The best models manage to complete tasks with over 90% accuracy in the simplest situations, while showing a significant drop as task complexity increases. For Crafter, performance analysis focuses on the agents' ability to gather resources, craft tools, and survive within the environment for an extended period. Complexity increases as resources become scarce and the environment becomes dynamic. Parameters such as the number of milestones reached (e.g., gathering rare resources, crafting advanced tools) and the average duration of survival are measured. In the Baba Is AI environment, particular attention is given to agents' ability to manipulate game rules to solve complex puzzles. Metrics include the number of puzzles solved, the time taken for each solution, and the creativity demonstrated in finding unconventional solutions. Agents must not only apply existing rules but also create new ones by combining text blocks to modify game mechanics. For each scenario, BALROG provides a comparative evaluation between LLMs and VLMs, highlighting differences in performance between purely textual representations and those that include visual inputs. Multimodal representations often result in a drop in performance, especially in environments where vision is crucial for effective decision-making, such as MiniHack and NetHack. Multimodal models are evaluated on their ability to integrate visual information with textual information, combining perception and reasoning to navigate complex environments. BALROG's metrics are designed to be normalized into a score from 0 to 100, allowing easy comparison between different models and experimental configurations. This detailed evaluation approach makes it possible to precisely identify models' weaknesses and monitor progress made in various critical areas, such as long-term planning, uncertainty management, and adaptive learning capability. Key Results Performance analysis has shown that current models achieve good results in simpler tasks but show significant shortcomings in more complex ones. In particular, NetHack has proven to be one of the most challenging environments, with the best models managing only an average progress of 1.5% in terms of game advancement. The o1-preview model achieved the best result, with an average progress of 1.57%, while other models, such as GPT-4o and Claude 3.5 Sonnet, recorded even lower performance, highlighting the enormous difficulty in navigating and planning in long-duration environments like NetHack. For MiniHack, the suite has proven to be extremely challenging, with tasks like "Boxoban" never being solved by any model, highlighting serious shortcomings in long-term planning and resource management abilities. Only some models managed to complete the simplest tasks, such as 9x9 mazes and corridor battles. In the case of BabyAI, the top-performing models achieved average progression results of over 70%, with GPT-4o and Llama 3.1 70B leading the way, while the introduction of visual inputs caused a drop in performance. The Gemini-1.5-Pro model maintained stable performance between the textual and visual formats, demonstrating greater robustness. For Crafter, the GPT-4o model showed the best resource management capabilities, with an average progression of 33.10%. However, even in this case, the introduction of visual inputs led to a drop in performance, suggesting that effectively integrating visual information remains a distant goal for many models. For TextWorld, more complex tasks, such as "Coin Collector," presented high difficulties for all models, with GPT-4o completing the task only once in twenty attempts. Gemini models encountered issues with the API, which often classified prompts as "unsafe," preventing a complete evaluation. A recurring element that emerged from the analysis is the so-called "knowing-doing gap": many models demonstrate theoretical knowledge about the game but fail to apply it during task execution. For instance, in NetHack, models like GPT-4o are capable of recognizing the danger of consuming spoiled food but continue to make this mistake during gameplay, highlighting a lack of practical integration of acquired knowledge. Finally, comparative analysis has shown that current multimodal architectures are still unable to fully exploit visual information for effective decision-making. In environments like MiniHack and NetHack, presenting images led to a significant drop in performance, indicating that vision-based reasoning remains an area where models need to improve significantly. Open Challenges for the Future BALROG is not just a benchmark but also a platform for the rapid prototyping of new prompting methodologies and strategies for improving agentic model capabilities. Several open challenges remain for future research, including improvements in integrating visual and textual inputs, enhancing long-term planning capabilities, and bridging the "knowing-doing gap." Improving Visual-Linguistic Integration BALROG's results show that multimodal representations are still not effectively exploited by agents, suggesting serious gaps in vision-based reasoning. The ability to interpret visual information and integrate it with language remains a distant goal. Future research should focus on techniques like self-supervised learning to improve models' ability to extract relevant insights from visual representations. Additionally, the introduction of video observations and multi-image observation histories could provide context for improving models' understanding in long-term scenarios, reducing the difficulty of visual processing. Long-term Planning and Agent Autonomy Long-term planning has been one of the areas where agents have shown the greatest shortcomings. To address these difficulties, a possible solution is to use advanced techniques like Chain-of-Thought Reasoning (CoT), which allows models to think iteratively and formulate more coherent plans. Additionally, the use of persistent memory systems could enable agents to accumulate experience over multiple game sessions, improving their planning ability and making informed decisions based on past experiences. Another approach could be to develop in-context Reinforcement Learning (RL) systems, where the agent learns directly from errors during the inference process, gradually improving its planning capabilities without the need for complete retraining. Bridging the Knowing-Doing Gap The so-called "knowing-doing gap" represents a significant challenge for current models. Many agents know theoretically what to do in specific situations but fail to put this knowledge into practice during gameplay. One approach to bridging this gap could be the integration of self-reflection mechanisms that allow the model to evaluate its actions and make behavioral adjustments. Additionally, the use of in-context fine-tuning techniques, where the agent is adapted in real-time based on game experiences, could prove effective in improving coherence between theoretical knowledge and practical action. Addressing the Computational Limits of Current Models Current models are limited from a computational standpoint, which affects their ability to solve complex tasks. The trade-off between model depth and context is a crucial aspect to consider for performance improvement. To address this problem, a research direction could focus on using attention optimization mechanisms, such as PagedAttention, which allow more efficient management of context and focus computational resources only on elements relevant to the task at hand. Introduction of Multi-Agent Prompting Strategies and Tool Use In the future, BALROG could also explore the role of multi-agent collaboration. Agents could benefit from integrating multi-agent prompting strategies, where different models work together to solve complex tasks. Additionally, the use of external tools and APIs to improve decision-making could represent an important development direction, allowing agents to acquire information and skills that go beyond their basic capabilities. Conclusions BALROG's results underline a crucial point: current AI models, although advanced, remain trapped in a gap between the ability to "know" and the ability to "do." This observation is not just a technical problem but reflects an intrinsic limitation in agent design: the absence of true "agentic intent." LLM and VLM agents do not possess an innate understanding of why certain actions are necessary or useful in a given scenario. This suggests that their current programming positions them as reactive tools rather than systems capable of autonomously navigating strategic complexities. The lack of full integration between visual and linguistic aspects, combined with the shortage of long-term planning, highlights an unexplored opportunity: developing models capable of learning not only from information but also from experience through operational and adaptive heuristics. For example, in games like NetHack or MiniHack, the inability to connect past experiences with future decisions is a signal that models lack a structural memory that transcends the inference session. This not only results in a performance problem but deeply limits the application of such systems in real-world scenarios, where continuity and adaptability are fundamental. From a strategic business perspective, this opens up two innovative opportunities. First, there is a need to develop hybrid systems that combine the computing power of current AIs with decision-making processes that incorporate "simulated intentionality." This could mean models designed to learn contextual behavioral patterns rather than simple task-oriented responses. Such models could be crucial in sectors like supply chain management, where long-term planning and adaptation to variables are essential. Second, the concept of the "knowing-doing gap" could lead to a transformation in how companies design digital workflows. AI systems capable of self-regulating and reflecting on their performance in real-time could reduce human intervention in complex decision-making processes, improving efficiency and resilience. Imagine, for example, a financial management AI system that, in addition to analyzing historical data, learns from its mistakes and adapts its forecasts to mitigate future risks. Finally, the inability to manage visual inputs as an integral part of the decision-making process brings up a fundamental lesson: multimodal AIs must be designed not to passively translate visual inputs into linguistic outputs but to "live" the visual context as an integral part of their understanding. This has enormous implications for sectors like industrial robotics and healthcare, where the interaction between visual and decision-making systems could become a decisive competitive advantage. BALROG is not just a technical benchmark; it is a mirror for understanding the future trajectories of artificial intelligence. For companies, the message is clear: those who know how to invest in solutions that bridge the gap between "knowing" and "doing" will gain not only a technological advantage but also a strategic one in an increasingly complex and interconnected world. Podcast: https://spotifycreators-web.app.link/e/QcyZnUTKWOb Source: https://arxiv.org/abs/2411.13543
OWASP Top 10 LLM: Ten Vulnerabilities for LLM-Based Applications
The security of applications based on large language models (LLMs) is an increasingly relevant topic as the integration of these technologies into business systems and public services becomes more widespread. The new 2025 version of the OWASP Top 10 LLM list for vulnerabilities in LLM-based applications describes the most critical risks these applications face. This article is a summary of the work conducted by the OWASP team and involved institutions, based on contributions from security experts, developers, and data scientists from various sectors. In this article, we will explore each of the identified vulnerabilities, providing concrete examples and possible mitigation strategies. Prompt Injection (OWASP Top 10 LLM) The issue of Prompt Injection occurs when a user's input manages to manipulate the behavior of a language model, altering the model's output or actions in unintended ways. This vulnerability can be exploited both intentionally, by malicious actors who provide input designed to deceive the model, and accidentally, when unexpected inputs lead to incorrect system behavior. A particularly complex aspect is that prompt injection attacks may not be visible or readable by humans: any content that the model can interpret can potentially influence its behavior. There are two main types of Prompt Injection attacks: direct and indirect. Direct attacks occur when an attacker directly introduces a command or input that induces the model to perform unwanted actions, such as ignoring security guidelines, revealing sensitive information, or even performing dangerous actions like accessing unauthorized resources. Indirect attacks, on the other hand, occur through input from external sources, such as files or websites, which contain instructions that the model can interpret and that alter its behavior. A new emerging challenge is related to multimodal models, which are designed to handle different types of input, such as text and images simultaneously. Attackers could, for instance, hide instructions within images accompanying text. These cross-modal attacks significantly increase the attack surface, making defending against prompt injection far more complex. The impact of a prompt injection attack can be devastating: from disclosing sensitive information to bypassing system security measures and manipulating the model's critical decisions. For example, an attacker could use hidden prompts to make a customer service chatbot ignore all internal security rules, allowing access to confidential personal data. To mitigate the risk of prompt injection, it is essential to adopt multiple protection strategies. First of all, limiting the model's behavior by precisely defining its roles and capabilities is a crucial step. Providing clear instructions to the model on what is and is not allowed helps prevent unwanted deviations. Additionally, it is important to filter inputs and outputs using semantic tools that can identify potentially harmful content. For instance, implementing input validation controls and content filtering rules can help reduce the risk of malicious inputs. The adoption of an approach called "human-in-the-loop" can also contribute to security. This approach requires that some high-risk actions need human operator confirmation before being executed, thus limiting the possibility that a malicious prompt leads to severe consequences. Furthermore, segregating external content and clearly identifying which data comes from untrusted sources further reduces the potential impact of a prompt injection attack. Finally, testing the model regularly through attack simulations and penetration testing techniques can help identify security flaws before they are exploited. These tests should treat the model as an untrusted user to evaluate the effectiveness of trust boundaries and access controls. Sensitive Information Disclosure (OWASP Top 10 LLM) The vulnerability of Sensitive Information Disclosure occurs when a language model handles personal or confidential data without adequate security controls. This issue can have serious consequences, especially when information is inadvertently disclosed during interaction with the model or due to poor data management practices during training. The nature of these models, trained on vast amounts of data, can lead to situations where private details are unexpectedly revealed if the data has not been properly filtered. One of the most common cases of sensitive information disclosure involves the leakage of personally identifiable information (PII), such as names, addresses, phone numbers, and other sensitive details. For example, in contexts where an LLM is used for customer support, it could inadvertently reveal personal data of another user if proper access controls are not in place. This situation can occur when the model has been trained using data that is not fully anonymized or when information is stored without adequate protection measures. Another significant risk is the exposure of proprietary algorithms or internal details of an organization. For example, a model used to solve business problems could accidentally reveal confidential information about proprietary algorithms or methodologies, exposing the company to potential security risks and loss of competitive advantage. This type of disclosure can occur not only due to errors in managing outputs but also because of targeted attacks exploiting vulnerabilities in prompts or training data. To mitigate these risks, it is crucial to adopt data sanitization techniques during the training process, ensuring that any personal or sensitive data is removed or masked. Sanitization must be performed not only on the data used for training but also on real-time user inputs. Additionally, the adoption of federated learning techniques can reduce the need to transfer sensitive data to a single centralized location, thereby decreasing the risk of exposure. The implementation of access controls based on the principle of least privilege is another key measure to prevent sensitive information disclosure. This approach implies that the model only has access to the information strictly necessary to perform its task, thus limiting the possibility that confidential information is processed or disclosed by mistake. Another useful technique is the use of differential privacy, which adds "noise" to the data to ensure that specific user information cannot be reconstructed from the results generated by the model. Educating users about the safe use of LLMs is equally important. Users must be aware of the risks associated with entering sensitive data and should receive guidelines on how to interact with the model safely. For example, the service's terms of use should clarify that the entered data might be used to improve the model, and users should be given the option to opt out of having their data used for training. Finally, it is essential to properly configure the system to avoid having confidential information included in system prompts or outputs generated by the model. Infrastructure security must be ensured by following best practices, such as those defined by OWASP, including the secure configuration of APIs and masking error messages to prevent leaks of critical information. Supply Chain Vulnerabilities (OWASP Top 10 LLM) Supply chain vulnerabilities in LLM-based applications represent a significant risk, as they can compromise the integrity of the models, training data, and deployment platforms. These vulnerabilities can arise from various external elements, such as pre-trained models or third-party software components. Using publicly available pre-trained models, for example, carries an inherent risk because such models may contain biases or even malicious backdoors, introducing weaknesses that are difficult to detect. A critical aspect is the use of outdated models that are no longer updated or maintained. The adoption of unsupported models or software components represents a common security flaw, similar to those described in other areas of cybersecurity (such as managing outdated software), but with potentially much greater impact, given the pervasive use of LLMs in critical contexts. If a model is not updated, discovered vulnerabilities can be exploited by malicious actors, leading to possible data breaches or system attacks. Another risk concerns fine-tuning methods based on techniques such as Low-Rank Adaptation (LoRA). While these techniques allow for more efficient adaptability and performance improvements, they also introduce new risks. An attacker could exploit vulnerabilities in these adaptations to compromise their integrity, manipulating the base model at the component level and inserting unwanted behaviors. For example, a malicious LoRA adapter could be loaded from an unverified source, compromising the entire system. Moreover, collaborative development and model merging processes, such as those widely adopted on platforms like Hugging Face, represent a notable attack surface. Model sharing platforms are often vulnerable to compromises due to misconfiguration or inadequate security controls. Model tampering attacks could include directly modifying a model's parameters to insert backdoors or biases that are not detectable during common usage. To mitigate these risks, it is crucial to maintain an accurate and updated inventory of all components used in the supply chain, utilizing tools like the Software Bill of Materials (SBOM), which allows verification of the origin and security of software components and pre-trained models. This enables the rapid identification of any known vulnerabilities and the evaluation of the system's overall security. The implementation of AI Red Teaming practices, involving specialized teams simulating attacks to identify vulnerabilities, can be highly effective in testing the resilience of models and components against real threats. It is equally important to continuously monitor and verify the security of collaborative development environments by introducing auditing mechanisms that allow the timely detection of anomalies or abuses. Finally, creating a constant update and patching policy for components used in models is crucial to ensure that any vulnerability is resolved as quickly as possible, thereby limiting the risk of exposure to potential exploits. The use of model encryption techniques, especially for models distributed on local devices, and the integration of integrity checks can prevent model tampering and limit unauthorized access. Data and Model Poisoning (OWASP Top 10 LLM) Data and Model Poisoning occurs when the data used to train the model is manipulated to introduce vulnerabilities, biases, or even to deliberately compromise the model. This type of attack can negatively affect the model's performance, leading to incorrect decisions or unexpected behaviors. One of the main risks is that training data, especially data from external sources, may contain malicious information that alters the model's ability to make accurate predictions. This is particularly true when models are trained on unverified datasets or data collected from public environments, where attackers can easily inject adversarial content. For instance, an attacker could manipulate the dataset by inserting specific examples designed to teach the model to behave incorrectly in certain situations. This type of attack, known as backdoor insertion, can leave the model seemingly normal until a specific trigger alters its behavior. Such an attack could allow the attacker to bypass security measures or directly manipulate the model's responses. To mitigate these risks, it is crucial to implement data traceability measures. Using tools like the Machine Learning Bill of Materials (ML-BOM) helps track the origin and transformations of data throughout the model's lifecycle. Data validation is equally important: every piece of data should undergo a rigorous verification process before being used for training, especially if it comes from external or collaborative sources. Another effective strategy is using data version control (DVC) to monitor every change in datasets. This helps detect any data manipulation and maintain the integrity of the entire model development process. Additionally, the implementation of adversarial learning techniques helps prepare the model to withstand attacks, improving its robustness against malicious perturbations. Another step to prevent model poisoning involves adopting sandboxing to limit the model's exposure to unverified data. Creating an isolated environment in which to test new data before actually using it for training reduces the risk of compromising the model. Finally, using monitoring and anomaly detection techniques during the training process helps identify unexpected behaviors in the model that could indicate the presence of poisoned data. Improper Output Handling (OWASP Top 10 LLM) Improper handling of outputs generated by LLM models can expose applications to a wide range of vulnerabilities, including remote code execution (RCE), cross-site scripting (XSS), and SQL injection attacks. This problem occurs when the output produced by the model is used without adequate validation or sanitization. Since LLMs are systems that generate text based on potentially unverified inputs, they can be exploited to introduce malicious commands that are then executed by subsequent components of the application chain. For example, model output that is fed into a shell system without being verified could allow an attacker to execute arbitrary commands, compromising the entire system. Similarly, SQL queries generated by the LLM and used to access databases without proper parameterization could lead to SQL injection vulnerabilities, allowing unauthorized access to data. In web contexts, unsanitized output displayed in a browser could result in cross-site scripting (XSS) attacks, where the attacker introduces malicious scripts that are executed by the user's browser. To mitigate these risks, it is crucial to treat every output generated by the model as potentially dangerous, applying strict validation and sanitization practices. The adoption of context controls, such as encoding the output based on the target environment (HTML, SQL, JavaScript), is an essential measure to ensure that generated content cannot be used maliciously. Using parameterized queries for all database operations reduces the risk that unverified inputs could alter the intended operations. Moreover, implementing a Content Security Policy (CSP) can limit the impact of XSS attacks by preventing unauthorized scripts from executing. The use of advanced logging and monitoring systems can help detect abnormal behaviors in the outputs generated by the models. For example, constantly monitoring the content generated by the LLM and identifying suspicious patterns can provide an additional level of security, enabling rapid intervention in case of malicious activity detection. It is also important to define rate limits and usage quotas to prevent abuse, especially in contexts where the model has access to critical functions or sensitive resources. Ultimately, ensuring proper output handling means adopting a "zero trust" approach towards generated content, treating the model as a possible attack vector, and implementing all necessary safeguards to protect downstream systems from potential compromises. Excessive Agency (OWASP Top 10 LLM) The concept of Excessive Agency refers to the excessive autonomy granted to a large language model (LLM), which can lead the model to take critical actions without adequate human supervision. LLMs with excessive autonomy can make decisions or perform operations that are outside their intended scope, potentially causing harm or security breaches. This risk becomes more critical with the growing spread of agent-based architectures, where an LLM is used as a decision point to perform various actions. In the context of an LLM application, autonomy can include the model's ability to invoke system functions, access external resources, or communicate with other parts of a system without human confirmation. This capability can be useful for automating tasks, but at the same time, it introduces vulnerabilities when action controls are not sufficiently limited. A common example of excessive autonomy concerns an LLM used as an assistant for email management, which might have access not only to read emails but also to send and delete them. This type of access exposes the system to significant risk, especially if an attacker manages to manipulate the LLM through malicious prompts or compromised external data. If the model is not designed to require human confirmation before performing certain operations, an attack could result in unauthorized emails being sent or critical information being deleted. Another example can be represented by the use of unnecessary plugins or extensions that increase the range of functionalities available to an LLM. If a model is enabled to interact with a file management system, and this extension allows both reading and modifying files, the risk is that unwanted behavior or a targeted attack could lead to the modification or deletion of sensitive data. Plugins with extended functionalities that are not strictly necessary for the intended operation represent a risk vector because they offer additional access points that can be exploited. A related issue is excessive permissions. Very often, LLMs are configured to operate with excessive privileges, allowing them to access functionalities or resources that are not essential for their operations. For example, an extension that only needs to read data from a database might be configured with write, modify, or delete permissions, creating a broader attack surface. Such misconfiguration makes the system vulnerable not only to possible attacks but also to errors that may result from the model's unexpected behavior. To mitigate the risk of excessive autonomy, it is essential to adopt an approach that minimizes the extensions and functionalities available to the LLM. Extensions should be limited to only strictly necessary operations, thus reducing the model's ability to perform harmful actions. It is crucial to apply the principle of least privilege, ensuring that each extension or plugin operates with the lowest possible privileges, required only for the specific intended operation. In this way, even if the model is compromised, the actions it could perform would be severely limited. Moreover, the implementation of human-in-the-loop mechanisms is crucial to ensure that all high-impact actions require confirmation from a human operator before being executed. For example, if an LLM is used to generate content to be published on social media, the final publication should always be manually approved by a human operator to avoid errors or abuse. Finally, it is important to implement continuous monitoring of the model's activities, logging all operations performed and identifying any abnormal behaviors. This type of logging can help quickly detect suspicious activities and respond effectively. Additionally, adopting rate limits and restrictions on the number of actions an LLM can perform within a given time frame helps prevent abuse and limit the impact of possible compromises. The risk of Excessive Agency is therefore closely linked to the management of the capabilities and permissions granted to LLMs. A well-designed architecture, which adopts mitigation measures such as the principle of least privilege, human supervision for critical actions, and continuous monitoring of activities, can significantly reduce exposure to this type of vulnerability, ensuring that the LLM always operates within safe and controlled limits. System Prompt Leakage (OWASP Top 10 LLM) The vulnerability of System Prompt Leakage involves the risk that system prompts, which are the instructions used to guide the model's behavior, may contain sensitive information that is not intended to be disclosed. System prompts are designed to provide the model with the directives needed to generate appropriate outputs, but they might inadvertently include confidential or critical data. When this information is uncovered, it can be used to facilitate other types of attacks, thus posing a significant risk to system security. A common example of System Prompt Leakage occurs when prompts contain access credentials, API keys, or configuration details that should remain secret. If an attacker manages to extract these prompts, they can exploit them for unauthorized access to system resources, with potentially severe consequences. A specific case reported in the OWASP 2025 research shows how, in various business environments, information such as the structure of internal permissions or user financial transaction limits has been inadvertently exposed, thereby increasing the risk of privilege escalation attacks or bypassing security limits. Moreover, System Prompt Leakage vulnerability can reveal internal filtering criteria used to prevent the model from providing sensitive responses. For example, a system prompt might contain instructions like: “If a user requests information about another user, always respond with ‘Sorry, I cannot assist with this request.’” If an attacker were to see this prompt, they could exploit it to bypass security measures and manipulate the model's behavior in unintended ways. To mitigate the risk of System Prompt Leakage, it is crucial to separate sensitive data from system prompts and avoid including any critical information directly in them. Sensitive information should be managed through systems external to the model, ensuring that the model does not have direct access to such data. Another effective approach is to implement external guardrails: while training for specific behaviors can be useful, it does not guarantee that the model will always follow the instructions, especially in attack situations. An independent system that checks outputs to ensure compliance with expectations is preferable to relying solely on system prompt instructions. A critical mitigation strategy is to ensure that security controls are applied independently of the LLM. This means that essential controls, such as privilege separation and authorization verification, must be performed in a deterministic and verifiable manner, and should never be delegated to the model. For instance, if an LLM agent performs tasks requiring different levels of access, multiple agents should be used, each configured with the minimal privileges needed to perform its task, thereby reducing the risk of accidental exposure of sensitive data. In summary, the risk associated with System Prompt Leakage is not simply about disclosing the prompts themselves, but rather about the presence of sensitive data or excessive authorizations within them. Implementing robust external controls and limiting prompt content to non-sensitive information are essential steps to protect the integrity and security of LLM-based applications. Vector and Embedding Weaknesses (OWASP Top 10 LLM) Weaknesses in embeddings and vectors represent another significant security risk for LLMs. Embeddings are numerical representations that capture the meaning of text and are fundamental to the functioning of LLMs. However, these representations can be exploited to manipulate the model or extract sensitive information, especially if they are not protected by adequate security controls. One of the primary vulnerabilities is embedding inversion, a type of attack in which an attacker uses embedding vectors to reconstruct sensitive information originally included in the training data. This inversion process can reveal private user details or proprietary data used to train the model, thereby compromising privacy. A concrete example reported in the OWASP 2025 research illustrates how an attacker managed to recover personal information, such as names or addresses, by analyzing embedding vectors generated by an inadequately protected LLM. Additionally, embeddings can become vulnerable due to insufficient access controls. In systems using Retrieval-Augmented Generation (RAG) techniques, information contained in vectors can be retrieved and combined with new queries, creating the risk of sensitive data leakage between different users or usage contexts. For example, in multi-tenant environments, an error in the logical separation of requests could cause one user to receive information related to another user, leading to a confidentiality issue. To mitigate these risks, it is essential to implement granular access controls that limit the use of embeddings to secure and verified contexts. Embeddings should be managed so that access is tightly controlled and authorized only for specific purposes. Additionally, techniques such as encrypting data within embeddings can help prevent the risk of inversion and information leakage. It is equally important to establish strict data validation policies to ensure that the information used to create embeddings is clean and comes from reliable sources. Another step toward mitigation involves continuously monitoring the use of embeddings and RAG resources, maintaining a detailed log of access activities. This allows for the timely detection of abnormal behavior that might indicate manipulation attempts or unauthorized access. Monitoring can be combined with anomaly detection techniques to quickly identify possible attacks and mitigate their impact. In summary, weaknesses in embeddings and vectors pose a significant challenge for LLM security. Implementing strict access controls, encrypting data, and constantly monitoring activity are all critical measures to protect these elements and ensure the security and confidentiality of LLM-based applications. Misinformation (OWASP Top 10 LLM) Misinformation represents one of the most critical risks in the use of LLMs, as the models can generate content that appears accurate but is completely incorrect or misleading. This risk is amplified by the ability of LLMs to produce responses that sound credible but are based on erroneous data or misinterpretations. Misinformation can lead to security violations, reputational damage, and even legal consequences, especially in contexts where the reliability of information is crucial, such as healthcare, finance, or law. One of the main issues underlying misinformation is the phenomenon of hallucinations, where the model "invents" answers when there is a lack of concrete data. When the LLM does not have precise information on a particular subject, it may fill in the gaps with statistically generated data that seem accurate but are actually fabricated. For example, in the OWASP 2025 research, there have been documented cases where LLMs provided nonexistent legal references or health details with no scientific basis. This type of misinformation can lead users to make poor decisions, with potentially harmful consequences. Another related problem is the excessive trust users may place in content generated by LLMs. Since responses often appear very confident and detailed, users tend not to verify their accuracy, integrating incorrect information into decision-making processes without proper checks. This can be particularly risky in sensitive contexts. For instance, a medical chatbot providing incorrect information could harm a patient's health, and a model used in the financial sector could lead to disastrous economic decisions. To reduce the risk of misinformation, an effective strategy is to use Retrieval-Augmented Generation (RAG), which allows the model to access updated and verified sources of information during response generation. This approach reduces the risk of hallucinations, as responses are based on concrete data rather than statistical generation. Moreover, it is important to integrate human supervision into decision-making processes, especially in critical fields: manually verifying the information generated by the model can improve overall accuracy and reduce the spread of erroneous content. Another mitigation technique is model refinement through fine-tuning and using embeddings that improve response quality. Techniques like parameter-efficient tuning (PET) and chain-of-thought prompting can significantly reduce the incidence of misinformation, as they enable the model to perform more structured reasoning and verify the consistency of generated information. Finally, it is crucial to educate users on the limitations of LLMs and the importance of independent verification of generated content. Providing specific training to users, especially in sensitive contexts, helps avoid excessive reliance on model-generated content and develop a more critical approach to using these technologies. In conclusion, misinformation represents a central vulnerability for LLMs but can be mitigated through a multidimensional approach combining the use of external sources, human supervision, continuous model refinement, and user education. Only through rigorous control and constant verification is it possible to minimize the risks associated with the dissemination of incorrect information by these models. Unbounded Consumption (OWASP Top 10 LLM) Unbounded Consumption refers to the risk of an LLM using computational resources in an uncontrolled manner, with potential consequences of denial of service or high operational costs. LLMs, especially those hosted in cloud environments with "pay-per-use" billing models, can be vulnerable to excessive and unauthorized use, leading to unsustainable costs for the managing organization. A common example of this risk is the so-called Denial of Wallet (DoW), where an attacker exploits the pay-per-use system to generate continuous and costly requests to the model, causing a significant increase in service costs. This type of attack not only can economically harm the organization but can also have operational consequences, limiting service availability for legitimate users. In the 2025 research, specific cases have been reported where a company's operational cost grew exponentially due to a DoW attack, highlighting how this can represent a significant financial threat. Another typical situation of Unbounded Consumption occurs when users repeatedly submit very complex inputs or long sequences, causing disproportionate use of the model's resources. In these cases, the system can become slow or even stop responding due to excessive computational pressure. An example might be the use of linguistically intricate requests that require significant processing, resulting in inefficient use of CPU and memory. To mitigate these risks, it is crucial to implement rate limits and usage quotas that regulate the maximum number of requests a single user can make within a given time period. This helps prevent resource abuse and ensures a fair distribution of computational capacity among users. The OWASP research emphasizes the importance of limiting the exposure of logits and other sensitive information during API interactions, thereby reducing potential attack vectors for exploiting the model. Another effective approach is continuous resource monitoring, which allows for detecting abnormal usage and quickly responding to suspicious behavior. Alarm systems and rate limiting can be configured to automatically intervene when model usage exceeds certain thresholds, ensuring that resources always remain within manageable limits. Finally, it is useful to consider implementing controlled system degradation techniques. Under excessive loads, the system can be designed to maintain partial functionality rather than undergo a complete shutdown. This ensures that at least some services remain operational even during significant attacks or overloads, thereby reducing the negative impact on the end-user experience. These multidimensional approaches are fundamental to addressing the risk of Unbounded Consumption in LLM applications and ensuring service continuity, economic sustainability, and operational security of implementations based on these models. Conclusions The growing integration of large language models (LLMs) into business processes and public services has led to increased attention to their security, highlighting the need to address new vulnerabilities. These risks, while technical, have profound strategic implications for businesses, particularly in terms of trust, reputation, compliance, and economic sustainability. Understanding the vulnerabilities identified in the OWASP Top 10 LLM 2025 report enables the development of unique perspectives and exploration of innovative strategies to mitigate risks while maximizing the value derived from these advanced technologies. A key takeaway is that vulnerabilities are not limited to the technology itself but often arise from the interaction between models, data, and business processes. For instance, the issue of “Prompt Injection” is not just a technical challenge but calls into question the reliability of the model as a decision-making tool. When a model can be manipulated through malicious inputs, companies must rethink their trust in the generated outcomes and build more resilient ecosystems. Adopting approaches like “human-in-the-loop” is not only a security measure but becomes a strategic choice to balance automation and human control, preserving decision quality in critical scenarios. The “Disclosure of Sensitive Information” instead highlights how fragile the boundary between technological innovation and privacy protection is. Companies can no longer consider data security as a separate technical requirement but must integrate it into their governance strategies. This implies building systems that go beyond simple anonymization, embracing concepts such as differential privacy and federated learning. Such approaches not only reduce risks but offer a competitive advantage in a context where consumer trust is a strategic asset. Vulnerabilities in the supply chain highlight how AI security depends on complex networks of suppliers and partners. Relying on pre-trained models or third-party components introduces systemic risks that require proactive management. Companies must start considering the security of model supply chains as an integral part of their risk management strategy, adopting tools like the Software Bill of Materials (SBOM) to ensure transparency and control. “Misinformation” represents a vulnerability with broader strategic consequences, as it undermines not only the credibility of the technology but also that of the businesses that use it. Companies must address this challenge by embracing a model of accountability to end users. This means not only implementing verification and oversight systems but also educating the public to understand the technology's limitations. Such awareness can transform a reputational risk into an opportunity to strengthen trust. Finally, the risk of “Unbounded Consumption” emphasizes that adopting LLMs is not only about technological innovation but also about economic sustainability. Inefficient resource management can quickly turn into a financial problem, making it essential for companies to implement monitoring and control mechanisms. Furthermore, the concept of “denial of wallet” introduces a new perspective on AI costs, pushing organizations to consider architectural solutions that balance performance and protection. Companies wishing to harness the potential of LLMs must adopt an integrated vision that goes beyond technical security, embracing a strategic approach that considers trust, governance, resilience, and economic sustainability. This requires rethinking the entire implementation lifecycle, from design to operational management, to build systems that are not only secure but also aligned with business objectives and capable of responding to future challenges. Podcast: https://spotifycreators-web.app.link/e/zQXACPTqXOb Source: https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/
Multimodal AI in Medicine
The growing availability of biomedical data, coming from biobanks, electronic health records, medical imaging, wearable sensors, and environmental biosensors, along with the decreasing costs of genome and microbiome sequencing, has created the conditions for the development of multimodal artificial intelligence (AI) solutions capable of capturing the complexity of human health and disease. This article is based on research conducted by Julián N. Acosta, Guido J. Falcone, Pranav Rajpurkar, and Eric J. Topol, belonging to institutions such as the Yale School of Medicine, Harvard Medical School, and the Scripps Research Translational Institute. This AI has the potential to transform medicine from a series of punctual evaluations to a more holistic and continuous vision, improving diagnosis, prognosis, and treatment personalization. What is Multimodal AI and What Are the Opportunities in Medicine? Multimodal AI combines data from different sources, such as genetic data, imaging, wearable sensors, and clinical data to provide a deeper understanding of patient health. Currently, most AI applications in medicine focus on single modalities, such as a CT scan or a retinal photograph, whereas a physician integrates multiple sources and modalities during diagnosis and treatment selection. Multimodal AI could approach this complexity, broadening the range of applications and improving the precision and personalization of care. One of the most promising areas concerns personalized medicine. Advances in sequencing have allowed the collection of detailed molecular biology data ("omics"), including genome, proteome, transcriptome, epigenome, metabolome, and microbiome. Integrating these data with other clinical information is essential to enhance our understanding of human health, enabling increasingly precise and personalized prevention, diagnosis, and treatment strategies. One of the main opportunities for multimodal AI lies in the use of so-called "digital twins." This approach involves creating virtual models that replicate the physiological behavior of an individual patient, to simulate different therapeutic scenarios. For example, digital twins are used to simulate the progression of diseases such as Alzheimer's or multiple sclerosis and test the possible impact of different therapies, reducing the time and costs required for clinical trials. Such technology can more accurately predict the effectiveness of a specific treatment, improving the personalization of care and optimizing healthcare resource management. Furthermore, combining genetic data, epigenetic data, and information on social determinants of health allows addressing the problem of disparities in access to care. By integrating these different types of data, multimodal AI can identify at-risk populations and develop specific interventions to prevent the onset of chronic diseases. Multimodal AI is not limited to diagnosis and therapy. Another essential application concerns "virtual health assistants." These assistants can combine genetic data, clinical information, and data from biosensors to offer personalized recommendations to patients, facilitating continuous health monitoring and promoting positive behaviors. In a context where healthcare costs continue to rise, using virtual assistants can reduce pressure on healthcare systems, improve therapy adherence, and provide constant support, especially for patients with chronic diseases. Digital Clinical Trials and Remote Monitoring Another important application area is represented by digital clinical trials. The digitalization of clinical trials has the potential to reduce costs, overcome geographical and cultural disparities in participant access, and improve data collection and quality. Recent developments suggest that integrating data from different wearable devices, such as monitoring heart rate, blood oxygen level, sleep quality, and physical activity, could offer a more complete picture of the health status of clinical trial participants. Additionally, integrating devices such as environmental sensors and depth cameras can provide a holistic perspective, useful not only for clinical monitoring but also for adapting patients' living environments according to their health conditions. Another significant innovation in the field of digital clinical trials is represented by the so-called "synthetic control trials." In this approach, historical and external data are used to create synthetic control groups, reducing the need to involve a large number of real participants. These virtual control groups allow comparable results to be obtained, reducing the overall costs and times of the clinical trial. Moreover, the concept of adaptive clinical trials, which exploits real-time data to modify trial protocols, represents a further development in this field. This type of adaptation, facilitated by multimodal integration, allows for a more dynamic and appropriate response to changes in patients' health status during the trial, increasing the safety and effectiveness of the tested interventions. Remote monitoring through wearable sensors and environmental biosensors is already demonstrating the possibility of replicating many hospital functions at home, improving patients' quality of life, and reducing healthcare costs. A concrete example concerns the remote monitoring program for COVID-19 developed by the Mayo Clinic, which demonstrated the feasibility and safety of remotely monitoring people with COVID-19. However, remote monitoring still requires validation through randomized trials to demonstrate its safety compared to hospital admissions. For this purpose, multimodal AI could play a key role in predicting imminent clinical deterioration, enabling timely intervention and reducing patient risks. The use of environmental sensors, such as wireless ones, offers further possibilities for remote monitoring. For instance, sensors placed in home environments, such as cameras or microphones, can detect changes in patient behavior or events such as falls, providing an additional level of safety and support, especially for elderly patients or those with chronic conditions. Integrating these different data modalities allows continuous and accurate monitoring, which could significantly improve the quality of care provided, allowing timely interventions when needed. Digital Twin Technology and Pandemic Surveillance Digital twin technology, a concept borrowed from engineering, could represent a paradigm shift in personalized medicine. These virtual models could predict the effectiveness of specific therapeutic interventions for a patient, accurately modeling the effects on their health. An example is the use of digital twins to improve clinical trials on Alzheimer's and multiple sclerosis, with the aim of testing new therapeutic strategies more quickly and at lower costs. Furthermore, digital twins can be used for simulating emergency health management scenarios. For example, digital twin models can be employed to predict the hospital capacity needed during a pandemic emergency, optimizing resource allocation and improving crisis response. This type of simulation was recently used to analyze the distribution and effectiveness of hospital resources during the COVID-19 pandemic, showing how integrating multimodal data (such as epidemiological data, bed availability information, and ventilation capacity) can provide a clearer and more strategic view of healthcare resource management. Multimodal AI can improve pandemic preparedness and response by facilitating more integrated and real-time surveillance. A concrete example is the DETECT study, launched by the Scripps Research Translational Institute, which used data from wearable sensors to early identify COVID-19 infections and other viral diseases, showing how the combination of self-reported symptoms and sensor metrics improved predictive capabilities compared to using individual modalities. Another emerging approach concerns the use of graph neural network models for integrating and analyzing multimodal data related to pandemic surveillance. These models can leverage the interconnections between different data sources (such as mobility, clinical outcomes, and contact tracing) to identify contagion patterns that are not evident with traditional methods, allowing a more timely and accurate response to epidemics. This technology can also be extended locally, providing specific information for individual hospitals or geographic areas and improving public health management in emergency situations. Technical and Privacy Challenges Despite the opportunities, there are still many challenges to overcome. The multimodal nature of health data entails an intrinsic complexity in their collection, linking, and annotation. Creating well-structured and standardized multimodal datasets is crucial for the success of these approaches. Moreover, the "curse of dimensionality" describes the difficulty in integrating data with a high number of variables, reducing the generalizability of AI solutions. One of the main difficulties is the effective integration of data from heterogeneous sources with different structural characteristics. Data from wearable devices, medical imaging, omics, and electronic health records vary widely in type and format. This variety makes data management complex and requires the creation of infrastructures and normalization tools that can ensure their compatibility and consistency. It has been shown that using techniques such as Graph Neural Networks can improve the ability to manage and integrate these different types of data, allowing a deeper analysis of interactions between different modalities and enhancing predictive accuracy. Furthermore, the lack of high-quality labeled data represents another significant obstacle. Building quality datasets requires accurate and standardized data collection and annotation, a process that is often costly and time-consuming. Self-supervised learning and knowledge transfer techniques can help bridge this gap, allowing AI models to effectively learn from large amounts of unlabeled data. For instance, DeepMind's Perceiver framework has been proposed as a possible solution to handle data of different natures without resorting to specific architectures for each modality, thus improving data integration efficiency. Another relevant issue is represented by data privacy protection. Integrating data from different sources increases the risk of re-identifying individuals, especially when dealing with sensitive health data. To address these challenges, several data protection techniques are emerging, such as federated learning, which allows AI models to be trained without transferring raw data from the original devices or institutions, thus ensuring a higher level of privacy. Federated learning has been successfully implemented in collaborations among multiple institutions to predict clinical outcomes in COVID-19 patients, demonstrating the feasibility of this technology even in complex and large-scale scenarios. Homomorphic encryption represents another solution that allows mathematical operations to be performed on encrypted data without the need to decrypt them, thus ensuring the security of information during the training process. Alternatively, swarm learning offers a newer possibility of training models on local data without the need for a trusted central server, instead leveraging blockchain-based smart contracts to ensure the integrity of the distributed training process. Future Perspectives Multimodal AI has the potential to profoundly transform medicine, making it more personalized, predictive, and accessible. However, to realize this potential, significant efforts are required from the medical community and AI researchers to build and validate new models and demonstrate their usefulness in improving clinical outcomes. One of the main future perspectives for multimodal AI is the creation of unified platforms capable of combining and analyzing a wide range of biomedical data, including genetic information, clinical data, and real-time health parameters. Such platforms could provide a detailed and integrated picture of patient health, enhancing doctors' ability to make more informed decisions. The main challenge in this area concerns the development of models that can effectively operate with heterogeneous data while maintaining privacy and information security. For example, integrating omics data combined with imaging and clinical data has the potential to lead to an even more precise understanding of human health and to develop more effective personalized therapeutic strategies. Furthermore, the growing spread of wearable devices and environmental sensors will enable increasingly granular and continuous data collection, fostering the adoption of preventive rather than reactive approaches. Using multimodal AI in combination with these technologies could help constantly monitor patients and provide early warnings about potential health risks. The possibility of creating medical digital twins could further progress, evolving from a tool for simulating specific therapeutic scenarios to a resource for continuously optimizing a patient's care path, thanks to the progressive integration of data as they are acquired. This would allow for a dynamic and updated patient model, useful not only for treatment selection but also for predicting clinical outcomes and early identification of potential complications. Greater collaboration between healthcare systems, research groups, and industry will be crucial to collect the necessary data and demonstrate the value of these approaches in everyday clinical practice. New privacy technologies, such as federated learning and homomorphic encryption, could be further strengthened and combined with other techniques to ensure that the benefits of multimodal AI are obtained without compromising data security. This approach is particularly important to facilitate data sharing among different institutions, enhancing AI's ability to learn from a broader number of cases without compromising patient confidentiality. Another expanding area concerns adaptive digital clinical trials, where multimodal AI can be used to dynamically monitor and modify trial protocols based on real-time results. This will allow accelerating the development of new therapies, reducing the time and costs associated with traditional clinical trials. In the long term, multimodal artificial intelligence is set to become an essential component in the development of increasingly advanced precision medicine models. These models will optimize treatments for each patient, considering their specific clinical history, individual genetic profile, and current health conditions, with the goal of significantly improving both clinical outcomes and quality of life. Conclusions Multimodal artificial intelligence represents a strategic turning point for the future of healthcare, but its adoption poses challenges and opportunities that transcend the technological sphere to profoundly impact the organizational, economic, and social models of the healthcare system. The possibility of integrating heterogeneous data, such as genetics, imaging, sensors, and social determinants, paves the way for precision medicine that goes beyond the traditional dichotomy between generic approaches and personalized therapies. However, this transformation implies a paradigm shift that not only concerns treatments but also the very structure of healthcare institutions, the training of professionals, and the active participation of patients. Healthcare based on multimodal AI will not be limited to treating disease but will move towards managing health as a continuous and dynamic asset. This approach entails redefining the concept of value in the healthcare system, shifting it from the efficiency of episodic treatments to the ability to anticipate and prevent future complications. The economic implications are profound: on the one hand, it reduces costs related to avoidable hospitalizations and late treatments; on the other, it requires massive initial investment to create technological infrastructures capable of supporting this integration. The adoption of technologies such as "digital twins" and adaptive clinical trials could give rise to new ways of allocating healthcare resources, pushing towards a more equitable and proactive system. A crucial aspect is represented by the impact of multimodal AI on inequalities in access to care. If managed with a strategic vision, this technology can reduce health disparities, but without an ethical approach and solid governance policies, it risks exacerbating them. The integration of social determinants of health data could, for example, identify the most vulnerable populations, but without targeted intervention, mere awareness of disparities will not lead to significant changes. Moreover, dependence on advanced technologies could create new barriers for sections of the population that are less digitally literate or lack access to adequate tools. From a strategic point of view, multimodal AI represents an opportunity to redesign the relationship between doctor and patient. The traditional model, centered on the physician's authority, gives way to a collaboration based on data and virtual assistants that support the patient in monitoring their health. This change requires a reconfiguration of the professional role of doctors, who will need to acquire technological skills to interpret and exploit the data provided by AI. At the same time, there is a risk of a dehumanization of care if the focus on technology is not balanced by renewed attention to the empathetic and relational aspects of clinical work. Another fundamental element is the trust of patients and healthcare workers in the AI-based system. Data protection, often relegated to a technical dimension, assumes strategic relevance in terms of maintaining the legitimacy of the healthcare system. Solutions such as federated learning and homomorphic encryption should not be seen only as security tools but as mechanisms to build a new ethic of data management, capable of balancing innovation with privacy protection. Transparency in the use of collected information will be crucial to avoid AI adoption being perceived as a threat to personal control over one's health. Finally, the integration of multimodal AI into the healthcare system could redefine the boundaries between public and private health. Applications for pandemic surveillance or hospital emergency management demonstrate that this technology is not only an opportunity to improve individual treatments but also a tool to manage population health more effectively and resiliently. However, this requires unprecedented collaboration between governments, industry, research, and citizens. Without strategic coordination, there is a risk that fragmented initiatives will lead to inefficiencies, duplications, and conflicts of interest. Multimodal AI is not just a technological evolution but a lever for rethinking the healthcare system as an integrated ecosystem. Its applications should not be evaluated solely in terms of scientific innovation but as catalysts for systemic transformation that requires vision, leadership, and the ability to manage change. Ultimately, the success of this transition will depend not only on the quality of algorithms but also on the ability of institutions to adapt to a new paradigm of care, centered not only on disease but on the entire spectrum of human health. Podcast: https://spotifycreators-web.app.link/e/aJ4xq1EwYOb Source: https://pubmed.ncbi.nlm.nih.gov/36109635/
AI multimodale in medicina
La crescente disponibilità di dati biomedici, provenienti da biobanche, cartelle cliniche elettroniche, imaging medico, sensori indossabili e biosensori ambientali, insieme alla diminuzione dei costi del sequenziamento del genoma e del microbioma, ha creato le condizioni per lo sviluppo di soluzioni di intelligenza artificiale (AI) multimodale in grado di catturare la complessità della salute e delle malattie umane. Questo articolo è basato sulla ricerca condotta da Julián N. Acosta, Guido J. Falcone, Pranav Rajpurkar, ed Eric J. Topol, appartenenti a istituzioni come la Yale School of Medicine, la Harvard Medical School e il Scripps Research Translational Institute. Questa AI ha il potenziale di trasformare la medicina da una serie di valutazioni puntuali a una visione più olistica e continua, migliorando la diagnosi, la prognosi e la personalizzazione dei trattamenti. Cos'è l'AI Multimodale e quali sono le opportunità in medicina? L'AI multimodale combina dati provenienti da diverse fonti, come i dati genetici, l'imaging, i sensori indossabili, e i dati clinici per fornire una comprensione più approfondita della salute del paziente. Attualmente, la maggior parte delle applicazioni AI in medicina si concentra su singole modalità, ad esempio un'immagine TC o una fotografia della retina, mentre un medico integra molteplici fonti e modalità durante la diagnosi e la scelta dei trattamenti. L'AI multimodale potrebbe avvicinarsi a questa complessità, ampliando il ventaglio delle applicazioni e migliorando precisione e personalizzazione delle cure. Uno degli ambiti più promettenti riguarda la medicina personalizzata. Il progresso del sequenziamento ha permesso la raccolta di dati dettagliati di biologia molecolare (“omics”), che includono genoma, proteoma, trascrittoma, epigenoma, metaboloma e microbioma. L'integrazione di questi dati con altre informazioni cliniche è fondamentale per incrementare la comprensione della salute umana, consentendo strategie di prevenzione, diagnosi e trattamento sempre più precise e personalizzate. Una delle principali opportunità dell'AI multimodale consiste nell'utilizzo dei cosiddetti "digital twin". Questo approccio implica la creazione di modelli virtuali che replicano il comportamento fisiologico di un singolo paziente, al fine di simulare diversi scenari terapeutici. Ad esempio, i digital twin sono utilizzati per simulare l'evoluzione di malattie come l'Alzheimer o la sclerosi multipla e testare il possibile impatto di diverse terapie, riducendo tempi e costi necessari per i trial clinici. Tale tecnologia può predire con maggiore precisione l'efficacia di un trattamento specifico, migliorando la personalizzazione delle cure e ottimizzando la gestione delle risorse sanitarie. Inoltre, la combinazione di dati genetici, dati epigenetici, e informazioni sui social determinants of health consente di affrontare il problema delle disparità di accesso alle cure. Integrando questi diversi tipi di dati, l'AI multimodale può identificare le popolazioni a rischio e sviluppare interventi specifici per prevenire l'insorgenza di malattie croniche. L'AI multimodale non si limita alla diagnosi e alla terapia. Un'altra applicazione fondamentale riguarda i "virtual health assistants". Questi assistenti possono combinare dati genetici, informazioni cliniche, e dati provenienti da biosensori per offrire raccomandazioni personalizzate ai pazienti, facilitando il monitoraggio continuo dello stato di salute e promuovendo comportamenti virtuosi. In un contesto in cui i costi della sanità continuano a crescere, l'uso di assistenti virtuali può ridurre la pressione sui sistemi sanitari, migliorare l'aderenza alle terapie e fornire un supporto costante, specialmente per pazienti con malattie croniche. Trials clinici digitali e monitoraggio remoto Un'altra area di applicazione importante è rappresentata dai trial clinici digitali. La digitalizzazione dei trial clinici ha il potenziale di ridurre i costi, superare le disparità geografiche e culturali nell'accesso ai partecipanti, e migliorare la raccolta e la qualità dei dati. Recenti sviluppi suggeriscono che l'integrazione di dati provenienti da diversi dispositivi indossabili, come il monitoraggio del battito cardiaco, il livello di ossigeno nel sangue, la qualità del sonno e l'attività fisica, potrebbe offrire un quadro più completo dello stato di salute dei partecipanti ai trial clinici. Inoltre, l'integrazione di dispositivi come sensori ambientali e videocamere di profondità può fornire una prospettiva olistica, utile non solo per il monitoraggio clinico, ma anche per l'adattamento degli ambienti di vita dei pazienti in funzione delle loro condizioni di salute. Un'altra importante innovazione nel contesto dei trial clinici digitali è rappresentata dai cosiddetti "synthetic control trials". In questo approccio, i dati storici e i dati esterni vengono utilizzati per creare gruppi di controllo sintetici, riducendo la necessità di coinvolgere un numero elevato di partecipanti reali. Questi gruppi di controllo virtuali permettono di ottenere risultati comparabili, riducendo i costi e i tempi complessivi del trial clinico. Inoltre, il concetto di adaptive clinical trials, che sfrutta dati in tempo reale per modificare i protocolli di sperimentazione, rappresenta un ulteriore sviluppo in questo ambito. Questo tipo di adattamento, facilitato dall'integrazione multimodale, consente una risposta più dinamica e appropriata ai cambiamenti nello stato di salute dei pazienti durante il trial, aumentando la sicurezza e l'efficacia degli interventi testati. Il monitoraggio remoto tramite sensori indossabili e biosensori ambientali sta già dimostrando la possibilità di replicare molte delle funzioni ospedaliere a domicilio, migliorando la qualità della vita dei pazienti e riducendo i costi sanitari. Un esempio concreto riguarda il programma di monitoraggio remoto COVID-19 sviluppato dalla Mayo Clinic, che ha dimostrato la fattibilità e la sicurezza del monitoraggio remoto delle persone con COVID-19. Tuttavia, il monitoraggio remoto richiede ancora validazione tramite trial randomizzati che dimostrino la sua sicurezza in confronto ai ricoveri ospedalieri. A questo fine, l'AI multimodale potrebbe giocare un ruolo chiave nella predizione dei deterioramenti clinici imminenti, consentendo l'intervento tempestivo e riducendo i rischi per i pazienti. L'uso di sensori ambientali, come quelli wireless, offre ulteriori possibilità per il monitoraggio remoto. Ad esempio, sensori collocati in ambienti domestici, come videocamere o microfoni, possono rilevare cambiamenti nel comportamento del paziente o eventi come cadute, offrendo un livello aggiuntivo di sicurezza e supporto, specialmente per i pazienti anziani o affetti da patologie croniche. L'integrazione di queste diverse modalità di dati permette un monitoraggio continuo e accurato, che potrebbe migliorare significativamente la qualità dell'assistenza fornita, permettendo interventi tempestivi in caso di necessità. Tecnologia dei Digital Twin e sorveglianza pandemica La tecnologia dei digital twin, concetto ripreso dall'ingegneria, potrebbe rappresentare un cambio di paradigma nella medicina personalizzata. Questi modelli virtuali potrebbero predire l'efficacia di interventi terapeutici specifici per un paziente, modellando con precisione gli effetti sul suo stato di salute. Un esempio è dato dall'utilizzo di digital twin per migliorare i trial clinici sull'Alzheimer e la sclerosi multipla, con l'obiettivo di testare nuove strategie terapeutiche più rapidamente e con costi ridotti. Inoltre, i digital twin possono essere utilizzati per la simulazione di scenari di gestione delle emergenze sanitarie. Ad esempio, modelli digital twin possono essere impiegati per prevedere la capacità ospedaliera necessaria durante un'emergenza pandemica, ottimizzando l'allocazione delle risorse e migliorando la risposta alle crisi. Questo tipo di simulazione è stato recentemente utilizzato per analizzare la distribuzione e l'efficacia delle risorse ospedaliere durante la pandemia di COVID-19, mostrando come l'integrazione di dati multimodali (come dati epidemiologici, informazioni sulla disponibilità di letti e capacità di ventilazione) possa fornire una visione più chiara e strategica della gestione delle risorse sanitarie. L'AI multimodale può migliorare la preparazione e la risposta alle pandemie, facilitando una sorveglianza più integrata e in tempo reale. Un esempio concreto è rappresentato dallo studio DETECT, lanciato dal Scripps Research Translational Institute, che ha utilizzato dati provenienti da sensori indossabili per identificare precocemente le infezioni da COVID-19 e altre malattie virali, mostrando come la combinazione di sintomi auto-riportati e metriche dei sensori abbia migliorato le capacità di previsione rispetto all'utilizzo delle singole modalità. Un altro approccio che sta emergendo riguarda l'uso di modelli di rete neurale grafica per l'integrazione e l'analisi dei dati multimodali relativi alla sorveglianza delle pandemie. Questi modelli possono sfruttare le interconnessioni tra diverse fonti di dati (come mobilità, risultati clinici e tracciamento dei contatti) per identificare pattern di contagio non evidenti con metodi tradizionali, permettendo una risposta tempestiva e più precisa alle epidemie. Questa tecnologia può essere estesa anche a livello locale, fornendo informazioni specifiche per singoli ospedali o aree geografiche, e migliorando la gestione della salute pubblica in situazioni di emergenza. Sfide tecniche e di privacy Nonostante le opportunità, ci sono ancora molte sfide da superare. La natura multimodale dei dati sanitari comporta una complessità intrinseca nella loro raccolta, collegamento e annotazione. La creazione di dataset multimodali ben strutturati e standardizzati è fondamentale per il successo di questi approcci. Inoltre, la “curse of dimensionality” (“la maledizione della dimensionalità”) descrive la difficoltà nell'integrare dati con un numero elevato di variabili, riducendo la capacità di generalizzare delle soluzioni AI. Una delle principali difficoltà riguarda l'integrazione efficace di dati provenienti da fonti eterogenee e con diverse caratteristiche strutturali. I dati provenienti da dispositivi indossabili, imaging medico, omics e cartelle cliniche elettroniche variano ampiamente per tipo e formato. Questa varietà rende complessa la gestione dei dati e richiede la creazione di infrastrutture e strumenti di normalizzazione che possano garantire la loro compatibilità e coerenza. È stato dimostrato che l'uso di tecniche come le reti neurali grafiche (Graph Neural Networks) può migliorare la capacità di gestire e integrare queste differenti tipologie di dati, consentendo un'analisi più profonda delle interazioni tra le diverse modalità e migliorando la precisione delle previsioni. Inoltre, la mancanza di dati etichettati di alta qualità rappresenta un ulteriore ostacolo significativo. La costruzione di dataset di qualità richiede una raccolta e annotazione dei dati accurata e standardizzata, un processo spesso costoso e che richiede molto tempo. Tecniche di apprendimento auto-supervisionato e trasferimento di conoscenze possono aiutare a colmare questa lacuna, consentendo ai modelli AI di imparare in modo efficace anche da grandi quantità di dati non etichettati. Ad esempio, il framework Perceiver di DeepMind è stato proposto come una possibile soluzione per gestire dati di natura diversa senza dover ricorrere ad architetture specifiche per ogni modalità, migliorando così l'efficienza dell'integrazione dei dati. Un altro problema rilevante è rappresentato dalla protezione della privacy dei dati. L'integrazione di dati da diverse fonti aumenta il rischio di re-identificazione degli individui, soprattutto quando si tratta di dati sanitari sensibili. Per affrontare queste sfide, stanno emergendo diverse tecniche di protezione dei dati, come il federated learning, che permette di addestrare modelli AI senza dover trasferire i dati grezzi dai dispositivi o dalle istituzioni di origine, garantendo così un maggiore livello di privacy. Il federated learning è stato implementato con successo in collaborazione tra più istituzioni per predire gli esiti clinici in pazienti affetti da COVID-19, dimostrando la fattibilità di questa tecnologia anche in scenari complessi e su larga scala. La crittografia omomorfica rappresenta un'ulteriore soluzione che consente di effettuare operazioni matematiche su dati criptati senza la necessità di decriptarli, garantendo così la sicurezza delle informazioni durante il processo di addestramento. In alternativa, il swarm learning offre una possibilità più recente di addestramento di modelli su dati locali senza la necessità di un server centrale fidato, sfruttando invece contratti smart basati su blockchain per garantire l'integrità del processo di addestramento distribuito. Prospettive future L'AI multimodale ha il potenziale per trasformare profondamente la medicina, rendendola più personalizzata, predittiva e accessibile. Tuttavia, per realizzare questo potenziale, sono necessari sforzi significativi da parte della comunità medica e dei ricercatori di AI per costruire e validare nuovi modelli, e dimostrare la loro utilità per migliorare i risultati clinici. Una delle principali prospettive future per l'AI multimodale è la creazione di piattaforme unificate in grado di combinare e analizzare una vasta gamma di dati biomedici, che includano informazioni genetiche, dati clinici e parametri di salute in tempo reale. Tali piattaforme potrebbero fornire un quadro dettagliato e integrato della salute del paziente, migliorando la capacità dei medici di prendere decisioni più informate. La sfida principale in questo ambito riguarda lo sviluppo di modelli che possano operare efficacemente con dati di natura eterogenea, mantenendo al contempo la privacy e la sicurezza delle informazioni. Ad esempio, l'integrazione dei dati omici, combinati con imaging e dati clinici, ha il potenziale di portare a una comprensione ancora più precisa della salute umana e di sviluppare strategie terapeutiche personalizzate più efficaci. Inoltre, la crescente diffusione di dispositivi indossabili e sensori ambientali consentirà una raccolta di dati sempre più granulare e continua, favorendo l'adozione di approcci preventivi piuttosto che reattivi. L'utilizzo dell'AI multimodale in combinazione con queste tecnologie potrebbe aiutare a monitorare costantemente i pazienti e fornire avvisi anticipati sui potenziali rischi per la salute. La possibilità di creare digital twin medici potrebbe progredire ulteriormente, trasformandosi da uno strumento per simulare specifici scenari terapeutici a una risorsa per ottimizzare costantemente il percorso di cura di un paziente, grazie all'integrazione progressiva dei dati man mano che vengono acquisiti. Questo consentirebbe di avere un modello dinamico e aggiornato del paziente, utile non solo per la scelta dei trattamenti, ma anche per la previsione dell'andamento clinico e l'identificazione precoce di potenziali complicazioni. Una maggiore collaborazione tra sistemi sanitari, gruppi di ricerca e l'industria sarà fondamentale per raccogliere i dati necessari e dimostrare il valore di questi approcci nella pratica clinica quotidiana. Le nuove tecnologie di privacy, come il federated learning e l'encryption omomorfa, potrebbero essere ulteriormente rafforzate e combinate con altre tecniche per garantire che i benefici dell'AI multimodale siano ottenuti senza compromettere la sicurezza dei dati. Questo approccio è particolarmente importante per facilitare la condivisione dei dati tra istituzioni diverse, migliorando la capacità dell'AI di imparare da un numero più vasto di casi senza compromettere la riservatezza dei pazienti. Un'altra area in espansione riguarda i trial clinici digitali adattivi, dove l'AI multimodale può essere utilizzata per monitorare e modificare dinamicamente i protocolli dei trial sulla base dei risultati in tempo reale. Questo permetterà di accelerare lo sviluppo di nuove terapie, riducendo i tempi e i costi associati ai trial clinici tradizionali. Nel lungo termine, l'intelligenza artificiale multimodale è destinata a rappresentare una componente essenziale nello sviluppo di modelli sempre più avanzati di medicina di precisione. Questi modelli consentiranno di ottimizzare i trattamenti per ciascun paziente, tenendo conto della sua storia clinica specifica, del profilo genetico individuale e delle condizioni di salute attuali, con l'obiettivo di migliorare in modo significativo sia i risultati clinici che la qualità della vita. Conclusioni L’intelligenza artificiale multimodale rappresenta una svolta strategica per il futuro della sanità, ma la sua adozione pone sfide e opportunità che trascendono la sfera tecnologica per impattare profondamente i modelli organizzativi, economici e sociali del sistema sanitario. La possibilità di integrare dati eterogenei, come genetica, imaging, sensori e determinanti sociali, apre la strada a una medicina di precisione che supera la tradizionale dicotomia tra approcci generici e terapie personalizzate. Tuttavia, questa trasformazione implica un cambio di paradigma che non riguarda solo le cure, ma anche la struttura stessa delle istituzioni sanitarie, la formazione dei professionisti e la partecipazione attiva dei pazienti. La sanità basata sull’AI multimodale non si limiterà a curare la malattia, ma si sposterà verso la gestione della salute come asset continuo e dinamico. Questo approccio comporta una ridefinizione del concetto di valore nel sistema sanitario , spostandolo dall’efficienza dei trattamenti episodici alla capacità di anticipare e prevenire complicazioni future. Le implicazioni economiche sono profonde: da un lato, si riducono i costi legati a ricoveri evitabili e trattamenti tardivi; dall’altro, si richiede un massiccio investimento iniziale per creare infrastrutture tecnologiche capaci di supportare questa integrazione. L’adozione di tecnologie come i "digital twin" e i trial clinici adattivi potrebbe far emergere nuove modalità di allocazione delle risorse sanitarie, spingendo verso un sistema più equo e proattivo. Un aspetto cruciale è rappresentato dall’impatto dell’AI multimodale sulle disuguaglianze di accesso alle cure. Se gestita con una visione strategica, questa tecnologia può ridurre i divari sanitari, ma senza un approccio etico e politiche di governance solide rischia di acuirli. L’integrazione dei dati dei determinanti sociali della salute potrebbe, ad esempio, identificare le popolazioni più vulnerabili, ma senza un intervento mirato, la semplice conoscenza delle disparità non porterà a cambiamenti significativi. Inoltre, la dipendenza da tecnologie avanzate potrebbe creare nuovi ostacoli per le fasce di popolazione meno alfabetizzate dal punto di vista digitale o prive di accesso a strumenti adeguati. Dal punto di vista strategico, l’AI multimodale rappresenta un’opportunità per ridisegnare il rapporto tra medico e paziente. Il modello tradizionale, incentrato sull’autorità del medico, lascia spazio a una collaborazione basata su dati e assistenti virtuali che affiancano il paziente nel monitoraggio della propria salute. Questo cambiamento richiede una riconfigurazione del ruolo professionale dei medici , che dovranno acquisire competenze tecnologiche per interpretare e sfruttare i dati forniti dall’AI. Al tempo stesso, si pone il rischio di una disumanizzazione delle cure se il focus sulla tecnologia non sarà bilanciato da una rinnovata attenzione agli aspetti empatici e relazionali del lavoro clinico. Un altro elemento fondamentale è la fiducia dei pazienti e degli operatori sanitari nel sistema basato sull’AI. La protezione dei dati, tema spesso relegato a una dimensione tecnica, assume una rilevanza strategica nell’ottica di mantenere la legittimità del sistema sanitario. Soluzioni come il federated learning e la crittografia omomorfica non devono essere viste solo come strumenti di sicurezza, ma come meccanismi per costruire una nuova etica della gestione dei dati, capace di bilanciare l’innovazione con la protezione della privacy. La trasparenza nell’uso delle informazioni raccolte sarà cruciale per evitare che l’adozione dell’AI venga percepita come una minaccia al controllo personale sulla propria salute. Infine, l’integrazione dell’AI multimodale nel sistema sanitario potrebbe ridefinire i confini tra salute pubblica e privata. Le applicazioni per la sorveglianza pandemica o la gestione delle emergenze ospedaliere dimostrano che questa tecnologia non è solo un’opportunità per migliorare i trattamenti individuali, ma anche uno strumento per gestire la salute delle popolazioni in modo più efficace e resiliente. Tuttavia, ciò richiede una collaborazione senza precedenti tra governi, industria, ricerca e cittadini. Senza un coordinamento strategico, c’è il rischio che le iniziative frammentate portino a inefficienze, duplicazioni e conflitti di interesse. L’AI multimodale non è solo una evoluzione tecnologica, ma una leva per ripensare il sistema sanitario come ecosistema integrato. Le sue applicazioni non devono essere valutate solo in termini di innovazione scientifica, ma come catalizzatori di una trasformazione sistemica che richiede visione, leadership e capacità di gestione del cambiamento. In definitiva, il successo di questa transizione dipenderà non solo dalla qualità degli algoritmi, ma dalla capacità delle istituzioni di adattarsi a un nuovo paradigma di cura, centrato non solo sulla malattia ma sull’intero spettro della salute umana. Podcast: https://spotifycreators-web.app.link/e/qf3pCIfwYOb Fonte: https://pubmed.ncbi.nlm.nih.gov/36109635/
OWASP Top 10 LLM: dieci vulnerabilità per applicazioni basate su LLM
La sicurezza delle applicazioni basate su modelli di linguaggio di grandi dimensioni (LLM) è una tematica sempre più rilevante man mano che l'integrazione di tali tecnologie nei sistemi aziendali e nei servizi al pubblico si diffonde. La nuova versione 2025 della classifica OWASP Top 10 LLM per le vulnerabilità nelle applicazioni basate su LLM descrive i rischi più critici a cui queste applicazioni sono esposte. Questo articolo è una sintesi dei risultati del lavoro svolto dal team OWASP e dalle istituzioni coinvolte, basato sui contributi di esperti di sicurezza, sviluppatori e data scientist provenienti da diversi settori. In questo articolo esploreremo ciascuna delle vulnerabilità identificate, fornendo esempi concreti e possibili strategie di mitigazione. Prompt Injection (OWASP Top 10 LLM) Il problema del Prompt Injection si verifica quando l'input di un utente riesce a manipolare il comportamento di un modello linguistico, modificando l'output o le azioni del modello in modi indesiderati. Questa vulnerabilità può essere sfruttata sia intenzionalmente, da parte di attori malevoli che forniscono input progettati per ingannare il modello, sia accidentalmente, quando input imprevisti portano a un comportamento errato del sistema. Un aspetto particolarmente complesso è che gli attacchi di prompt injection possono non essere visibili o leggibili dagli esseri umani: qualsiasi contenuto che il modello riesca a interpretare può potenzialmente influenzare il suo comportamento. Esistono due principali tipologie di attacchi di Prompt Injection: diretti e indiretti. Gli attacchi diretti si verificano quando l'attaccante introduce direttamente un comando o un input che induce il modello a eseguire azioni indesiderate, come ignorare linee guida di sicurezza, rivelare informazioni sensibili, o addirittura effettuare azioni pericolose come l'accesso a risorse non autorizzate. Gli attacchi indiretti, invece, avvengono attraverso input provenienti da fonti esterne, come file o siti web, che contengono istruzioni che il modello può interpretare e che alterano il suo comportamento. Una nuova sfida emergente è legata ai modelli multimodali, che sono progettati per gestire input di diverse tipologie, come testo e immagini contemporaneamente. Gli attaccanti potrebbero, ad esempio, nascondere istruzioni all'interno di immagini che accompagnano del testo. Questi attacchi "cross-modal" aumentano significativamente la superficie di attacco, rendendo molto più complessa la difesa contro il prompt injection. L'impatto di un attacco di prompt injection può essere devastante: dalla divulgazione di informazioni sensibili, al bypass delle misure di sicurezza del sistema, fino alla manipolazione delle decisioni critiche del modello. Ad esempio, un attaccante potrebbe utilizzare prompt nascosti per far sì che un chatbot di assistenza clienti ignori tutte le regole di sicurezza interne, consentendo l'accesso a dati personali riservati. Per mitigare il rischio di prompt injection, è essenziale adottare diverse strategie di protezione. Prima di tutto, limitare il comportamento del modello attraverso la definizione precisa dei suoi ruoli e delle sue capacità è un passo cruciale. Fornire istruzioni chiare al modello su ciò che è consentito e ciò che non lo è aiuta a prevenire deviazioni indesiderate. Inoltre, è importante filtrare gli input e gli output, utilizzando strumenti semantici che possano identificare contenuti potenzialmente dannosi. Ad esempio, implementare controlli di "input validation" e regole per il filtraggio del contenuto può aiutare a ridurre il rischio di input malevoli. Anche l'adozione di un approccio chiamato "human-in-the-loop" può contribuire alla sicurezza. Questo approccio prevede che alcune azioni ad alto rischio richiedano la conferma di un operatore umano prima di essere eseguite, limitando così la possibilità che un prompt malevolo porti a conseguenze gravi. In aggiunta, la segregazione dei contenuti esterni e l'identificazione chiara di quali dati provengano da fonti non fidate riducono ulteriormente l'impatto potenziale di un attacco di prompt injection. Infine, testare il modello in maniera regolare attraverso simulazioni di attacco e tecniche di penetrazione può aiutare a individuare falle nei controlli di sicurezza prima che vengano sfruttate. Questi test dovrebbero trattare il modello come un utente non affidabile, al fine di valutare l'efficacia dei confini di fiducia e dei controlli di accesso. Sensitive Information Disclosure (OWASP Top 10 LLM) La vulnerabilità di Divulgazione di Informazioni Sensibili si verifica quando un modello linguistico gestisce dati personali o riservati senza adeguati controlli di sicurezza. Questo problema può avere gravi conseguenze, soprattutto quando le informazioni vengono inavvertitamente divulgate durante l'interazione con il modello o a causa di cattive pratiche di gestione dei dati di addestramento. La natura di questi modelli, addestrati su vaste quantità di dati, può portare a situazioni in cui dettagli privati vengono inaspettatamente rivelati se i dati non sono stati adeguatamente filtrati. Uno dei casi più comuni di divulgazione di informazioni sensibili riguarda la fuga di informazioni personali identificabili (PII), come nomi, indirizzi, numeri di telefono e altri dettagli sensibili. Per esempio, in contesti in cui un LLM viene utilizzato per assistenza clienti, potrebbe inavvertitamente rivelare dati personali di un altro utente se non sono presenti adeguati controlli di accesso. Questa situazione può verificarsi quando il modello è stato addestrato utilizzando dati non completamente anonimizzati o quando le informazioni vengono memorizzate senza le adeguate misure di protezione. Un altro rischio significativo è l'esposizione di algoritmi proprietari o dettagli interni di un'organizzazione. Ad esempio, un modello utilizzato per risolvere problematiche aziendali potrebbe accidentalmente rivelare informazioni riservate su algoritmi o metodologie proprietarie, esponendo l'azienda a potenziali rischi di sicurezza e perdita di vantaggio competitivo. Questo tipo di divulgazione può verificarsi non solo per errori nella gestione degli output, ma anche a causa di attacchi mirati che sfruttano vulnerabilità nei prompt o nei dati di addestramento. Per mitigare questi rischi, è fondamentale adottare tecniche di sanitizzazione dei dati durante il processo di addestramento, garantendo che qualsiasi dato personale o sensibile venga rimosso o mascherato. La sanitizzazione deve essere eseguita non solo sui dati utilizzati per l'addestramento, ma anche sugli input forniti dagli utenti in tempo reale. Inoltre, l'adozione di tecniche di apprendimento federato può ridurre la necessità di trasferire dati sensibili a un singolo luogo centralizzato, diminuendo così il rischio di esposizione. L'implementazione di controlli di accesso basati sul principio del minimo privilegio è un'altra misura chiave per prevenire la divulgazione di informazioni sensibili. Questo approccio implica che il modello abbia accesso solo alle informazioni strettamente necessarie per svolgere il suo compito, limitando così la possibilità che informazioni riservate vengano elaborate o divulgate erroneamente. Un'altra tecnica utile è l'uso della differential privacy , che aggiunge "rumore" ai dati per garantire che le informazioni specifiche dell'utente non possano essere ricostruite dai risultati generati dal modello. Educare gli utenti sull'uso sicuro degli LLM è altrettanto importante. Gli utenti devono essere consapevoli dei rischi legati all'inserimento di dati sensibili e devono ricevere linee guida su come interagire con il modello in modo sicuro. Ad esempio, i termini di utilizzo del servizio dovrebbero chiarire che i dati inseriti potrebbero essere utilizzati per migliorare il modello, e dovrebbe essere fornita la possibilità di escludere i propri dati dall'addestramento. Infine, è essenziale configurare correttamente il sistema per evitare che informazioni riservate siano inserite nei prompt di sistema o negli output generati dal modello. La sicurezza dell'infrastruttura deve essere garantita seguendo best practice come quelle definite dall'OWASP, tra cui la configurazione sicura delle API e il mascheramento dei messaggi di errore per evitare fughe di informazioni critiche. Supply Chain Vulnerabilities (OWASP Top 10 LLM) Le vulnerabilità nella catena di fornitura delle applicazioni basate su LLM rappresentano un rischio significativo, poiché possono compromettere l'integrità dei modelli, dei dati di addestramento e delle piattaforme di distribuzione. Queste vulnerabilità possono insorgere da una varietà di elementi esterni, come modelli pre-addestrati o componenti software di terze parti. L'utilizzo di modelli pre-addestrati disponibili pubblicamente, per esempio, comporta un rischio intrinseco poiché tali modelli potrebbero contenere bias o addirittura backdoor malevoli, introducendo debolezze difficili da individuare. Un aspetto critico è l'uso di modelli obsoleti che non sono più aggiornati o mantenuti. L'adozione di modelli o componenti software non supportati rappresenta una falla di sicurezza comune, simile a quelle descritte in altre aree della sicurezza informatica (come la gestione di software non aggiornati), ma con un impatto potenzialmente molto maggiore, dato l'uso pervasivo degli LLM in contesti critici. Se un modello non viene aggiornato, le vulnerabilità scoperte possono essere sfruttate da attori malevoli, portando a possibili violazioni dei dati o attacchi di sistema. Un altro rischio riguarda i metodi di "fine-tuning" basati su tecniche come il Low-Rank Adaptation (LoRA) . Sebbene queste tecniche consentano un'adattabilità più efficiente e il miglioramento delle performance dei modelli, esse introducono anche nuovi rischi. Un attaccante potrebbe sfruttare vulnerabilità in questi adattamenti per comprometterne l'integrità, manipolando il modello di base a livello di singolo componente e inserendo comportamenti indesiderati. Ad esempio, un LoRA adapter malevolo potrebbe essere caricato da una fonte non verificata, compromettendo l'intero sistema. Inoltre, i processi di sviluppo collaborativo e di fusione dei modelli, come quelli ampiamente adottati su piattaforme come Hugging Face, rappresentano una superficie di attacco notevole. Le piattaforme di condivisione di modelli sono spesso vulnerabili a compromissioni dovute a errori di configurazione o controlli di sicurezza non adeguati. Attacchi di tipo model tampering potrebbero includere la modifica diretta dei parametri di un modello per inserire backdoor o bias non rilevabili durante l'utilizzo comune. Per mitigare questi rischi, è fondamentale mantenere un inventario accurato e aggiornato di tutti i componenti utilizzati nella catena di fornitura, utilizzando strumenti come il Software Bill of Materials (SBOM) , che consente di verificare la provenienza e la sicurezza dei componenti software e dei modelli pre-addestrati. Questo permette di individuare rapidamente eventuali vulnerabilità note e di valutare la sicurezza complessiva del sistema. L'implementazione di pratiche di AI Red Teaming , ovvero l'impiego di team specializzati che simulano attacchi per individuare vulnerabilità, può rivelarsi estremamente efficace per testare la resistenza di modelli e componenti alle minacce reali. È altrettanto importante monitorare e verificare costantemente la sicurezza degli ambienti di sviluppo collaborativo, introducendo meccanismi di auditing che permettano di rilevare tempestivamente eventuali anomalie o abusi. Infine, la creazione di una politica di aggiornamento e patching costante per i componenti utilizzati nei modelli è cruciale per garantire che ogni vulnerabilità venga risolta nel minor tempo possibile, limitando così il rischio di esposizione a possibili exploit. L'uso di tecniche di cifratura dei modelli, soprattutto per quelli distribuiti su dispositivi locali, e l'integrazione di controlli di integrità, può prevenire la manomissione dei modelli e limitare l'accesso non autorizzato. Data and Model Poisoning (OWASP Top 10 LLM) Il Data e Model Poisoning si verifica quando i dati utilizzati per l'addestramento del modello vengono manipolati per introdurre vulnerabilità, bias, o addirittura per compromettere deliberatamente il modello. Questo tipo di attacco può influire negativamente sulle prestazioni del modello, portando a decisioni errate o comportamenti inattesi. Uno dei principali rischi è che i dati di addestramento, soprattutto quelli provenienti da fonti esterne, possano contenere informazioni maligne che alterano la capacità del modello di fare previsioni accurate. Questo è particolarmente vero quando i modelli vengono addestrati su dataset non verificati o su dati raccolti da ambienti pubblici, dove gli attaccanti possono facilmente iniettare contenuti avversariali. Ad esempio, un attaccante potrebbe manipolare il dataset inserendo esempi specifici progettati per insegnare al modello a comportarsi in modo errato in determinate situazioni. Questo tipo di attacco, noto come inserimento di backdoor , può lasciare il modello apparentemente normale fino a quando non si attiva uno specifico trigger che ne altera il comportamento. Un attacco del genere potrebbe consentire all'attaccante di bypassare misure di sicurezza o manipolare direttamente le risposte del modello. Per mitigare questi rischi, è fondamentale implementare misure di tracciabilità dei dati . Utilizzare strumenti come il Machine Learning Bill of Materials (ML-BOM) aiuta a tenere traccia dell'origine e delle trasformazioni dei dati lungo tutto il ciclo di vita del modello. La validazione dei dati è altrettanto importante: ogni dato dovrebbe essere sottoposto a un processo rigoroso di verifica prima di essere utilizzato per l'addestramento, soprattutto se proviene da fonti esterne o collaborative. Un'altra strategia efficace è l'utilizzo del data version control (DVC) per monitorare ogni cambiamento nei dataset. Questo aiuta a rilevare eventuali manipolazioni dei dati e a mantenere l'integrità dell'intero processo di sviluppo del modello. Inoltre, l'implementazione di tecniche di apprendimento avversariale consente di preparare il modello a resistere agli attacchi, migliorandone la robustezza rispetto a perturbazioni malevole. Un ulteriore passo per prevenire l'avvelenamento del modello consiste nell'adottare il sandboxing per limitare l'esposizione del modello a dati non verificati. Creare un ambiente isolato in cui testare i nuovi dati prima di utilizzarli effettivamente nell'addestramento riduce il rischio di compromettere il modello. Infine, l' utilizzo di tecniche di monitoraggio e rilevamento delle anomalie durante il processo di addestramento permette di identificare comportamenti inattesi del modello che potrebbero indicare la presenza di dati avvelenati. Improper Output Handling (OWASP Top 10 LLM) La gestione impropria degli output generati dai modelli LLM può esporre le applicazioni a una vasta gamma di vulnerabilità, inclusi attacchi di tipo esecuzione di codice remoto (RCE) , cross-site scripting (XSS) e SQL injection . Questo problema si verifica quando l'output prodotto dal modello viene utilizzato senza adeguata validazione o sanitizzazione. Essendo gli LLM sistemi che generano testo basato su input potenzialmente non verificati, essi possono essere sfruttati per introdurre comandi malevoli che vengono poi eseguiti da componenti successivi della catena applicativa. Per esempio, un output del modello che viene inserito in un sistema shell senza essere verificato potrebbe permettere a un attaccante di eseguire comandi arbitrari, compromettendo l'intero sistema. Analogamente, query SQL generate dall'LLM e utilizzate per accedere a database senza una corretta parametrizzazione potrebbero portare a vulnerabilità di tipo SQL injection , consentendo l'accesso non autorizzato ai dati. Nei contesti web, l'output non sanificato che viene mostrato in un browser può risultare in attacchi di cross-site scripting (XSS) , dove l'attaccante introduce script malevoli che vengono eseguiti dal browser dell'utente. Per mitigare questi rischi, è fondamentale trattare ogni output generato dal modello come potenzialmente pericoloso, applicando rigorose pratiche di validazione e sanitizzazione . L'adozione di controlli di contesto , come l'encoding dell'output in base all'ambiente di destinazione (HTML, SQL, JavaScript), è una misura essenziale per garantire che i contenuti generati non possano essere utilizzati in modo dannoso. L'uso di query parametrizzate per tutte le operazioni di database riduce il rischio che input non verificati possano alterare le operazioni previste. Inoltre, l'implementazione di Content Security Policy (CSP) può limitare l'impatto degli attacchi XSS impedendo l'esecuzione di script non autorizzati. L'utilizzo di sistemi di logging e monitoraggio avanzati può aiutare a rilevare comportamenti anomali negli output generati dai modelli. Ad esempio, monitorare costantemente il contenuto generato dall'LLM e identificare pattern sospetti può fornire un ulteriore livello di sicurezza, consentendo interventi rapidi in caso di rilevazione di attività malevole. È altresì importante definire limiti di frequenza e quote di utilizzo per prevenire abusi, specialmente in contesti dove il modello ha accesso a funzionalità critiche o risorse sensibili. In definitiva, garantire una corretta gestione degli output significa adottare un approccio di "zero trust" nei confronti dei contenuti generati, trattando il modello come un possibile vettore di attacco e implementando tutte le salvaguardie necessarie per proteggere i sistemi downstream da possibili compromissioni. Excessive Agency (OWASP Top 10 LLM) Il concetto di Excessive Agency fa riferimento all'autonomia eccessiva concessa a un modello linguistico di grandi dimensioni (LLM), che può portare il modello stesso a compiere azioni critiche senza un'adeguata supervisione umana. LLM con eccessiva autonomia possono prendere decisioni o eseguire operazioni che sono al di fuori del loro ambito previsto, potenzialmente causando danni o violazioni della sicurezza. Questo rischio è reso più critico dalla crescente diffusione di architetture basate su agenti, dove un LLM è utilizzato come punto decisionale per svolgere varie azioni. Nel contesto di un'applicazione LLM, l'autonomia può includere la capacità del modello di invocare funzionalità di sistema, di accedere a risorse esterne o di comunicare con altre parti di un sistema senza conferma umana. Questa capacità può essere utile per automatizzare attività, ma allo stesso tempo introduce vulnerabilità quando il controllo delle azioni non è sufficientemente limitato. Un esempio comune di eccessiva autonomia riguarda un LLM utilizzato come assistente per la gestione delle e-mail, che potrebbe avere accesso non solo alla lettura delle e-mail ma anche all'invio e alla cancellazione delle stesse. Questo tipo di accesso espone il sistema a un rischio significativo, soprattutto se un attaccante riesce a manipolare l'LLM attraverso prompt malevoli o dati esterni compromessi. Se il modello non è progettato per richiedere una conferma umana prima di eseguire determinate operazioni, un attacco potrebbe portare all'invio di e-mail non autorizzate o alla cancellazione di informazioni critiche. Un altro esempio può essere rappresentato dall'utilizzo di plugin o estensioni non necessari che aumentano la gamma di funzionalità disponibili per un LLM. Se un modello è abilitato a interagire con un sistema di gestione file, e questa estensione permette sia la lettura che la modifica dei file, il rischio è che un comportamento non voluto o un attacco indirizzato possa portare alla modifica o alla cancellazione di dati sensibili. I plugin con funzionalità estese, non strettamente necessarie all'operazione prevista, rappresentano un vettore di rischio perché offrono ulteriori punti di accesso che possono essere sfruttati. Un problema correlato è l' eccesso di permessi . Molto spesso, gli LLM sono configurati per operare con privilegi eccessivi, consentendo loro di accedere a funzionalità o risorse che non sono essenziali per le loro operazioni. Ad esempio, un'estensione che deve semplicemente leggere i dati di un database potrebbe essere configurata con permessi di scrittura, modifica o eliminazione, creando una superficie di attacco più ampia. Questo tipo di configurazione errata rende il sistema vulnerabile non solo a possibili attacchi, ma anche a errori che possono derivare da comportamenti imprevisti del modello. Per mitigare il rischio di eccessiva autonomia, è essenziale adottare un approccio che minimizzi le estensioni e le funzionalità disponibili per l'LLM. Le estensioni dovrebbero essere limitate alle sole operazioni strettamente necessarie, riducendo così la capacità del modello di compiere azioni dannose. È fondamentale applicare il principio del minimo privilegio , garantendo che ogni estensione o plugin operi con i privilegi più bassi possibili, necessari solo per la specifica operazione prevista. In questo modo, anche se il modello dovesse essere compromesso, le azioni che potrebbe compiere sarebbero fortemente limitate. Inoltre, l'implementazione di meccanismi di human-in-the-loop è cruciale per garantire che tutte le azioni ad alto impatto richiedano la conferma di un operatore umano prima di essere eseguite. Ad esempio, se un LLM è utilizzato per generare contenuti da pubblicare sui social media, la pubblicazione finale dovrebbe sempre essere approvata manualmente da un operatore umano per evitare errori o abusi. Infine, è importante implementare un monitoraggio continuo delle attività del modello, registrando tutte le operazioni eseguite e identificando eventuali comportamenti anomali. Questo tipo di logging può aiutare a rilevare rapidamente attività sospette e a rispondere in modo efficace. Inoltre, l'adozione di limiti di rate e di restrizioni sul numero di azioni che un LLM può compiere in un dato intervallo di tempo aiuta a prevenire abusi e a limitare l'impatto di eventuali compromissioni. Il rischio di Excessive Agency è quindi strettamente legato alla gestione delle capacità e dei permessi concessi agli LLM. Un'architettura ben progettata, che adotti misure di mitigazione come il principio del minimo privilegio, la supervisione umana per le azioni critiche, e un monitoraggio continuo delle attività, può ridurre significativamente l'esposizione a questo tipo di vulnerabilità, garantendo che l'LLM operi sempre entro limiti sicuri e controllati. System Prompt Leakage (OWASP Top 10 LLM) La vulnerabilità del System Prompt Leakage riguarda il rischio che i prompt di sistema, ossia le istruzioni utilizzate per guidare il comportamento del modello, possano contenere informazioni sensibili che non sono destinate a essere divulgate. I prompt di sistema sono progettati per fornire al modello le direttive necessarie per generare output adeguati, ma potrebbero involontariamente includere dati riservati o critici. Quando queste informazioni vengono scoperte, esse possono essere utilizzate per facilitare altri tipi di attacchi, creando così un pericolo significativo per la sicurezza del sistema. Un esempio comune di System Prompt Leakage si verifica quando i prompt contengono credenziali di accesso, chiavi API, o dettagli di configurazione che dovrebbero rimanere segreti. Se un attaccante riesce a estrarre questi prompt, può sfruttarli per accedere non autorizzato alle risorse del sistema, con conseguenze potenzialmente molto gravi. Un caso specifico riportato nella ricerca OWASP del 2025 mostra come, in diversi contesti aziendali, siano state involontariamente esposte informazioni come la struttura dei permessi interni o i limiti delle transazioni finanziarie di un utente, aumentando così il rischio di attacchi di tipo privilege escalation o bypass dei limiti di sicurezza. Inoltre, la vulnerabilità del System Prompt Leakage può rivelare criteri di filtraggio interni utilizzati per impedire al modello di fornire risposte sensibili. Ad esempio, un prompt di sistema potrebbe contenere istruzioni come: “Se un utente richiede informazioni su un altro utente, rispondi sempre con ‘Spiacente, non posso assisterti con questa richiesta’”. Se un attaccante riuscisse a visualizzare questo prompt, potrebbe sfruttarlo per aggirare le misure di sicurezza e manipolare il comportamento del modello in modi indesiderati. Per mitigare il rischio di System Prompt Leakage , è fondamentale separare i dati sensibili dai prompt di sistema ed evitare di includere qualsiasi tipo di informazione critica direttamente in essi. È preferibile gestire le informazioni riservate attraverso sistemi esterni al modello, assicurandosi che il modello non abbia accesso diretto a tali dati. Un altro approccio efficace consiste nell'implementare guardrail esterni al modello: mentre il training per comportamenti specifici può essere utile, non garantisce che il modello seguirà sempre le istruzioni, soprattutto in situazioni di attacco. Un sistema indipendente che verifichi gli output per garantire la conformità alle aspettative è preferibile rispetto alle sole istruzioni del prompt di sistema. Una strategia di mitigazione cruciale è garantire che i controlli di sicurezza siano applicati in modo indipendente dall'LLM. Ciò significa che i controlli critici, come la separazione dei privilegi e la verifica delle autorizzazioni, devono essere eseguiti in modo deterministico e verificabile, e non dovrebbero mai essere delegati al modello. Ad esempio, se un agente LLM svolge compiti che richiedono diversi livelli di accesso, dovrebbero essere utilizzati più agenti, ciascuno configurato con i privilegi minimi necessari per svolgere il proprio compito, riducendo così il rischio di esposizione accidentale di dati sensibili. In sintesi, il rischio legato al System Prompt Leakage non riguarda semplicemente la divulgazione dei prompt stessi, ma piuttosto la presenza di dati sensibili o di autorizzazioni eccessive all'interno di questi. Implementare controlli esterni robusti e limitare il contenuto dei prompt a informazioni non sensibili sono passi essenziali per proteggere l'integrità e la sicurezza delle applicazioni basate su LLM. Vector and Embedding Weaknesses (OWASP Top 10 LLM) Le debolezze negli embedding e nei vettori rappresentano un altro importante rischio per la sicurezza degli LLM. Gli embedding sono rappresentazioni numeriche che catturano il significato del testo e sono fondamentali per il funzionamento degli LLM. Tuttavia, queste rappresentazioni possono essere sfruttate per manipolare il modello o estrarre informazioni sensibili, specialmente se non sono protette da adeguati controlli di sicurezza. Una delle principali vulnerabilità riguarda l' inversione degli embedding , un tipo di attacco in cui un attaccante sfrutta i vettori di embedding per ricostruire informazioni sensibili originariamente incluse nei dati di addestramento. Questo processo di inversione può rivelare dettagli privati degli utenti o dati proprietari utilizzati per addestrare il modello, compromettendo così la privacy. Un esempio concreto riportato nella ricerca OWASP del 2025 mostra come un attaccante sia riuscito a recuperare informazioni personali, quali nomi o indirizzi, analizzando i vettori di embedding generati da un LLM non adeguatamente protetto. Inoltre, gli embedding possono diventare vulnerabili a causa di controlli di accesso insufficienti . In sistemi che utilizzano tecniche di Retrieval-Augmented Generation (RAG), le informazioni contenute nei vettori possono essere richiamate e combinate con nuove query, creando il rischio di una fuga di dati sensibili tra diversi utenti o contesti di utilizzo. Ad esempio, in ambienti multi-tenant, un errore nella separazione logica delle richieste potrebbe far sì che un utente riceva informazioni relative a un altro utente, generando un problema di riservatezza. Per mitigare questi rischi, è essenziale implementare controlli di accesso granulari che limitino l'utilizzo degli embedding a contesti sicuri e verificati. Gli embedding dovrebbero essere gestiti in modo che l'accesso sia strettamente controllato e autorizzato solo per specifici scopi. Inoltre, tecniche come la cifratura dei dati negli embedding possono contribuire a prevenire il rischio di inversione e fuga di informazioni. È altrettanto importante stabilire delle rigide politiche di validazione dei dati per garantire che le informazioni utilizzate per creare gli embedding siano pulite e provengano da fonti affidabili. Un ulteriore passo verso la mitigazione consiste nel monitorare continuamente l'uso degli embedding e delle risorse di RAG, mantenendo un registro dettagliato delle attività di accesso. Questo permette di rilevare tempestivamente comportamenti anomali che potrebbero indicare tentativi di manipolazione o di accesso non autorizzato. Il monitoraggio può essere abbinato a tecniche di anomaly detection per identificare rapidamente possibili attacchi e mitigare il loro impatto. In sintesi, le debolezze negli embedding e nei vettori rappresentano una sfida significativa per la sicurezza degli LLM. Implementare controlli di accesso rigorosi, cifrare i dati, e monitorare costantemente l'attività sono tutte misure fondamentali per proteggere questi elementi critici e garantire la sicurezza e la riservatezza delle applicazioni basate su LLM. Misinformation (OWASP Top 10 LLM) La disinformazione rappresenta uno dei rischi più critici nell'uso degli LLM, in quanto i modelli possono generare contenuti apparentemente accurati ma completamente errati o fuorvianti. Questo rischio è amplificato dalla capacità degli LLM di produrre risposte che suonano credibili ma sono basate su dati errati o interpretazioni sbagliate. La disinformazione può portare a violazioni della sicurezza, danni alla reputazione e persino conseguenze legali, soprattutto in contesti in cui l'affidabilità delle informazioni è cruciale, come la sanità, la finanza o il diritto. Uno dei principali problemi alla base della disinformazione è il fenomeno delle allucinazioni , ovvero la capacità del modello di "inventare" risposte laddove mancano dati concreti. Quando l'LLM non ha informazioni precise su un determinato argomento, può riempire i vuoti con dati generati statisticamente, che sembrano accurati ma sono in realtà inventati. Ad esempio, nella ricerca OWASP del 2025 sono stati documentati casi in cui modelli LLM hanno fornito riferimenti legali inesistenti o dettagli sanitari che non avevano alcuna base scientifica. Questo tipo di disinformazione può indurre gli utenti a prendere decisioni sbagliate, con conseguenze potenzialmente dannose. Un altro problema correlato è l' eccessiva fiducia che gli utenti possono riporre nei contenuti generati dagli LLM. Poiché le risposte appaiono spesso molto sicure e dettagliate, gli utenti tendono a non verificarne l'accuratezza, integrando informazioni errate nei processi decisionali senza un controllo adeguato. Questo può essere particolarmente rischioso in contesti sensibili. Ad esempio, un chatbot medico che fornisce informazioni scorrette potrebbe causare danni alla salute di un paziente, e un modello utilizzato in ambito finanziario potrebbe indurre a prendere decisioni economiche disastrose. Per ridurre il rischio di disinformazione, una strategia efficace è l'uso del Retrieval-Augmented Generation (RAG) , che consente al modello di accedere a fonti di informazioni aggiornate e verificate durante la generazione delle risposte. Questo approccio riduce il rischio di allucinazioni, poiché le risposte si basano su dati concreti piuttosto che sulla generazione statistica. Inoltre, è importante integrare la supervisione umana nei processi decisionali, soprattutto in ambiti critici: verificare manualmente le informazioni generate dal modello può migliorare l'accuratezza complessiva e ridurre la diffusione di contenuti errati. Un'altra tecnica di mitigazione è l' affinamento del modello tramite il fine-tuning e l'uso di embedding che migliorano la qualità delle risposte. L'applicazione di tecniche come la parameter-efficient tuning (PET) e il chain-of-thought prompting può contribuire a ridurre significativamente l'incidenza della disinformazione, poiché permette al modello di eseguire ragionamenti più strutturati e di verificare la coerenza delle informazioni generate. Infine, è fondamentale educare gli utenti sui limiti degli LLM e sull'importanza della verifica indipendente dei contenuti generati. Fornire formazione specifica agli utenti, soprattutto in contesti sensibili, aiuta a evitare un'eccessiva fiducia nei contenuti generati dal modello e a sviluppare un approccio più critico nell'utilizzo di queste tecnologie. In conclusione, la disinformazione rappresenta una vulnerabilità centrale per gli LLM, ma può essere mitigata con un approccio multidimensionale che combina l'uso di fonti esterne, la supervisione umana, l'affinamento continuo del modello e l'educazione degli utenti. Solo attraverso un controllo rigoroso e una verifica costante è possibile ridurre al minimo i rischi associati alla diffusione di informazioni errate da parte di questi modelli. Unbounded Consumption (OWASP Top 10 LLM) La Unbounded Consumption si riferisce al rischio che un LLM utilizzi risorse computazionali in modo incontrollato, con possibili conseguenze di denial of service o costi operativi elevati. Gli LLM, soprattutto quelli ospitati in ambienti cloud con modelli di pagamento "pay-per-use", possono essere vulnerabili a utilizzi eccessivi e non autorizzati, portando a costi insostenibili per l'organizzazione che li gestisce. Un esempio comune di questo rischio è il cosiddetto Denial of Wallet (DoW) , in cui un attaccante sfrutta il sistema di pagamento a consumo per generare richieste continue e costose verso il modello, causando un aumento significativo dei costi per il servizio. Questo tipo di attacco non solo può danneggiare economicamente l'organizzazione, ma può anche avere conseguenze operative, limitando la disponibilità del servizio per gli utenti legittimi. Nella ricerca del 2025 sono riportati casi specifici in cui il costo operativo di un'azienda è cresciuto esponenzialmente a causa di un attacco di tipo DoW, evidenziando come questo possa rappresentare una minaccia finanziaria significativa. Un'altra situazione tipica di Unbounded Consumption si verifica quando gli utenti inviano ripetutamente input molto complessi o sequenze lunghe, causando un utilizzo sproporzionato delle risorse del modello. In questi casi, il sistema può diventare lento o addirittura smettere di rispondere a causa dell'eccessiva pressione computazionale. Un esempio potrebbe essere l'utilizzo di richieste linguisticamente intricate che richiedono un'elaborazione significativa, risultando in un utilizzo inefficiente della CPU e della memoria. Per mitigare questi rischi, è fondamentale implementare limiti di frequenza e quote di utilizzo che regolino il numero massimo di richieste che possono essere effettuate da un singolo utente in un determinato periodo di tempo. In questo modo si può prevenire l'abuso delle risorse e garantire una distribuzione equa delle capacità computazionali tra gli utenti. La ricerca OWASP sottolinea l'importanza di limitare l'esposizione dei logits e di altre informazioni sensibili durante le interazioni API, riducendo così le potenziali vie di attacco per sfruttare il modello. Un altro approccio efficace è il monitoraggio continuo delle risorse , che consente di rilevare utilizzi anomali e rispondere rapidamente in caso di comportamenti sospetti. Sistemi di allarme e rate limiting possono essere configurati per intervenire in automatico qualora l'utilizzo del modello superi determinate soglie, garantendo così che le risorse rimangano sempre entro limiti gestibili. Infine, è utile considerare l'implementazione di tecniche di degrado controllato del sistema. In presenza di carichi eccessivi, il sistema può essere progettato per mantenere una funzionalità parziale piuttosto che andare incontro a un arresto completo. Ciò garantisce che almeno alcuni servizi rimangano operativi anche durante attacchi o sovraccarichi significativi, riducendo così l'impatto negativo sull'esperienza dell'utente finale. Questi approcci multidimensionali sono fondamentali per affrontare il rischio di Unbounded Consumption nelle applicazioni LLM e per garantire la continuità del servizio, la sostenibilità economica e la sicurezza operativa delle implementazioni basate su questi modelli. Conclusioni La crescente integrazione dei modelli linguistici di grandi dimensioni (LLM) nei processi aziendali e nei servizi al pubblico ha portato a una maggiore attenzione verso la loro sicurezza, evidenziando la necessità di affrontare nuove vulnerabilità. Questi rischi, pur essendo tecnici, hanno profonde implicazioni strategiche per le imprese, soprattutto in termini di fiducia, reputazione, compliance e sostenibilità economica. La comprensione delle vulnerabilità identificate nel report OWASP Top 10 LLM del 2025 consente di sviluppare prospettive uniche e di esplorare strategie innovative per mitigare i rischi, massimizzando il valore derivato da queste tecnologie avanzate. Un elemento chiave emerso è che le vulnerabilità non si limitano alla tecnologia in sé, ma spesso derivano dall’interazione tra modelli, dati e processi aziendali. Ad esempio, il problema del “Prompt Injection” non è solo una sfida tecnica, ma mette in discussione l’affidabilità del modello come strumento decisionale. Quando un modello può essere manipolato attraverso input malintenzionati, le aziende devono ripensare la loro fiducia nei risultati generati e costruire ecosistemi più resilienti. Adottare approcci come il “human-in-the-loop” non è solo una misura di sicurezza, ma diventa una scelta strategica per bilanciare automazione e controllo umano, preservando la qualità delle decisioni in scenari critici. La “Divulgazione di Informazioni Sensibili” sottolinea invece come il confine tra innovazione tecnologica e tutela della privacy sia fragile. Le imprese non possono più considerare la sicurezza dei dati come un requisito tecnico separato, ma devono integrarla nelle loro strategie di governance. Questo implica la costruzione di sistemi che vadano oltre la semplice anonimizzazione, abbracciando concetti come la privacy differenziale e l'apprendimento federato. Tali approcci non solo riducono i rischi, ma offrono un vantaggio competitivo in un contesto in cui la fiducia dei consumatori è un asset strategico. Le vulnerabilità nella catena di fornitura evidenziano come la sicurezza dell’AI dipenda da reti complesse di fornitori e partner. L’affidamento a modelli pre-addestrati o componenti di terze parti introduce rischi sistemici che richiedono una gestione proattiva. Le imprese devono iniziare a considerare la sicurezza della supply chain di modelli come parte integrante della loro strategia di gestione dei rischi, adottando strumenti come il Software Bill of Materials (SBOM) per garantire trasparenza e controllo. La “disinformazione” rappresenta una vulnerabilità dalle conseguenze strategiche più ampie, in quanto mina non solo la credibilità della tecnologia, ma anche quella delle imprese che la utilizzano. Le aziende devono affrontare questa sfida abbracciando un modello di accountability nei confronti degli utenti finali. Ciò significa non solo implementare sistemi di verifica e supervisione, ma anche educare il pubblico a comprendere i limiti della tecnologia. Questa consapevolezza può trasformare un rischio reputazionale in un’opportunità per rafforzare la fiducia. Infine, il rischio di “Unbounded Consumption” sottolinea come l’adozione di modelli LLM non sia solo una questione di innovazione tecnologica, ma anche di sostenibilità economica. La gestione inefficiente delle risorse può trasformarsi rapidamente in un problema finanziario, rendendo essenziale per le aziende implementare meccanismi di monitoraggio e controllo. Inoltre, il concetto di “denial of wallet” introduce una nuova prospettiva sui costi dell’AI, spingendo le organizzazioni a considerare soluzioni architetturali che bilancino prestazioni e protezione. Le aziende che vogliono sfruttare il potenziale degli LLM devono adottare una visione integrata che vada oltre la sicurezza tecnica, abbracciando un approccio strategico che consideri le implicazioni di fiducia, governance, resilienza e sostenibilità economica. Questo richiede di ripensare l’intero ciclo di vita dell’implementazione, dal design alla gestione operativa, per costruire sistemi che non siano solo sicuri, ma anche allineati agli obiettivi di business e capaci di rispondere alle sfide future. Podcast: https://spotifycreators-web.app.link/e/u8QkLNTqXOb Fonte: https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/
Gaming e intelligenza artificiale. BALROG il Nuovo standard per LLM e VLM
La rapida evoluzione dei Large Language Models (LLMs) e dei Vision Language Models (VLMs) ha riacceso l'interesse per la creazione di agenti generali capaci di raggiungere autonomamente obiettivi complessi. Questi modelli possiedono un vasto repertorio di conoscenze e hanno mostrato promettenti capacità di ragionamento in contesti specifici. Tuttavia, presentano ancora notevoli limitazioni quando si tratta di operare in ambienti complessi e dinamici, che richiedono pianificazione a lungo termine, esplorazione continua e gestione di interazioni intricate. BALROG è stato sviluppato proprio per affrontare questo problema: si tratta di un benchmark progettato per valutare le capacità agentiche di LLMs e VLMs attraverso una serie di giochi di complessità crescente. Questo progetto è stato realizzato grazie alla collaborazione tra l'AI Centre dell'University College London, IDEAS NCBR, la University of Oxford, la New York University e Anthropic. Gli autori principali della ricerca sono Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, Ulyana Piterbarg, Maciej Wolczyk, Akbir Khan, Eduardo Pignatelli, Łukasz Kuciński, Lerrel Pinto, Rob Fergus, Jakob Nicolaus Foerster, Jack Parker-Holder e Tim Rocktäschel. Obiettivi e struttura di BALROG BALROG si propone di fornire un ambiente unificato per la valutazione delle capacità degli LLMs e VLMs come agenti in ambienti di reinforcement learning. L'obiettivo principale è quello di spingere i modelli a superare le loro attuali limitazioni, testandoli in contesti che richiedono non solo capacità di comprensione e interazione, ma anche competenze avanzate di ragionamento, esplorazione e adattamento. BALROG è strutturato per sfidare i modelli in vari aspetti delle loro capacità agentiche, compresi il ragionamento spaziale, la pianificazione a lungo termine, e l'interazione con rappresentazioni multimodali. I giochi utilizzati per il benchmark spaziano da attività relativamente semplici, risolvibili da un umano non esperto in pochi secondi, fino a compiti estremamente complessi come l'ambiente di NetHack, che può richiedere anni per essere padroneggiato. I giochi inclusi in BALROG sono stati accuratamente selezionati per coprire una vasta gamma di abilità cognitive. Ad esempio: BabyAI : un ambiente relativamente semplice che valuta la capacità del modello di seguire istruzioni in linguaggio naturale e navigare in un mondo bidimensionale. Crafter : ispirato al famoso gioco Minecraft, questo ambiente richiede all'agente di esplorare, raccogliere risorse e creare oggetti, mettendo alla prova la sua capacità di sopravvivenza e gestione delle risorse. TextWorld : un gioco completamente testuale dove l'agente deve esplorare labirinti e interagire con oggetti quotidiani, dimostrando la sua capacità di comprendere e gestire scenari descritti solo verbalmente. Baba Is AI : basato sul popolare gioco di puzzle Baba Is You, questo ambiente valuta la capacità del modello di manipolare regole di gioco per risolvere problemi complessi, sfidando la sua abilità di ragionamento non convenzionale. MiniHack e NetHack : ambienti estremamente complessi e impegnativi, in cui gli agenti devono combinare capacità di esplorazione, navigazione e pianificazione a lungo termine per sopravvivere in dungeon procedurali. NetHack, in particolare, è noto per la sua difficoltà e per le competenze avanzate che richiede ai giocatori umani. Ogni gioco è caratterizzato da differenti livelli di difficoltà, simulazioni procedurali e requisiti di pianificazione a lungo termine, rendendo BALROG un benchmark completo e rappresentativo delle sfide che gli agenti LLM devono affrontare nel mondo reale. BALROG non si limita a valutare le prestazioni dei modelli, ma incoraggia anche lo sviluppo di nuove strategie per migliorare le capacità degli agenti, fornendo una piattaforma flessibile che supporta l'integrazione di nuovi metodi di prompting e approcci di reinforcement learning. Inoltre, BALROG adotta un'architettura modulare che permette la facile aggiunta di nuovi giochi e ambienti di test, mantenendo la piattaforma aperta per la ricerca e l'innovazione continua. Ogni componente del benchmark, dai task di navigazione di base alle sfide più avanzate come MiniHack e NetHack, contribuisce a fornire una panoramica dettagliata delle capacità dei modelli in contesti diversi e complessi. L'infrastruttura permette l'uso di agenti basati su zero-shot prompting, few-shot learning e altre tecniche avanzate, supportando così un'ampia gamma di metodologie di apprendimento e valutazione. Metodologia e metriche di valutazione Per valutare le capacità degli agenti, BALROG adotta metriche estremamente dettagliate e rigorose, pensate per misurare vari aspetti delle performance degli LLM e VLM in contesti complessi. Ogni modello viene valutato su una serie di parametri chiave, tra cui la capacità di risolvere problemi, l'efficacia delle decisioni prese, l'abilità di pianificazione a lungo termine, la gestione delle risorse, la reattività a input visivi e testuali, e la robustezza di fronte a sfide procedurali impreviste. I test vengono condotti utilizzando diverse configurazioni degli ambienti di gioco per assicurare la generalizzabilità delle capacità dei modelli. Gli agenti vengono valutati su ambienti generati proceduralmente, il che significa che ogni sessione di test presenta situazioni e mappe differenti, evitando qualsiasi possibilità di overfitting basato sulla memorizzazione delle soluzioni. Ogni ambiente include metriche dettagliate per catturare il progresso dell'agente, inclusi punteggi intermedi, numero di errori commessi, e tempo impiegato per completare i compiti. Per esempio, nell'ambiente di NetHack , è stato sviluppato un sistema di progressione basato sui livelli di esperienza e di dungeon raggiunti, in quanto il sistema di punteggio standard non era sufficiente a rappresentare adeguatamente il progresso del modello. In questo contesto, ogni livello raggiunto contribuisce a una valutazione progressiva del modello, permettendo di identificare quanto un agente sia vicino a completare con successo il gioco, con percentuali di completamento che spaziano dallo 0% fino al 100%. Le difficoltà di NetHack rendono particolarmente utile una misurazione fine-grained per monitorare la capacità di sopravvivenza e la strategia di pianificazione degli agenti. In BabyAI , la metrica principale è la precisione con cui l'agente segue le istruzioni e il tempo necessario per completare i compiti. Gli agenti sono valutati sulla loro capacità di navigare correttamente attraverso una serie di azioni descritte in linguaggio naturale. I modelli migliori riescono a completare i task con un'accuratezza superiore al 90% nelle situazioni più semplici, mentre mostrano un calo significativo con l'aumentare della complessità dei compiti. Per Crafter , l'analisi delle prestazioni si concentra sulla capacità degli agenti di raccogliere risorse, costruire strumenti e sopravvivere all'interno dell'ambiente per un lungo periodo. La complessità aumenta poiché le risorse sono scarse e l'ambiente è dinamico. Vengono misurati parametri come il numero di milestone raggiunte (es. raccolta di risorse rare, costruzione di strumenti avanzati) e la durata media della sopravvivenza. Nell'ambiente Baba Is AI , una particolare attenzione viene data alla capacità degli agenti di manipolare regole di gioco per risolvere puzzle complessi. Le metriche includono il numero di puzzle risolti, il tempo impiegato per ogni risoluzione, e la creatività dimostrata nel trovare soluzioni non convenzionali. Gli agenti devono non solo applicare regole esistenti, ma anche crearne di nuove combinando blocchi di testo per modificare le meccaniche di gioco. Per ogni scenario, BALROG fornisce una valutazione comparativa tra LLMs e VLMs, mettendo in evidenza le differenze nelle prestazioni tra rappresentazioni esclusivamente testuali e rappresentazioni che includono input visivi. Le rappresentazioni multimodali spesso comportano un calo di prestazioni, soprattutto in ambienti dove la visione è fondamentale per prendere decisioni efficaci, come in MiniHack e NetHack . I modelli multimodali sono valutati sulla loro capacità di integrare informazioni visive con quelle testuali, unendo percezione e ragionamento per navigare in ambienti complessi. Le metriche di BALROG sono progettate per essere normalizzate in un punteggio da 0 a 100, che permette una facile comparazione tra diversi modelli e configurazioni di esperimento. Questo approccio di valutazione dettagliato consente di identificare con precisione i punti deboli dei modelli e di monitorare i progressi compiuti nelle diverse aree critiche, come la pianificazione a lungo termine, la gestione dell'incertezza e la capacità di apprendimento adattivo. Principali risultati L'analisi delle prestazioni ha evidenziato che i modelli attuali riescono a ottenere buoni risultati nelle attività più semplici, ma mostrano importanti carenze in quelle più complesse. In particolare, NetHack si è rivelato uno degli ambienti più impegnativi, con i migliori modelli che sono riusciti a raggiungere solo un progresso medio del 1,5% in termini di avanzamento nel gioco. Il modello o1-preview ha ottenuto il miglior risultato, con un avanzamento medio del 1,57%, mentre altri modelli, come GPT-4o e Claude 3.5 Sonnet, hanno registrato performance ancora inferiori, evidenziando l'enorme difficoltà nel navigare e pianificare in ambienti di lunga durata come NetHack. Per MiniHack , la suite si è dimostrata estremamente impegnativa, con compiti come "Boxoban" che non sono mai stati risolti da alcun modello, evidenziando gravi carenze nelle capacità di pianificazione a lungo termine e gestione delle risorse. Solo alcuni modelli sono riusciti a completare i compiti più semplici, come i labirinti 9x9 e le battaglie nei corridoi. Nel caso di BabyAI , i modelli più performanti hanno ottenuto risultati medi di progressione superiori al 70%, con GPT-4o e Llama 3.1 70B in testa, mentre l'introduzione di input visivi ha provocato un calo delle prestazioni. Il modello Gemini-1.5-Pro ha mantenuto una performance stabile tra il formato testuale e quello visivo, dimostrando una maggiore robustezza. Per Crafter , il modello GPT-4o ha mostrato la miglior capacità di gestione delle risorse, con una progressione media del 33,10%. Tuttavia, anche in questo caso l'introduzione di input visivi ha portato a un calo delle prestazioni, suggerendo che l'integrazione efficace delle informazioni visive rimane un obiettivo lontano per molti modelli. Per TextWorld , i compiti più complessi, come il "Coin Collector", hanno presentato difficoltà elevate per tutti i modelli, con GPT-4o che è riuscito a completare il compito solo una volta su venti tentativi. I modelli Gemini hanno incontrato problematiche con l'API, che ha spesso classificato i prompt come "non sicuri", impedendo la valutazione completa. Un elemento ricorrente emerso dall'analisi è il cosiddetto "knowing-doing gap": molti modelli dimostrano di possedere conoscenze teoriche sul gioco ma non riescono a metterle in pratica durante l'esecuzione delle attività. Ad esempio, in NetHack, modelli come GPT-4o sono in grado di riconoscere il pericolo del consumo di cibo avariato, ma continuano a commettere questo errore durante il gioco, sottolineando una mancanza di integrazione pratica delle conoscenze acquisite. Infine, l'analisi comparativa ha mostrato che le architetture multimodali attuali non riescono ancora a sfruttare pienamente le informazioni visive per prendere decisioni efficaci. In ambienti come MiniHack e NetHack, la presentazione di immagini ha portato a un calo significativo delle prestazioni, evidenziando che il ragionamento basato sulla visione è ancora un'area in cui i modelli devono migliorare notevolmente. Le sfide aperte per il futuro BALROG non è solo un benchmark, ma anche una piattaforma per la prototipazione rapida di nuove metodologie di prompting e strategie di miglioramento delle capacità agentiche dei modelli. Diverse sono le sfide aperte per la ricerca futura, che includono miglioramenti all'integrazione tra input visivi e testuali, potenziamento delle capacità di pianificazione a lungo termine e colmare il "knowing-doing gap". 1. Migliorare l'integrazione Visivo-Linguistica I risultati di BALROG mostrano che le rappresentazioni multimodali non vengono ancora sfruttate efficacemente dagli agenti, suggerendo gravi lacune nel ragionamento basato sulla visione. La capacità di interpretare le informazioni visive e di integrarle con il linguaggio rimane un obiettivo distante. Future ricerche dovrebbero concentrarsi su tecniche come il self-supervised learning per migliorare la capacità dei modelli di estrarre insight rilevanti dalle rappresentazioni visive. Inoltre, l'introduzione di osservazioni video e storie di osservazioni multimmagine potrebbero fornire un contesto per migliorare la comprensione dei modelli in scenari a lungo termine, riducendo la difficoltà di elaborazione visiva. 2. Pianificazione a lungo termine e autonomia degli Agenti La pianificazione a lungo termine è stata una delle aree in cui gli agenti hanno mostrato le maggiori carenze. Per affrontare queste difficoltà, una possibile soluzione è l'uso di tecniche avanzate come il Chain-of-Thought Reasoning (CoT) che permette ai modelli di pensare in modo iterativo e formulare piani più coerenti. Inoltre, l'uso di sistemi di memoria persistente potrebbe consentire agli agenti di accumulare esperienza nel corso di più sessioni di gioco, migliorando la loro capacità di pianificazione e di prendere decisioni informate basate su esperienze precedenti. Un altro approccio potrebbe essere quello di sviluppare sistemi di Reinforcement Learning (RL) in-context , dove l'agente apprende direttamente dagli errori durante il processo di inferenza, migliorando gradualmente le sue capacità di pianificazione senza bisogno di riaddestramenti completi. 3. Colmare il Knowing-Doing Gap Il cosiddetto "knowing-doing gap" rappresenta una sfida significativa per i modelli attuali. Molti agenti sanno teoricamente cosa fare in situazioni specifiche, ma non riescono a mettere in pratica queste conoscenze durante il gioco. Un approccio per colmare questo divario potrebbe essere l'integrazione di meccanismi di auto-riflessione che consentano al modello di valutare le proprie azioni e apportare modifiche comportamentali. Inoltre, l'uso di tecniche di in-context fine-tuning , in cui l'agente viene adattato in tempo reale sulla base delle esperienze del gioco, potrebbe rivelarsi efficace per migliorare la coerenza tra conoscenza teorica e azione pratica. 4. Affrontare i limiti computazionali dei modelli attuali I modelli attuali sono limitati dal punto di vista computazionale, il che influisce sulla loro capacità di risolvere compiti complessi. Il Trade-off tra profondità del modello e contesto è un aspetto cruciale da considerare per il miglioramento delle prestazioni. Per affrontare questo problema, una direzione di ricerca potrebbe concentrarsi sull'uso di meccanismi di ottimizzazione dell'attenzione , come il PagedAttention , che permettono di gestire in maniera più efficiente il contesto e di concentrare le risorse computazionali solo sugli elementi rilevanti per il compito in corso. 5. Introduzione di strategie di Prompting Multi-Agente e uso di strumenti In futuro, BALROG potrebbe anche esplorare il ruolo della collaborazione multi-agente. Gli agenti potrebbero beneficiare dell'integrazione di strategie di prompting multi-agente, dove diversi modelli lavorano in collaborazione per risolvere compiti complessi. Inoltre, l'uso di strumenti e API esterne per migliorare il processo decisionale potrebbe rappresentare un'importante direzione di sviluppo, consentendo agli agenti di acquisire informazioni e competenze che vanno oltre le loro capacità di base. Conclusioni I risultati di BALROG sottolinea un punto cruciale: i modelli attuali di intelligenza artificiale, seppur avanzati, restano intrappolati in un divario tra la capacità di "sapere" e quella di "fare". Questa constatazione non è soltanto un problema tecnico, ma riflette un limite intrinseco nella progettazione degli agenti: l'assenza di un vero "intento agentico". Gli agenti LLM e VLM non posseggono una comprensione innata del perché certe azioni siano necessarie o utili in un determinato contesto. Questo suggerisce che la loro programmazione attuale li posiziona come strumenti reattivi piuttosto che come sistemi capaci di navigare autonomamente le complessità strategiche. Il mancato sviluppo di una piena integrazione tra aspetti visivi e linguistici, unito alla carenza di pianificazione a lungo termine, mette in luce un'opportunità ancora inesplorata: realizzare modelli in grado di apprendere non solo dalle informazioni, ma anche dall'esperienza, attraverso euristiche operative e adattive. Per esempio, nei giochi come NetHack o MiniHack, l'incapacità di collegare esperienze pregresse con decisioni future è un segnale che i modelli mancano di una memoria strutturale che trascenda la sessione di inferenza. Questo non si traduce solo in un problema di performance, ma limita profondamente l'applicazione di tali sistemi in scenari reali, dove la continuità e l'adattabilità sono fondamentali. Dal punto di vista strategico per le imprese, ciò apre due prospettive innovative. In primo luogo, c'è la necessità di sviluppare sistemi ibridi che combinino la potenza di calcolo delle AI attuali con processi decisionali che incorporino "intenzionalità simulata". Questo potrebbe significare modelli progettati per apprendere schemi comportamentali contestuali piuttosto che semplici risposte task-oriented. Tali modelli potrebbero essere cruciali in settori come la gestione delle supply chain, dove la pianificazione a lungo termine e l'adattamento alle variabili sono essenziali. In secondo luogo, il concetto di "knowing-doing gap" potrebbe portare a una rivoluzione nel modo in cui le imprese progettano i flussi di lavoro digitali. Sistemi di AI in grado di autoregolarsi e riflettere sul proprio operato in tempo reale potrebbero ridurre l'intervento umano in processi decisionali complessi, migliorando efficienza e resilienza. Immaginiamo, ad esempio, un sistema di AI per la gestione finanziaria che, oltre ad analizzare i dati storici, apprende dai propri errori e adatta le sue previsioni per mitigare rischi futuri. Infine, l’incapacità di gestire input visivi come parte integrante del processo decisionale richiama una lezione fondamentale: le AI multimodali devono essere progettate non per tradurre passivamente input visivi in output linguistici, ma per "vivere" il contesto visivo come parte integrante della loro comprensione. Questo ha implicazioni enormi per settori come la robotica industriale e l’assistenza sanitaria, dove l’interazione tra sistemi visivi e decisionali potrebbe diventare una chiave competitiva decisiva. BALROG non è solo un benchmark tecnico; è uno specchio per comprendere le future traiettorie dell’intelligenza artificiale. Per le imprese, il messaggio è chiaro: chi saprà investire in soluzioni che colmino il divario tra "sapere" e "fare" otterrà non solo un vantaggio tecnologico, ma anche strategico, in un mondo sempre più complesso e interconnesso. Podcast: https://spotifycreators-web.app.link/e/wFoXgWTKWOb Fonte: https://arxiv.org/abs/2411.13543
Competenze e governance AI per la trasformazione del settore pubblico
In un'era in cui l'Intelligenza Artificiale (AI) diventa sempre più prevalente, il suo potenziale per trasformare il settore pubblico è indiscutibile. Tuttavia, la diffusione dell'AI nel settore pubblico dipende in larga misura dalla presenza di competenze adeguate e dall'adozione di pratiche di governance efficaci. Questo articolo si basa su una sintesi di ricerche empiriche, letteratura grigia e politica, un workshop di esperti e interviste a rappresentanti di sette organizzazioni pubbliche europee (Ministero dell'Interno della Cechia, Comune di Gladsaxe in Danimarca, Distretto di Lüneburg in Germania, Ministero della Governance Digitale della Grecia, Istituto Nazionale di Previdenza Sociale in Italia, Comune di Amsterdam nei Paesi Bassi e Comune di Trondheim in Norvegia), al fine di identificare le competenze e le pratiche di governance necessarie per generare valore nel settore pubblico grazie all'AI. Gli autori principali della ricerca sono R. Medaglia, P. Mikalef e L. Tangi, appartenenti rispettivamente alla Copenhagen Business School, alla Norwegian University of Science and Technology e al Centro Comune di Ricerca della Commissione Europea. Il contesto normativo europeo L'impegno dell'Unione Europea verso l'AI ha avuto inizio con la Dichiarazione di Cooperazione del 2018 e ha trovato un ulteriore avanzamento nella revisione del Piano Coordinato sull'AI del 2021, che ha evidenziato il ruolo strategico dell'AI nel settore pubblico. Oggi esistono numerose iniziative e misure legislative per facilitare l'integrazione dell'AI nell'amministrazione pubblica. Tra queste spiccano l'AI Act e l'Interoperable Europe Act, entrambi adottati nel 2024. L'AI Act stabilisce un approccio basato sui rischi per la regolamentazione dell'AI, vietando i sistemi che pongono rischi inaccettabili e definendo applicazioni ad alto rischio soggette a controlli rigorosi. Questo atto promuove inoltre l'innovazione attraverso aree di sperimentazione regolatoria e ha portato alla formazione del Comitato Europeo per l'Intelligenza Artificiale e di un database dell'UE per i sistemi di AI ad alto rischio. L'Interoperable Europe Act, proposto nel novembre 2022 e adottato nell'aprile 2024, mira a migliorare l'interoperabilità transfrontaliera dei sistemi IT impiegati nei servizi pubblici. Introduce l'Interoperable European Board, responsabile della definizione di un'agenda strategica condivisa per l'interoperabilità transfrontaliera, e richiede valutazioni di interoperabilità per i sistemi IT che operano oltre confine. Inoltre, ha annunciato il lancio dell'Interoperable Europe Portal, una piattaforma collaborativa per la condivisione e il riutilizzo di soluzioni IT. Questo atto incoraggia anche l'innovazione tramite aree di sperimentazione regolatoria e partenariati GovTech. Altre leggi rilevanti includono il Digital Services Act (DSA), che mira a stabilire regole chiare per i fornitori di servizi digitali garantendo sicurezza agli utenti e maggiore trasparenza, il Digital Markets Act (DMA), progettato per assicurare condizioni eque nel mercato digitale, e il Data Governance Act (DGA), che intende aumentare la fiducia nella condivisione dei dati e superare gli ostacoli tecnici al loro riutilizzo. La normativa include anche il Data Act e il Cybersecurity Act, tutti volti a creare un ecosistema digitale sicuro e interoperabile. Un'iniziativa chiave in questo ambito è il Public Sector Tech Watch (PSTW), un osservatorio istituito nel 2023 e gestito dalla Direzione Generale per i Servizi Digitali della Commissione Europea e dal Centro Comune di Ricerca (JRC). PSTW funge da piattaforma per lo scambio di conoscenze, esperienze e risorse educative tra dipendenti pubblici, imprese private, enti accademici e strategisti politici, facilitando la trasformazione digitale e la compatibilità dei sistemi pubblici europei. PSTW include un database di oltre 1.000 casi di utilizzo dell'AI e altre tecnologie emergenti nel settore pubblico e promuove un ambiente collaborativo per lo scambio di pratiche ed esperienze, anche attraverso iniziative come competizioni per il miglior caso d'uso. Inoltre, il Technical Support Instrument (TSI) e iniziative come "AI-ready public administration" forniscono supporto tecnico su misura agli Stati membri per prepararsi all'adozione dell'AI, inclusi partenariati GovTech e contratti modello per il procurement di soluzioni AI affidabili e sicure. Governance AI per la trasformazione del settore pubblico: metodologia della ricerca Il rapporto si basa su una metodologia in tre fasi, mirata a sviluppare una visione completa e aggiornata delle competenze e delle pratiche di governance necessarie per l'uso dell'AI nelle organizzazioni pubbliche. La prima fase ha coinvolto una revisione sistematica della letteratura accademica e della documentazione politica e grigia. La seconda fase ha riguardato un workshop online con 40 esperti di settore, tenutosi il 25 ottobre 2023, finalizzato a consolidare e approfondire i risultati ottenuti nella fase di revisione della letteratura. Gli esperti provenivano da diverse organizzazioni pubbliche e il workshop è stato strutturato in sessioni di lavoro suddivise in gruppi di discussione per esplorare a fondo sia le competenze che le pratiche di governance dell'AI. I risultati del workshop sono stati utilizzati per verificare i risultati della revisione della letteratura e sono stati sintetizzati in un report. Infine, nella terza fase sono state condotte interviste semi-strutturate con responsabili di sette organizzazioni pubbliche europee in vari paesi (Cechia, Danimarca, Germania, Grecia, Italia, Paesi Bassi e Norvegia), con l'obiettivo di arricchire e validare i risultati. Sono state realizzate un totale di 19 interviste tra maggio e novembre 2023. Le interviste si sono concentrate sull'esperienza individuale con l'AI, la percezione della rilevanza dell'AI nel contesto di lavoro specifico e le difficoltà percepite nell'ottenimento delle competenze legate all'AI nel settore pubblico. Le trascrizioni delle interviste sono state elaborate con il supporto di software di trascrizione automatica e poi manualmente riviste per garantirne l'accuratezza. Quadro delle competenze per l'AI nel settore pubblico Il rapporto presenta un quadro completo delle competenze necessarie per l'adozione e l'uso dell'AI nel settore pubblico, distinguendo tra competenze tecniche, gestionali, politiche, legali ed etiche. Queste competenze sono ulteriormente classificate in tre cluster: competenze attitudinali (conoscenza del "perché"), operative (conoscenza del "come") e di alfabetizzazione (conoscenza del "cosa"). Le competenze tecniche comprendono conoscenze approfondite in ambito tecnologico, abilità nella gestione di banche dati, capacità di valutare la qualità dei dati e selezionare le architetture di intelligenza artificiale più adeguate. Sul piano operativo, risultano fondamentali la gestione dei dati, la programmazione software mirata all'intelligenza artificiale e l'adesione agli standard tecnici previsti in questo settore. Per quanto riguarda l'aspetto attitudinale, la curiosità verso le innovazioni tecnologiche e l'impegno nel continuo apprendimento rappresentano qualità indispensabili per affrontare con successo le sfide legate all'intelligenza artificiale. Le competenze gestionali comprendono la leadership, la gestione del cambiamento e la capacità di mediare tra diversi gruppi di interesse. In particolare, la leadership viene vista come la capacità di guidare iniziative di AI e di integrare la tecnologia in maniera etica ed efficace, mentre la gestione del cambiamento riguarda la capacità di adattare i processi organizzativi all'adozione dell'AI. Le competenze politiche, legali ed etiche includono la consapevolezza delle implicazioni etiche e la capacità di lavorare con esperti del settore per garantire che l'adozione dell'AI avvenga in modo responsabile. È essenziale che i funzionari pubblici abbiano la capacità di formulare domande di policy compatibili con le tecniche di AI e di collaborare con esperti del dominio per tradurre concetti complessi in soluzioni pratiche. La capacità di auditing e di garantire la conformità agli standard di progettazione e responsabilità è anch'essa fondamentale. Le competenze di alfabetizzazione comprendono la comprensione dei fondamenti dell'apprendimento automatico, la visione artificiale e il natural language processing (NLP), nonché una conoscenza approfondita dei quadri legali e delle politiche pubbliche. Inoltre, la capacità di gestire il procurement di soluzioni AI in maniera compatibile con i valori dell'interesse pubblico è vista come una competenza cruciale per garantire che l'AI venga utilizzata in modo equo e trasparente nel settore pubblico. Pratiche di governance per l'AI Il rapporto distingue le pratiche di governance in tre dimensioni principali: pratiche procedurali, strutturali e relazionali. Ogni dimensione è articolata su tre livelli: strategico, tattico e operativo. L'obiettivo delle pratiche di governance è di garantire che vi sia una coerenza tra gli obiettivi dell'organizzazione e la tecnologia utilizzata per raggiungerli. Questo significa implementare norme e regolamenti che guidino l'uso responsabile dell'AI e che favoriscano una cultura dell'innovazione aperta e collaborativa. Le pratiche procedurali si riferiscono ai processi e alle norme che devono essere messi in atto per gestire l'AI in modo responsabile. Queste includono l'adozione di linee guida per lo sviluppo etico dell'AI, la definizione di standard per la gestione dei dati e la creazione di criteri per l'auditing dei sistemi di AI. Un esempio significativo è l'utilizzo di framework di conformità che includono valutazioni di impatto etico e legale durante tutto il ciclo di vita dell'AI, al fine di garantire la conformità alle normative europee come l'AI Act e il GDPR. Le pratiche strutturali riguardano l'organizzazione interna e la distribuzione dei ruoli e delle responsabilità in relazione all'AI. Questo implica la creazione di unità dedicate all'AI, la nomina di Chief AI Officers, e la definizione di politiche di governance per assicurare che le iniziative di AI siano in linea con la strategia complessiva dell'organizzazione. Le organizzazioni pubbliche devono istituire team multidisciplinari che comprendano esperti di AI, analisti di dati, giuristi ed esperti di etica per monitorare e supervisionare l'implementazione dell'AI. Questo garantisce che l'uso dell'AI sia gestito in modo da rispettare i valori dell'interesse pubblico. Le pratiche relazionali si concentrano sulla gestione delle relazioni tra le diverse parti interessate, sia interne che esterne all'organizzazione. Questo include la collaborazione con altre agenzie governative, il coinvolgimento delle comunità locali, e la creazione di partnership con il settore privato e con le università. Un elemento fondamentale è la trasparenza e il coinvolgimento dei cittadini, attraverso consultazioni pubbliche e la condivisione delle informazioni sulle applicazioni di AI in uso. Queste pratiche mirano a costruire fiducia e a garantire che l'AI sia sviluppata e utilizzata in modo responsabile e con il consenso del pubblico. La governance strategica prevede la definizione di una visione chiara sull'uso dell'AI, con obiettivi a lungo termine che includano l'innovazione e il miglioramento dei servizi pubblici. A livello tattico, le pratiche di governance comprendono la pianificazione delle risorse e la gestione del rischio associato all'implementazione dell'AI, mentre a livello operativo si concentrano sulla formazione del personale, l'allocazione delle risorse necessarie e il monitoraggio continuo delle performance dei sistemi di AI. L'adozione di un approccio basato su cicli di feedback continui è essenziale per garantire che le soluzioni di AI siano adattive e in grado di rispondere ai cambiamenti nei requisiti dell'organizzazione e alle aspettative dei cittadini. Raccomandazioni e prospettive future Basandosi sulle analisi svolte, il rapporto presenta sei raccomandazioni per lo sviluppo delle competenze e delle pratiche di governance dell'AI nel settore pubblico. Queste raccomandazioni mirano a creare un contesto favorevole per l'adozione etica ed efficace dell'AI, promuovendo una cultura dell'innovazione, del miglioramento continuo e della responsabilità sociale. Di seguito, vengono esposte nel dettaglio le raccomandazioni principali e le azioni correlate: Formazione continua e sviluppo delle competenze : La formazione continua rappresenta un elemento essenziale per garantire che il personale del settore pubblico possa sfruttare al meglio le potenzialità dell'AI. Per sviluppare le competenze necessarie all'adozione e alla gestione efficace delle tecnologie AI, sono state identificate diverse azioni strategiche. Programmi di formazione continua : I programmi di formazione devono essere progettati per includere vari livelli di complessità, partendo da un'alfabetizzazione generale sull'AI per tutti i dipendenti pubblici fino a corsi avanzati per coloro che lavorano direttamente con tecnologie AI. I contenuti di questi corsi dovrebbero includere i fondamenti del machine learning, i concetti base del natural language processing, le implicazioni etiche dell'AI e le pratiche di gestione dei dati. Workshop pratici e casi di studio : La teoria deve essere affiancata da workshop pratici e l'analisi di casi di studio concreti. I workshop possono includere sessioni di programmazione e configurazione di modelli di AI, così come simulazioni per comprendere i processi decisionali automatizzati. L'analisi di casi di studio, invece, permetterà ai funzionari di vedere esempi di applicazioni di AI sia di successo che fallimentari, aiutando a comprendere le sfide e le opportunità reali. Collaborazioni con Università e Centri di Ricerca : Il settore pubblico dovrebbe collaborare attivamente con università e centri di ricerca per sviluppare corsi specifici e personalizzati. Questo tipo di collaborazione può garantire un accesso continuo alle ultime novità in campo tecnologico e alle best practice accademiche, oltre a favorire la co-creazione di contenuti formativi che rispondano a bisogni specifici delle amministrazioni pubbliche. Programmi di mentorship : La mentorship rappresenta uno strumento importante per accelerare il trasferimento di competenze. Esperti di AI e figure senior all'interno delle organizzazioni pubbliche possono essere assegnati come mentori a nuovi membri del personale o a coloro che hanno bisogno di sviluppare specifiche competenze sull'AI. La mentorship può essere utile non solo per trasmettere conoscenze tecniche, ma anche per affrontare aspetti legati alla gestione del cambiamento e alla comunicazione dei progetti di AI ai diversi stakeholder. Formazione in ambito etico e normativo : La formazione non deve limitarsi agli aspetti tecnici dell'AI, ma deve includere anche le competenze in ambito etico e normativo. Il personale deve essere consapevole delle implicazioni etiche dell'uso dell'AI, comprendere i potenziali rischi legati a bias algoritmici e garantire la protezione dei dati personali. La conoscenza delle normative rilevanti, come l'AI Act e il GDPR, deve essere parte integrante dei programmi di formazione. Approccio modulare e personalizzato : Un aspetto cruciale dei programmi di formazione deve essere la modularità. Ogni dipendente pubblico ha esigenze e livelli di competenza differenti; quindi, la formazione deve essere personalizzata e modulare. Questo consente di adattare i percorsi di apprendimento in base ai ruoli specifici e al livello di responsabilità dei dipendenti nell'ambito dell'adozione dell'AI. Utilizzo di piattaforme di E-Learning e certificazioni : Le piattaforme di e-learning possono essere utilizzate per garantire l'accesso continuo alle risorse formative, permettendo ai dipendenti di apprendere a loro ritmo. L'introduzione di certificazioni ufficiali può inoltre incentivare la partecipazione ai corsi e garantire il riconoscimento delle competenze acquisite. Valutazione e aggiornamento continuo dei programmi : I programmi di formazione devono essere sottoposti a valutazione periodica per garantirne l'efficacia e l'aggiornamento rispetto ai continui cambiamenti tecnologici e normativi. Le esigenze del settore pubblico evolvono, così come le tecnologie AI; pertanto, i contenuti dei corsi devono essere aggiornati regolarmente in modo da mantenere la rilevanza e l'efficacia della formazione. Promozione di partnership pubblico-privato : Le partnership pubblico-privato rappresentano un elemento chiave per favorire l'adozione di soluzioni AI innovative e accedere a competenze e tecnologie all'avanguardia. La collaborazione tra amministrazioni pubbliche, aziende tecnologiche e istituzioni di ricerca può garantire uno sviluppo più rapido ed efficace di soluzioni AI, nonché contribuire a costruire un ecosistema di innovazione sostenibile e orientato ai bisogni della collettività. Di seguito vengono esposti in dettaglio i principali elementi e benefici delle partnership pubblico-privato: Collaborazione con aziende tecnologiche : Le amministrazioni pubbliche possono beneficiare enormemente dall'esperienza e dall'innovazione del settore privato. I partenariati con aziende tecnologiche consentono di accedere a risorse avanzate e competenze tecniche che spesso non sono disponibili internamente. Ad esempio, attraverso questi partenariati, le organizzazioni pubbliche possono beneficiare dell'uso di piattaforme di analisi avanzata, sistemi di machine learning pre-addestrati e soluzioni di cloud computing per la gestione dei dati. Progetti di ricerca e sviluppo (R&S) con istituzioni accademiche : La collaborazione con le università e i centri di ricerca è essenziale per sviluppare progetti di ricerca applicata e trasferimento tecnologico. Questi partenariati non solo favoriscono l'innovazione, ma garantiscono anche che le soluzioni AI sviluppate siano basate su solidi principi scientifici e siano testate in modo rigoroso prima dell'implementazione su larga scala. Tali collaborazioni possono anche prevedere la creazione di laboratori congiunti di innovazione e la co-progettazione di soluzioni tecnologiche con ricercatori e studenti. Accesso a finanziamenti e risorse : Le partnership con il settore privato possono anche facilitare l'accesso a risorse finanziarie aggiuntive necessarie per sostenere l'implementazione dell'AI. Le aziende private possono co-finanziare progetti di AI innovativi, riducendo il rischio finanziario per le amministrazioni pubbliche e rendendo più facile sperimentare soluzioni pionieristiche. Inoltre, i partenariati possono consentire alle amministrazioni di beneficiare di infrastrutture tecnologiche e strumenti avanzati di cui altrimenti non potrebbero disporre. Sviluppo di soluzioni condivise : Le soluzioni sviluppate attraverso partenariati pubblico-privato spesso possono essere adattate e riutilizzate in contesti diversi. Ciò riduce i costi e accelera il processo di trasformazione digitale. Ad esempio, un modello di AI sviluppato per migliorare l'efficienza dei servizi sanitari in una regione può essere utilizzato come base per sviluppare soluzioni simili in altre regioni o in altri settori dell'amministrazione pubblica, come l'istruzione o i trasporti. Garanzia di trasparenza e conformità : È fondamentale che le partnership pubblico-privato siano strutturate in modo da garantire la massima trasparenza e la protezione dei dati dei cittadini. Per questo motivo, devono essere definiti protocolli chiari per la gestione dei dati, la privacy e la sicurezza delle informazioni. La definizione di standard e linee guida per la trasparenza è essenziale per mantenere la fiducia dei cittadini nell'uso dell'AI da parte delle amministrazioni pubbliche. Le partnership devono includere accordi dettagliati che definiscano ruoli, responsabilità e modalità di condivisione dei dati. Promozione dell'innovazione attraverso competizioni e premi : Un modo per incentivare la partecipazione delle aziende private nello sviluppo di soluzioni AI per il settore pubblico è attraverso l'organizzazione di competizioni e hackathon. Questi eventi possono attirare startup, piccole e medie imprese (PMI) e grandi aziende a contribuire con idee e soluzioni innovative. La competizione sana e la possibilità di ottenere premi o contratti con le amministrazioni pubbliche stimolano la creatività e la generazione di nuove idee. Supporto alla creazione di ecosistemi di innovazione : Le partnership pubblico-privato possono anche supportare la creazione di ecosistemi di innovazione locali, coinvolgendo non solo grandi aziende ma anche startup, PMI e incubatori di impresa. Questi ecosistemi sono essenziali per creare un ambiente fertile in cui le nuove idee possano essere sperimentate e sviluppate. Le amministrazioni pubbliche possono facilitare la creazione di tali ecosistemi promuovendo l'accesso a finanziamenti, offrendo incentivi fiscali e creando spazi fisici in cui pubblico e privato possano collaborare. Queste azioni mirano a creare una sinergia efficace tra pubblico e privato, con l'obiettivo di massimizzare il valore generato dall'AI per il bene comune e garantire che le soluzioni adottate siano in linea con gli standard etici e i bisogni della collettività. Solo attraverso un impegno congiunto e una cooperazione aperta sarà possibile sfruttare appieno le potenzialità dell'AI per migliorare i servizi pubblici e la qualità della vita dei cittadini. Sperimentazione regolatoria e aree sandbox : La sperimentazione regolatoria e le aree sandbox rappresentano strumenti fondamentali per l'adozione efficace dell'AI nel settore pubblico. Queste iniziative consentono di testare nuove tecnologie e approcci innovativi in un ambiente controllato (una sandbox è un ambiente protetto in cui si possono sperimentare soluzioni senza impattare sui sistemi reali o violare normative), riducendo al minimo i rischi associati all'implementazione e garantendo che le soluzioni siano conformi alle normative vigenti. Di seguito vengono descritti i principali elementi e le azioni correlate alle aree di sperimentazione regolatoria e alle sandbox: Aree Sandbox per l'AI : Le aree sandbox permettono alle amministrazioni pubbliche di testare nuove soluzioni AI in un contesto regolamentato e con un livello di supervisione adeguato. Queste sandbox sono create per garantire che le tecnologie emergenti possano essere sviluppate, valutate e affinate prima della loro diffusione su larga scala. Le aree sandbox offrono un ambiente protetto in cui le amministrazioni possono collaborare con aziende tecnologiche, startup e università per sviluppare applicazioni AI innovative, riducendo il rischio di fallimenti costosi e migliorando la qualità delle soluzioni finali. Coinvolgimento dei cittadini : Il coinvolgimento dei cittadini rappresenta un aspetto cruciale nelle aree sandbox. Le consultazioni pubbliche e i processi di feedback permettono di valutare l'impatto sociale delle tecnologie AI, assicurando che le soluzioni sviluppate rispondano ai bisogni della collettività e rispettino i valori dell'interesse pubblico. Coinvolgere direttamente i cittadini nei processi di sperimentazione può anche contribuire ad aumentare la fiducia nelle soluzioni AI, mostrando come vengono gestiti i rischi associati all'implementazione della tecnologia. Valutazione dell'impatto e trasparenza : Ogni progetto avviato all'interno delle aree sandbox deve essere soggetto a una rigorosa valutazione dell'impatto etico, sociale e legale. La valutazione dell'impatto permette di identificare potenziali rischi legati alla privacy, alla discriminazione algoritmica o ad altri aspetti critici, e di introdurre misure correttive prima dell'implementazione su larga scala. Inoltre, è essenziale garantire la trasparenza dei risultati dei test condotti nelle aree sandbox, pubblicando rapporti dettagliati che descrivano il processo di sperimentazione, i risultati ottenuti e le lezioni apprese. Linee guida per l'implementazione delle Aree Sandbox : Per garantire un uso efficace delle aree sandbox, è necessario stabilire linee guida chiare che definiscano il processo di creazione e gestione delle sandbox, i criteri di selezione dei progetti da testare e le modalità di supervisione. Queste linee guida devono assicurare che tutti i progetti siano in linea con i valori e gli obiettivi dell'amministrazione pubblica, che rispettino le normative vigenti e che adottino un approccio basato sui rischi per garantire la sicurezza e la conformità delle soluzioni sviluppate. Supporto normativo e finanziario : La creazione di aree sandbox richiede un adeguato supporto normativo e finanziario. Le amministrazioni pubbliche devono poter contare su un quadro normativo flessibile che consenta la sperimentazione regolatoria senza vincoli eccessivi. Allo stesso tempo, devono essere disponibili risorse finanziarie per supportare i costi della sperimentazione, inclusi quelli legati all'infrastruttura tecnologica e alla formazione del personale coinvolto. Feedback e miglioramento continuo : Uno degli obiettivi delle aree sandbox è quello di creare un ciclo continuo di feedback e miglioramento. Ogni sperimentazione dovrebbe essere seguita da un'attenta analisi dei risultati, con l'obiettivo di migliorare non solo la tecnologia sperimentata ma anche il processo di sperimentazione stessa. Questo approccio iterativo consente di adattare le soluzioni AI alle esigenze reali delle amministrazioni pubbliche e dei cittadini, garantendo che ogni fase di sviluppo sia basata sull'apprendimento e sul miglioramento continuo. Integrazione con le Politiche Europee di Innovazione : Le aree di sperimentazione regolatoria devono essere strettamente integrate con le politiche europee in materia di innovazione e AI, come l'AI Act e l'Interoperable Europe Act. Questa integrazione è essenziale per garantire che le soluzioni sviluppate nelle sandbox siano allineate con le normative europee e possano essere facilmente scalate a livello transfrontaliero, favorendo una maggiore interoperabilità e una diffusione più ampia delle migliori pratiche nel settore pubblico. Queste pratiche di sperimentazione regolatoria e le aree sandbox mirano a ridurre il rischio associato all'adozione di tecnologie innovative, a migliorare la qualità delle soluzioni sviluppate e a garantire che l'AI sia utilizzata in modo responsabile e trasparente nel settore pubblico. La combinazione di sperimentazione, collaborazione e valutazione dell'impatto rappresenta un approccio completo per massimizzare il potenziale dell'AI e assicurare che i benefici siano equamente distribuiti tra tutti i cittadini. Rafforzamento delle pratiche di governance etica e legale : Il rafforzamento delle pratiche di governance etica e legale è fondamentale per garantire che l'adozione dell'AI nel settore pubblico avvenga in modo responsabile e in linea con i valori della collettività. Di seguito vengono sviluppate le principali azioni da intraprendere per garantire un'implementazione etica e legale dell'AI: Creazione di linee guida etiche per lo sviluppo dell'AI : Le linee guida etiche sono necessarie per stabilire criteri chiari per lo sviluppo e l'utilizzo dell'AI nel settore pubblico. Queste linee guida devono coprire diversi aspetti, tra cui la raccolta e l'uso dei dati, la gestione dei bias, la responsabilità degli sviluppatori e degli operatori, e la tutela della privacy. Le linee guida devono essere integrate nei processi di procurement e sviluppo, assicurando che ogni soluzione AI adottata sia allineata ai principi etici approvati dall'organizzazione e dal quadro normativo europeo. Valutazioni d'impatto etico e legale : Ogni progetto di AI deve essere accompagnato da una valutazione d'impatto etico e legale che ne analizzi le potenziali conseguenze a livello di equità, privacy, sicurezza e trasparenza. Queste valutazioni devono essere condotte in fase iniziale e aggiornate durante tutto il ciclo di vita del progetto, identificando rischi potenziali e prevedendo misure correttive per mitigare tali rischi. Istituzione di comitati etici : La creazione di comitati etici a livello nazionale o locale ha lo scopo di supervisionare le decisioni chiave in materia di AI. Questi comitati devono essere composti da esperti di etica, rappresentanti del settore pubblico, accademici e membri della società civile. Il loro ruolo è quello di valutare i progetti di AI, offrire raccomandazioni etiche, garantire che i principi di equità e non discriminazione siano rispettati e che l'interesse pubblico sia sempre al centro delle decisioni prese. Definizione di standard per l'auditing algoritmico : Gli algoritmi utilizzati dalle amministrazioni pubbliche devono essere soggetti ad auditing periodici per garantirne la conformità alle normative vigenti e per prevenire bias o utilizzi impropri. L'auditing deve includere un'analisi trasparente del funzionamento dell'algoritmo, l'identificazione di possibili distorsioni e la verifica della precisione e dell'affidabilità dei risultati. È importante stabilire un processo formale per l'auditing e identificare gli indicatori chiave di prestazione (KPI) che consentano di valutare l'efficacia e l'impatto degli algoritmi. Garanzia di trasparenza e responsabilità : Per rafforzare la governance dell'AI, è essenziale promuovere la trasparenza in ogni fase di sviluppo e implementazione delle tecnologie AI. Le amministrazioni pubbliche devono comunicare in modo chiaro le finalità per cui l'AI viene utilizzata, i dati impiegati e le modalità con cui vengono prese le decisioni algoritmiche. La responsabilità deve essere assicurata attraverso un sistema di governance che preveda meccanismi per l'accertamento delle responsabilità e che consenta ai cittadini di contestare le decisioni prese dalle tecnologie AI, laddove queste possano avere un impatto significativo sui loro diritti. Controllo sulla raccolta e sull'uso dei dati : I dati rappresentano la base su cui vengono addestrati i modelli di AI, ed è quindi fondamentale che la raccolta e l'uso dei dati siano effettuati in modo responsabile. Le amministrazioni pubbliche devono assicurarsi che i dati raccolti siano di alta qualità, pertinenti e gestiti secondo le normative sulla privacy. La minimizzazione dei dati, ovvero la raccolta del solo dato strettamente necessario, e la pseudonimizzazione (tecnica che sostituisce i dati identificativi con identificatori pseudonimi per proteggere l'identità degli individui) sono pratiche chiave per garantire un uso sicuro e conforme alle normative dei dati personali. Queste azioni hanno l'obiettivo di garantire che l'adozione dell'AI nel settore pubblico avvenga in modo sicuro, responsabile e in linea con i valori dell'interesse pubblico. Il rafforzamento delle pratiche di governance etica e legale è una componente cruciale per promuovere la fiducia dei cittadini nell'uso dell'AI e assicurare che questa tecnologia contribuisca al miglioramento dei servizi pubblici senza compromettere i diritti e le libertà individuali. Creazione di un ecosistema di supporto per la trasformazione digitale : La creazione di un ecosistema di supporto per la trasformazione digitale nel settore pubblico non riguarda solo la fornitura di risorse finanziarie e tecnologiche, ma anche lo sviluppo di una rete di attori e istituzioni che lavorino insieme per favorire l'innovazione. Di seguito vengono approfonditi i principali componenti e azioni necessari per garantire un ecosistema efficace e resiliente per la trasformazione digitale: Supporto istituzionale e politico : È fondamentale che vi sia un supporto istituzionale solido per la trasformazione digitale. I governi devono elaborare piani strategici chiari per l'adozione dell'AI e di altre tecnologie digitali, includendo obiettivi specifici e scadenze definite. Questo supporto deve essere accompagnato da politiche favorevoli che incentivino la digitalizzazione, eliminando le barriere burocratiche e promuovendo una visione coordinata tra tutti i livelli dell'amministrazione pubblica, dalle istituzioni nazionali alle comunità locali. Piattaforme di condivisione delle conoscenze : La condivisione delle conoscenze è un elemento chiave per la trasformazione digitale. Le amministrazioni pubbliche devono avere accesso a piattaforme che facilitino lo scambio di esperienze, best practice e casi di studio. Queste piattaforme, come il Public Sector Tech Watch (PSTW), possono contribuire a ridurre la curva di apprendimento per nuove tecnologie e permettere una rapida diffusione delle innovazioni che hanno avuto successo in altri contesti. La disponibilità di risorse e documentazione facilmente accessibili è cruciale per accelerare il processo di digitalizzazione. Sostegno finanziario e accesso ai Fondi Europei : La trasformazione digitale richiede investimenti significativi, ed è essenziale che le amministrazioni pubbliche abbiano accesso a finanziamenti adeguati. Fondi come Horizon Europe, Digital Europe Programme e il Recovery and Resilience Facility (RRF) sono cruciali per sostenere progetti di trasformazione digitale su larga scala. Tuttavia, è altrettanto importante fornire supporto tecnico e consulenziale alle amministrazioni per facilitare l'accesso a tali fondi, assicurando che anche le piccole e medie amministrazioni possano usufruire di queste opportunità finanziarie. Incentivi per l'innovazione e l'assunzione di talenti digitali : Le amministrazioni pubbliche devono creare incentivi per attrarre e trattenere talenti con competenze digitali. L'assunzione di esperti in AI, data science e trasformazione digitale è cruciale per il successo di qualsiasi strategia di innovazione. Incentivi come premi per l'innovazione, opportunità di formazione avanzata, e percorsi di carriera dedicati possono aiutare a costruire un team di esperti che sia in grado di guidare il cambiamento all'interno delle amministrazioni. Inoltre, programmi di reclutamento mirati alle nuove generazioni di talenti digitali possono aiutare a colmare il divario di competenze tecnologiche nel settore pubblico. Quadro regolatorio e normative flessibili : Il successo della trasformazione digitale dipende anche dalla presenza di un quadro normativo adeguato. Gli Stati membri devono adottare un approccio regolatorio che sia sufficientemente flessibile da permettere l'innovazione, ma che allo stesso tempo protegga i cittadini da potenziali abusi. Le normative devono essere aggiornate periodicamente per riflettere l'evoluzione delle tecnologie e delle esigenze della società, assicurando che siano in linea con i principi etici e di tutela dei diritti umani. Queste azioni e componenti sono essenziali per la creazione di un ecosistema di supporto alla trasformazione digitale nel settore pubblico. Solo attraverso l'accesso a risorse adeguate e un forte impegno istituzionale sarà possibile sfruttare appieno le potenzialità dell'AI e delle tecnologie emergenti. Promozione di una cultura dell'innovazione e del rischio calcolato : Promuovere una cultura dell'innovazione e del rischio calcolato è essenziale per garantire che il settore pubblico sia capace di sperimentare e adottare nuove tecnologie come l'AI senza essere paralizzato dalla paura del fallimento. Una cultura che accetta il rischio calcolato e che incoraggia l'innovazione è in grado di produrre soluzioni più creative ed efficaci per rispondere alle sfide del settore pubblico. Di seguito vengono illustrate le principali azioni da intraprendere per costruire una cultura dell'innovazione e del rischio calcolato: Incentivare la sperimentazione e l'apprendimento dagli errori : È cruciale creare un ambiente in cui gli errori siano considerati parte del processo di apprendimento, piuttosto che fallimenti da evitare a tutti i costi. Le amministrazioni pubbliche devono promuovere una cultura in cui il personale è incoraggiato a sperimentare nuove soluzioni e a imparare dagli errori commessi. Questo può essere ottenuto attraverso la creazione di programmi pilota che consentano di testare nuove idee in un ambiente protetto, senza le conseguenze negative di un'implementazione immediata su larga scala. Formazione e supporto per gestire l'innovazione : La gestione dell'innovazione richiede competenze specifiche che spesso non sono presenti nelle tradizionali strutture del settore pubblico. Per questo motivo, è importante fornire formazione specifica per i dirigenti e i responsabili dei progetti, con l'obiettivo di sviluppare le competenze necessarie per gestire processi innovativi e per prendere decisioni strategiche in situazioni di incertezza. Questa formazione deve includere anche aspetti legati alla gestione del rischio, all'identificazione delle opportunità e alla mitigazione degli effetti negativi. Incoraggiare una mentalità proattiva e aperta al cambiamento : Le amministrazioni devono impegnarsi attivamente per incoraggiare una mentalità proattiva e aperta al cambiamento. Ciò può essere ottenuto attraverso campagne di comunicazione interna che sottolineino i benefici dell'innovazione e mostrino casi di successo, nonché attraverso la condivisione di storie di innovazione e di buone pratiche all'interno dell'organizzazione. Una leadership che sostiene attivamente il cambiamento e l'innovazione è fondamentale per creare un ambiente che incoraggi il personale a essere proattivo e a sperimentare nuove idee. Promuovere l'adozione di tecniche di Design Thinking : Il design thinking è un approccio creativo e centrato sull'utente che può aiutare le amministrazioni pubbliche a risolvere problemi complessi. L'integrazione del design thinking nei processi decisionali permette di esplorare nuove idee, testarle rapidamente e adattarle in base al feedback ricevuto. Questo approccio consente di mantenere il focus sulle esigenze dei cittadini e di trovare soluzioni innovative che migliorino la qualità dei servizi pubblici. Valutazione del rischio e gestione delle incertezze : L'innovazione comporta inevitabilmente dei rischi. Per questo è fondamentale implementare pratiche di gestione del rischio che consentano di identificare, valutare e mitigare i rischi associati all'adozione di nuove tecnologie. Le amministrazioni pubbliche devono sviluppare metodologie per valutare le incertezze e prendere decisioni informate che bilancino opportunità e rischi, assicurandosi che le innovazioni adottate siano sostenibili e che non mettano a repentaglio la sicurezza o la fiducia dei cittadini. Leadership che sostiene il cambiamento : La promozione di una cultura dell'innovazione e del rischio calcolato richiede una leadership visionaria e disposta a sostenere il cambiamento. I leader devono essere i primi a dimostrare apertura verso l'innovazione, creando un ambiente che non solo accetta ma incoraggia il rischio ragionato. Questo tipo di leadership è fondamentale per superare le resistenze interne e per motivare il personale a impegnarsi nei progetti di trasformazione digitale. Queste azioni hanno l'obiettivo di sviluppare una cultura del settore pubblico che sia orientata al miglioramento continuo, all'apprendimento dagli errori e alla sperimentazione. Solo creando un ambiente in cui il rischio calcolato è considerato parte integrante del processo di innovazione, sarà possibile sfruttare appieno le potenzialità dell'AI e delle altre tecnologie digitali per migliorare i servizi pubblici e rispondere alle esigenze in continua evoluzione dei cittadini. Conclusioni La governance dell'Intelligenza Artificiale nel settore pubblico non rappresenta solo una questione di competenze tecniche o normative, ma un profondo cambiamento culturale e strategico. In questa transizione, il settore pubblico si trova davanti a una sfida cruciale: adottare l'AI non solo come uno strumento tecnologico, ma come un catalizzatore per ripensare il modo in cui lo Stato interagisce con i cittadini e risponde alle loro esigenze. La capacità di un’organizzazione pubblica di sfruttare l’AI non dipende unicamente da risorse economiche o regolamentazioni adeguate, ma soprattutto da una visione chiara e condivisa che consideri la tecnologia come un’opportunità per costruire fiducia, equità e innovazione. Il rischio più grande per il settore pubblico non è l’adozione sbagliata dell’AI, ma la mancata adozione della trasformazione culturale necessaria a renderla uno strumento di progresso sociale. L'AI, con la sua capacità di automatizzare processi complessi e di analizzare enormi quantità di dati, può migliorare l’efficienza dei servizi pubblici, ma senza una governance inclusiva rischia di creare un divario ancora maggiore tra le istituzioni e i cittadini. Le comunità più vulnerabili potrebbero essere escluse da questi benefici, non per una mancanza di tecnologie adeguate, ma a causa di sistemi che non considerano i bisogni di tutti. È qui che la governance etica diventa il vero pilastro strategico: non come un vincolo, ma come una leva per garantire che l’AI sia al servizio dell’interesse pubblico. Un altro aspetto fondamentale è il valore della sperimentazione. La creazione di spazi di sandbox regolatorie, di cui tanto si parla, non deve essere vista solo come un ambiente protetto per testare tecnologie, ma come un simbolo del nuovo atteggiamento che il settore pubblico deve adottare. Questi spazi permettono di trasformare il fallimento in apprendimento, un concetto che sfida radicalmente la tradizionale avversione al rischio tipica della burocrazia pubblica. Le organizzazioni che riusciranno a coltivare una cultura del rischio calcolato diventeranno esempi di come l’AI possa non solo essere implementata, ma anche migliorata continuamente in risposta ai bisogni dei cittadini. Le competenze richieste dall’AI vanno ben oltre la tecnologia. Certamente, il settore pubblico ha bisogno di esperti in machine learning o data science, ma il vero motore della trasformazione sarà la capacità di integrare queste competenze con una leadership visionaria e un’etica forte. La leadership in questo contesto non significa solo saper prendere decisioni tecnologiche, ma soprattutto saper comunicare una visione inclusiva e orientata al futuro. Questa leadership deve essere capace di navigare le complessità delle regolamentazioni, delle aspettative dei cittadini e delle collaborazioni con il settore privato. Le partnership pubblico-privato rappresentano un altro punto di svolta strategico. Tuttavia, il settore pubblico non deve accontentarsi di essere un "cliente" del settore privato. Deve diventare un partner attivo, capace di negoziare soluzioni che rispettino i valori pubblici e che siano trasparenti nella loro implementazione. Questa collaborazione deve andare oltre la semplice fornitura tecnologica: il settore pubblico ha il dovere di guidare il dialogo su come l’AI debba essere progettata, implementata e monitorata per garantire benefici equi. Infine, la vera trasformazione sarà misurata non solo dai miglioramenti operativi, ma dalla capacità dell’AI di rafforzare il contratto sociale tra Stato e cittadini. L’AI può diventare uno strumento per rendere le istituzioni più trasparenti e responsabili, ma solo se i cittadini vengono coinvolti attivamente nella sua progettazione e nel monitoraggio. La fiducia sarà il vero indicatore di successo: non una fiducia cieca nella tecnologia, ma una fiducia costruita su processi aperti, risultati tangibili e un impegno visibile verso il bene comune. Questa riflessione evidenzia che l’adozione dell’AI nel settore pubblico non è solo una questione di come farlo, ma di perché farlo. Il rischio non è tecnologico, ma strategico: perdere l’occasione di rendere l’AI un alleato del progresso sociale, piuttosto che una semplice macchina al servizio dell’efficienza. Le decisioni prese oggi non solo determineranno l’efficacia dei servizi pubblici, ma definiranno il ruolo delle istituzioni in una società sempre più digitale e interconnessa. Podcast: https://spotifycreators-web.app.link/e/ZuRqX0jMVOb Fonte: https://publications.jrc.ec.europa.eu/repository/handle/JRC138702
AI Governance for Public Sector Transformation
In an era where Artificial Intelligence (AI) is becoming increasingly prevalent, its potential to transform the public sector is undeniable. However, the spread of AI in the public sector largely depends on the availability of adequate skills and the adoption of effective governance practices. This article is based on a synthesis of empirical research, gray and policy literature, an expert workshop, and interviews with representatives from seven European public organizations (Ministry of the Interior of the Czech Republic, Municipality of Gladsaxe in Denmark, Lüneburg District in Germany, Ministry of Digital Governance of Greece, National Social Security Institute in Italy, Municipality of Amsterdam in the Netherlands, and Municipality of Trondheim in Norway) to identify the skills and governance practices needed to generate value in the public sector through AI. The main authors of the research are R. Medaglia, P. Mikalef, and L. Tangi, from Copenhagen Business School, the Norwegian University of Science and Technology, and the European Commission's Joint Research Centre, respectively. The European Regulatory Framework The European Union's commitment to AI began with the Declaration of Cooperation in 2018 and further advanced with the revision of the Coordinated Plan on AI in 2021, highlighting AI's strategic role in the public sector. Today, there are numerous initiatives and legislative measures to facilitate the integration of AI in public administration. Among these, the AI Act and the Interoperable Europe Act, both adopted in 2024, stand out. The AI Act establishes a risk-based approach to AI regulation, banning systems that pose unacceptable risks and defining high-risk applications subject to stringent controls. This act also promotes innovation through regulatory sandboxes and led to the formation of the European Artificial Intelligence Board and an EU database for high-risk AI systems. The Interoperable Europe Act, proposed in November 2022 and adopted in April 2024, aims to improve cross-border interoperability of IT systems used in public services. It introduces the Interoperable European Board, responsible for defining a shared strategic agenda for cross-border interoperability, and requires interoperability assessments for IT systems operating across borders. Additionally, it announced the launch of the Interoperable Europe Portal, a collaborative platform for sharing and reusing IT solutions. This act also encourages innovation through regulatory sandboxes and GovTech partnerships. Other relevant laws include the Digital Services Act (DSA), which aims to establish clear rules for digital service providers ensuring user safety and greater transparency, the Digital Markets Act (DMA), designed to ensure fair conditions in the digital market, and the Data Governance Act (DGA), which aims to increase trust in data sharing and overcome technical barriers to their reuse. The legislation also includes the Data Act and the Cybersecurity Act, all aimed at creating a secure and interoperable digital ecosystem. A key initiative in this area is the Public Sector Tech Watch (PSTW), an observatory established in 2023 and managed by the European Commission's Directorate General for Digital Services and the Joint Research Centre (JRC). PSTW serves as a platform for exchanging knowledge, experiences, and educational resources among public employees, private companies, academic institutions, and policy strategists, facilitating digital transformation and compatibility of European public systems. PSTW includes a database of over 1,000 use cases of AI and other emerging technologies in the public sector and promotes a collaborative environment for sharing practices and experiences, also through initiatives such as best use case competitions. Furthermore, the Technical Support Instrument (TSI) and initiatives like "AI-ready public administration" provide tailored technical support to Member States to prepare for AI adoption, including GovTech partnerships and model contracts for procuring reliable and secure AI solutions. AI Governance for Public Sector Transformation: Research Methodology The report is based on a three-phase methodology aimed at developing a comprehensive and updated view of the skills and governance practices required for AI use in public organizations. The first phase involved a systematic review of academic literature and policy and gray documentation. The second phase involved an online workshop with 40 sector experts, held on October 25, 2023, aimed at consolidating and deepening the findings obtained in the literature review phase. The experts came from various public organizations, and the workshop was structured into working sessions divided into discussion groups to explore in depth both AI skills and governance practices. The workshop results were used to verify the findings from the literature review and were summarized in a report. Finally, in the third phase, semi-structured interviews were conducted with leaders of seven European public organizations in various countries (Czech Republic, Denmark, Germany, Greece, Italy, Netherlands, and Norway), to enrich and validate the results. A total of 19 interviews were conducted between May and November 2023. The interviews focused on individual experience with AI, the perception of AI's relevance in the specific work environment, and the perceived difficulties in acquiring AI skills in the public sector. The interview transcripts were processed with the help of automatic transcription software and then manually reviewed to ensure accuracy. Competency Framework for AI in the Public Sector The report presents a comprehensive framework of the skills required for the adoption and use of AI in the public sector, distinguishing between technical, managerial, political, legal, and ethical skills. These skills are further classified into three clusters: attitudinal skills (knowledge of "why"), operational skills (knowledge of "how"), and literacy skills (knowledge of "what"). Technical skills include in-depth knowledge of technology, data management skills, the ability to evaluate data quality, and to select appropriate AI architectures. On the operational side, data management, AI-targeted software programming, and adherence to the technical standards required in this field are essential. As for the attitudinal aspect, curiosity about technological innovations and a commitment to continuous learning are essential qualities for successfully facing AI challenges. Managerial skills include leadership, change management, and the ability to mediate between different interest groups. In particular, leadership is seen as the ability to lead AI initiatives and integrate the technology in an ethical and effective way, while change management involves the ability to adapt organizational processes to AI adoption. Political, legal, and ethical skills include awareness of ethical implications and the ability to work with sector experts to ensure that AI adoption takes place responsibly. It is essential that public officials have the ability to formulate policy questions compatible with AI techniques and collaborate with domain experts to translate complex concepts into practical solutions. The ability to audit and ensure compliance with design and accountability standards is also fundamental. Literacy skills include an understanding of the fundamentals of machine learning, computer vision, and natural language processing (NLP), as well as a thorough knowledge of legal frameworks and public policies. In addition, the ability to manage the procurement of AI solutions in a manner consistent with public interest values is seen as a crucial skill to ensure that AI is used fairly and transparently in the public sector. Governance Practices for AI The report distinguishes governance practices into three main dimensions: procedural, structural, and relational. Each dimension is articulated at three levels: strategic, tactical, and operational. The goal of governance practices is to ensure consistency between the organization's objectives and the technology used to achieve them. This means implementing rules and regulations that guide the responsible use of AI and foster a culture of open and collaborative innovation. Procedural practices refer to the processes and rules that need to be put in place to manage AI responsibly. These include the adoption of guidelines for ethical AI development, the definition of standards for data management, and the creation of criteria for AI system auditing. A significant example is the use of compliance frameworks that include ethical and legal impact assessments throughout the AI lifecycle to ensure compliance with European regulations such as the AI Act and GDPR. Structural practices concern the internal organization and the distribution of roles and responsibilities related to AI. This involves creating AI-dedicated units, appointing Chief AI Officers, and defining governance policies to ensure that AI initiatives are aligned with the organization's overall strategy. Public organizations need to establish multidisciplinary teams that include AI experts, data analysts, lawyers, and ethics experts to monitor and oversee AI implementation. This ensures that AI use is managed to respect public interest values. Relational practices focus on managing relationships among different stakeholders, both internal and external to the organization. This includes collaboration with other government agencies, engagement with local communities, and the creation of partnerships with the private sector and universities. A key element is transparency and citizen engagement through public consultations and sharing information on AI applications in use. These practices aim to build trust and ensure that AI is developed and used responsibly and with public consent. Strategic governance involves defining a clear vision for AI use, with long-term goals that include innovation and improving public services. At the tactical level, governance practices include resource planning and risk management associated with AI implementation, while at the operational level, they focus on staff training, resource allocation, and continuous monitoring of AI system performance. The adoption of a continuous feedback cycle approach is essential to ensure that AI solutions are adaptive and able to respond to changing organizational requirements and citizen expectations. Recommendations and Future Perspectives Based on the analysis carried out, the report presents six recommendations for the development of AI skills and governance practices in the public sector. These recommendations aim to create a favorable environment for the ethical and effective adoption of AI, promoting a culture of innovation, continuous improvement, and social responsibility. Below, the main recommendations and related actions are outlined in detail: Continuous Training and Skill Development Continuous training is an essential element to ensure that public sector personnel can make the most of AI's potential. Several strategic actions have been identified to develop the skills needed for the adoption and effective management of AI technologies. Continuous Training Programs : Training programs should be designed to include various levels of complexity, starting from general AI literacy for all public employees to advanced courses for those working directly with AI technologies. The content of these courses should include the fundamentals of machine learning, basic natural language processing concepts, AI's ethical implications, and data management practices. Practical Workshops and Case Studies : Theory must be complemented with practical workshops and case studies. Workshops can include sessions on programming and configuring AI models, as well as simulations to understand automated decision-making processes. Case study analysis, on the other hand, will allow officials to see examples of both successful and unsuccessful AI applications, helping to understand real challenges and opportunities. Collaborations with Universities and Research Centers : The public sector should actively collaborate with universities and research centers to develop specific and customized courses. Such collaboration can guarantee continuous access to the latest technological innovations and academic best practices, as well as foster the co-creation of training content that meets the specific needs of public administrations. Mentorship Programs : Mentorship represents an important tool to accelerate skills transfer. AI experts and senior figures within public organizations can be assigned as mentors to new staff members or those needing to develop specific AI skills. Mentorship can be useful not only for conveying technical knowledge but also for addressing aspects related to change management and communicating AI projects to various stakeholders. Training in Ethical and Regulatory Aspects : Training must not be limited to the technical aspects of AI but must also include skills in the ethical and regulatory fields. Staff must be aware of the ethical implications of AI use, understand the potential risks associated with algorithmic biases, and ensure the protection of personal data. Knowledge of relevant regulations, such as the AI Act and GDPR, must be an integral part of training programs. Modular and Customized Approach : A crucial aspect of training programs must be modularity. Each public employee has different needs and levels of competence; therefore, training must be customized and modular. This allows learning paths to be adapted based on specific roles and the level of responsibility of employees in the adoption of AI. Use of E-Learning Platforms and Certifications : E-learning platforms can be used to ensure continuous access to training resources, allowing employees to learn at their own pace. The introduction of official certifications can also encourage participation in courses and ensure the recognition of acquired skills. Continuous Evaluation and Updating of Programs : Training programs must be subject to periodic evaluation to ensure their effectiveness and updating concerning continuous technological and regulatory changes. The needs of the public sector evolve, as do AI technologies; therefore, course content must be regularly updated to maintain relevance and effectiveness. 2 . Promotion of Public Private Partnerships: Public-private partnerships are a key element in fostering the adoption of innovative AI solutions and accessing cutting-edge skills and technologies. Collaboration between public administrations, technology companies, and research institutions can ensure faster and more effective development of AI solutions, as well as contribute to building a sustainable innovation ecosystem oriented towards the needs of the community. Below, the main elements and benefits of public-private partnerships are outlined in detail: Collaboration with Technology Companies : Public administrations can greatly benefit from the experience and innovation of the private sector. Partnerships with technology companies enable access to advanced resources and technical skills that are often not available internally. For example, through these partnerships, public organizations can benefit from the use of advanced analytics platforms, pre-trained machine learning systems, and cloud computing solutions for data management. R&D Projects with Academic Institutions : Collaboration with universities and research centers is essential for developing applied research and technology transfer projects. These partnerships not only foster innovation but also ensure that AI solutions are based on solid scientific principles and rigorously tested before large-scale implementation. Such collaborations can also involve creating joint innovation labs and co-designing technology solutions with researchers and students. Access to Funding and Resources : Partnerships with the private sector can also facilitate access to additional financial resources needed to support AI implementation. Private companies can co-finance innovative AI projects, reducing the financial risk for public administrations and making it easier to experiment with pioneering solutions. In addition, partnerships can allow administrations to benefit from technological infrastructure and advanced tools they would otherwise not have access to. Development of Shared Solutions : Solutions developed through public-private partnerships can often be adapted and reused in different contexts. This reduces costs and speeds up the digital transformation process. For example, an AI model developed to improve healthcare efficiency in one region can be used as a basis for developing similar solutions in other regions or in other public administration sectors, such as education or transport. Ensuring Transparency and Compliance : It is crucial that public-private partnerships are structured to ensure maximum transparency and citizen data protection. For this reason, clear protocols must be defined for data management, privacy, and information security. Defining standards and guidelines for transparency is essential to maintain citizens' trust in AI use by public administrations. Partnerships must include detailed agreements defining roles, responsibilities, and data sharing methods. Promotion of Innovation through Competitions and Awards : One way to encourage private companies to participate in developing AI solutions for the public sector is through competitions and hackathons. These events can attract startups, small and medium-sized enterprises (SMEs), and large companies to contribute ideas and innovative solutions. Healthy competition and the possibility of winning prizes or contracts with public administrations stimulate creativity and the generation of new ideas. Support for the Creation of Innovation Ecosystems : Public-private partnerships can also support the creation of local innovation ecosystems, involving not only large companies but also startups, SMEs, and business incubators. These ecosystems are essential to create a fertile environment where new ideas can be tested and developed. Public administrations can facilitate the creation of such ecosystems by promoting access to funding, offering tax incentives, and creating physical spaces where public and private entities can collaborate. These actions aim to create effective synergy between public and private sectors to maximize the value generated by AI for the common good and ensure that the solutions adopted are aligned with ethical standards and community needs. Only through joint commitment and open cooperation will it be possible to fully exploit AI's potential to improve public services and citizens' quality of life. Regulatory Experimentation and Sandbox Areas Regulatory experimentation and sandbox areas are fundamental tools for the effective adoption of AI in the public sector. These initiatives allow testing new technologies and innovative approaches in a controlled environment (a sandbox is a protected environment where solutions can be tested without impacting real systems or violating regulations), minimizing the risks associated with implementation and ensuring that solutions comply with existing regulations. The main elements and actions related to regulatory experimentation and sandboxes are described below: AI Sandboxes : Sandboxes allow public administrations to test new AI solutions in a regulated environment with an adequate level of supervision. These sandboxes are created to ensure that emerging technologies can be developed, evaluated, and refined before their widespread deployment. Sandbox areas provide a protected environment where administrations can collaborate with technology companies, startups, and universities to develop innovative AI applications, reducing the risk of costly failures and improving the quality of final solutions. Citizen Involvement : Citizen involvement is a crucial aspect of sandbox areas. Public consultations and feedback processes allow the social impact of AI technologies to be evaluated, ensuring that the solutions developed respond to community needs and respect public interest values. Directly involving citizens in experimentation processes can also help increase trust in AI solutions, showing how the risks associated with technology implementation are managed. Impact Assessment and Transparency : Every project initiated within sandbox areas must be subject to a rigorous ethical, social, and legal impact assessment. The impact assessment allows potential risks related to privacy, algorithmic discrimination, or other critical aspects to be identified and corrective measures to be introduced before large-scale implementation. Moreover, it is essential to ensure the transparency of the test results conducted in sandbox areas by publishing detailed reports describing the experimentation process, results obtained, and lessons learned. Guidelines for Sandbox Implementation : To ensure effective use of sandbox areas, clear guidelines must be established defining the process of creating and managing sandboxes, the criteria for selecting projects to be tested, and the methods of supervision. These guidelines must ensure that all projects are in line with the values and objectives of the public administration, comply with existing regulations, and adopt a risk-based approach to ensure the safety and compliance of developed solutions. Regulatory and Financial Support : Creating sandbox areas requires adequate regulatory and financial support. Public administrations must be able to rely on a flexible regulatory framework that allows regulatory experimentation without excessive constraints. At the same time, financial resources must be available to support the costs of experimentation, including those related to technological infrastructure and training of involved personnel. Feedback and Continuous Improvement : One of the goals of sandbox areas is to create a continuous cycle of feedback and improvement. Every experimentation should be followed by a careful analysis of results to improve not only the tested technology but also the experimentation process itself. This iterative approach allows AI solutions to be adapted to the real needs of public administrations and citizens, ensuring that every development phase is based on learning and continuous improvement. Integration with European Innovation Policies : Regulatory sandbox areas must be closely integrated with European policies on innovation and AI, such as the AI Act and the Interoperable Europe Act. This integration is essential to ensure that solutions developed in sandboxes are aligned with European regulations and can be easily scaled at a cross-border level, promoting greater interoperability and a wider spread of best practices in the public sector. These practices of regulatory experimentation and sandbox areas aim to reduce the risk associated with adopting innovative technologies, improve the quality of developed solutions, and ensure that AI is used responsibly and transparently in the public sector. The combination of experimentation, collaboration, and impact assessment represents a comprehensive approach to maximizing AI's potential and ensuring that the benefits are equitably distributed among all citizens. Strengthening Ethical and Legal Governance Practices: Strengthening ethical and legal governance practices is crucial to ensure that AI adoption in the public sector takes place responsibly and in line with community values. Below are the main actions to be taken to ensure ethical and legal AI implementation: Creation of ethical guidelines for AI development: Ethical guidelines are needed to establish clear criteria for the development and use of AI in the public sector. These guidelines must cover various aspects, including data collection and use, bias management, responsibility of developers and operators, and privacy protection. The guidelines must be integrated into procurement and development processes, ensuring that each adopted AI solution aligns with approved ethical principles and the European regulatory framework. Ethical and legal impact assessments: Each AI project must be accompanied by an ethical and legal impact assessment that analyzes its potential consequences in terms of fairness, privacy, security, and transparency. These assessments must be conducted early and updated throughout the project lifecycle, identifying potential risks and providing corrective measures to mitigate them. Establishment of ethical committees: The creation of ethical committees at the national or local level aims to oversee key AI decisions. These committees must be composed of ethics experts, public sector representatives, academics, and civil society members. Their role is to assess AI projects, offer ethical recommendations, and ensure that the principles of fairness and non-discrimination are respected and that the public interest is always at the center of decisions made. Definition of standards for algorithmic auditing: Algorithms used by public administrations must be subject to periodic audits to ensure compliance with regulations and prevent bias or misuse. Auditing must include a transparent analysis of the algorithm's functioning, identification of possible distortions, and verification of accuracy and reliability. It is important to establish a formal process for auditing and identify key performance indicators (KPIs) that allow the effectiveness and impact of algorithms to be evaluated. Ensuring transparency and accountability: To strengthen AI governance, it is essential to promote transparency at every stage of AI technology development and implementation. Public administrations must clearly communicate the purposes for which AI is used, the data employed, and how algorithmic decisions are made. Accountability must be ensured through a governance system that includes mechanisms for accountability and that allows citizens to challenge decisions made by AI technologies where they may significantly impact their rights. Control over data collection and use: Data is the foundation on which AI models are trained, and it is therefore essential that data collection and use are carried out responsibly. Public administrations must ensure that the collected data is of high quality, relevant, and managed according to privacy regulations. Data minimization, i.e., collecting only the strictly necessary data, and pseudonymization (a technique that replaces identifying data with pseudonymous identifiers to protect individuals' identities) are key practices for ensuring the safe and compliant use of personal data. These actions aim to ensure that AI adoption in the public sector takes place safely, responsibly, and in line with public interest values. Strengthening ethical and legal governance practices is a crucial component to promoting citizen trust in AI use and ensuring that this technology contributes to improving public services without compromising individual rights and freedoms. Creating a Support Ecosystem for Digital Transformation: Creating a support ecosystem for digital transformation in the public sector is not just about providing financial and technological resources but also about developing a network of actors and institutions that work together to foster innovation. Below are the main components and actions necessary to ensure an effective and resilient ecosystem for digital transformation: Institutional and political support: It is essential that there is solid institutional support for digital transformation. Governments must develop clear strategic plans for AI adoption and other digital technologies, including specific objectives and defined deadlines. This support must be accompanied by favorable policies that encourage digitalization, remove bureaucratic barriers, and promote a coordinated vision across all levels of public administration, from national institutions to local communities. Knowledge-sharing platforms: Knowledge sharing is a key element for digital transformation. Public administrations must have access to platforms that facilitate the exchange of experiences, best practices, and case studies. Platforms such as the Public Sector Tech Watch (PSTW) can help reduce the learning curve for new technologies and enable the rapid dissemination of innovations that have been successful in other settings. The availability of easily accessible resources and documentation is crucial to accelerating the digitalization process. Financial support and access to European funds: Digital transformation requires significant investments, and it is essential that public administrations have access to adequate funding. Funds such as Horizon Europe, the Digital Europe Programme, and the Recovery and Resilience Facility (RRF) are crucial to supporting large-scale digital transformation projects. However, it is equally important to provide technical and consulting support to administrations to facilitate access to these funds, ensuring that even small and medium administrations can benefit from these financial opportunities. Incentives for innovation and recruitment of digital talent: Public administrations must create incentives to attract and retain talent with digital skills. Hiring experts in AI, data science, and digital transformation is crucial for the success of any innovation strategy. Incentives such as innovation awards, advanced training opportunities, and dedicated career paths can help build an expert team capable of driving change within administrations. Additionally, recruitment programs targeting new generations of digital talent can help bridge the technology skills gap in the public sector. Flexible regulatory framework: The success of digital transformation also depends on the presence of an appropriate regulatory framework. Member States must adopt a regulatory approach that is flexible enough to allow innovation while at the same time protecting citizens from potential abuses. Regulations must be updated periodically to reflect the evolution of technologies and societal needs, ensuring that they align with ethical principles and human rights protections. These actions and components are essential for creating a support ecosystem for digital transformation in the public sector. Only through access to adequate resources and strong institutional commitment will it be possible to fully harness the potential of AI and emerging technologies. Promoting a Culture of Innovation and Calculated Risk: Promoting a culture of innovation and calculated risk is essential to ensure that the public sector can experiment with and adopt new technologies such as AI without being paralyzed by fear of failure. A culture that accepts calculated risk and encourages innovation can produce more creative and effective solutions to respond to public sector challenges. Below are the main actions to take to build a culture of innovation and calculated risk: Encourage experimentation and learning from mistakes: It is crucial to create an environment where mistakes are considered part of the learning process, rather than failures to be avoided at all costs. Public administrations must promote a culture in which staff are encouraged to experiment with new solutions and learn from mistakes. This can be achieved through pilot programs that allow new ideas to be tested in a protected environment without the negative consequences of immediate large-scale implementation. Training and support for managing innovation: Innovation management requires specific skills that are often not present in traditional public sector structures. For this reason, it is important to provide specific training for managers and project leaders to develop the skills needed to manage innovative processes and make strategic decisions in situations of uncertainty. This training must also include aspects related to risk management, opportunity identification, and mitigation of negative effects. Encourage a proactive and open-minded attitude towards change: Administrations must actively work to encourage a proactive and open attitude towards change. This can be achieved through internal communication campaigns that emphasize the benefits of innovation and showcase success stories, as well as through sharing innovation stories and best practices within the organization. A leadership that actively supports change and innovation is crucial to creating an environment that encourages staff to be proactive and experiment with new ideas. Promote the adoption of Design Thinking techniques: Design thinking is a creative and user-centered approach that can help public administrations solve complex problems. Integrating design thinking into decision-making processes allows new ideas to be explored, tested quickly, and adapted based on feedback received. This approach keeps the focus on citizens' needs and finds innovative solutions that improve the quality of public services. Risk assessment and management of uncertainties: Innovation inevitably involves risks. Therefore, it is crucial to implement risk management practices that allow for identifying, evaluating, and mitigating the risks associated with adopting new technologies. Public administrations must develop methodologies to assess uncertainties and make informed decisions that balance opportunities and risks, ensuring that adopted innovations are sustainable and do not jeopardize citizens' safety or trust. Leadership that supports change: Promoting a culture of innovation and calculated risk requires visionary leadership willing to support change. Leaders must be the first to demonstrate openness to innovation, creating an environment that not only accepts but encourages reasoned risk. This type of leadership is essential to overcome internal resistance and motivate staff to engage in digital transformation projects. These actions aim to develop a public sector culture that is oriented towards continuous improvement, learning from mistakes, and experimentation. Only by creating an environment where calculated risk is considered an integral part of the innovation process will it be possible to fully harness the potential of AI and other digital technologies to improve public services and meet the ever-changing needs of citizens. Conclusions AI governance in the public sector is not just a matter of technical or regulatory skills but represents a profound cultural and strategic change. In this transition, the public sector faces a crucial challenge: adopting AI not only as a technological tool but as a catalyst for rethinking how the state interacts with citizens and responds to their needs. A public organization’s ability to leverage AI depends not only on financial resources or adequate regulations but above all on a clear and shared vision that sees technology as an opportunity to build trust, equity, and innovation. The greatest risk for the public sector is not the improper adoption of AI, but the failure to adopt the cultural transformation necessary to make it a tool for social progress. AI, with its ability to automate complex processes and analyze massive amounts of data, can improve the efficiency of public services, but without inclusive governance, it risks creating an even greater gap between institutions and citizens. The most vulnerable communities could be excluded from these benefits, not due to a lack of adequate technologies, but because of systems that do not consider everyone’s needs. This is where ethical governance becomes the real strategic pillar: not as a constraint but as a lever to ensure that AI serves the public interest. Another fundamental aspect is the value of experimentation. The creation of regulatory sandboxes, which is much discussed, should not be seen merely as a protected environment to test technologies but as a symbol of the new attitude that the public sector must adopt. These spaces allow failure to become learning, a concept that radically challenges the traditional risk aversion typical of public bureaucracy. Organizations that manage to cultivate a culture of calculated risk will become examples of how AI can not only be implemented but also continuously improved in response to citizens’ needs. The skills required by AI go far beyond technology. Certainly, the public sector needs experts in machine learning or data science, but the true engine of transformation will be the ability to integrate these skills with visionary leadership and strong ethics. Leadership in this context does not mean merely being able to make technological decisions, but above all being able to communicate an inclusive and future-oriented vision. This leadership must be capable of navigating the complexities of regulations, citizen expectations, and partnerships with the private sector. Public-private partnerships represent another strategic turning point. However, the public sector must not settle for being a "customer" of the private sector. It must become an active partner, capable of negotiating solutions that respect public values and are transparent in their implementation. This collaboration must go beyond simple technological supply: the public sector has a duty to lead the dialogue on how AI should be designed, implemented, and monitored to ensure equitable benefits. Finally, the real transformation will be measured not only by operational improvements but by AI's ability to strengthen the social contract between the state and citizens. AI can become a tool to make institutions more transparent and accountable, but only if citizens are actively involved in its design and monitoring. Trust will be the true success indicator: not blind trust in technology, but trust built on open processes, tangible results, and a visible commitment to the common good. This reflection highlights that AI adoption in the public sector is not just a matter of how to do it but of why to do it. The risk is not technological but strategic: missing the opportunity to make AI an ally of social progress rather than a mere machine at the service of efficiency. The decisions made today will not only determine the effectiveness of public services but will define the role of institutions in an increasingly digital and interconnected society. Podcast: https://spotifycreators-web.app.link/e/IChmFr8LVOb Source: https://publications.jrc.ec.europa.eu/repository/handle/JRC138702