AI Benefits are not a Problem

Welcome to Notes about AI Security and Safety

Apr 05, 2023

Since the beginning of 2023, practical AI applications started to get much more attention, and every week brings major news about new capabilities, scenarios, or applications. This progress is obviously based on decades of scientific research in Machine Learning (ML) and significant Deep Learning (DL) improvements in the last few years that led to spectacular AI applications, recently in Generative AI and Large Language Models (LLM) like GPT. That seems to be the specific moment when AI broke into public perception. Within weeks, AI stopped belonging to domains of science fiction and discussions in computer science or psychology departments and started to become something tangible that is both very promising and very threatening.

Beginning of the next technological revolution

It is not the first time we have experienced a technological revolution with a similar potential to change the very foundations of our civilization. The most similar point happened probably 30 years ago with the emergence of the Internet. Establishing a global open network was recognized quickly as a revolution, but practical applications and their consequences on individuals, groups, businesses, and societies still exceeded our furthest hopes and expectations. Unfortunately, that also applies to our biggest fears, as we were not able to imagine how Internet technologies could be flawed, the consequences of vulnerabilities, and all the creative ways they would get abused.

There are similarities between these two revolutions but also some differences. Internet and mobile devices fundamentally changed how we communicate. In a similar way, AI applications will change how we make most of our decisions. In the latter case, the disruptive impact may be much broader, not limited to digital but also extending to physical reality, and quickly affecting even those who would prefer not to use AI. However, the most significant difference is the speed of applying AI technologies in practical scenarios. We see a race between researchers, big companies, and startups, looking at the implementation of AI in solving old problems and finding brand-new ones. Changes that used to take years now seem to happen within weeks.

We need responsible innovation in AI

Focus on benefits is essential for innovation, but innovation must also be responsible to be successful. Failure in that scope may delay or entirely prevent the implementation of otherwise realistic opportunities or lead to catastrophic consequences. That becomes even more important for AI applications in domains with scenarios that should be identified as critical, like healthcare, education, transportation, legal, or law enforcement. These are domains where AI applications might lead to very significant benefits but are also connected with the biggest risks and unknowns. Mistakes in these cases can be very expensive or unacceptable based on multiple criteria (with healthcare as the best possible example).

We are not talking about the future. Even though generative and conversational AI applications are getting most of the current attention, narrow AI solutions of different complexity have been used across industries to automate or augment business decision processes for many years. With the popularity and public recognition of AI opportunities, we expect the development of new AI applications to increase dramatically. As exciting as this progress is, it should also be very scary. With AI applications being introduced to so many different contexts, it is critical to establish a practical approach to look at what can possibly go wrong. We need methods, tools, and experts, but most of all, time to properly analyze threats, understand risks, and make informed decisions about required mitigations. AI applications must be secure and safe in order to be useful.

Different understandings of AI Security

We will talk about practical requirements and other applicable definitions in future posts, but let’s start with different possible understandings of the general term of AI Security. Because this is a new area, the term can have different meanings depending on the context, usage, or particular audience. This confusion can be especially evident in conversations with AI/ML experts vs. cybersecurity professionals with at least five possible understandings of AI security in circulation.

Attacking AI Application. This category refers to AI applications as a target of an attack. The attacks usually involve an adversary with specific goals aimed at AI platforms, models, or decision processes that are automated or augmented with AI. Different types of Adversarial ML techniques have been researched, including inference evasion, data poisoning, model inversion/extraction, membership interference, or attempts to disrupt critical scenarios dependent on results from AI components. These attacks can be specific to the use context, and new types can emerge with new applications, for example, jailbreaking or prompt injection in ChatGTP.
Misuse of AI Application. This category refers to the potential harm caused by decisions during the development, deployment, or operating of AI applications. That can include not understanding the full impact of an AI application (who can be affected?), dismissing critical aspects of implementation (bias or copyrights of training data), failing to predict, or just ignoring, unintended consequences, or using an AI model in an inappropriate context (different than intended and tested for). These concerns also cover dependency on 3rd parties, sharing data externally, and general questions about trust and accountabilities.
Malicious use of AI. This category covers using AI capabilities in an offensive way, in novel attacks, or in complementing and upgrading existing methods. In cybersecurity, we may expect very automated social engineering attacks that are interactive, highly contextual, and aimed at vulnerable groups (e.g., elders). AI can also be used to automate different attack techniques to facilitate more Advanced Persistent Threats (APT) or the next generation of autonomous malware. In a broader context, we should be ready for more highly coordinated campaigns of sharing misinformation, fraud automation, damaging reputation, changing sentiment, or poisoning political discourse.
Defensive use of AI. This category refers to using AI capabilities to mitigate traditional and new attacks (also those related to Malicious AI). Machine Learning for years has been used in fraud detection or Intrusion Detection Systems. We can see a marketing trend of adding AI to all types of security controls that will likely only grow. AI will be indispensable in dealing with automated attacks and processing vast amounts of data. Such applications will face particular requirements, some ideas will be revisited (e.g., voice-based authentication), and new solutions will need to be developed (e.g., detecting AI involvement). In every case, we will need to manage our expectations.
AI Autonomy and Control. This last category is the most unique and related to the significant concerns brought up during the discussions about Artificial General Intelligence (AGI). Even though AGI is likely still a matter of the future (pending precise definition), there are some critical questions regarding already existing AI applications. Those questions include the use of AI in high-risk or mission-critical scenarios, questions about transferring levels of controls, the domain of practical AI value alignment between AI and its creators (and users!), or challenges of emergent behaviors connected to our limited understanding of AI black boxes.

These different ways of looking at AI Security point to various problems, priorities, and very focused efforts, but with current developments, they may very quickly converge in practice. We may expect to see AI components used to attack (#3) legitimate AI applications (#1) that will be defended with the help of solutions with AI capabilities (#4). Using AI in an inappropriate context (#2) can make such attacks easier and defending against them much more difficult. Even adding the concept of autonomy (#5) in the context of sophisticated malware doesn’t seem farfetched.

Security as the key requirement for practical AI

The risks related to AI applications have been the subject of interesting conversations for years but now these topics have become much more urgent — we need practical tools, techniques, and processes that can be applied every time we introduce AI in a new context. In the meantime, we are still learning about general requirements for AI applications. Not so long ago, the discussions about AI risks were focused on Ethical AI, and it seems that now we have moved to the Responsible use of Trustworthy AI, which is a more precise and complete way to describe the challenges we are facing. Security has been a part of the conversations but usually is a bit in the background as many other related requirements, such as safety, privacy, resilience, robustness, accountability, and trustworthiness, have covered it.

If an AI system is not secure, it cannot be safe, fair, ethical, private, accountable, transparent, explainable, interpretable, reliable, resilient, robust, accurate, effective, sustainable, purposeful, aligned, or generally speaking — trusted.

There are many great projects focused on AI security, including risk management, threat modeling, or security guidelines; even a new category identified as AI TRISM — AI Trust, Risk, and Security Management. However, we seem to be struggling with explicitly recognizing security as a separate core requirement for AI applications and as the foundation for most of the other critical requirements in the space. We cannot move forward without understanding risks, practical requirements, constraints, and guidelines for using specific models in particular contexts. If an AI system is not secure, it cannot be safe, fair, ethical, private, accountable, transparent, explainable, interpretable, reliable, resilient, robust, accurate, effective, sustainable, purposeful, aligned, or generally speaking — trusted.

AI Security engineering based on cybersecurity

From a cybersecurity POV, new technologies and applications usually lead to new threats, requiring new controls, frameworks, and processes. That is also fully applicable to AI technologies, with the justified expectation that there will be a lot of new problems that will have to be addressed, to mention: model specific attacks, new types of attack surfaces, the complexity of trust and dependencies on 3rd parties, or the need for new operational procedures (how to patch a model?). We need a new field of AI Security engineering focused on practical applications and based on our cybersecurity experiences but going beyond that and connected with multiple efforts covering other socio-technical and ethical aspects of AI applications.

Cybersecurity is not only about technology but also about a culture of not looking only at benefits but balancing them with risks without forgetting the reality of practical applications.

Cybersecurity is not only applicable as the foundation for AI Security but there are also unique opportunities to learn from the evolution of that field. Modern security has been built on a history of failures, and many lessons can be used to avoid making similar mistakes in the new context. That can be of immense value, given the fast adoption pace for AI, as we do not have enough time to learn from scratch. Cybersecurity is not only about technology but also about a culture of not looking only at benefits but balancing them with risks without forgetting the reality of practical applications. That culture can be effectively adopted for AI security and to address an even broader need for responsible innovation in AI. Especially at the beginning, when we know only some questions and even fewer answers, appropriate culture may be more important than technology.

We are only at the beginning of the road toward the Responsible use of Trustworthy AI. We are not sure what to expect because we don’t know where we are going. Technology is never perfect, and there are always individuals and groups who will try to cause harm, either intentionally or due to their irresponsible actions. This time much more may be at stake. Security must become an essential part of conversations about any non-trivial AI application throughout its lifecycle, from design and development, through deployment and operating, to responding to incidents and changes, including unexpected ones. Progress in security starts with the fundamental question: what can possibly go wrong? In every room full of people excited about AI’s opportunities and benefits, we need an individual (or a group!) whose job and responsibility is to keep re-phrasing that question in all possible forms and flavors. That will also be the goal of posts in this series.

Updated on Aug 30^th, 2023, with references and minor fixes based on received feedback.

Notes about AI Security