Generative AI Implementation & Penetration Testing

Artificial Intelligence is becoming indispensable in today's world with countless businesses in most industries utilizing it for their daily operations. The trend of Generative AI (Gen AI) is pacing from pure chat based applications towards context based applications that use Large Language Models (LLMs) plus internal data to serve certain business use cases for organizations. The business use cases vary from, not limited to, summarizing or translating documents, personal Q&A from documents, querying large chunks of data from the database through prompts, placing orders etc. Along with the convenience and efficiency that it is providing to organizations, it is also raising red flags and apprehensions due to the risks it poses from ethical standards, data privacy, and theft of sensitive information.

When context based Gen AI applications are built, the LLMs require a large amount of data for training and optimal functioning and a lot of restrictions on operations. This makes them prone to a large set of vulnerabilities based on the implementation and actions allowed to end users and based on the architecture - front-end and back-end layers/API and permission layers for each. With each new day and implementation, an evolving list of Top 10 LLM vulnerabilities is out on OWASP and will be changing continuously but below is a brief of what we have also commonly observed in such implementations: -

Jailbreaking GPT/Prompt Injection - Jailbreaking refers to the process of modifying or bypassing restrictions on the model to gain unauthorized access or control over its behavior or capabilities. The vulnerability in prompt injection lies in its potential to be manipulated to produce biased or misleading outputs, particularly in contexts where the generated content can have significant real-world implications, such as misinformation, propaganda, unintended data leakage, unauthorized access to sensitive information, or unethical manipulation. This has also been detailed in one of our previous blogs - 

Insecure Input/Output Handling - When the LLM based application is serving as a direct upstream/downstream application due to which user injected input is directly getting access to additional functionality or user injected input is directly displayed to the end users - it can either lead to vulnerabilities like Remote Code Execution (in cases where input through user prompt is directly consumed as system commands) or vulnerabilities like Cross Site Scripting (XSS) where data injected by users is directly displayed in response to end users without any input/output encoding.

Sensitive Information Disclosure/Improper Access Control - LLM based applications tend to potentially reveal sensitive information or confidential details through their output. This can result in unauthorized access to client data, intellectual property, PII information or other security breaches. The unpredictable nature of LLMs leads to such cases where the applied restrictions, if any, are not honoured and are circumvented by various means. 

Insecure Plugin Design - Plugins add extra features to the LLM, like text summarization, question & answer or translation tools etc. They might have coding mistakes, weak authentication, or insecure communication, making them targets for attacks like injection or unauthorized access. Attackers could use these weaknesses to access sensitive data, change how the LLM works, or run harmful code on the system.

Training Data Poisoning/Supply Chain Vulnerabilities - The starting point for LLMs is training data or raw text. From this data, they learn patterns to generate outputs. LLM training data poisoning happens when someone intentionally adds biased, false, or malicious content to the training data either directly or through third party models utilized in the implementation.

Excessive Agency - LLM based applications typically need a degree of authority to perform certain tasks/actions based on/in response to prompts. Excessive agency is a vulnerability that occurs when damaging actions can be performed based on unexpected output from the LLM (might be arising due to a separate vulnerability) – typically due to excessive permissions, excessive functionality or excessive autonomy.

LLM Hallucination/Overreliance - LLMs suffer from limitations that make them prone to hallucination by default. The LLM tends to give factually incorrect information to the end users very confidently. The LLMs are prone to intrinsic hallucination (information contradicting from the source information) and/or extrinsic hallucination (additional information than what can be inferred from the source information). 

Denial of Service (DoS) - LLMs suffer from Denial of Service (DoS) attacks either for them or other users due to excessive requests through high-volume generation in a short period of time, repeated inputs, variable length input flood, or by introducing complex inputs that seem normal but increase the resource utilization at the back end. 

Why it is imperative to perform a security review/penetration test of Gen AI based contextual implementations?

1.    Protecting Sensitive Data & Unauthorized Access – As context based applications might have models that are trained on proprietary business data and/or personal information, it becomes necessary to protect this data from unauthorized access or leakage, reducing the risk of data breaches.

2.    Ensuring Compliance – With compliances like GDPR, CCPA, HIPAA etc. it becomes unavoidable to take risk on sensitive data based on the data classification of the applications due to the potential legal and financial repercussions associated with non-compliance.

3.    Trust & Reputation - The reputation of an organization and user trust can suffer severe consequences due to security breaches. Demonstrating commitment to security and preserving user trust can be achieved by proactively identifying and addressing security vulnerabilities through reviews and tests.

We will be outlining some real-world scenarios, implementations and vulnerabilities that we have come across as part of our Gen AI penetration testing (which requires a completely different approach and framework compared to our standard black-box penetration testing methodologies) during next few posts.

Article by Rishita Sarabhai & Hemil Shah