The Fading Line Between Data and Instruction in LLM-Driven Applications

Overview:

As technology continues to advance at a rapid pace, the line between data and instruction has become increasingly blurred, particularly in the case of Language Model (LM) driven applications. Language models like GPT-3.5, which is based on the LLM (Large Language Model) architecture, have the ability to process and generate vast amounts of text, enabling them to perform tasks that were once considered exclusive to human intelligence.

Language models have evolved significantly over the years, and LLMs have taken this evolution to new heights. Traditional programming paradigms relied on explicitly coded instructions to process data. However, LLMs have the capacity to learn from large datasets, extract patterns, and generate text based on that knowledge. This shift enables them to perform a wide range of tasks, such as natural language understanding, text completion, and even creative writing.

In LLM-driven applications, the once clear-cut distinction between data and instruction is fading away. The data fed to these models is not just passive information but can also serve as implicit instructions. For example, in a language translation application, the input sentence is treated as data, but it also serves as an instruction to generate the corresponding translation. The LLM processes the data and simultaneously performs the instructed task, blurring the line between the two.

Data and Instructions are merging – context driven:


One of the key reasons behind the fading line between data and instruction in LLM-driven applications is the dynamic interpretation of input. LLMs have the ability to comprehend and interpret the context of the data provided to them, allowing them to determine the underlying task or instruction. This dynamic interpretation enables LLMs to generate responses or outputs that align with the intended task, even if it is not explicitly specified.

LLMs excel at contextual understanding, allowing them to extract meaning from data in a way that was previously reserved for humans. By analyzing the surrounding text, LLMs can infer the desired instructions or tasks. This context-driven approach enables them to generate highly accurate and relevant outputs, even when the inputs are incomplete or ambiguous.

Security impact:

The blurring line between data and instruction in LLM-driven applications opens up new possibilities and raises important considerations. On one hand, it allows for more seamless and intuitive interactions with technology, as users can input data in a more natural and flexible manner. On the other hand, it raises concerns about the potential for misinterpretation or bias in the generated outputs.

The line between data and instruction is undeniably fading away in the realm of LLM-driven applications. Through dynamic interpretation and contextual understanding, LLMs are able to process data and simultaneously perform tasks without the need for explicit instructions. While this blurring of boundaries brings great advancements and convenience, it also presents challenges related to security.

Prompt Leakage

Prompt leakage in language models has raised concerns regarding the unintended disclosure of sensitive information, biases, and limitations that may be embedded in their initial prompt configurations. This phenomenon occurs when language models inadvertently expose these initial prompts, leading to potential privacy and security risks.

Language models, such as the LLM (Large Language Model), often come with predefined prompts as part of their initial configuration. These prompts serve as starting points for generating responses and guiding the model's behavior. While these prompts may contain sensitive information, limitations, or inherent biases, it is essential to ensure that they remain confidential and are not inadvertently revealed. Hence, prompt leakage can occur when a language model inadvertently discloses its initial prompt configurations.

The following link provides a compilation of examples showcasing leaked system prompts: [Link: https://matt-rickard.com/a-list-of-leaked-system-prompts].

Prompt Injection

A Large Language Model (LLM) is a powerful and complex type of language model that is characterized by its immense size and parameter count. LLMs, often based on architectures like Transformers, are trained on massive datasets, enabling them to learn patterns, grammar, and semantics from vast amounts of text. These models possess billions of parameters, allowing them to generate coherent and contextually relevant text for a wide range of tasks and applications.

In the context of machine learning, particularly with LLMs, a prompt plays a crucial role. A prompt refers to the initial input or instruction given to the model to guide its behavior and influence the output it generates. It serves as a starting point or context for the model's response or task performance. Prompts can take various forms, such as a few words, a sentence, a paragraph, or even a dialogue, depending on the specific requirements of the model and the task at hand. The purpose of a prompt is to provide necessary context, constraints, or explicit instructions to guide the model's decision-making process and shape its generated output.

Prompt injection, on the other hand, involves deliberately inserting a targeted instruction, query, question, or context into the model's prompt to manipulate or influence its subsequent output. By carefully crafting the prompt, users can steer the model's responses in a desired direction or elicit specific types of answers. Prompt injection allows users to have more control over the generated text and tailor the model's output to meet specific criteria or objectives.

The concept of prompt injection is particularly valuable when fine-tuning the model's responses to achieve desired outcomes or generate content that aligns with specific requirements. It empowers users to guide the model's creative output and shape the conversation according to their needs. Prompt injection can be employed in various applications, such as generating creative writing, providing tailored responses in chatbots, or assisting with specific tasks like code generation or translation.

However, it is important to note that prompt injection can also introduce vulnerabilities and ethical concerns. Malicious actors may attempt to manipulate the model to generate harmful or biased content. Therefore, it is crucial to implement safeguards, robust validation mechanisms, and regular model updates to mitigate potential risks and ensure responsible and ethical use of prompt injection techniques.

SSRF - Detection, Exploitation, and Mitigation Techniques [Part 2]

In the previous section, we explored different techniques for detecting Server-Side Request Forgery (SSRF) based on the application's scenarios. Now, let's delve into the exploitation techniques associated with SSRF, which come into play once SSRF has been confirmed within the application. These techniques aim to assess the vulnerability's risk or impact. The SSRF exploitation process can be divided into two main parts.

Exploiting Application Internal infrastructure:

  • Companies utilize various architectural patterns for running applications, including reverse proxies, load balancers, cache servers, and different routing methods. It is crucial to determine if an application is running on the same host. URL bypass techniques can be employed to invoke well-known URLs and Protocols like localhost (127.0.0.1) and observe the resulting responses. Malicious payloads can sometimes trigger error messages or responses that inadvertently expose internal IP addresses, providing valuable insights into the internal network.
  • Another approach involves attempting connections to well-known ports on localhost or leaked IP addresses and analyzing the responses received on different ports.
  • Application-specific information, such as the operating system, application server version, load balancer or reverse proxy software/platform, and vulnerable server-side library versions, can aid in targeting specific payloads for exploitation. It is also worthwhile to check if the application permits access to default sensitive files located in predefined locations. For example, on Windows systems, accessing critical files like win.ini, sysprep.inf, sysprep.xml, and NTLM hashes can be highly valuable. A comprehensive list of Windows files is available at https://github.com/soffensive/windowsblindread/blob/master/windows-files.txt. On Linux, an attacker may exfiltrate file:////etc/passwd hashes through SSRF.
  • If the application server runs on Node.js, a protocol redirection attack can be attempted by redirecting from an attacker's HTTPS server endpoint to HTTP. For instance, using a URL like https://attackerserver.com/redirect.aspx?target=http://localhost/test.
  • It is essential to identify all endpoints where the application responds with an 'access denied' (403) error. These URLs can then be used in SSRF to compare differences in responses.
  • By identifying the platform or components used in an application, it becomes possible to exploit platform-specific vulnerabilities through SSRF. For example, if the application relies on WordPress, its admin or configuration internal URLs can be targeted. Platform-specific details can be found at https://github.com/assetnote/blind-ssrf-chains, which assists in exploiting Blind/Time-based SSRF.
  • DNS Rebinding attack: This type of attack occurs when an attacker-controlled DNS server initially responds to a DNS query with a valid IP address with very low TTL value, but subsequently returns internal, local, or restricted IP addresses. The application may allow these restricted IP addresses in later requests while restricting them in the first request. DNS Rebinding attacks can be valuable when the application imposes domain/IP-level restrictions.
  • Cloud metadata exploitation: Cloud metadata URLs operate on specific IP addresses and control the configuration of cloud infrastructures. These endpoints are typically accessible only from the local environment. If an application is hosted on a cloud infrastructure and is susceptible to SSRF, these endpoints can be exploited to gain access to the cloud machine.

Amazon (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html)

  • http://169.254.169.254/
  • http://169.254.169.254/latest/meta-data/
  • http://169.254.169.254/latest/user-data
  • http://169.254.169.254/latest/user-data/iam/security-credentials/<<role>>
  • http://169.254.169.254/latest/meta-data/iam/security-credentials/<<role>>
  • http://169.254.169.254/latest/meta-data/ami-id
  • http://169.254.169.254/latest/meta-data/hostname
  • http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key
  • http://169.254.169.254/latest/meta-data/public-keys/<<id>>/openssh-key

Google (https://cloud.google.com/compute/docs/metadata/querying-metadata)

  • http://169.254.169.254/computeMetadata/v1/
  • http://metadata.google.internal/computeMetadata/v1/
  • http://metadata/computeMetadata/v1/
  • http://metadata.google.internal/computeMetadata/v1/instance/hostname
  • http://metadata.google.internal/computeMetadata/v1/instance/id
  • http://metadata.google.internal/computeMetadata/v1/project/project-id

Azure (https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=windows)    

  • http://169.254.169.254/metadata/v1/maintenance 

 

Exploiting external network

  • If an application makes backend API calls and an attacker is aware of the backend API domains, they can exploit SSRF to abuse the application by targeting those backend APIs. Since the application is already authenticated with the domain of the backend API, it provides an avenue for the attacker to manipulate the requests.
  • Furthermore, an attacker can utilize a vulnerable application as a proxy to launch attacks on third-party servers. By leveraging SSRF, they can make requests to external servers through the compromised application, potentially bypassing security measures in place.
  • SSRF can be combined with other vulnerabilities such as XSS (Cross-Site Scripting), XXE (XML External Entity), Open redirect, and Request Smuggling to amplify the impact and severity of the overall vulnerability. This combination of vulnerabilities can lead to more advanced attacks and potentially result in unauthorized access, data leakage, or server-side compromise.

In the next section of this blog, we will delve into various strategies and techniques for preventing and mitigating SSRF attacks in different application scenarios.

Article by Amish Shah