SSRF - Detection, Exploitation, and Mitigation Techniques [Part 3]

In the previous part 1 and part 2, we explored different techniques for detecting and exploitation of Server-Side Request Forgery (SSRF) based on the application's scenarios. Server-Side Request Forgery (SSRF) vulnerabilities pose a formidable threat to web applications, enabling malicious actors to exploit internal network assets, potentially leading to data breaches or unauthorized access to both the application and its underlying infrastructures. Within this discourse, we shall embark upon a profound exploration of potent SSRF mitigation techniques, designed to bolster the security of your applications across diverse scenarios – securing internal pages, and network URLs, handling external URL requests, and ensuring proper protocol/schema use. Additionally, we will offer specific advice for safeguarding cloud-hosted applications on platforms like AWS, Azure, and Google.

Constructing Resilient Applications: Scenarios and Strategies

In the realm of application development, consider these situations:

Securing Internal Pages Access Architecture:

  • Enlist a secure URL parsing library to meticulously extract schema, hostname, URL path, and query string.
  • Employ judicious input validation even though data are encoded or in different formats, as each parameter undergoes rigorous scrutiny to eliminate potential vulnerabilities.
  • In the case of dynamic generation of URL, only append the path associated with the specific page into the designated host configuration, while excluding extraneous parameters.
  • Unleash the power of map path functions to transform relative URLs into absolute URLs.
  • Validate input for URL formats, scrutinizing binary characters relevant to localhost URLs, employing techniques such as CIDR, dots, decimal/octal/hexadecimal, and domain parser confusion attacks.

Securing Internal URL Access Architecture:

  • Keep an updated list of internal IPs and domains, using it for whitelist/blacklist approaches to control access.
  • Enlist a secure URL parsing library to meticulously extract schema, hostname, URL path, and query string along with strict input validation on hostname and IPs.
  • Employ a local DNS resolver for converting DNS requests.
  • Validate destination IP against whitelist/blacklist IPs before granting access.
  • Handle encodings and null characters securely.
  • Avoid taking the internal application's IP address as user input; manage it from the backend.
  • Resolve DNS to A and AAAA IPs, validating them.
  • Make a single DNS call and use the IP for subsequent calls, preventing DNS rebinding attacks (time to check vs time to use).

Securing External URL Access: Few applications fetch or send data to an external domain through webhooks or external calls for image rendering, metadata processing, etc functionalities.

  • Permit access only to public IP ranges and limit access from private IP ranges.
  • Ensure that the backend web client does not reveal any sensitive information in the HTTP request’s headers such as user-agent, cookies, etc.
  • Do not throw any errors when IPs or URLs cannot be reached.
  • Follow guidelines for internal URL access architecture for validating input against IPs and domains.
  • Whitelisting of external URLs or domains would be a better approach.

Securing Protocol/Schema Usage in the Application:

  • Allow only HTTPS access.
  • Explicitly disable undesired URL schemas like gopher, sftp, file, dict, ftp, etc.
  • Ensure your URL parsing tool disables these schemas as well.

Securing Metadata of the Cloud hosted application:

  • AWS - Disable IMDSv1. Only enable IMDSv2 if required.
  • IMDSv2 - Token is being used as an authorization header. The value of the token is collected from the PUT request and the same token value is passed in the next request used as an authorization header to get the metadata details.
    • TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
    • curl http://169.254.169.254/latest/meta-data/profile -H "X-aws-ec2-metadata-token: $TOKEN"
  • Azure – By default “Metadata"="true" header is passed as a request header in the HTTP request to get the metadata details. (ref: - https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=windows)
  •  Google – By default “Metadata-Flavor: Google” header is passed as a request header in the metadata HTTP request to get the metadata details. (ref: https://cloud.google.com/compute/docs/metadata/querying-metadata)

Strengthening Defences: Extra Layers of Security

Along with the above techniques, network layer and host-based firewalls can be utilized as an additional layer of protection that complements application layer protection. Also, enable authentication/access control for internal applications to thwart unauthorized access.
By implementing these Server-Side Request Forgery (SSRF) mitigation techniques, you can significantly reduce the risk of SSRF attacks and protect your applications from potential vulnerabilities. It is crucial to adopt a layered approach to security, combining proper input validation, access controls, and secure configurations to ensure the integrity and confidentiality of your application’s resources.

Article by Amish Shah

Unveiling the Vulnerabilities: Hacking Cloud Native Apps and Discovering the Silver Lining

 

The security of cloud storage services, such as S3/GCP Buckets, the implementation of Cognito/SSO OAuth, and the protection of API endpoints have become paramount for web applications. As cloud integration becomes more accessible, comprehensive assessments of these components before deployment are crucial. Cloud infrastructure offers technological advancements, resource availability, and cost savings compared to traditional setups. However, cloud-native applications bring unique challenges. This talk addresses risks associated with user authentication using OAuth providers like Google, Slack, or AWS Cognito, where insecure implementations can lead to complete user account takeovers. It delves into security concerns regarding cloud storage services like AWS S3/GCP Buckets, which expose sensitive user data to unauthorized access. Additionally, vulnerabilities introduced by API-based micro-services employing Serverless lambda functions are explored. Attendees will gain insights into identifying, exploiting, and mitigating these cloud-native vulnerabilities through manual penetration testing and automated tools, as well as practical mitigation techniques.

Here for more detail on this talk.

Speakers

avatar for Amish Shah

Amish Shah

Co-CEO and Director, Blueinfy Solutions Pvt. Ltd
Amish Shah, Co-CEO and Director at Blueinfy, an esteemed technical expert in the field. With a wealth of experience spanning over 20 years, Amish brings a unique blend of skills in secure product development, application security assessment, and red team exploitation. As the technical... Read More →


The Fading Line Between Data and Instruction in LLM-Driven Applications

Overview

In the ever-advancing landscape of technology, the once distinct boundary between data and instruction is steadily dissolving, particularly in the realm of Language Model (LM) driven applications. These applications, powered by impressive language models like GPT-3.5, built upon the foundation of Large Language Model (LLM) architecture, possess the extraordinary capability to process and generate vast volumes of text. This remarkable feat allows them to undertake tasks that were previously exclusive to human intelligence, marking a significant paradigm shift.

Over time, language models have undergone remarkable transformations, and LLMs have propelled this evolution to unprecedented heights. Traditional programming paradigms relied on explicit instructions to manipulate data. However, LLMs have revolutionized this approach by harnessing the power of large datasets, extracting intricate patterns, and leveraging their knowledge to generate text. This transformative capability empowers LLMs to undertake a diverse range of tasks, including natural language understanding, text completion, and even creative writing.

Root cause:

The diminishing line between data and instruction in LLM-driven applications can be attributed, in large part, to the dynamic interpretation of input. LLMs possess the remarkable ability to comprehend and interpret the contextual intricacies of the provided data, enabling them to discern the underlying task or instruction at hand. This dynamic interpretation empowers LLMs to generate responses or outputs that align seamlessly with the intended objective, even when the instructions are not explicitly specified.

LLMs shine in their aptitude for contextual understanding, a trait previously reserved for human cognition. By meticulously analyzing the surrounding text, LLMs can discern the desired instructions or tasks implicitly embedded within the data. This context-driven approach equips them to generate outputs that are remarkably accurate and relevant, even when the inputs are incomplete or ambiguous.

Security implication:

While the blurring line between data and instruction in LLM-driven applications opens up new horizons and offers unparalleled convenience, it also raises crucial considerations regarding security. On one hand, this convergence facilitates more seamless and intuitive interactions with technology, as users can input data in a natural and flexible manner. On the other hand, concerns arise regarding the potential for misinterpretation or bias in the outputs generated by these models.

Prompt Leakage

The inadvertent exposure of initial prompts in language models, known as prompt leakage, has sparked concerns regarding the disclosure of sensitive information, biases, and limitations embedded within them. This phenomenon poses potential risks to privacy and security.

Language models, including the powerful LLM (Large Language Model), often come preconfigured with specific prompts to initiate the generation of responses and guide the model's behavior. These prompts may contain sensitive information, limitations, or inherent biases that should be treated with utmost confidentiality. However, prompt leakage occurs when a language model unintentionally reveals its initial prompt configurations, undermining the safeguarding of such sensitive content.

The following link provides a compilation of examples showcasing leaked system prompts: [Link: https://matt-rickard.com/a-list-of-leaked-system-prompts].

Prompt Injection

A Large Language Model (LLM) stands as a formidable and intricate form of language model, distinguished by its immense size and parameter count. LLMs, often based on architectures like Transformers, undergo training on vast datasets, equipping them with the ability to learn patterns, grammar, and semantics from extensive textual sources. These models boast billions of parameters, granting them the capacity to generate coherent and contextually relevant text across a wide array of tasks and applications.

Within the realm of machine learning, particularly in the context of LLMs, the role of a prompt is of paramount importance. A prompt serves as the initial input or instruction provided to the model, shaping its behavior and influencing the output it produces. It acts as a starting point or contextual framework for the model's responses or task execution. Prompts can assume various forms, ranging from a few words to a sentence, a paragraph, or even a dialogue, tailored to the specific requirements of the model and the task at hand. The primary purpose of a prompt is to furnish the necessary context, constraints, or explicit instructions that guide the model's decision-making process and shape the content it generates.

Prompt injection, conversely, entails purposefully embedding a targeted instruction, query, question, or context into the model's prompt to manipulate or influence its subsequent output. Through skillful construction of the prompt, users can steer the model's responses in their desired direction or elicit specific types of answers. Prompt injection grants users greater control over the generated text, enabling them to tailor the model's output to meet specific criteria or objectives.

The concept of prompt injection proves especially valuable when fine-tuning the model's responses to attain desired outcomes or generate content that aligns with specific requirements. It empowers users to guide the model's creative output and shape the conversation in accordance with their needs. Prompt injection finds applications in various domains, including generating creative writing, providing customized responses in chatbots, or facilitating specific tasks such as code generation or translation.

Nevertheless, it is crucial to acknowledge that prompt injection can introduce vulnerabilities and raise ethical concerns. There is a potential for malicious actors to manipulate the model to generate harmful or biased content. Hence, it is of utmost importance to implement safeguards, robust validation mechanisms, and regular model updates to mitigate potential risks.

Here is the link with examples and possbilities - https://simonwillison.net/2023/Apr/14/worst-that-can-happen/

SSRF - Detection, Exploitation, and Mitigation Techniques [Part 2]

In the previous section, we explored different techniques for detecting Server-Side Request Forgery (SSRF) based on the application's scenarios. Now, let's delve into the exploitation techniques associated with SSRF, which come into play once SSRF has been confirmed within the application. These techniques aim to assess the vulnerability's risk or impact. The SSRF exploitation process can be divided into two main parts.

Exploiting Application Internal infrastructure:

  • Companies utilize various architectural patterns for running applications, including reverse proxies, load balancers, cache servers, and different routing methods. It is crucial to determine if an application is running on the same host. URL bypass techniques can be employed to invoke well-known URLs and Protocols like localhost (127.0.0.1) and observe the resulting responses. Malicious payloads can sometimes trigger error messages or responses that inadvertently expose internal IP addresses, providing valuable insights into the internal network.
  • Another approach involves attempting connections to well-known ports on localhost or leaked IP addresses and analyzing the responses received on different ports.
  • Application-specific information, such as the operating system, application server version, load balancer or reverse proxy software/platform, and vulnerable server-side library versions, can aid in targeting specific payloads for exploitation. It is also worthwhile to check if the application permits access to default sensitive files located in predefined locations. For example, on Windows systems, accessing critical files like win.ini, sysprep.inf, sysprep.xml, and NTLM hashes can be highly valuable. A comprehensive list of Windows files is available at https://github.com/soffensive/windowsblindread/blob/master/windows-files.txt. On Linux, an attacker may exfiltrate file:////etc/passwd hashes through SSRF.
  • If the application server runs on Node.js, a protocol redirection attack can be attempted by redirecting from an attacker's HTTPS server endpoint to HTTP. For instance, using a URL like https://attackerserver.com/redirect.aspx?target=http://localhost/test.
  • It is essential to identify all endpoints where the application responds with an 'access denied' (403) error. These URLs can then be used in SSRF to compare differences in responses.
  • By identifying the platform or components used in an application, it becomes possible to exploit platform-specific vulnerabilities through SSRF. For example, if the application relies on WordPress, its admin or configuration internal URLs can be targeted. Platform-specific details can be found at https://github.com/assetnote/blind-ssrf-chains, which assists in exploiting Blind/Time-based SSRF.
  • DNS Rebinding attack: This type of attack occurs when an attacker-controlled DNS server initially responds to a DNS query with a valid IP address with very low TTL value, but subsequently returns internal, local, or restricted IP addresses. The application may allow these restricted IP addresses in later requests while restricting them in the first request. DNS Rebinding attacks can be valuable when the application imposes domain/IP-level restrictions.
  • Cloud metadata exploitation: Cloud metadata URLs operate on specific IP addresses and control the configuration of cloud infrastructures. These endpoints are typically accessible only from the local environment. If an application is hosted on a cloud infrastructure and is susceptible to SSRF, these endpoints can be exploited to gain access to the cloud machine.

Amazon (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html)

  • http://169.254.169.254/
  • http://169.254.169.254/latest/meta-data/
  • http://169.254.169.254/latest/user-data
  • http://169.254.169.254/latest/user-data/iam/security-credentials/<<role>>
  • http://169.254.169.254/latest/meta-data/iam/security-credentials/<<role>>
  • http://169.254.169.254/latest/meta-data/ami-id
  • http://169.254.169.254/latest/meta-data/hostname
  • http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key
  • http://169.254.169.254/latest/meta-data/public-keys/<<id>>/openssh-key

Google (https://cloud.google.com/compute/docs/metadata/querying-metadata)

  • http://169.254.169.254/computeMetadata/v1/
  • http://metadata.google.internal/computeMetadata/v1/
  • http://metadata/computeMetadata/v1/
  • http://metadata.google.internal/computeMetadata/v1/instance/hostname
  • http://metadata.google.internal/computeMetadata/v1/instance/id
  • http://metadata.google.internal/computeMetadata/v1/project/project-id

Azure (https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=windows)    

  • http://169.254.169.254/metadata/v1/maintenance 

 

Exploiting external network

  • If an application makes backend API calls and an attacker is aware of the backend API domains, they can exploit SSRF to abuse the application by targeting those backend APIs. Since the application is already authenticated with the domain of the backend API, it provides an avenue for the attacker to manipulate the requests.
  • Furthermore, an attacker can utilize a vulnerable application as a proxy to launch attacks on third-party servers. By leveraging SSRF, they can make requests to external servers through the compromised application, potentially bypassing security measures in place.
  • SSRF can be combined with other vulnerabilities such as XSS (Cross-Site Scripting), XXE (XML External Entity), Open redirect, and Request Smuggling to amplify the impact and severity of the overall vulnerability. This combination of vulnerabilities can lead to more advanced attacks and potentially result in unauthorized access, data leakage, or server-side compromise.

In the next section of this blog, we will delve into various strategies and techniques for preventing and mitigating SSRF attacks in different application scenarios.

Article by Amish Shah

SSRF - Detection, Exploitation, and Mitigation Techniques [Part 1]

SSRF (Server Side Request Forgery) is a security vulnerability that allows an attacker to make unauthorized HTTP requests from the backend of a vulnerable web application by manipulating the URL/domain/path parameter of the request. The injected URL can come from either an internal network or a third-party network, and the attacker's goal is usually to gain unauthorized access to internal applications or leak sensitive data.

SSRF attacks can have serious consequences, such as unauthorized actions on third-party applications and remote command execution on vulnerable internal applications. Additionally, attackers can use SSRF to bypass network security measures such as firewalls and gain access to sensitive resources.

Detection techniques for pen-testing with different types of application scenarios

One of the most commonly used methods to detect SSRF vulnerabilities is to set up a dedicated server that can receive both DNS and HTTP requests. The idea is to identify requests made by the user-agent or originating from the IP address of the vulnerable application server. If the server receives a request from the application, it indicates that there might be an SSRF vulnerability present. This method can help in identifying SSRF attacks in real-time and is used extensively by security professionals and researchers. 

Another method of detecting SSRF attacks is based on response timing. In such cases, the attacker learns whether or not a specific resource exists based on the time it takes to receive a response. If the response time is significantly different from what is expected, it may indicate that the attacker is trying to access a resource that does not exist or is not accessible.

URL/domain/path as a part query string or request body - One common scenario where SSRF can occur is when an application takes any URL, domain name, or file path as an input as part of the query string or request body, and the values of these parameters are used in backend processing. SSRF  can happen when an attacker is able to control the input parameters and can inject malicious URL/domain/path. For instance, an attacker could use an image URL or a link URL as input in template generation, or use a file/directory path or an image URL in system/device configuration. In such cases, the attacker could trick the application into sending requests to internal resources or third-party services without the application's knowledge. The most common consequence of such attacks is unauthorized access to sensitive data or resources.

The Referrer header - This header can be manipulated by an attacker to exploit an SSRF vulnerability. If the application uses the referrer header for business logic or analytics purposes, the attacker can modify it to point to a target server they control. The vulnerable application will then make requests to the internal network, allowing them to potentially gain access to internal resources. This can also lead to data exfiltration or unauthorized actions on third-party applications.

PDF Rendering/Preview Functionality - If the application provides the ability to generate PDF files or preview their content based on user input data, there may be a risk of SSRF. This is because the application's code or libraries could render the user-supplied JavaScript content on the backend, potentially leading to SSRF vulnerabilities. Attackers could exploit this vulnerability by injecting a malicious URL or IP address in the PDF file or the preview content, resulting in unauthorized access to internal systems or sensitive data. Therefore, it's important for developers to thoroughly sanitize user input data and restrict access to internal resources to prevent SSRF attacks.

File uploads – If an application includes a file upload feature and the uploaded file is parsed or processed in any way, it may be vulnerable to SSRF attacks. This is because URLs or file paths embedded in uploaded files such as SVG, XML, or PDF files may be used to make unauthorized requests to external resources. Attackers can leverage this vulnerability to perform actions such as gaining unauthorized access to internal applications, leaking sensitive data, or executing commands on third-party applications through vulnerable application’s origin.

Bypassing Whitelisted Domain/URL/Path – An attacker can use various encoding mechanisms and supply malformed URL formats with binary characters for the localhost URL, including techniques like CIDR bypass, dot bypass, decimal/octal/hexadecimal bypass, and domain parser confusion, to evade an application's whitelisted URL/domain/file path configuration. This can allow the attacker to inject a malicious URL or domain name, potentially leading to an SSRF vulnerability.

Checking with different protocols/IP/Methods - An attacker may attempt to exploit an SSRF vulnerability by sending requests with different protocols (e.g. file, dict, sftp, gopher, LDAP, etc.), IP addresses, and HTTP methods (e.g. PUT, DELETE, etc.) to see if the application is vulnerable. For instance, an attacker may try to access internal resources using the file protocol, which can allow them to read files on the server or execute arbitrary code. Similarly, an attacker may try to access resources using less common protocols like dict or gopher, which are not typically used and may not be blocked by firewalls.

The upcoming section of the blog will delve deeper into the topic of SSRF exploitation in the context of cloud-based applications. We will also explore platform-oriented attacks on internal apps and examine various migration strategies to prevent SSRF attacks.

Article by Amish Shah