Performing Secure Code Review of R Code – A Beginner’s Guide

Background
R is a programming language primarily used for statistical computing and graphical presentation to analyze and visualize data. R is an interpreted language, meaning R programs are not pre-compiled but executed by the R interpreter at runtime. An R file is a script written in the R programming language and saved with the .R file extension.

R language - Ecosystem
Comprehensive R Archive Network (CRAN)
CRAN is R's central software repository, supported by the R Foundation. It is an archive of the latest and previous versions of the R distribution, documentation, and contributed R packages.

Posit – Shiny
Shiny is an R package used to build interactive web applications that execute R language code on the back-end. It enables users to host standalone applications on a webpage, embed interactive charts in R Markdown documents, and build dashboards.

OSS Index

OSS Index is a free catalogue of open source components and scanning tools developed by Sonatype. It helps developers identify open source dependencies, known publicly disclosed vulnerabilities, understand risk, and keep their software secure. The vulnerability data is derived from public sources and does not include human-curated intelligence or expert remediation guidance.

Shiny Applications
A typical web application built using the Shiny package runs the `shinyApp(ui, server)` object. The `ui.R` and `server.R` files contain the code for the client-side and server-side, respectively, which makes up the major logic and code for the application.


 

 

 

R Markdown Files
Another important file type used in applications developed using R is the `.Rmd` file, which is an R Markdown file. It is a specific type of file format designed to produce documents that include both code and text. These files are typically used to generate report files for the results in formats like DOCX, HTML, and PDF.

Secure code review guidelines 

Get background information

  • Obtain a brief overview of the application's purpose
  • Understand the feature list
  • Identify the targeted audience and deployment environment


Assess Potential Vulnerabilities
R is a statistical programming language primarily used for domain-specific application development. Based on the application details, try to gauge what security vulnerabilities the code may be susceptible to.

Understand Code Structure
Examine how the code handles user input, imports and exports data file, manages configurations, processes data, makes external/third-party calls, utilizes libraries/packages, and generates the user interface.

Check for Vulnerable Packages

  • \Packages are typically listed at the beginning of .R files
  • Package versions can be found in the package_list.csv file
  • If the package list file is not available, installed packages and versions can be found using the following command:
  • Vulnerabilities for R language-based packages can be found on the OSS Index
  • The `oysteR` package is an R interface to the OSS Index that allows users to scan their installed R packages


HTTP Requests
Look for HTTP requests from the code files. The `httr` package is commonly used to make HTTP requests from R language code. Check the sensitivity of the data sent in the HTTP request, if applicable.

User Input Sanitization
Trace the input data received from the user interface
Ensure that input data is validated, escaped, and sanitized using blacklisting or whitelisting approaches
For file imports, check that the file data is properly sanitized before consuming it in the program logic
Various functions can be used to search for specific characters and/or patterns in R, such as `gsub`, `grepl`, `str_replace`, and `str_replace_all` from the `stringr` package

Hardcoded Secrets
Check for hardcoded secrets like passwords, keys, or tokens in the code.

Built-in Security Features
R comes with a variety of built-in security features that can help protect your software. Here are a few key features:

  • Secure Password Storage: R provides the `bcrypt` package for secure password hashing
  • Secure Communication: The `openssl` package in R can be used to encrypt communication, ensuring data confidentiality and integrity
  • Data Anonymization: R's `sdcMicro` package provides methods for anonymizing data, a key aspect of privacy preservation

Common security vulnerabilities in R code
The most common security vulnerabilities in R code primarily revolve around the handling of data serialization and deserialization, particularly with RDS (R Data Serialization) files. Here are some of the common vulnerabilities observed:

Arbitrary Code Execution

Recent findings have highlighted a critical vulnerability, which allows for arbitrary code execution through the deserialization of untrusted RDS files. This vulnerability exploits R's lazy evaluation and promise objects, enabling attackers to craft malicious RDS files that execute arbitrary code when loaded. This poses a significant risk, especially in environments where R packages are shared among developers and data scientists.

Supply Chain Attacks
The nature of R's package management system, particularly with repositories like CRAN, makes it susceptible to supply chain attacks. An attacker can overwrite legitimate R packages with malicious versions containing harmful code. When users load these compromised packages, the malicious code is executed, potentially leading to system compromise.

Input Validation Issues
Like many programming languages, R is vulnerable to injection attacks if user inputs are not properly validated. This includes risks associated with SQL injection and command injection, especially when R is used to interact with databases or execute system commands.

Lack of Access Controls
R applications often lack robust access control mechanisms, which can lead to unauthorized access to sensitive data or functionalities. Ensuring that proper authentication and authorization checks are in place is essential to mitigate this vulnerability.

Insecure Data Handling

Improper handling of sensitive data, such as failing to encrypt data at rest or in transit, can expose applications to data breaches. It is crucial to implement strong encryption practices when dealing with sensitive information in R.

Article by Maunik Shah & Krishna Choksi