Tuesday, June 10, 2014

Attack Surface analysis and Entry Point Identification

Each web application is running with possible attack surface. Attack surface is defined by number of attack points it contains. Attack surface exposure is cut across various component of application like web server, application server, database etc. As shown in figure 1 we have browser accessing application over Internet. Browser contains various different set of technologies and accessing application as client. It directly access application server over HTTP. Application page is integrated in backend infrastructure like database and authentication servers. At the same time application is not working in isolation so it accesses other information using APIs.
Figure 1 – Attack Surface
Here attack surface contain following elements
1.) Application server pages and access
2.) Database and Authentication modules
3.) Browser access variables for client side attacks
4.) Stream or data coming from web services / APIs
Web application modules and pages will access information coming from various sources and let’s see range of possible entry points. As shown in figure 2 we can have range of possible entry points to the system both on server and client side.
Figure 2 – Entry points on attack surface

Entry points

Web or application pages consume HTTP requests and process incoming values. All these variables become entry points or source to the system. Let’s look at them in detail.
Querystring as an entry point
Each incoming HTTP request can pass on values through querystring to web/application pages. Here is an example
In above case we have name value pair username & shah is passed as HTTP parameter to application page called login.aspx. If we look at actual HTTP request then it would look like following.
GET /login.aspx?username=shah HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv: Gecko/2008070208 Firefox/3.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
It is GET request hitting to application pages along with some other HTTP headers. All these headers and name-value pair get processed by web pages.
POST-Name/Value pair as query string
We have passed value using GET request by querystring in above case, it is also possible to pass similar value by POST request where size of value can be unlimited. In querystring one can pass maximum buffer of 256 characters only. For example here is simple POST request
 POST http://example.com/cgi-bin/search.cgi HTTP/1.1
 Host: example.com
 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; rv:1.7.3) Gecko/20040913 Firefox/0.10
 Accept: text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, text/plain;q=0.8, image/png, */*;q=0.5
 Keep-Alive: 300
 Referer: http://example.com/
 Content-Type: application/x-www-form-urlencoded
 Content-Length: 17

As you can see as an entry point buffer or content-length of 17 passed and its value would be search=searchtext. This is another potential entry point for web applications. Querystring and POST-name/value pairs are most common entry points get abused by attacker.
HTTP Variables/Headers
We have seen both GET and POST requests in above cases. As you can see along with name value pair we need to pass several other HTTP headers like referer, content-type, accept etc. There are list of possible HTTP requests which are define in RFC to be processed by web server. Some of these values get accessed by web pages as input and it becomes entry points to the system as well.
Here is a list of header for HTTP by RFC – http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.22
It is possible to abuse these headers and launch attack against the target application.
XML/SOAP/JSON streams (Web 2.0)
Traditional applications were using name-value pair approach with GET or POST requests but with Web 2.0 applications we are seeing several new ways of communicating various different structures and protocols all over HTTP. All these new streams and protocol can be considered as new entry points to the application and one needs to identify them in source code.
Here are some important different entry points with respect to Web 2.0 or non name-value pair based entry points.
XML-RPC – RPC (Remote Procedure Call) is very old concept to invoke procedures remotely. These remote calls were for underlying operating system which can be invoked from same machine or over the network. Over period to defend operating system some of the ports were closed and only available open ports were 80 and 443. These ports are supporting HTTP protocol only. This opens a demand for RPC running on HTTP. This need was addressed by XML-RPC. Specification for XML-RPC is found at http://www.xmlrpc.com/spec.
POST /trade-rpc/getquote.rem HTTP/1.0
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: xmlrpc.example.com
Content-Type: text/xml
Content-Length: 161

<?xml version="1.0"?>
In above case “MSFT” holding place can be entry point to the application.
SOAP – SOAP (Simple Object Access Protocol) message is known as SOAP envelope and it gets exchanged between server and client. SOAP has specific skeleton for information exchange.
<?xml version="1.0" encoding="utf-8"?>
   <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" 
         <getQuotes xmlns="http://tempuri.org/">
Again over here as well one can inject values into various part of SOAP message as well.
REST – REST(Representational State Transfer) REST is a modern Web application architecture style. It can fall in SOA as well. REST uses XML structures to communicate between client and server and vice a versa.
<?xml version="1.0"?>
<p:Laptops xmlns:p="http://laptops.example.com"
<Laptop id="0123" xl:href="http://www.parts-depot.com/laptops/0123"/>
< Laptop id="0348" xl:href="http://www.parts-depot.com laptops /0348"/>
< Laptop id="0321" xl:href="http://www.parts-depot.com/ laptops /0321"/>
In this case it is possible to inject values into XML nodes.
JSON – JSON (JavaScript Object Notation) is very light weight format for data exchange. JSON is getting popularity with developers for Web 2.0 applications. JSON is supported by various languages and scripts so it is easy to integrate and becoming very popular as well.
{"bookmarks":[{"Link":"www.example.com","Desc":"Interesting link"}]}
In this case JSON structure can be manipulated and corresponding values can be injected. All different values in JSON can act as potential entry point to the application.
Similarly on same line we can have JS-Objects as well,
message = {
    from : "john@example.com",
    to : "jerry@example.com",
    subject : "I am fine",
    body : "Long message here",
    showsubject : function(){document.write(this.subject)}    
Some of the platforms are using customized structures for information transfer like below.
Some of the developers are passing information in simple XML as well and again that become entry points as well. If application is utilizing some sort of frameworks then there is abstraction and framework will convert XML stream into respective values at intermediate level. It is interesting to capture these calls as well.
File Uploads
 Different applications are allowing uploading files to the application and it is possible to load malicious file to the target application. These files can create several different issues for the application and can be exploited as well.
For example, forms are multipart as shown below.
<form name="Form1" method="post" action="ContractUpload.aspx" id="Form1" enctype="multipart/form-data">
It is taking input as file as below,
<input name="uplTheFile" type="file" id="uplTheFile" />
Files can be attached with some advanced protocol as well like SOAP or can be posted directly over WebDAV as well. All these combinations open up new possible entry points to the application.
Feeds, third party information and APIs
 As shown in our initial figure (1) where we have seen modern days applications are consuming information from various different sources using APIs. For example, Google has published their JSON RPCs to access google database over HTTP for search results. If an application wants to integrate then can use this APIs and can make a call. This information get injected into server side code. This can be considered as entry point as well. RSS/ATOM feeds are another set of information source or entry points to the application.
<rss version="2.0">
    <title>Example News</title>
    <description>News feed</description>
    <pubDate>Tue, 10 Jun 2006 04:00:00 GMT</pubDate>
    <lastBuildDate>Tue, 10 Jun 2006 09:41:01  
    <generator>Weblog Editor 2.0</generator>
      <title>Today's title</title>
      <description>News goes here</description>
      <pubDate>Tue, 03 Jun 2006 09:39:21 GMT</pubDate>
All different values of RSS nodes can be manipulated and injected as well.
We have discussed several different entry points in above section. Some of them are valid for clients as well. The part of application is going to run on browser and browser can have following entry points. In with stream are coming into browser and if malicious payload gets injected in the HTML then can cause XSS or any other breach.
1.) HTTP response – All headers as well as HTML content
2.) JavaScripts coming from server
3.) Ajax/RIA calls consuming different structures which we have discussed like JSON, XML, JS-Object etc.
4.) Callbacks – Modern days applications are using callback mechanism so data coming from browser can be injected into DOM using script functions.
5.) Browser making API calls across domains
All these are entry points and now we need to identify their position in our source code. If we can identify all possible entry points in our system then it can help greatly in reducing attack surface. If complete source code is of 35,000 lines and if we can find 30 lines of entry points then now we know where we should be focusing for our analysis.