Archive for June 2011
Some days ago a security advisory related to web application firewalls (WAFs) was published on Full Disclosure. Wendel Guglielmetti Henrique found another bug in the IBM Web Application Firewall which can be used to circumvent the WAF and execute typical web application attacks like SQL injection (click here for details). Wendel talked already (look here) at the Troopers Conference in 2009 about the different techniques to identify and bypass WAFs, so this kind of bypass methods are not quite new.
Nevertheless doing a lot of web application assessments and talking about countermeasures to protect web applications there’s a TOP 1 question I have to answer almost every time: “Wouldn’t it be helpful to install a WAF in front of our web application to protect them from attacks?”. My typical answer is “NO” because it’s better to spent the resources for addressing the problems in the code itself. So I will take this opportunity to write some rants about sense and nonsense of WAFs ;-). Let’s start with some – from our humble position – widespread myths:
1. WAFs will protect a web application from all web attacks .
2. WAFs are transparent and can’t be detected .
3. After installation of a WAF our web application is secure, no further “To Dos” .
4. WAFs are smart, so they can be used with any web application, no matter how complex it is .
5. Vulnerabilities in web applications can’t be fixed in time, only a WAF can help to reduce the attack surface.
And now let us dig a little bit deeper into these myths ;-).
1. WAFs will protect a web application from all web attacks
There are different attack detection models used by common WAFs like signature based detection, behavior based detection or a whitelist approach. These detection models are also known by attackers, so it’s not too hard to construct an attack that will pass the detection engines.
Just a simple example for signatures ;-): Studying sql injection attacks we can learn from all the examples that we can manipulate “WHERE clauses” with attacks like “or 1=1″. This is a typical signature for the detection engine of WAFs, but what about using “or 9=9″ or even smarter “or 14<15″? This might sound ridiculous for most of you, but this already worked at least against one WAF and there are much more leet attacks to circumvent WAFs (sorry that we don’t disclose any vendor names, but this post is about WAFs in general).
Another point to mention are the different types of attacks against web applications, it’s not all about SQL injection and Cross-Site Scripting attacks, there also logic flaws that can be attacked or the typical privilege escalation problem “can user A access data of user B?”. A WAF can’t protect against these attacks, it a WAF can raise the bar for attackers under some circumstances, but it can’t protect a web application from skilled attackers.
2. WAFs are transparent and can’t be detected
In 2009, initially at Troopers ;-), Wendel and Sandro Gauci published a tool called wafw00f and described their approach to fingerprint WAFs in different talks at security conferences. This already proves that this myth is not true. Furthermore there will be another tool release from ERNW soon, so stay tuned, it will be available for download shortly ;-).
3. After installation of a WAF my web application is secure, no further “To Dos”
WAFs require a lot of operational effort just because web applications offer more and more functionality and the main purpose of a web application is to support the organization’s business. WAF administrators have to ensure that the WAF doesn’t block any legitimate traffic. It’s almost the same as with Intrusion Detection and Prevention Systems, they require a lot of fine tuning to detect important attacks and ensure functionality in parallel. History proves that this didn’t (and still doesn’t) work for most IDS/IPS implementations, why should it work for WAFs ;-)?
4. WAFs are smart, so they can be used with any web application, no matter how complex it is
Today’s web applications are often quite complex, they use DOM based communication, web services with encryption and very often they create a lot of dynamic content. A WAF can’t use a whitelist approach or the behavior based detection model with these complex web applications because the content changes dynamically. This reduces the options to the signature based detection model which is not as effective as many people believe (see myth No. 1).
5. Vulnerabilities in web applications can’t be fixed in time, only a WAF can help to reduce the attack surface
This is one of the most common sales arguments, because it contains a lot of reasonable arguments, but what these sales guys don’t tell is the fact, that a WAF won’t solve your problem either ;-).
Talking about risk analysis the ERNW way we have 3 contributing factors: probability, vulnerability and impact. A WAF won’t have any influance on the impact, because if the vulnerability gets exploited there’s still the same impact. Looking at probabilities with the risk analysis view, you have to take care that you don’t consider existing controls (like WAFs ) because we’re talking about the probability that someone tries to attack your web application and I think that’s pretty clear that the installation of a WAF won’t change that ;-). So there’s only the vulnerability factor left that you can change with the implementation of controls.
But me let me ask one question using the picture of the Fukushima incident: What is the better solution to protect nuclear plants from tsunamis? 1. Building a high wall around it to protect it from the water? 2. Build the nuclear plant at a place where tsunamis can’t occur?
I think the answer is obvious and it’s the same with web application vulnerabilities, if you fix them there’s no need for a WAF. If you start using a Security Development Lifecycle (SDL) you can reach this goal with reasonable effort ;-), so it’s not a matter of costs.
Clarifying these myths of web application firewalls, I think the conclusions are clear. Spend your resources for fixing the vulnerabilities in your web applications instead of buying another appliance that needs operational effort, only slightly reducing the vulnerability instead of eliminating it and also costing more money. We have quite a lot of experience supporting our customers with a SDL and from this experience we can say, that it works effectively and can be implemented more easily than many people think.
You are still not convinced ;-)? In short we will publish an ERNW Newsletter (our newsletter archive can be found here) describing techniques to detect und circumvent WAFs and also a new tool called TSAKWAF (The Swiss Army Knife for Web Application Firewalls) which implements these techniques for practical use. Maybe this will change your mind ;-).
have a nice day,
2 Comments | Posted by Enno Rey
The Amazon Elastic Compute Cloud (short: EC2) provides a flexible environment for the on demand provisioning of virtual machines of different performance levels. For our lab setup, a so-called extra large instance was used. According to Amazon, the technical specs are the following:
15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
I/O Performance: High
API name: m1.xlarge
Since the I/O performance of single disks had turned out to be the bottleneck in the “local” setup, eight Elastic Block Storage (short: EBS) volumes were created and attached to the instance. Each EBS volume is hosted within a specific availability zone and can be attached to instances running in the same zone. EBS volumes can be created and attached issuing two commands of the amazon ec2 command line tools. Therefore the amount of storage can be scaled up very easily. The only requirement (for our tests) is the existence of a sufficient number of EBS volumes which then contain parts of the pcap file to be analyzed.
During the benchmarks, the performance was significantly lower than with the setup described in the previous post, even though eight different EBS volumes were used to avoid the bottleneck of a single storage volume. The overall performance of the test was seemingly limited by the I/O performance restriction within virtualized instances and virtualized storage systems. Following the overall cloud computing paradigm, performance limitations of this kind might be circumvented by using multiple resources which do the processing in parallel. This could be done by using multiple instances or by using frameworks like Amazon MapReduce which are designed to process huge sets of data. Applying this approach to the analysis of pcap files, the structure of the pcap format carries some inherent problems. The format consists of a binary representation of the data which is structured by the time of the captured packets and not by logical packet traces. Therefore it would be necessary to process the complete pcap file by each instance to extract all streams to identify which streams of the file are to be analyzed by the concrete worker instance. This prevents an efficient distribution of the analysis in multiple jobs or input files. If the captured network data would be stored in separate streams instead of one big pcap file, the processing using a map/reduce algorithm would be possible and thus potentially increase scalability significantly.
That said, finally here are the results of our testing (test methodology described in earlier post):
So it took much longer to extract the data from a 500 GB file which can be attributed to the increased latency times accessing centralized storage (from a SAN/over the network) when compared to locally connected SSDs.
Hopefully this little series provided some insight for you, dear readers. We’ll publish the full technical report as an ERNW Newsletter in the next months.
Have a good one, thanks
0 Comments | Posted by Enno Rey
In the first post I’ve laid out the tools and lab setup, so in this one I’m going to discuss some results.
Description of overall test methodology
To evaluate the performance of the different setups used to analyze capture data, both tcpdump and pcap_extractor (see last post) were used. For the tests, five capture files were created using mergecap. Various sample traffic dumps were merged to five large files with different file sizes. All these files consisted of several capture files containing a variety of protocols (including iSCSI and FCoE packets). Capture files of ∼40, ∼80, ∼200, ∼500, and ∼800 GB size were created and were analyzed with both tools. For all tests the filtering expressions for tcpdump and pcap_extractor were configured to search for a specific source IP and a specific destination IP matching to iSCSI packets contained in the capture file. Additionally pcap_extractor was “instructed” to look for some search string (formatted like a credit card number).To address the performance bottleneck (again, see last post), that is the I/O throughput, two different setups of the testing environment (see above) were implemented, the first one going with a raid0 approach using four SSD hard drives, the second one with four individual SSD hard drives, each of them processing only a fourth of the analyzed capture file. Standard UNIX time command was invoked to measure the time of execution. Additionally the tools analyzing the data were started with the highest possible scheduling priority to ensure execution with the maximum of available resources. This is a sample command line invoking the test:
/usr/bin/time -hp /usr/bin/nice -n -19 ./pcap_test2 -i $i/in.pcap -o $i/out.pcap -f “ip src 192.168.1.207 and ip dst 192.168.1.208″ -s “5486000000620012” > $i/out &
The most interesting results table is shown below:
So actually extracting a given search string from a 500 GB file could be done in about 21 minutes, employing readily available tools and using COTS hardware for about 3K EUR (as of March 2011). This means that an attacker disposing of (large) data sets resulting from previous eavesdropping attacks will most likely succeed in getting the exact data she’s going after. Furthermore the time needed scales in a lineary fashion with the file size, so that processing a 1 TB data volume presumably would have taken ∼42 minutes, a 2 TB file would have taken ∼84 minutes and so on. In addition, SSD prices are constantly declining, too.
Thus it could be shown that the perception that the sheer volume of data gained from eavesdropping attacks on high speed links might prohibit an attacker from analyzing this data is, well, simply not correct ;-).
Risk Assessment & Mitigating Controls
Several factors come into play when trying to assess the actual risk of this type of attack. Let’s put it like this: once an attacker disposes either of physical access to a fibre at some point or is able to get into the transport path by means of certain network based attacks – which are going to be covered in another, future post – collecting and analyzing the data is an easy task. If you have sufficient reasons for trusting the party actually implementing the connection (e.g. a carrier offering Metro Ethernet services) and “the overall circumstances” you might rely on the isolation properties provided by the service and topology. In case you either don’t have sufficient reasons to trust (some discussion on approaches to “evaluate trustworthiness” can be found here or here) or in highly regulated environments, using encryption technologies on layer 2 (like these or these) might be a safer approach.
In the next post we’ll discuss the cloud based test setup, together with its results. Stay tuned &
have a great day,