Breaking

Some reflections on virtualization security, part 1

Today was an interesting day, for a number of reasons. Amongst those it stuck out that we were approached by two very large environments (both > 50K employees) to provide security review/advise, as they want to “virtualize their DMZs, by means of VMware ESX”.
[yes, more correctly I could/should have written: “virtualize some of their DMZ segments”. but this essentially means: “mostly all of their DMZs” in 6-12 months. and “their DMZ backend systems together with some internal servers” in 12-24 months. and “all of this” in 24-36 months. so it’s the same discussion anyway, just on a shifted timescale ;-)]

Out of some whim, I’d like to give a spontaneous response here (to the underlying question, which is: “is it a good idea to do this?”).

At first, for those of you who are working as ISOs, a word of warning. Some of you, dear readers, might recall the slide of my Day-Con3 keynote titled “Don’t go into fights you can’t win”.
[I’m just informed that those slides are not yet online. they will be soon… in the interim, to get an idea: the keynote’s title was “Tools of the Trade for Modern (C)ISOs” and it had a section “Communication & Tools” in it, with that mentioned slide].
This is one of the fights you (as an ISO) can’t win. Business/IT infrastructure/whoever_brought_this_on_the_table will. Get over it. The only thing you can do is “limit your losses” (more on this in second, or in another post).
Before, you are certainly eager to know: “now, what’s your answer to the question [good idea or not]?”.

I’m tempted to give a simple one here: “it’s all about risk [=> so perform risk analysis]”. This is the one we like to give in most situations (e.g. at conferences) when people expect a simple answer to a complex problem ;-).
However it’s not that easy here. In our daily practice, when calculating risk, we usually work with three parameters (each on a 1 [“very low”] to 5 [“very high” scale), that are: likelihood of some event (threat) occurring, vulnerability (environment disposes of, with regard to that event) and impact (if threat “successfully” happens).
Let’s assume the threat is “Compromise of [ESX] host, from attacker on guest”.
Looking at “our scenario” – that is “a number of DMZ systems is virtualized by means of VMware ESX” – the latter one (impact) might be the easiest one: let’s put in a “5” here. Under the assumption that at least one of the DMZ systems can get compromised by a skilled+motivated attacker at some point of time (if you would not expect this yourselves, why have you placed those systems in a DMZ then? πŸ˜‰ … under that assumption, one might put in a “2” for the probability/likelihood. Furthermore _we_ think that, in the light of stuff like thisΒ  and the horrible security history VMware has for mostly all of their main products, it is fair to go with a “3” for the vulnerability.
This, in turn, gives a “2 * 3 *5 = 30” for the risk associated with the threat “Compromise of [ESX] host, from attacker on guest” (for a virtualized DMZ scenario, that is running guests with a high exposure to attacks).

In practically all environments performing risk analysis similar to the one described above (in some other post we might sometimes explain our approach – used by many other risk assessment practitioners as well – in more detail), a risk score of “30” would require some “risk treatment” other than “risk retention” (see ISO 27005 9.3 for our understanding of this term).
Still following the risk treatment options outlined in ISO 27005, there are left:

a) risk avoidance (staying away from the risk-inducing action at all). Well, this is probably not what the above mentioned “project initiator” will like to hear πŸ˜‰ … and, remember: this is a fight you can’t win.

b) risk transfer (hmm… handing your DMZ systems over to some 3rd party to run them virtualized might actually not really decrease the risk of the threat “Compromise of [ESX] host, from attacker on guest” πŸ˜‰

c) risk reduction. But… so how? There’s not many options or additional/mitigating controls you can bring into this picture. The most important technical recommendation to be given here is the one of binding a dedicated NIC to every virtualized system (you already hear them yelling “why can’t we bring more than ~ 14 systems on a physical platform?”, don’t you? πŸ˜‰ ). Some minor, additional advise will be provided in another post, as will some discussion on the management side/aspects of “DMZ virtualization”. (notice how we’re cleverly trapping you into coming back here? πŸ˜‰
So, if you are sent back and asked to “provide some mitigating controls”… you simply can’t. there’s not much that can be done. You’re mostly thrown back to that well-known (but not widely accepted) “instrument of security governance”, that is: trust.

In the end of the day you have to trust VMware, or not.
We don’t. We – for us – do not think that VMware ESX is a platform suited for “high secure isolation” (at least not at the moment).
The jury is still out on that one… but presumably you all know the truth, at your very inner self πŸ˜‰
For completeness’ sake, here’s the general advice we give when we only have 60 seconds to answer the question “What do you think about the security aspects of moving systems to VMware ESX”. It’s split into “MUST” or (“DO NOT”) parts and “SHOULD” parts. See RFC 2119 for more on their meaning. Here we go:

1.) Assuming that you have a data/system/network classification scheme with four levels (like “1 = public” to “4 = strictly confidential/high secure”) you SHOULD NOT virtualize “level 4”. And think twice before virtualizing SOX relevant systems πŸ˜‰
2.) If you still do this (virtualizing 4s), you MUST NOT mix those with other levels on the same physical platform.
3.) If you mix the other levels, then you SHOULD only mix two levels next to each other (2 & 3 or 1 & 2).
4.) DMZ systems SHOULD NOT be virtualized (on VMware ESX as of the current security state).
5.) If you still do this (virtualizing DMZ systems), you MUST NOT mix those with Non-DMZ systems.

For those of you who have already violated advice no. 4 but – reading this – settle back mumbling “at least we’re following advice no. 5″… wait, my friends, the same people forcing you before will soon knock at your door … and tell you about all those “significant cost savings” again… and again…

Continue reading