Taming the BEAST

By now, a lot of people have heard of BEAST, which is an attack against the AES-CBC encryption used in SSL. Some people might also have noticed that the HekaFS git sources include “aes” and “cbc” branches which represent two different implementations of a new at-rest encryption method to replace the weak AES-CTR version that we’re using as a placeholder, and those people might wonder whether we share the BEAST vulnerability. Short answer: we don’t. While Edward’s “aes” branch might implement real CBC, my “cbc” branch does not. Yeah, I know that’s confusing. Simply put, I use some of the “xxx_cbc” entry points for convenience, but only for one cipherblock at a time so there’s no actual chaining involved. One correspondent has already pointed out – correctly – that “cbc” is a misnomer for what’s really tweaked ECB. Our scheme is actually pretty similar to LRW, but it uses a hash and a unique (per file) salt instead of Galois-field multiplication. It was designed to defeat a completely different attack (modification in one ciphertext block leading to a predictable change in the next plaintext block), but it also avoids the guessable-IV flaw that is the basis of BEAST.

There are no absolutes in this world, but I believe that our yet-to-be-named scheme is as secure as anything else out there with respect to the known attacks – including BEAST – it was designed to thwart. More importantly, people who know a ton more about this stuff than I do seem to agree. The main knock against it is performance. Calculating a hash per cipherblock is more expensive than a simple XOR. It also precludes using AES-capable hardware to its fullest. (Sad fact: commodity crypto hardware will practically always implement the crypto that’s near if not beyond the end of its useful life.) On the other hand, there are many hashes with a good balance between cryptographic strength and computational difficulty, and our approach allows us to use any of them. Hashing is in any case considerably less expensive than a full extra round of encryption would be, and that approach is commonly used (check out XEX and XTS in the Wikipedia article) without too many people complaining.

In the interest of full disclosure, I will point out that there is still one issue with the encryption we’re using. It’s one shared with practically all forms of storage encryption: the IV for a given block, though it has many good cryptographic properties, is still constant unless the key/IV/salt is changed periodically. This is a very expensive process, which essentially involves both decrypting and re-encrypting the entire file. It also introduces significant key-management complexity. We might implement this some day, or at least some way to do it manually per file without having to make an actual copy (which wouldn’t be as transparent to someone using the file). Bear in mind, though, that attacks against the storage itself would have to be executed either by your storage provider or someone who had already compromised that provider, and from then on would be equivalent to attacks against the disk in your on-premise system. Other users of the same service would not be able to execute such an attack. It’s not perfect, but it’s at least as secure as your local storage and it’s as good as anything I know about except maybe Tahoe-LAFS.

P.S. One of these days I’d like to do a comparison between HekaFS and Tahoe-LAFS, and maybe some thoughts on when you might want to use each. Zooko, would you be interested in collaborating on that?