DISCLAIMER: Do not use any of the material presented here to cause harm. I will find out where you live, I will surprise you in your sleep and I will tickle you so hard that you will promise to behave until the end of your days.
The story so far
Last year, at 28c3, Alexander Klink and Julian Wälde presented a way to run a denial-of-service attack against web applications.
One of the most impressing demonstrations of the attack was sending crafted data to a web application. The web application would dutifully parse that data into a hash table, not knowing that the data was carefully chosen in a way so that each key being sent would cause a collision in the hash table. The result is that the malicious data sent to the web application elicits worst case performance behavior of the hash table implementation. Instead of amortized constant time for an insertion, every insertion now causes a collision and degrades our hash table to nothing more than a fancy linked list, requiring linear time in the table size for each consecutive insertion. Details can be found in the Wikipedia article.
It is interesting to note that the attack has been known for quite some time now, it was already portrayed in 2003 Usenix paper Denial of Service via Algorithmic Complexity Attacks. At the time, they specifically targeted Perl, the Squid web proxy and, I’m not kidding, the Bro intrusion detection system. UPDATE: As Pascal Junod taught me today, Phrack seems to be the first to have officially published such an attack.
Nobody seemed to bother to look whether the attack may work for other hash functions as well, the thing was fixed in Perl and the rest of the software they targeted and everyone lived happily ever after. Until Klink and Wälde, eight years later, probably thought: “Wait a minute…”.
They investigated whether the attack could be extended to other popular hash functions being used in current programming language implementations. The results were astonishing: nearly every hash function used at the time could be easily attacked by the same mechanism. Except for Perl of course, since they had fixed it back in 2003.
The problem and how it was fixed last year
The principal problem that renders the attack possible is the ease with which arbitrary collisions can be produced for almost all popular general-purpose hash functions - Klink and Wälde presented some examples in their presentation. The mere term “general-purpose” hash function contrasts them to “real” cryptographic hash functions. A cryptographic hash function is specifically designed to prevent anyone from finding collisions in a computationally feasible way. More generally, cryptographic hash functions are specifically designed to thwart a number of other attacks like (second) preimage attacks as well. So why not use cryptographic hash functions right away and be done with it? The major reason is performance. The added security usually comes at a price, hash functions like SHA-1, the SHA-2 family or even MD5 are typically orders of magnitude slower than the average general-purpose hash function.
If you think about it, this trade-off is quite similar to what we observe in the world of random numbers. There are fast algorithms providing random numbers that perform great in statistical tests and simulations, yet it is often not too hard to predict them. This renders them useless in online games or security-critical applications. The solution is to use a cryptographically secure algorithm for random number generation, one that explicitly tries to prevent prediction of its output.
While for random number generators there is largely a consensus where each shall be used, the situation is less clear for cryptographic hash functions and non-cryptographic general-purpose hash functions. Traditionally, the line was drawn between situations where we explicitly require the added security, while we would use the general-purpose functions in situations where we can safely give up the additional security. Much like it is the case for random number generators as well. But the “hashDoS” problem has blurred that line quite a bit.
There are voices that deny the validity of the attack. It’s the nature of the algorithm, so how can this be an attack, right? I really don’t like this argument. Just because it’s not “fair”, it still is rather annoying that anyone can take down a web server with this kind of attack. As long as the danger is clear and present, as long as it can effectively be used to cause real harm, it sounds like a problem to me. A problem that must be fixed. We do, after all, live in the age of Cyber Warfare™.
Algorithmic complexity attacks are not only applicable in the context of hash functions. They are applicable in basically any context where an algorithm shows a different worst case behavior opposed to its average case behavior. The original Usenix paper mentions Quicksort, Binary Trees or Regexp matchers as other prominent examples. And I am sure there are much more examples to be found. In the example of Quicksort, we can effectively solve the problem by doing an initial random shuffle of the input data to be sorted. Using cryptographically secure random numbers, of course.
Randomness is very effective in mitigating these kinds of problems. It is probably therefore that the fix proposed last year was inspired by Universal Hashing, also portrayed in the venerable Introduction to Algorithms. By randomly picking a hash function from a family of hash functions fulfilling some requirements, we are being guaranteed an upper bound to the probability of collisions to occur. This sounds exactly like what we want, solving our problem for good.
Most general-purpose hash functions use a seed value that is being used to generate subsequent outputs. The idea was to randomize this seed in order to achieve randomization of the hash function in general. However, the seed would not be regenerated once it had been created (typically on startup). Once generated, it would stay the same for each invocation of the hash function. The same approach is used in the context of cryptographic hash functions as well: there, if the initial seed value is being kept secret (a “key”), the resulting hash function can be regarded as a pseudo-random function (or PRF). This looks a lot like we could really use it in the context of Universal Hashing to achieve our upper bound on collisions. Problem solved?
The flaw in last year’s fix
Having a PRF would fulfill the requirements needed for Universal Hashing, but unfortunately none of the general-purpose hash functions in use was designed for this purpose. While randomizing the seed properly (using a cryptographically secure random number generator) looks like it would achieve that goal, the conclusion that the resulting hash function would automatically become a PRF is wrong. As a very simple example, take the following “hash function”:
f(seed) = 42
Regardless of the randomness of the seed, the overall function is still very predictable. Even if we roll a die or split atoms to generate our seed, there is nothing that helps making this a better hash function, let alone a PRF.
While the typical general-purpose hash function is of course not as predictable as the example function, it is still possible to distinguish their output from real random output, even when randomizing the seed. As long as we can find “distinguishers” that tell our hash function from a real random function, this is enough to prove that we clearly have a function that is not a PRF.
Since general-purpose hash functions are neither chosen from a “universal” family of hash functions plus the fact that they are not PRFs should lower our confidence in the functions’ collision resistance considerably. This opens the door (at least in theory) for finding other ways to produce collisions for them. And here is how.
Here come the cryptographers
While working on SipHash - more on that later - Jean-Philippe Aumasson and Daniel J. Bernstein also analyzed the collision properties of popular general-purpose hash functions. What was probably novel to their approach was that they treated these hash functions as if they were cryptographic hash functions. Since it is collisions that we are interested in, they used the tools that are being used to analyze collision resistance of “real” cryptographic hash functions, for example, Differential Cryptanalysis.
Not surprisingly, their results were impressive. None of the functions that were analyzed were specifically designed to withstand this level of scrutiny. Among the most impressive results are completely breaking CityHash64, finding a way to retrieve the secret seed of the hash functions being used in Python 2.7.3 and 3.2.3 and completely breaking MurmurHash 2 and 3. “Completely breaking” means they were able to produce arbitrary multicollisions even when the seed was properly randomized. The resulting attacks are much more powerful than the typical attacks you would hope to find in your garden-variety cryptographic hash function. Typically, cryptographers are happy if they manage to find a collision at all. Or they might gleam with joy if they pulled off to decrease the theoretical complexity needed to find a collision by a few powers of two. Just to point out how devastating these attacks are in the context of cryptographic hash functions.
We have the recipe - now all we need is the (hash) cake
I will focus on the results for MurmurHash in the rest of this article, since this hash function is used in a variety of software projects, for example CRuby, JRuby, Rubinius, Redis, and also in Oracle JDK and OpenJDK. It is a bit tragic that MurmurHash 3 was only introduced in the JDK after last year’s hashDoS presentation - they adopted the fix being proposed at the time and offer the option to run the JVM using MurmurHash 3 as the hash function used for string hashing. It is (was?) also planned to make this the default hashing function in future releases of the JDK.
Jean-Philippe and I discussed how we could transform their findings into real-world attacks that could show off the potentially disastrous consequences that not dealing with this problem could have. Since I am most familiar with the source code of the Ruby implementations, I started applying Jean-Philippe’s recipe in Ruby. JRuby and CRuby chose MurmurHash 2 as their hash function, while Rubinius switched to MurmurHash 3 in a reply to last year’s hashDoS. It was rather easy to transfer the “recipe” into valid Ruby exploit code. There it was. Proof that this really worked. I got excited. The only problem was with the 64 bit version of CRuby. There hadn’t been a PoC yet for the particular version of Murmur 2 used there:
1 2 3 4 5 6 7 8 9 10 11 |
|
On a 64 bit system, SIZEOF_ST_INDEX_T
is larger than 4, and what happens
is that MurmurMagic
is defined as the concatenation of the “magic constant”
of MurmurHash in its version 1 and the actual constant that is used in the
32 bit version of MurmurHash 2. This complicated things a bit, and the generic
recipe had to be adjusted slightly in order to still be able to produce
collisions. Let’s have a look at the implementation of Murmur 2 in 64 bit
CRuby. This is a simplified version of the one being implemented in
st.c
that still covers the important details:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
As we immediately notice, if our input data is a multiple of 8 bytes long, the code in the switch block will not be executed. If we want to successfully generate multicollisions it will make sense to restrict ourselves to 8-byte-aligned inputs, this will only make the task easier. Another fact we observe is that the seed doesn’t play a very important role in the algorithm. In fact, the entire transform of the data being processed is independent of the seed that was chosen. It turns out that we can exploit these facts to trick even a randomized MurmurHash into producing collisions at will. Without further ado, here is the code that produces arbitrary multicollisions for 64 bit CRuby:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
This is the high-level representation of the algorithm. The blocks
parameter
controls how many collisions will be produced in the end. We generate a random
string that is blocks
times 16 bytes long. Next, we mostly copy this string,
introducing tiny differences by XORing the string with DIFF
, which alters
the original string only in every eigth byte.
Let’s just consider the first 16 bytes of in0
and the first 16 bytes of in1
for now. Hashing them with Murmur means that there are two data blocks that
would be processed into intermediate values for k
. Each of these k
values
would itself be XORed to the current h
value at that point. We see that h
is
initialized only with seed
and len
, so its initial value is independent of
the string to be hashed. Let’s imagine we could force Murmur into producing the
16 byte strings from above as the intermediate values of k
. So the first 8
bytes of k
finally would result in the first 8 bytes of in0
and in1
respectively, and let’s assume the same was possible for the second 8 bytes
that are finally assigned to k
. If, by magic, we were able to pull this off,
then we could produce a collision! Why? Let’s have an in-detail look at what
would happen:
Say we call the first 8 bytes of in0
x0
and the second 8 bytes x1
. Also
call the two 8 byte pairs of in1
y0
and y1
. By virtue of how we
constructed in0
and in1
, it holds that both x0
and y0
as well as x1
and y1
differ in exactly one byte, by 0x80
. If we first process x0
and
x1
, this is what will happen:
h := seed ^ len
h := h * m
h := h ^ x0
h := h * m
h := h ^ x1 := h*
When processing y0
and y1
, the process is this:
h := seed ^ len
h := h * m
h := h ^ y0 # difference of 0x80 compared to h ^ x0
h := h * m # difference is preserved
h := h ^ y1 := h**
By our choice of DIFF
, we make sure that the multiplication by m
preserves
our difference of 0x80. Different choices could lead to the difference being
changed by the multiplication, so it’s important to keep it exactly the way it
is. But now since the difference is kept in the same byte, the final XOR of
h ^ y1
introduces yet another difference of 0x80 in the same byte, and
therefore they cancel each other out: h**
is indeed equal to h*
! Building
on this, it is not hard to see the general approach. Form a chain of such 16
byte blocks in0
and in1
and produce every possible combination of them
that replaces 16 byte blocks of in0
with the corresponding 16 byte blocks
of in1
. There are 2n possible combinations if both in0
and in1
are
of length 16n
. Thus, we have 2n distinct strings that produce the same
final hash value, and thus, 2n multicollisions.
Note that this thought experiment used an yet unsolved element of magic. In
fact, in0
and in1
in their current form are useless, since we said that
we wanted their values to become the intermediate results of the k
values.
Right now, we don’t know how to get there. Recall that there was a call to
the mysterious function invert
in the previous snippet that we haven’t seen
yet. In the above snippet, we apply invert
to our strings in0
and in1
.
You guessed right, this is what prepares them in a clever way so that when
being processed by MurmurHash, MurmurHash is actually forced to produce the
values in the form that we desire. Think of it as first applying an “inverse
MurmurHash” procedure to our input strings which will be cancelled out after
applying Murmur, yielding the original values that we need for our collision.
Voilà the heart of the attack:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
What invert64
does is applying a series of transforms to our initial input
so that applying the corresponding operations of MurmurHash will yield our
original input again. Contrast the above with what happens in MurmurHash:
1 2 3 |
|
The input k
is first multiplied by m
. INV_MAGIC
is actually the inverse
of m
mod 264. This means that multiplying k
by m
yields the value of
x
in the invert64
function right after we applied the series of XORs and
right shifts. Or, in simpler terms, the first operation of Murmur cancels out
the last operation of invert64
.
The series of XORs and right shifts by R
that was applied in invert64
was
carefully chosen in a way that ensures that applying the transform
k ^= k >> r
in Murmur itself produces the value of x
right after the first
multiplication by INV_MAGIC
in invert64
. It’s a bit tricky to see why this
works, but if you’re not convinced, you can easily try it on a piece of paper.
Finally multiplying k
again by m
cancels out the first multiplication of
x
in invert64
. The end result of k
is the initial value that we submitted
to invert64
originally. Applying invert64
to that input first has forced
MurmurHash into producing the original input after it is done processing our
modified, “inverted” input - we have solved the piece of “magic” that was
still missing.
If you run the script on a 64 bit CRuby before 1.9.3p327, it will indeed produce 24 collisions from (slightly) different input keys:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
The actual hash values will probably vary due to the randomized seed, but they will be equal. Sweet. We can produce collisions for MurmurHash at will now (the approach is similar for MurmurHash 3), but how can we use this in the real world?
Finding a good showcase for the attack
So being able to produce collisions is one thing, but we would need a prominent example of where to apply this to cause real impact. Hashes (the data structure) are used in a huge number of places in Ruby, in particular web applications typically create hashes from user input when processing requests. Rails was one of the targets last year. The particular attack used form-encoded POST data that would be inserted into a Hash, causing Rails to choke on the malicious data.
As a consequence, Rack (which is where this was ultimately happening) introduced KeySpaceConstrainedParams. This clever little bastard limits the total size of all keys being parsed from a request, and it is very effective in doing so. My first attempts to fool Rails failed miserably, because the limit kicked in very quickly. After all, for our attack to be successful, we need to insert very large String keys (16n-size strings for 2n collisions). I was devastated. It just couldn’t be that I got this far and now I’d be stuck. FFFFFFFUUUUUUUUUUUU. Times ten.
After I recovered from my misery, I recalled that Hashes are used everywhere. I mean, literally. So who cares if form-encoded data won’t work? Let’s just move elsewhere. It wasn’t hard to choose a good candidate - in times of SOA, Software as a Service, web APIs, and single-page JavaScript web applications, what better candidate than JSON could there be? All the different JSON parsers probably do what is the most obvious thing to do: they parse a JSON object into a Hash.
I tried targeting the JSON implementation in C that ships with CRuby’s standard library, certain of my imminent success. But F7U12 all over again. What the F*ck? What’s your problem now, Ruby? Turns out that the JSON implementation complained about the encodings of the strings it was being presented with. I was near a heart attack, so I couldn’t find the strength to actually find out what the problem was. I had only one idea left before I would commit Harakiri. I would simply run a brute-force search for values that would not suffer from the encoding issue.
Much to my surprise, this worked. And with reasonable performance even.
Relieved, I took those precomputed values, and created a serialized HTTP
request that I could simply read as a raw String and send to the Rails app
via Net::HTTP
. It worked. And the nice thing about it: your app doesn’t
even have to expose a JSON end point at all to make this work!
Next target: The Enterprise
Now I didn’t want to let this become a Ruby-only thing. Being involved in Ruby myself, this was the last thing I would want to happen. But there still was another high-profile candidate that I hadn’t touched yet. Java. How hard could it be to break Java’s hash now that I had endured the Ruby experience? That should be a sure thing, merely an excercise.
Alright, let’s apply the general recipe to Java Strings, crafting our byte
arrays in the same way that we did for Ruby. Save, run and BOOM!, right? No.
Save, run and FFFFFFFUUUUUUUUUUUU! What is it this time? Here’s what happened:
Not thinking much about it, I had used the byte[]
constructor
of String:
1 2 3 4 5 6 7 8 9 10 11 |
|
The party pooper is StringCoding.decode
. Whatever values I feed into this
constructor, the decode
is gonna mess with them. I was overwhelmed with
fear. Could it be that Java is safe? You can’t touch this?
One last option I saw would have been trying to “invert” what happens in
decode
, much like invert64
did invert the operations of MurmurHash
itself. But plagued from failure and misery, I started staring at the
MurmurHash 3 implementation
in the JDK: there must be another way…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|
And then it struck me - “Wait a minute! They’re operating on char[]!”. And I
remembered that String
features a constructor for char[]
as well. I took
a look:
1 2 3 |
|
Muahahahaha. No decoding or any other annoying thing happening. The rest was
easy. I would simply take 16 bit-wide char
s and do my thing on two bytes at
a time, translating the operations on bytes one-to-one to chars. It worked.
Take this, Enterprise!
How can we fix this?
We reported our findings and CVEs were assigned. At this point I really want to give props to how this issue was handled in the Ruby community. I believe they did an amazing job already after last year’s hashDoS, being the first to come up with fixes for the problem. The same could be observed again this year. I’m really proud of being a part of this community, seeing how the issue was handled after we had reported it. A special thank you to Hiroshi Nakamura, who managed the issue for CRuby and JRuby and also took care of the communication with oCERT. Ruby 1.9.3p327 was released, JRuby and Rubinius followed.
At Appsec Forum in Switzerland, Jean-Philippe and I presented our work together with a possible solution to the dilemma (slides). So what can we do? Clearly, what we want is collision resistance, even when facing malicious adversaries. As already mentioned, this is typically the domain of cryptographic hash functions, but we already concluded that they are just too slow for our purposes. The holy grail would be a cryptographic hash function whose performance is comparable to the general-purpose hash functions we use today. This is what SipHash tries to do for us. It was specifically designed to prevent the attacks described so far, yet it is comparable in speed to general-purpose hashes. It is much faster than traditional cryptographic hashes, too. It is slower than MurmurHash, but not by much.
Does SipHash solve the problem once and for all? As with all cryptographic primitives, the current state of the art in academia does not allow us to prove that SipHash is indeed a PRF. But it was designed to be one, by experts in the field. Of course, you shouldn’t trust any cryptographic primitive before it hasn’t been extensively tested by all sorts of different people. It might even happen that one day somebody finds a flaw in the design. But, and let me emphasize this, SipHash is the first “general-purpose” hash to explicitly address the issue. While you will probably manage to find distinguishers for other popular general-purpose hashes quite easily, it is very likely that you will have a much harder time doing so for SipHash. There are many implementations available for SipHash now, in case you’d like to use it in your projects, too.
Other fixes?
The fact that SipHash is slower than MurmurHash is bugging many language implementors. Just because one child misbehaves, the whole class has to suffer the consequences? This seems unfair, we all know that this is not considered contemporary education practice. That’s why other fixes were proposed, after last year’s hashDoS as well.
One prominent proposal was to limit the parameters of the software that could potentially be exploited by this attack. It is an application concern after all that we’re dealing with, so we should handle it at the application level. We saw KeySpaceConstrainedParams, and limitation of incoming parameters was also the course taken in PHP last year if I remember correctly. If we learned one thing from my odyssey, this approach is bound to fail. My experience with KeySpaceConstrainedParams was living proof for what would happen in reality. I do believe that it is correct and important to constrain the size of request parameters for security purposes, but it is obviously not going to help with this particular problem. One hole was fixed, but courtesy of the omnipresence of hashes in today’s programming it was easy for me to just move on to the next target that used hashes and where nobody had bothered to take care of the issue yet. Constraining parameters in every piece of code that makes use of a Hash is simply an illusion. Imagine the effort needed. Imagine the ugly code bloat being caused by an aspect that has nothing to do with the business logic. Even if we managed to fix this in every place where a Hash is created from incoming input, there may be numerous other places where Hashes are created only in second instance, only indirectly depending on the direct input. The probability of overlooking something or fixing one place the wrong way is tremendously high. No matter how hard we try, I am very positive that there will always be yet another loophole. I’m also reluctant to leave it up to developers to solve the problem in everyday coding. Imagine how often we will tell ourselves “Alas, a hash! We need to restrict parameters!”. Errrr… not happening. I believe the only chance we have to master the problem is to fix it at its root - and this is by fixing the core hash function.
A different proposal was to look into other data structures that have different worst case characteristics than hash tables. Some form of Trie for instance. I’m reluctant. I played with Crit-bit trees after last year’s hashDoS. I also looked into ways for balancing tries in general, but the only good options I was able to find involved hashing - the irony! For something like Red-Black Trees, I once read (can’t find the source anymore unfortunately) that there are “rebalancing attacks” that carefully transform the tree in such a state that cascading rebalancing operations are triggered. The “H” in Hash Array Mapped Tries is proof enough that attacks similar to the one described here could be carried out on them as well. And after all, none of the alternatives give us the dream that motivated “associate arrays” in the first place: (amortized) constant time insertion and access to the elements being stored. With this in mind, I feel that the “hash table” is not becoming obsolete any time soon.
Who else might have a problem now?
Most certainly anyone using MurmurHash 2/3 or CityHash is clearly in trouble. But even if other general-purpose hash functions haven’t been broken yet, confidence in their collision resistance in the light of malicious adversaries should be low. Languages that just applied some form of parameter restriction to solve the issue should clearly rethink their strategy - the possibility of loopholes is very high as was demonstrated with the Rails PoC. What about languages that use different data structures for implementing associative arrays, like Scala and Clojure for example? If I’m not mistaken, they use a HAMT internally. As I mentioned earlier, the mere fact that a hash function is used could already be dangerous.
Many people like to pick on Microsoft, but credit where credit is due, they
seem to have done the right thing after last year’s incident. This is
how they fixed String.GetHashCode
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
They introduced a feature that lets you turn on randomized hashing
manually. And here is where it gets scary: if turned on, the hash will be
computed by the mysterious InternalMarvin32HashString
. It’s declared as
1 2 3 4 5 |
|
It is imported from an external DLL, clr.dll
to be precise. Since Reverse
Engineering is illegal in Germany, I had to call my man overseas, asking him
to investigate further. And here’s what he found. After reverse engineering
the code, he is pretty certain that the ominous Marvin hash function seems
to be a pretty good hash function, at least on the surface it doesn’t look
like it could be attacked easily. Take this with a grain of salt though, my
man overseas told me that he didn’t have much time to analyze the function,
it’s just what he was able to find on a cursory glance over the disassembled
code. What’s further disturbing is that googling for anything about “Marvin32”
hasn’t brought up anything. Zilch. Security by obscurity? At least it’s
effective, for now. If anyone knows more about this mysterious hash function
please contact us! We’d be delighted to know more about its background.
Final words
We certainly want to explore further candidates in the future. If you are involved in the development of any product using hash tables and feel like you are potentially affected, please do not hesitate to contact us, we would be happy to assist!
The code for the Proof of Concept attacks has been published here. Please feel free to experiment with it, but please also read the disclaimer of this article. In case you want to know more about the details of this attack, especially about the details of CityHash and Python’s hash, there is going to be a presentation at 29c3 that will cover these aspects and the design principles of SipHash in much more detail.