Breaking Murmur: Hash-flooding DoS reloaded

DISCLAIMER: Do not use any of the material presented here to cause harm. I will find out where you live, I will surprise you in your sleep and I will tickle you so hard that you will promise to behave until the end of your days.

The story so far

Last year, at 28c3, Alexander Klink and Julian Wälde presented a way to run a denial-of-service attack against web applications.

One of the most impressing demonstrations of the attack was sending crafted data to a web application. The web application would dutifully parse that data into a hash table, not knowing that the data was carefully chosen in a way so that each key being sent would cause a collision in the hash table. The result is that the malicious data sent to the web application elicits worst case performance behavior of the hash table implementation. Instead of amortized constant time for an insertion, every insertion now causes a collision and degrades our hash table to nothing more than a fancy linked list, requiring linear time in the table size for each consecutive insertion. Details can be found in the Wikipedia article.

It is interesting to note that the attack has been known for quite some time now, it was already portrayed in 2003 Usenix paper Denial of Service via Algorithmic Complexity Attacks. At the time, they specifically targeted Perl, the Squid web proxy and, I’m not kidding, the Bro intrusion detection system. UPDATE: As Pascal Junod taught me today, Phrack seems to be the first to have officially published such an attack.

Nobody seemed to bother to look whether the attack may work for other hash functions as well, the thing was fixed in Perl and the rest of the software they targeted and everyone lived happily ever after. Until Klink and Wälde, eight years later, probably thought: “Wait a minute…”.

They investigated whether the attack could be extended to other popular hash functions being used in current programming language implementations. The results were astonishing: nearly every hash function used at the time could be easily attacked by the same mechanism. Except for Perl of course, since they had fixed it back in 2003.

The problem and how it was fixed last year

The principal problem that renders the attack possible is the ease with which arbitrary collisions can be produced for almost all popular general-purpose hash functions - Klink and Wälde presented some examples in their presentation. The mere term “general-purpose” hash function contrasts them to “real” cryptographic hash functions. A cryptographic hash function is specifically designed to prevent anyone from finding collisions in a computationally feasible way. More generally, cryptographic hash functions are specifically designed to thwart a number of other attacks like (second) preimage attacks as well. So why not use cryptographic hash functions right away and be done with it? The major reason is performance. The added security usually comes at a price, hash functions like SHA-1, the SHA-2 family or even MD5 are typically orders of magnitude slower than the average general-purpose hash function.

If you think about it, this trade-off is quite similar to what we observe in the world of random numbers. There are fast algorithms providing random numbers that perform great in statistical tests and simulations, yet it is often not too hard to predict them. This renders them useless in online games or security-critical applications. The solution is to use a cryptographically secure algorithm for random number generation, one that explicitly tries to prevent prediction of its output.

While for random number generators there is largely a consensus where each shall be used, the situation is less clear for cryptographic hash functions and non-cryptographic general-purpose hash functions. Traditionally, the line was drawn between situations where we explicitly require the added security, while we would use the general-purpose functions in situations where we can safely give up the additional security. Much like it is the case for random number generators as well. But the “hashDoS” problem has blurred that line quite a bit.

There are voices that deny the validity of the attack. It’s the nature of the algorithm, so how can this be an attack, right? I really don’t like this argument. Just because it’s not “fair”, it still is rather annoying that anyone can take down a web server with this kind of attack. As long as the danger is clear and present, as long as it can effectively be used to cause real harm, it sounds like a problem to me. A problem that must be fixed. We do, after all, live in the age of Cyber Warfare™.

Algorithmic complexity attacks are not only applicable in the context of hash functions. They are applicable in basically any context where an algorithm shows a different worst case behavior opposed to its average case behavior. The original Usenix paper mentions Quicksort, Binary Trees or Regexp matchers as other prominent examples. And I am sure there are much more examples to be found. In the example of Quicksort, we can effectively solve the problem by doing an initial random shuffle of the input data to be sorted. Using cryptographically secure random numbers, of course.

Randomness is very effective in mitigating these kinds of problems. It is probably therefore that the fix proposed last year was inspired by Universal Hashing, also portrayed in the venerable Introduction to Algorithms. By randomly picking a hash function from a family of hash functions fulfilling some requirements, we are being guaranteed an upper bound to the probability of collisions to occur. This sounds exactly like what we want, solving our problem for good.

Most general-purpose hash functions use a seed value that is being used to generate subsequent outputs. The idea was to randomize this seed in order to achieve randomization of the hash function in general. However, the seed would not be regenerated once it had been created (typically on startup). Once generated, it would stay the same for each invocation of the hash function. The same approach is used in the context of cryptographic hash functions as well: there, if the initial seed value is being kept secret (a “key”), the resulting hash function can be regarded as a pseudo-random function (or PRF). This looks a lot like we could really use it in the context of Universal Hashing to achieve our upper bound on collisions. Problem solved?

The flaw in last year’s fix

Having a PRF would fulfill the requirements needed for Universal Hashing, but unfortunately none of the general-purpose hash functions in use was designed for this purpose. While randomizing the seed properly (using a cryptographically secure random number generator) looks like it would achieve that goal, the conclusion that the resulting hash function would automatically become a PRF is wrong. As a very simple example, take the following “hash function”:

f(seed) = 42

Regardless of the randomness of the seed, the overall function is still very predictable. Even if we roll a die or split atoms to generate our seed, there is nothing that helps making this a better hash function, let alone a PRF.

While the typical general-purpose hash function is of course not as predictable as the example function, it is still possible to distinguish their output from real random output, even when randomizing the seed. As long as we can find “distinguishers” that tell our hash function from a real random function, this is enough to prove that we clearly have a function that is not a PRF.

Since general-purpose hash functions are neither chosen from a “universal” family of hash functions plus the fact that they are not PRFs should lower our confidence in the functions’ collision resistance considerably. This opens the door (at least in theory) for finding other ways to produce collisions for them. And here is how.

Here come the cryptographers

While working on SipHash - more on that later - Jean-Philippe Aumasson and Daniel J. Bernstein also analyzed the collision properties of popular general-purpose hash functions. What was probably novel to their approach was that they treated these hash functions as if they were cryptographic hash functions. Since it is collisions that we are interested in, they used the tools that are being used to analyze collision resistance of “real” cryptographic hash functions, for example, Differential Cryptanalysis.

Not surprisingly, their results were impressive. None of the functions that were analyzed were specifically designed to withstand this level of scrutiny. Among the most impressive results are completely breaking CityHash64, finding a way to retrieve the secret seed of the hash functions being used in Python 2.7.3 and 3.2.3 and completely breaking MurmurHash 2 and 3. “Completely breaking” means they were able to produce arbitrary multicollisions even when the seed was properly randomized. The resulting attacks are much more powerful than the typical attacks you would hope to find in your garden-variety cryptographic hash function. Typically, cryptographers are happy if they manage to find a collision at all. Or they might gleam with joy if they pulled off to decrease the theoretical complexity needed to find a collision by a few powers of two. Just to point out how devastating these attacks are in the context of cryptographic hash functions.

We have the recipe - now all we need is the (hash) cake

I will focus on the results for MurmurHash in the rest of this article, since this hash function is used in a variety of software projects, for example CRuby, JRuby, Rubinius, Redis, and also in Oracle JDK and OpenJDK. It is a bit tragic that MurmurHash 3 was only introduced in the JDK after last year’s hashDoS presentation - they adopted the fix being proposed at the time and offer the option to run the JVM using MurmurHash 3 as the hash function used for string hashing. It is (was?) also planned to make this the default hashing function in future releases of the JDK.

Jean-Philippe and I discussed how we could transform their findings into real-world attacks that could show off the potentially disastrous consequences that not dealing with this problem could have. Since I am most familiar with the source code of the Ruby implementations, I started applying Jean-Philippe’s recipe in Ruby. JRuby and CRuby chose MurmurHash 2 as their hash function, while Rubinius switched to MurmurHash 3 in a reply to last year’s hashDoS. It was rather easy to transfer the “recipe” into valid Ruby exploit code. There it was. Proof that this really worked. I got excited. The only problem was with the 64 bit version of CRuby. There hadn’t been a PoC yet for the particular version of Murmur 2 used there:

CRuby 64 bit version of MurmurHash 2

#define MurmurMagic_1 (st_index_t)0xc6a4a793
#define MurmurMagic_2 (st_index_t)0x5bd1e995
#if MURMUR == 1
#define MurmurMagic MurmurMagic_1
#elif MURMUR == 2
#if SIZEOF_ST_INDEX_T > 4
#define MurmurMagic ((MurmurMagic_1 << 32) | MurmurMagic_2)
#else
#define MurmurMagic MurmurMagic_2
#endif
#endif

On a 64 bit system, SIZEOF_ST_INDEX_T is larger than 4, and what happens is that MurmurMagic is defined as the concatenation of the “magic constant” of MurmurHash in its version 1 and the actual constant that is used in the 32 bit version of MurmurHash 2. This complicated things a bit, and the generic recipe had to be adjusted slightly in order to still be able to produce collisions. Let’s have a look at the implementation of Murmur 2 in 64 bit CRuby. This is a simplified version of the one being implemented in st.c that still covers the important details:

Simplified version of 64 bit CRuby MurmurHash 2 implementation

uint64_t MurmurHash2 (const void * key, int len, uint64_t seed)
{
  const uint64_t m = (0xc6a4a793 << 32) | 0x5bd1e995;
  const int r = 24;

  uint64_t h = seed ^ len;

  const unsigned char * data = (const unsigned char *)key;
  while(len >= 8)
  {
    uint64_t k = *(uint64_t*)data;

    k *= m;
    k ^= k >> r;
    k *= m;

    h *= m;
    h ^= k;

    data += 8;
    len -= 8;
  }

  switch(len)
  {
  case 7: h ^= data[6] << 48;
  case 6: h ^= data[5] << 40;
  case 5: h ^= data[4] << 32;
  case 4: h ^= data[3] << 24;
  case 3: h ^= data[2] << 16;
  case 2: h ^= data[1] << 8;
  case 1: h ^= data[0];
      h *= m;
  };

  h ^= h >> 13;
  h *= m;
  h ^= h >> 15;

  return h;
}

As we immediately notice, if our input data is a multiple of 8 bytes long, the code in the switch block will not be executed. If we want to successfully generate multicollisions it will make sense to restrict ourselves to 8-byte-aligned inputs, this will only make the task easier. Another fact we observe is that the seed doesn’t play a very important role in the algorithm. In fact, the entire transform of the data being processed is independent of the seed that was chosen. It turns out that we can exploit these facts to trick even a randomized MurmurHash into producing collisions at will. Without further ado, here is the code that produces arbitrary multicollisions for 64 bit CRuby:

Ruby code to produce arbitrary multicollisions in CRuby (64 bit)

INV_MAGIC = 0x5f7a0ea7e59b19bd
R = 16
MASK64 = 0xffffffffffffffff
DIFF = "\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x80"

def find(blocks)
  num_collisions = 1 << blocks
  in0 = Random.new.bytes(16 * blocks)
  in1 = "".tap do |s|
    in0.each_byte.each_with_index do |b, i|
      s << (in0[i].ord ^ DIFF[i % 16].ord)
    end
  end

  in0 = invert(in0, blocks)
  in1 = invert(in1, blocks)

  num_collisions.times do |i|
    collision = ""
    blocks.times do |j|
      src = i & (1 << j) == 0 ? in0 : in1
      collision << src.slice(j * 16, 16)
    end
    puts "Murmur2 (#{collision.unpack("H*")[0]}) = #{collision.hash}"
  end
end

find(4)

This is the high-level representation of the algorithm. The blocks parameter controls how many collisions will be produced in the end. We generate a random string that is blocks times 16 bytes long. Next, we mostly copy this string, introducing tiny differences by XORing the string with DIFF, which alters the original string only in every eigth byte.

Let’s just consider the first 16 bytes of in0 and the first 16 bytes of in1 for now. Hashing them with Murmur means that there are two data blocks that would be processed into intermediate values for k. Each of these k values would itself be XORed to the current h value at that point. We see that his initialized only with seed and len, so its initial value is independent of the string to be hashed. Let’s imagine we could force Murmur into producing the 16 byte strings from above as the intermediate values of k. So the first 8 bytes of k finally would result in the first 8 bytes of in0 and in1 respectively, and let’s assume the same was possible for the second 8 bytes that are finally assigned to k. If, by magic, we were able to pull this off, then we could produce a collision! Why? Let’s have an in-detail look at what would happen:

Say we call the first 8 bytes of in0 x0 and the second 8 bytes x1. Also call the two 8 byte pairs of in1 y0 and y1. By virtue of how we constructed in0 and in1, it holds that both x0 and y0 as well as x1 and y1 differ in exactly one byte, by 0x80. If we first process x0 and x1, this is what will happen:

h := seed ^ len
h := h * m
h := h ^ x0
h := h * m
h := h ^ x1 := h*

When processing y0 and y1, the process is this:

h := seed ^ len
h := h * m
h := h ^ y0 # difference of 0x80 compared to h ^ x0
h := h * m # difference is preserved
h := h ^ y1 := h**

By our choice of DIFF, we make sure that the multiplication by m preserves our difference of 0x80. Different choices could lead to the difference being changed by the multiplication, so it’s important to keep it exactly the way it is. But now since the difference is kept in the same byte, the final XOR of h ^ y1 introduces yet another difference of 0x80 in the same byte, and therefore they cancel each other out: h** is indeed equal to h*! Building on this, it is not hard to see the general approach. Form a chain of such 16 byte blocks in0 and in1 and produce every possible combination of them that replaces 16 byte blocks of in0 with the corresponding 16 byte blocks of in1. There are 2ⁿ possible combinations if both in0 and in1 are of length 16n. Thus, we have 2ⁿ distinct strings that produce the same final hash value, and thus, 2ⁿ multicollisions.

Note that this thought experiment used an yet unsolved element of magic. In fact, in0 and in1 in their current form are useless, since we said that we wanted their values to become the intermediate results of the k values. Right now, we don’t know how to get there. Recall that there was a call to the mysterious function invert in the previous snippet that we haven’t seen yet. In the above snippet, we apply invert to our strings in0 and in1. You guessed right, this is what prepares them in a clever way so that when being processed by MurmurHash, MurmurHash is actually forced to produce the values in the form that we desire. Think of it as first applying an “inverse MurmurHash” procedure to our input strings which will be cancelled out after applying Murmur, yielding the original values that we need for our collision.

Voilà the heart of the attack:

The “invert” function

def invert(s, blocks)
  "".tap do |r|
    blocks.times do |i|
      2.times { |j| r << invert64(s.slice((2 * i + j) * 8, 8).unpack("Q<")[0]) }
    end
  end
end

def invert64(n)
  x = (n * INV_MAGIC) & MASK64
  t = x >> R
  u = (x ^ t) >> R
  v = (x ^ u) >> R
  x = x ^ v
  x = (x * INV_MAGIC) & MASK64
  [x].pack("Q<")
end

What invert64 does is applying a series of transforms to our initial input so that applying the corresponding operations of MurmurHash will yield our original input again. Contrast the above with what happens in MurmurHash:

The corresponding part of MurmurHash

k *= m;
k ^= k >> r;
k *= m;

The input k is first multiplied by m. INV_MAGIC is actually the inverse of m mod 2^64. This means that multiplying k by m yields the value of x in the invert64 function right after we applied the series of XORs and right shifts. Or, in simpler terms, the first operation of Murmur cancels out the last operation of invert64.

The series of XORs and right shifts by R that was applied in invert64 was carefully chosen in a way that ensures that applying the transform k ^= k >> r in Murmur itself produces the value of x right after the first multiplication by INV_MAGIC in invert64. It’s a bit tricky to see why this works, but if you’re not convinced, you can easily try it on a piece of paper. Finally multiplying k again by m cancels out the first multiplication of x in invert64. The end result of k is the initial value that we submitted to invert64 originally. Applying invert64 to that input first has forced MurmurHash into producing the original input after it is done processing our modified, “inverted” input - we have solved the piece of “magic” that was still missing.

If you run the script on a 64 bit CRuby before 1.9.3p327, it will indeed produce 2⁴ collisions from (slightly) different input keys:

Running the collision script

Murmur2 (24b20e7b9946a171282e32bec3ee7c79a08f917bc93e43bb82ac559b95b5dcc65867008a6f580bdfde5c7d0dfe636f5f849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a08f917bc93e43bb82ac559b95b5dcc65867008a6f580bdfde5c7d0dfe636f5f849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a00f7088b824a8dc822c778ea6cf77a55867008a6f580bdfde5c7d0dfe636f5f849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a00f7088b824a8dc822c778ea6cf77a55867008a6f580bdfde5c7d0dfe636f5f849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a08f917bc93e43bb82ac559b95b5dcc658e7de961b584eccdedc5b1aed49d480849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a08f917bc93e43bb82ac559b95b5dcc658e7de961b584eccdedc5b1aed49d480849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a00f7088b824a8dc822c778ea6cf77a558e7de961b584eccdedc5b1aed49d480849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a00f7088b824a8dc822c778ea6cf77a558e7de961b584eccdedc5b1aed49d480849a3cf7179164b0830ac727bc230d9e) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a08f917bc93e43bb82ac559b95b5dcc65867008a6f580bdfde5c7d0dfe636f5f841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a08f917bc93e43bb82ac559b95b5dcc65867008a6f580bdfde5c7d0dfe636f5f841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a00f7088b824a8dc822c778ea6cf77a55867008a6f580bdfde5c7d0dfe636f5f841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a00f7088b824a8dc822c778ea6cf77a55867008a6f580bdfde5c7d0dfe636f5f841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a08f917bc93e43bb82ac559b95b5dcc658e7de961b584eccdedc5b1aed49d480841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a08f917bc93e43bb82ac559b95b5dcc658e7de961b584eccdedc5b1aed49d480841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (24b20e7b9946a171282e32bec3ee7c79a00f7088b824a8dc822c778ea6cf77a558e7de961b584eccdedc5b1aed49d480841a1b04c49064b7838aa5346823508b) = 4431715562160092046
Murmur2 (2432ed874546a17828ae10cb6feebf66a00f7088b824a8dc822c778ea6cf77a558e7de961b584eccdedc5b1aed49d480841a1b04c49064b7838aa5346823508b) = 4431715562160092046

The actual hash values will probably vary due to the randomized seed, but they will be equal. Sweet. We can produce collisions for MurmurHash at will now (the approach is similar for MurmurHash 3), but how can we use this in the real world?

Finding a good showcase for the attack

So being able to produce collisions is one thing, but we would need a prominent example of where to apply this to cause real impact. Hashes (the data structure) are used in a huge number of places in Ruby, in particular web applications typically create hashes from user input when processing requests. Rails was one of the targets last year. The particular attack used form-encoded POST data that would be inserted into a Hash, causing Rails to choke on the malicious data.

As a consequence, Rack (which is where this was ultimately happening) introduced KeySpaceConstrainedParams. This clever little bastard limits the total size of all keys being parsed from a request, and it is very effective in doing so. My first attempts to fool Rails failed miserably, because the limit kicked in very quickly. After all, for our attack to be successful, we need to insert very large String keys (16n-size strings for 2ⁿ collisions). I was devastated. It just couldn’t be that I got this far and now I’d be stuck. FFFFFFFUUUUUUUUUUUU. Times ten.

After I recovered from my misery, I recalled that Hashes are used everywhere. I mean, literally. So who cares if form-encoded data won’t work? Let’s just move elsewhere. It wasn’t hard to choose a good candidate - in times of SOA, Software as a Service, web APIs, and single-page JavaScript web applications, what better candidate than JSON could there be? All the different JSON parsers probably do what is the most obvious thing to do: they parse a JSON object into a Hash.

I tried targeting the JSON implementation in C that ships with CRuby’s standard library, certain of my imminent success. But F7U12 all over again. What the F*ck? What’s your problem now, Ruby? Turns out that the JSON implementation complained about the encodings of the strings it was being presented with. I was near a heart attack, so I couldn’t find the strength to actually find out what the problem was. I had only one idea left before I would commit Harakiri. I would simply run a brute-force search for values that would not suffer from the encoding issue.

Much to my surprise, this worked. And with reasonable performance even. Relieved, I took those precomputed values, and created a serialized HTTP request that I could simply read as a raw String and send to the Rails app via Net::HTTP. It worked. And the nice thing about it: your app doesn’t even have to expose a JSON end point at all to make this work!

Next target: The Enterprise

Now I didn’t want to let this become a Ruby-only thing. Being involved in Ruby myself, this was the last thing I would want to happen. But there still was another high-profile candidate that I hadn’t touched yet. Java. How hard could it be to break Java’s hash now that I had endured the Ruby experience? That should be a sure thing, merely an excercise.

Alright, let’s apply the general recipe to Java Strings, crafting our byte arrays in the same way that we did for Ruby. Save, run and BOOM!, right? No. Save, run and FFFFFFFUUUUUUUUUUUU! What is it this time? Here’s what happened: Not thinking much about it, I had used the byte[] constructor of String:

String constructor

public String(byte bytes[], int offset, int length, String charsetName)
       throws UnsupportedEncodingException
{
    if (charsetName == null)
        throw new NullPointerException("charsetName");
    checkBounds(bytes, offset, length);
    char[] v = StringCoding.decode(charsetName, bytes, offset, length);
    this.offset = 0;
    this.count = v.length;
    this.value = v;
}

The party pooper is StringCoding.decode. Whatever values I feed into this constructor, the decode is gonna mess with them. I was overwhelmed with fear. Could it be that Java is safe? You can’t touch this? One last option I saw would have been trying to “invert” what happens in decode, much like invert64 did invert the operations of MurmurHash itself. But plagued from failure and misery, I started staring at the MurmurHash 3 implementation in the JDK: there must be another way…

Murmur 3 implementation in OpenJDK

public static int murmur3_32(int seed, char[] data, int offset, int len) {
    int h1 = seed;

    int off = offset;
    int count = len;

    // body
    while (count >= 2) {
        int k1 = (data[off++] & 0xFFFF) | (data[off++] << 16);

        count -= 2;

        k1 *= 0xcc9e2d51;
        k1 = Integer.rotateLeft(k1, 15);
        k1 *= 0x1b873593;

        h1 ^= k1;
        h1 = Integer.rotateLeft(h1, 13);
        h1 = h1 * 5 + 0xe6546b64;

    }

    // tail

    if (count > 0) {
        int k1 = data[off];

        k1 *= 0xcc9e2d51;
        k1 = Integer.rotateLeft(k1, 15);
        k1 *= 0x1b873593;
        h1 ^= k1;
    }

    // finalization

    h1 ^= len * (Character.SIZE / Byte.SIZE);

    // finalization mix force all bits of a hash block to avalanche
    h1 ^= h1 >>> 16;
    h1 *= 0x85ebca6b;
    h1 ^= h1 >>> 13;
    h1 *= 0xc2b2ae35;
    h1 ^= h1 >>> 16;

    return h1;
}

And then it struck me - “Wait a minute! They’re operating on char[]!”. And I remembered that String features a constructor for char[] as well. I took a look:

String constructor using char[]

public String(char value[]) {
    this.value = Arrays.copyOf(value, value.length);
}

Muahahahaha. No decoding or any other annoying thing happening. The rest was easy. I would simply take 16 bit-wide chars and do my thing on two bytes at a time, translating the operations on bytes one-to-one to chars. It worked. Take this, Enterprise!

How can we fix this?

We reported our findings and CVEs were assigned. At this point I really want to give props to how this issue was handled in the Ruby community. I believe they did an amazing job already after last year’s hashDoS, being the first to come up with fixes for the problem. The same could be observed again this year. I’m really proud of being a part of this community, seeing how the issue was handled after we had reported it. A special thank you to Hiroshi Nakamura, who managed the issue for CRuby and JRuby and also took care of the communication with oCERT. Ruby 1.9.3p327 was released, JRuby and Rubinius followed.

At Appsec Forum in Switzerland, Jean-Philippe and I presented our work together with a possible solution to the dilemma (slides). So what can we do? Clearly, what we want is collision resistance, even when facing malicious adversaries. As already mentioned, this is typically the domain of cryptographic hash functions, but we already concluded that they are just too slow for our purposes. The holy grail would be a cryptographic hash function whose performance is comparable to the general-purpose hash functions we use today. This is what SipHash tries to do for us. It was specifically designed to prevent the attacks described so far, yet it is comparable in speed to general-purpose hashes. It is much faster than traditional cryptographic hashes, too. It is slower than MurmurHash, but not by much.

Does SipHash solve the problem once and for all? As with all cryptographic primitives, the current state of the art in academia does not allow us to prove that SipHash is indeed a PRF. But it was designed to be one, by experts in the field. Of course, you shouldn’t trust any cryptographic primitive before it hasn’t been extensively tested by all sorts of different people. It might even happen that one day somebody finds a flaw in the design. But, and let me emphasize this, SipHash is the first “general-purpose” hash to explicitly address the issue. While you will probably manage to find distinguishers for other popular general-purpose hashes quite easily, it is very likely that you will have a much harder time doing so for SipHash. There are many implementations available for SipHash now, in case you’d like to use it in your projects, too.

Other fixes?

The fact that SipHash is slower than MurmurHash is bugging many language implementors. Just because one child misbehaves, the whole class has to suffer the consequences? This seems unfair, we all know that this is not considered contemporary education practice. That’s why other fixes were proposed, after last year’s hashDoS as well.

One prominent proposal was to limit the parameters of the software that could potentially be exploited by this attack. It is an application concern after all that we’re dealing with, so we should handle it at the application level. We saw KeySpaceConstrainedParams, and limitation of incoming parameters was also the course taken in PHP last year if I remember correctly. If we learned one thing from my odyssey, this approach is bound to fail. My experience with KeySpaceConstrainedParams was living proof for what would happen in reality. I do believe that it is correct and important to constrain the size of request parameters for security purposes, but it is obviously not going to help with this particular problem. One hole was fixed, but courtesy of the omnipresence of hashes in today’s programming it was easy for me to just move on to the next target that used hashes and where nobody had bothered to take care of the issue yet. Constraining parameters in every piece of code that makes use of a Hash is simply an illusion. Imagine the effort needed. Imagine the ugly code bloat being caused by an aspect that has nothing to do with the business logic. Even if we managed to fix this in every place where a Hash is created from incoming input, there may be numerous other places where Hashes are created only in second instance, only indirectly depending on the direct input. The probability of overlooking something or fixing one place the wrong way is tremendously high. No matter how hard we try, I am very positive that there will always be yet another loophole. I’m also reluctant to leave it up to developers to solve the problem in everyday coding. Imagine how often we will tell ourselves “Alas, a hash! We need to restrict parameters!”. Errrr… not happening. I believe the only chance we have to master the problem is to fix it at its root - and this is by fixing the core hash function.

A different proposal was to look into other data structures that have different worst case characteristics than hash tables. Some form of Trie for instance. I’m reluctant. I played with Crit-bit trees after last year’s hashDoS. I also looked into ways for balancing tries in general, but the only good options I was able to find involved hashing - the irony! For something like Red-Black Trees, I once read (can’t find the source anymore unfortunately) that there are “rebalancing attacks” that carefully transform the tree in such a state that cascading rebalancing operations are triggered. The “H” in Hash Array Mapped Tries is proof enough that attacks similar to the one described here could be carried out on them as well. And after all, none of the alternatives give us the dream that motivated “associate arrays” in the first place: (amortized) constant time insertion and access to the elements being stored. With this in mind, I feel that the “hash table” is not becoming obsolete any time soon.

Who else might have a problem now?

Most certainly anyone using MurmurHash 2/3 or CityHash is clearly in trouble. But even if other general-purpose hash functions haven’t been broken yet, confidence in their collision resistance in the light of malicious adversaries should be low. Languages that just applied some form of parameter restriction to solve the issue should clearly rethink their strategy - the possibility of loopholes is very high as was demonstrated with the Rails PoC. What about languages that use different data structures for implementing associative arrays, like Scala and Clojure for example? If I’m not mistaken, they use a HAMT internally. As I mentioned earlier, the mere fact that a hash function is used could already be dangerous.

Many people like to pick on Microsoft, but credit where credit is due, they seem to have done the right thing after last year’s incident. This is how they fixed String.GetHashCode:

C# - String.GetHashCode

// Gets a hash code for this string. If strings A and B are such that A.Equals(B), then
// they will return the same hash code.
[System.Security.SecuritySafeCritical]  // auto-generated
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public override int GetHashCode() {

#if FEATURE_RANDOMIZED_STRING_HASHING
    if(HashHelpers.s_UseRandomizedStringHashing)
    {
        return InternalMarvin32HashString(this, this.Length, 0);
    }
#endif // FEATURE_RANDOMIZED_STRING_HASHING

    unsafe {
        fixed (char *src = this) {
            Contract.Assert(src[this.Length] == '\0', "src[this.Length] == '\\0'");
            Contract.Assert( ((int)src)%4 == 0, "Managed string should start at 4 bytes boundary");
#if WIN32
            int hash1 = (5381<<16) + 5381;
#else
            int hash1 = 5381;
#endif
            int hash2 = hash1;

#if WIN32
            // 32 bit machines.
            int* pint = (int *)src;
            int len = this.Length;
            while (len > 2)
            {
                hash1 = ((hash1 << 5) + hash1 + (hash1 >> 27)) ^ pint[0];
                hash2 = ((hash2 << 5) + hash2 + (hash2 >> 27)) ^ pint[1];
                pint += 2;
                len  -= 4;
            }

            if (len > 0)
            {
                hash1 = ((hash1 << 5) + hash1 + (hash1 >> 27)) ^ pint[0];
            }
#else
            int     c;
            char *s = src;
            while ((c = s[0]) != 0) {
                hash1 = ((hash1 << 5) + hash1) ^ c;
                c = s[1];
                if (c == 0)
                    break;
                hash2 = ((hash2 << 5) + hash2) ^ c;
                s += 2;
            }
#endif
#if DEBUG
            // We want to ensure we can change our hash function daily.
            // This is perfectly fine as long as you don't persist the
            // value from GetHashCode to disk or count on String A
            // hashing before string B.  Those are bugs in your code.
            hash1 ^= ThisAssembly.DailyBuildNumber;
#endif
            return hash1 + (hash2 * 1566083941);
        }
    }
}

They introduced a feature that lets you turn on randomized hashing manually. And here is where it gets scary: if turned on, the hash will be computed by the mysterious InternalMarvin32HashString. It’s declared as

C# InternalMarvin32HashString

[System.Security.SecurityCritical]
[System.Security.SuppressUnmanagedCodeSecurity]
[ResourceExposure(ResourceScope.None)]
[DllImport(JitHelpers.QCall, CharSet = CharSet.Unicode)]
internal static extern int InternalMarvin32HashString(string s, int sLen, long additionalEntropy);

It is imported from an external DLL, clr.dll to be precise. Since Reverse Engineering is illegal in Germany, I had to call my man overseas, asking him to investigate further. And here’s what he found. After reverse engineering the code, he is pretty certain that the ominous Marvin hash function seems to be a pretty good hash function, at least on the surface it doesn’t look like it could be attacked easily. Take this with a grain of salt though, my man overseas told me that he didn’t have much time to analyze the function, it’s just what he was able to find on a cursory glance over the disassembled code. What’s further disturbing is that googling for anything about “Marvin32” hasn’t brought up anything. Zilch. Security by obscurity? At least it’s effective, for now. If anyone knows more about this mysterious hash function please contact us! We’d be delighted to know more about its background.

Final words

We certainly want to explore further candidates in the future. If you are involved in the development of any product using hash tables and feel like you are potentially affected, please do not hesitate to contact us, we would be happy to assist!

The code for the Proof of Concept attacks has been published here. Please feel free to experiment with it, but please also read the disclaimer of this article. In case you want to know more about the details of this attack, especially about the details of CityHash and Python’s hash, there is going to be a presentation at 29c3 that will cover these aspects and the design principles of SipHash in much more detail.

Breaking Murmur: Hash-flooding DoS Reloaded