Computing the Units Digit of a Power: a Constant-Time Algorithm Using a Little Number Theory

A question that often comes up in math assessments is something like this:

What is the units digit for the power 3 ^ 2012?

For everyone who, like me, doesn’t know what a units digit is: it’s the last digit of an integer number in decimal representation, the one that contributes the least to the overall value of the number. For example, in 815, 5 would be the units digit.

Tricky thing is, when asked the above question they usually want you to do this without using a calculator. And in most cases a calculator wouldn’t really help. Most times, it shows the result to be something like 9,288904458×10⁹⁵⁹. Hmm, not helping.

But being programmers, let’s use our computer to solve this:

Ruby: Computing 3 ^ 2012

puts 3 ** 2012

# => 92889044588673769408930657158348637826253549738504402109446757688468292001
#    00895133045568610313559943511326963669101419356132448129301161911437077979
#    74845328445022428187397562187699487288147137339630484505801392626557382906
#    48436056477488586554569394630906606107459853399507260212571559105799059148
#    48967552594523494342770304749382025064668371481914096694276699856179196934
#    51168497232539127852886210464699103411983466731430185863291737785502268014
#    67817639806221514414791302508418812863657825701961769122152840214259983193
#    50094724503310005060169808885715675634106779079502202012717447933188159268
#    51622370568349214091263561550828859168414613127661835968071251588467089807
#    89530152015431456264939875716416229796717748349156426924910549996452947118
#    50120002454402783254074178836770281737161470912726965337318715137078196392
#    89847999836692557707510587457239693189400861775799624714833950668094847485
#    098530422085114169352012542557272229748027722536025126071935506344571441

Alright, the answer is 1. Post done, see you next time!

But OK, let’s see if there is more to it. We have been asked to calculate this without using a calculator or a computer, so there must be an easier way. Besides, what if your programming language were not as helpful as Ruby here and would not support big integers out of the box?

Java: What if there is no support for large integers?

System.out.println(Math.pow(3, 2012)); // => Infinity

Ouch. Sure, I do know about BigInteger, but let’s for one moment assume that we did not have such a tool at our hands. What are we really interested in? The units digit. As it turns out, there is a mathematical tool that expresses just that. Saying that we want to have the units digit of the number is equivalent to saying that we want the result of taking the number modulo 10. It’s a handy feature of the decimal system: whenever we write down a number, let’s say 815, what this really means in the decimal system is

8 * 100 + 1 * 10 + 5 * 1

or, equivalently

8 * 10² + 1 * 10¹ + 5 * 10⁰

This is also called the “10-adic” representation of the number 815. If we take this mod 10, then all the terms that are multiples of 10^i where i > 0 cancel out and we’re left with 5, which is our units digit:

   8 * 10² + 1 * 10¹ + 5 * 10⁰ mod 10
= (8 * 10² mod 10) + (1 * 10¹ mod 10) + (5 * 10⁰ mod 10)
=        0         +       0          +        5
= 5

We are allowed to pull in the mod 10 to each multiplication by the laws of modular arithmetic, and each modular multiplication with a factor of 10^i with i > 0 is 0 since 10 divides 10^i in those cases. Let’s apply this to our original problem:

Ruby: Directly computing the units digit of 3^2012

puts 3 ** 2012 % 10 # => 1

Sweet, but this still doesn’t help us in case we lack support for large integers. What can we do there?

A first solution

Well, we can again use some modular arithmetic. We can compute 3^2012 mod 10 in the obvious way by first computing 3^2012 and then taking the result mod 10 or we can reduce each intermediate step directly:

3 ^ 2012 mod 10
= (3 * 3 * ... * 3) mod 10
= (...(((3 * 3 mod 10) * 3 mod 10) * 3 mod 10) * ... ) * 3 mod 10

Translated to Ruby code, this looks like

Ruby: Repeated modular reduction of 3^2012

result = 1
2012.times { result = result * 3 % 10 }
puts result # => 1

Great, and it even works when you can’t use large integer support! So are we done yet? Not quite. For starters, the solution just presented is still far from being done easily without the help of a computer. Sure, it’s manageable to do 2012 multiplications and taking the remainder each time, but come on, that’s a little annoying, isn’t it? On the other hand, the above algorithm works fine if you are operating on relatively small numbers. But once the numbers grow larger, you will notice how the computation will take longer and longer:

Working algorithm, but slooow

$time ruby -e 'r=1; 20122013.times { r = r * 20122013 % 10 }; puts r'
# => 3
# => real  0m2.153s

Analyzing our first solution

So, why is it slow? To analyze the algorithm further, let’s first recall some of the tools that are usually used when analyzing such algorithms. Typically, running times are given with respect to the length of the binary representation of the numbers involved. That is, if we compute x^y mod 10, we would define

k := ceil(log_2(x)) + 1 # length of the binary representation of x
l := ceil(log_2(y)) + 1 # length of the binary representation of y

Our current solution does y multiplications of the form r = r * 20122013 % 10, and this is simply too much. Although y times doesn’t look too bad at first, we should note that this number is exponentially large compared to l, and this means that we got ourselves an exponential-time algorithm with respect to k and l. It doesn’t matter too much whether x is large here or not, the following with x := 1 still takes about the same amount of time:

Size of x is dominated by the exponential number of multiplications

$time ruby -e 'r=1; 20122013.times { r = r * 1 % 10 }'
# => real  0m1.858s

Improving our solution

Clearly, our goal has to be reducing the number of multiplications first before we can go any further. If you think of cryptographic algorithms like RSA or DSA, they require huge modular exponentiations, so they’d better use some efficient algorithm to compute them. The venerable Handbook of Applied Cryptography is always a good source for these types of problems. In Ch.14, we are presented with some efficient algorithms for modular exponentiation and we will use the “right-to-left” variant of the famous “square and multiply” method.

Ruby: Modular exponentiation using Square-and-Multiply

def mod10_pow(base, exponent)
  result = 1
  while exponent > 0 do
    result = (result * base) % 10 if exponent % 2 == 1
    exponent /= 2
    base = (base * base) % 10
  end
  result
end

puts mod10_pow(20122013, 20122013) # => 3

It produces the correct result, and if we time it, it takes a mere real 0m0.015s instead of the 2s it took with our first algorithm.

Analyzing the Square-and-Multiply algorithm

Let’s see why this new algorithm is so much faster than our previous one. There are exactly l squarings of the form (base * base) % 10 as we divide the exponent by two in each step - which is equivalent to shifting it to the right one bit, and we can only do this l times before it becomes 0. Next, we have to perform a multiplication of the form (result * base) % 10 whenever there is a 1 bit in the exponent, and this can happen l times in the worst case (all 1s). Counting yields that we have at most 2l multiplications plus at most 2l modular reductions % 10, as well as l divisions of the exponent by 2. Since a modular reduction by 10 takes O(k * 4) = O(k) time (modular reduction a mod n is O(size(a) * size(n))), and dividing the exponent by 2 is also linear in l at most, the running time is governed by the multiplications. Each multiplication (result * base) is roughly O(k²) or, slightly better, O(k^log_2(3)) if we use Karatsuba multiplication. In total, this gives us a running time of O(l * k^log_2(3)). This means we have found a algorithm that is polynomial in l and k which is clearly better than the exponential algorithm we had before.

Square-and-Multiply on a piece of paper

It’s even efficient enough that we could use it to solve our initial problem of computing the units digit of 3^2012:

result | base | exponent | exponent % 2 == 1 |
   1   |   3  |   2012   |         no        |
   1   |   9  |   1006   |         no        |
   1   |   1  |    503   |        yes        |
   1   |   1  |    251   |        yes        |
   1   |   1  |    125   |        yes        |
   1   |   1  |     62   |         no        |
   1   |   1  |     31   |        yes        |
   1   |   1  |     15   |        yes        |
   1   |   1  |      7   |        yes        |
   1   |   1  |      3   |        yes        |
   1   |   1  |      1   |        yes        |
   1   |   1  |      0   |         no        |

Not much going on here as the result stays 1 throughout the algorithm, but you can easily convince yourself that it would work even if the result were different than 1 - let’s have a look at the computation of 7^7, where the units digit is a 3:

result | base | exponent | exponent % 2 == 1 |
   1   |   7  |     7    |        yes        |
   7   |   9  |     3    |        yes        |
   3   |   1  |     1    |        yes        |
   3   |   1  |     0    |         no        |

Not too bad after all. We could stop here, having found an algorithm that we can use to solve our initial problem. But whenever designing algorithms, our conscience is constantly nagging us with the ubiquitous question:

Can we do better?

While the Square-and-Multiply algorithm is a major improvement over the first algorithm, it being polynomial still means that eventually there are numbers large enough to give it a hard time.

Ruby: Giving Square-and-Multiply something to chew on

require 'benchmark'

n = 20122013 ** 5000

Benchmark.bm do |bm|
  bm.report { puts mod10_pow(n, n) }
end

# => 7.346602s

Running the above takes roughly seven seconds on my computer. Don’t even dare running it with the first algorithm.

You may have noticed that although the article’s title has mentioned Number Theory, we haven’t seen much of it yet. What is it that this “Units Digit” problem has in common with Number Theory? Well, let’s look at what happens if we take consecutive powers of the base in our initial problem, 3:

3 ^ 1 = 3
3 ^ 2 = 9
3 ^ 3 = 27
3 ^ 4 = 81
3 ^ 5 = 243
3 ^ 6 = 729
3 ^ 7 = 2187
3 ^ 8 = 6561
...

Look closer at the units digits. Pattern, anyone? The sequence is 3, 9, 7, 1 and then it repeats itself, over and over again. Try it - you won’t find a number in the sequence that doesn’t follow this pattern. Interesting.

If the initial problem would have been

What is the units digit of 5^2012?

I bet many of you would have muttered 5 immediately. It seems more anchored in our brains that for any power of 5, there is only one outcome for the units digit - 5. What about this question:

What is the units digit of 2011^2012?

It’s gotta be 1, right? Then again,

What is the units digit of 25^2012?

Ahhhm, 5?

Reducing the general problem to the numbers 0-9

This leads us to our first important observation: The units digit of a power is only dependent on the units digit of the base. This implies that it simply doesn’t matter if you look at 3^2012 or at 2013^2012 - the units digit will be the same in both cases. Why is this so? If we use the decimal representation for the base again, we get that

x ^ y mod 10
= (x_0 * 10^0 + x_1 * 10^1 + x_2 * 10^2 + ... + x_n * 10^n) ^ y mod 10
= (x_0^y + a_1 * 10^1 + a_2 * 10^2 + ... + a_n * 10^(n * y)) mod 10
= x_0^y mod 10

where the a_is are coefficients that we derive from applying a generalized version of the laws for binomial coefficients. It tells us that the sum of our decimal representation raised to the y and then taken mod 10 will only rely on the units digit raised to y, all other terms contain some power of 10 and therefore cancel out.

To get an intuition for how we arrived at this fact, let’s have a look at the pen-and-paper method of multiplying two numbers that we learned in school:

815 * 423 =
---------------
   326000
    16300
     2445
---------------
   344745

We immediately notice that the units digit of the product relies only on the units digits of the two factors. 5 * 3 is 15, so the result will have a units digit of 5 as well. Now think of a power like 2013^2012. If we write it out, it looks like this:

2013^2012 = 2013 * 2013 * ... * 2013

The intermediate result of the first two factors will only rely on the units digits of the two factors, as we have just seen. But then , the result of that computation multiplied again with the third factor in the chain will also just rely on the units digits and so on and so forth - we have just outlined an inductive proof of our claim.

The fact that the units digit of a power only depends on the units digit of its base is huge since it tells us that the “Units Digit” problem for arbitrary powers boils down to knowing the units digits for powers of the numbers that are smaller than 10. We just cut down the problem for infinitely many powers to powers of the numbers zero through nine. Not bad.

Solving for the number ‘3’ using the repeating pattern

Now our intuition told us that there is most certainly a repeating pattern for powers of 5 and 1 - the resulting power’s unit digit is always a 5 or 1 respectively. In our example for the number 3, we also saw that there is a repeating cycle of exactly 4 numbers, and we now know that we only need to solve the problem for the numbers zero through nine.

How convenient would it be if there were a repeating cycle for any of those numbers, right? And it’s here where our final piece of Number Theory fits in. In our example for 3, “repeating units digits after every fourth power” can be expressed mathematically as

3^4 congruent 1 mod 10

This is incredibly powerful because it lets us immediately derive the units digits of higher powers, too:

3^21 mod 10
= (3 ^ 4 * 3 ^ 4 * 3 ^ 4 * 3 ^ 4 * 3 ^ 4) * 3 mod 10
=    1   *   1   *   1   *   1   *   1    * (3 mod 10)
= 3

Put in simple terms: We just need to know the remainder of 21 mod 4, 1, to reduce our problem down to solving 3 mod 10 to get to our units digit. Compare this to our solution using Square-and-Multiply: all we have to do now is to compute r:= exponent % 4 and solve the problem 3^r mod 10. In terms of our initial problem:

3^2012 mod 10
= 3 ^ (4 * 503) * 1 mod 10
=      1        * 1 mod 10
= 1

Awesome, right? Now that’s something we can really do immediately on a piece of paper. So how about the rest of the numbers? 0 as a special case is of course guaranteed to cycle because 0 * x = 0 for arbitrary x. But is there an equally useful pattern for the rest? Our intuition told us so far that at least for 0, 1 and 5 there should be, but how about the rest?

Enter Euler’s Theorem

Euler’s Theorem tells us that

x^φ(10) mod 10 is congruent to 1

where φ(10) are the numbers between 1 and 10 that are co-prime to 10 (their gcd with 10 is equal to 1). This holds whenever x is co-prime to 10 as well. The numbers smaller than 10 that are co-prime to 10 are exactly 1,3,7 and 9. So φ(10) = 4 and this proves that a^4 mod 10 is congruent to 1 for a equal to 1,3,7 or 9. Or in other words: We have proven that the units digits for 1,3,7 and 9 will repeat after at most four consecutive powers.

What about the rest of the numbers, 0,2,4,5,6 and 8? As φ also determines the order of the multiplicative group of integers modulo n, consulting Lagrange’s theorem shows us that the order of any element must divide the order of that group. Therefore the order of any element within the group can only be as high as the total group order itself. The order of an element here is the smallest number i where a^i congruent to 1 mod n for a an element of the group. The group order for a multiplicative group of integers modulo n is the number of elements it contains, which happens to be exactly φ(n). In our case, where the group order is 4 (= φ(10)), this means that the order of any element cannot be more than 4 as well. In plain English: There is no number smaller than 10 where it takes more than four consecutive powers until the units digit of those powers enter a cycle.

We conclude that each of the units digits of powers of 1..9 repeat after at most four consecutive powers.

Coming up with a constant-time algorithm

We can use this knowledge now to construct a constant-time algorithm for our initial problem. With all that we know now, we know it suffices to memorize the units digits of the first 4 powers (in the order a^4, a^1, a^2 and a^3) of the numbers zero through nine to solve the “Units Digit” problem for arbitrary powers:

i: [i^4 mod 10, i^1 mod 10, i^2 mod 10, i^3 mod 10]
-----------------------
0: [0, 0, 0, 0]
1: [1, 1, 1, 1]
2: [6, 2, 4, 8]
3: [1, 3, 9, 7]
4: [6, 4, 6, 4]
5: [5, 5, 5, 5]
6: [6, 6, 6, 6]
7: [1, 7, 9, 3]
8: [6, 8, 4, 2]
9: [1, 9, 1, 9]

Using these in a precomputed array, here is finally a constant-time algorithm as promised:

Ruby: A constant-time algorithm for computing the units digit

def units_digit(base, exponent)
  [
    0, 0, 0, 0,
    1, 1, 1, 1,
    6, 2, 4, 8,
    1, 3, 9, 7,
    6, 4, 6, 4,
    5, 5, 5, 5,
    6, 6, 6, 6,
    1, 7, 9, 3,
    6, 8, 4, 2,
    1, 9, 1, 9
  ][(base % 10) * 4 + (exponent % 4)]
end

puts units_digit(3, 2012) # => 1

Let’s try out the number that brought the Square-and-Multiply algorithm to its knees:

Ruby: Benchmarking our new solution

require 'benchmark'

n = 20122013 ** 5000

Benchmark.bm do |bm|
  bm.report { puts units_digit(n, n) }
end

# => 0.000148s

Ridiculous. That’s all you got?

Astute readers might point out that this is not really a constant-time algorithm because of the modular reductions in it and they are right in doing so. Can we fix this? Turns out we can: we could require a string representation of our base, and we would simply pick only its units digit and transform that into an integer to get rid of the first modular reduction. Regarding the second reduction modulo 4, this should really be a very efficient operation already if our (byte code) compiler is worth its money. Taking a number % 4 is equivalent to masking it with 3 on the binary level. Why? 4 is a power of two, 2², and therefore the remainder is simply the last two digits of the binary representation - which we get by masking the number with number & 0x03 because the binary representation of 3 is 11 - the mask we need to just keep the last two binary digits.

To illustrate this, let’s look at the assembly that GCC generates for such a computation.

Simple C program computing modulo four

#include <stdio.h>
#include <stdint.h>

int main(int argc, char **argv) {
    unsigned int a = 42;
    unsigned int b = a % 4;
    return 0;
}

And here is the assembly from gcc -S mod4.c (only the relevant parts):

Assembly output for C program

main:
.LFB0:
  .cfi_startproc
  pushq   %rbp
  .cfi_def_cfa_offset 16
  .cfi_offset 6, -16
  movq    %rsp, %rbp
  .cfi_def_cfa_register 6
  movl    %edi, -20(%rbp)
  movq    %rsi, -32(%rbp)
  movl    $42, -4(%rbp)
  movl    -4(%rbp), %eax
  andl    $3, %eax
  movl    %eax, -8(%rbp)
  movl    $0, %eax
  popq    %rbp
  .cfi_def_cfa 7, 8
  ret
  .cfi_endproc

As you can see, it’s doing what we expected, “ANDing” the input with 3 to get the result.

Final words

I hope you liked the subject as much as I did when exploring it. I found it really entertaining how a question that might look boring on the surface wound up involving so many different topics, finally even leading to Number Theory. Applying more and more Math, we were able to get from an exponential solution to a polynomial solution all the way down to a constant-time algorithm.

When I was first exposed to Number Theory, the guy told us

While Number Theory is considered by many to be the most beautiful subject of Mathematics, be warned that you will probably never find a useful application for it in your future lives.

Apart from cryptography, he was right about this. It is beautiful and interesting to know more about the origins and relations of our numbers, but I had never ever needed Euler’s theorem for anything else than RSA so far. I was really happy that after such a long time I was finally able to see it applied in this context again. Of course, you might want to argue about how “useful” this exercise was exactly in the end. But if you have made it this far, then at least you seemed to have liked it, too!

Computing the Units Digit of a Power: A Constant-Time Algorithm Using a Little Number Theory