23 November 2016

Computing a particular Fibonacci number is a common toy problem posed to illustrate examples of how you can reason about the performance of a program by considering its time complexity. Let's try it.

Linear Fibonacci

The relatively simple Fibonacci algorithm that's generally given is to keep track of two consecutive Fibonacci numbers \( F_n, F_{n+1} \), then compute \( F_{n+2} = F_n + F_{n+1} \) and forget about \( F_n \) repeatedly until you've computed the desired value. Here's some Python code:

def fib(n):
    x = 0
    y = 1
    for i in range(n-1):
        z = x + y
        x, y = y, z
    return y

The time complexity? Well, we do \( O(n) \) iterations and a constant number of operations (one addition plus some assignments) in each iteration, so the complexity is O(n). Believe me? Let's look at another algorithm.

Fast Doubling

Here are two non-obvious Fibonacci number identities.

  • \( F_{2k} = F_k(2F_{k+1} - F_k) \)
  • \( F_{2k+1} = F_{k+1}^2 + F_k^2 \)

Using these identities we can find \( F_n \) by again tracking the 2-tuple \( (F_k, F_{k+1} \) and then either moving to \( F_{k+1}, F_{k+2} \) (by adding) or to \( F_{2k}, F_{2k+1} \) (by using these identities) depending on the bit pattern of \( n \). Here's some Python code:

def numBits(n):
    # Returns the number of bits of n
    # Implementation left to the reader.

def bit(n, i):
    # Returns whether the i'th bit of n is set (0th bit is least significant)
    # Implementation left to the reader.

def fib(n):
    x = 0
    y = 1
    b = numBits(n)
    for i in range(b):
        s = x*(2*y - x)
        t = x*x + y*y
        if bit(n, b - 1 - i):
            x, y = t, s+t
        else:
            x, y = s, t
    return x

So what about the time complexity of this? It's a bit less obvious, but we're doing \( O(\log n) \) iterations, each with a constant number of operations, so \( O(\log n) \).

Not quite that simple

If you actually put both of these algorithms into runnable form and then run them, you might find that they actually differ by only a constant factor (this will not be true with Python's bigint implementation). Furthermore, both algorithms actually take more than \( O(n) \) time to run: fib(2000) will take more than twice as long as fib(1000).

The problem is that it's very easy to be fuzzy on exactly what we mean when we say time complexity. The analysis we did is correct, but it measures what's known as arithmetic complexity. Arithmetic complexity treats any arithmetic operation (addition, subtraction, multiplication, etc) as just one operation.

However, for big enough numbers, multiplication is significantly costlier than addition. It's straightforward to add two \(k\)-bit numbers in \( O(k) \) time, but the fastest known multiplication algorithm is \( O(k \log k 2^{2 \log^* k}) \). Furthermore, this algorithm is quite impractical so in practice your multiplication is probably \( O( k \log k \log \log k ) \) or maybe worse depending on what bignum library you're using. Python's bignums use Karatsuba multiplication, so multiplication there will be \( O(k^{\log_2 3}) \).

One version of time complexity that gets us much closer to the actual runtime characteristics is called bit complexity. In bit complexity, we imagine that rather than having access to a number and applying arithmetic operations to that number, our machine is operating on individual bits and applying single bit operations. Then arithmetic operations would be built on top of that. The time complexities I gave for the multiplication algorithms were their bit complexities.

So what are the bit complexities of the two algorithms we showed above? First, we need to understand the size of the numbers that we are working with. \( F_n \) is approximately \( \varphi^n \) (where \( \varphi = \frac{ 1 + \sqrt{5}}{2} \) is the golden ratio), so it will have \( \log_2 \varphi^n = n \log_2 \varphi = \Theta(n) \) bits. If we let \( A(k) \) denote the bit complexity of adding two \(k\)-bit numbers, we'd then see that the overall time complexity is \( O(A(1) + A(2) + \cdots + A(n)) \). Since \( A(k) = O(k) \), this will be \( O(1 + 2 + \cdots + n) = O(n^2) \). The "linear" algorithm is actually quadratic!

What about the doubling algorithm?

First, instead of worrying about the specifics of the multiplication algorithm we're using, let's write \( M(k) \) to be the bit-complexity of multiplying two \(k\)-bit numbers. Of course, since any multiplication algorithm needs to read the entire input, \( M(k) = \Omega(k) \). That's all we'll need to know about it for now.

Let's think about the iterations in reverse order. At the end, we're multiplying together numbers around the size of \( F_{n/2} \), which have \( cn/2 \) bits each, followed by additions and subtractions of numbers of the same size. Each earlier iteration operates on numbers that are roughly half the size of the next iteration. So the overall complexity will look like \( O( (M(\frac{n}{2}) + \frac{n}{2} + M(\frac{n}{4}) + \frac{n}{4} + M(\frac{n}{8}) + \frac{n}{8} + \cdots) ) \). Because \( M(k) = \Omega(k) \), all of the linear terms are dominated and we can think of the complexity as simply \( O(M(\frac{n}{2}) + M(\frac{n}{4}) + \cdots) \).

Once again, we'll use the fact that \(M(k) = \Omega(k)\) and note that this sum drops off at least as fast as the geometric series \( M(\frac{n}{2}) + \frac{1}{2}M(\frac{n}{2}) + \frac{1}{4}M(\frac{n}{2}) + \cdots \). This series sums to a constant times the biggest term, or \( O(M(\frac{n}{2})) \). We think of this as saying that the algorithm is dominated by the final iteration.

So essentially, the bit complexity of the doubling algorithm is the same as the bit complexity of multiplying two \(n\)-bit numbers. With naive multiplication, this will be \( O(n^2) \) and nothing more than a constant factor has been gained over the linear algorithm (you can see this fact in the graphs from this article). If you use smarter multiplication algorithms, then you can reduce it farther.

What if your numbers aren't big?

"Okay," you might say, "that's true if we're working with bigints, but what if we keep all of the numbers small enough to fit in a machine word, for example by reducing everything by some modulus?" If you do that, then the original time complexity analysis will be rather accurate, and you'll see them take linear and logarithmic time, as you'd expect. However, in this setting you can do even better. This will be my challenge to the reader.

Given a constant \( M \), prove it is possible to calculate \( F_n \) modulo \( M \) in a constant number of integer additions, subtractions, multiplications, and divisions, and no other arithmetic operations (Binet's formula does not satisfy this because implementing exponentiation involves a logarithmic number of multiplications).