Derivative of the binomial coefficient

I was asked the other day how one can take the derivative of a discrete function like the binomial coefficient n\choose k, specifically with respect to n, and I noticed that the only explanation I could find online (the one on the wikipedia page for binomial coefficients) is correct but a little hand-wavy if you're rusty on your differential calculus. So I thought I'd write up a quick explanation.

First, recall that the binomial coefficent n\choose k, read "n choose k", is defined (as most folks familiar with it will recall), as

\displaystyle {n\choose k}=\frac{n!}{k!(n-k)!}

This of course expands as

\displaystyle {n\choose k}=\frac{1\cdot2\cdot3\cdots(n-1)n}{(1\cdot2\cdot3\cdots(k-1)k)(1\cdot2\cdot3\cdots(n-k-1)(n-k))}

We can now cancel terms to arrive at the following expansion

\displaystyle {n\choose k}=\frac{(n-k+1)(n-k+2)\cdots(n-1)n}{(1\cdot2\cdot3\cdots(k-1)k)}

the numerator in this ratio can otherwise be written with a falling factorial symbol (here using the notation from Graham, Knuth, and Patashnik's Concrete Mathematics)

\displaystyle n^{\underline{k}}=(n-k+1)(n-k+2)\cdots(n-1)n

Leaving us with the following equality:

\displaystyle {n\choose k}=\frac{n^{\underline{k}}}{k!}

Now, what we're interested in is the derivative of {n\choose k} with respect to n, or:

\displaystyle \frac{d}{dn}{n\choose k}=\frac{d}{dn}\frac{n^{\underline{k}}}{k!}

First we'll just notice that 1/k! is a constant with respect to n, so we can push it out of the derivative:

\displaystyle \frac{d}{dn}{n\choose k}=\frac{1}{k!}\frac{d}{dn}n^{\underline{k}}

Now we'll stow all that, and concentrate on the derivative of that n^{\underline{k}} term. To do this, we're going to use logarithmic differentiation.

First, we take the function we're trying to find the derivative of and assign it to some variable (typically y):

\displaystyle y=n^{\underline{k}}

Then, we take the natural logarithm of both sides

\displaystyle \ln{y}=\ln{n^{\underline{k}}}

It will be helpful now to re-expand that falling factorial notation

\displaystyle \ln y=\ln(n(n-1)(n-2)\cdots(n-k+2)(n-k+1))

Here we recall a property of the logarithm, namely that \log(ab)=\log a+\log b, allowing us to restate the above as

\displaystyle \ln y=\ln n+\ln(n-1)+\ln(n-2)+\cdots+\ln(n-k+2)+\ln(n-k+1)

Or

\displaystyle \ln y=\sum_{i=0}^{k-1}\ln(n-i)

Now, we take the derivative of both sides

\displaystyle \frac{d}{dn}\ln y=\frac{d}{dn}\sum_{i=0}^{k-1}\ln(n-i)

The left side simply makes use of the derivative \frac{d}{dx}\ln u=\frac{1}{u}\frac{du}{dx}:

\displaystyle \frac{1}{y}\frac{dy}{dn}=\frac{d}{dn}\sum_{i=0}^{k-1}\ln(n-i)

And the right hand side can first be simplified by noting that the derivative of a sum is the sum of the derivatives of its terms:

\displaystyle \frac{1}{y}\frac{dy}{dn}=\sum_{i=0}^{k-1}\frac{d}{dn}\ln(n-i)

Applying the same logarithmic derivative formula noted above to this leaves us with

\displaystyle \frac{1}{y}\frac{dy}{dn}=\sum_{i=0}^{k-1}\frac{1}{n-i}

Now we multiply both sides by y

\displaystyle \frac{dy}{dn}=y\sum_{i=0}^{k-1}\frac{1}{n-i}

And substitute the original function for y wherever it appears, leaving us with our derivative!

\displaystyle \frac{d}{dn}n^{\underline{k}}=n^{\underline{k}}\sum_{i=0}^{k-1}\frac{1}{n-i}

Taking this result and feeding it back into our equation for the derivative of the binomial coefficient gives us

\displaystyle \frac{d}{dn}{n\choose k}=\frac{1}{k!}\frac{d}{dn}n^{\underline{k}}=\frac{n^{\underline{k}}}{k!}\sum_{i=0}^{k-1}\frac{1}{n-i}

Finally we note that that the n^{\underline{k}}/k! term in front of the sum is the same as the original binomial coefficient, giving us our final solution:

\displaystyle \frac{d}{dn}{n\choose k}={n\choose k}\sum_{i=0}^{k-1}\frac{1}{n-i}

Even Fibonacci numbers

I was fiddling around on Project Euler the other day, and looked back at one of the earlier problems (I think it was #2), which asks the following (I'm paraphrasing here): Find the sum of all those Fibonacci numbers that are both even and less than or equal to 4,000,000. Now, solving this iteratively is actually pretty simple, but I wanted to have a little fun and see if I could find a closed-form solution. So, in response to a friend who asked that I describe the solution, here it is:

First, we start with a definition. The sequence of Fibonacci numbers is described as a sequence such that any number in the sequence is equal to the sum of the previous two numbers. In other words, it satisfies the recurrence relation:

\displaystyle F_{1}=F_{2}=1\\F_{n}=F_{n-1}+F_{n-2}\text{  for  }n>2

Let's look at some small terms of the Fibonacci sequence to get a feel for it:

\displaystyle 1,1,2,3,5,8,13,21,34,55,89,144,...

We can notice straightaway that every 3rd number in the sequence appears to be even, but none of the rest are. In fact this is easy to prove - the Fibonacci sequence (from 1, anyway) starts with two odd numbers (1s). Now we note that for any place in the sequence, the nth number is the sum of the (n-1)th number and the (n-2)th number. If both of those numbers are odd, then the nth number will be even. If one is even, and one is odd, the nth number will be odd. Given those facts, we can see that the 3rd number will be the sum of two odd numbers and therefore even (2), the 4th will be the sum of an odd and an even, giving an odd (3), and the 5th will be the sum of an even and an odd, giving another odd (5) and resetting the proverbial counter. Thus, only every 3rd Fibonacci number is even - so those are the only ones we want to pay attention to. Let's give them a name: E_{n} will hereafter denote the nth even Fibonacci number, and as discussed it has the formula:

\displaystyle E_{n}=F_{3n}

Now let's see if we can turn this into a recurrence by stating it in terms of itself, as opposed to terms of the Fibonacci sequence. First, let's unfold F_{n} on the right a few times and see what happens:

\displaystyle E_{n}=F_{3n-1}+F_{3n-2}\\E_{n}=F_{3n-2}+F_{3n-3}+F_{3n-3}+F_{3n-4}\\E_{n}=F_{3n-2}+2F_{3n-3}+F_{3n-4}

OK, hold it. Now it's time to notice something handy:

\displaystyle F_{3n-3}=F_{3(n-1)}=E_{n-1}

Given the above, we can now state our current unfolding as:

\displaystyle E_{n}=F_{3n-2}+2E_{n-1}+F_{3n-4}

Let's continue unfolding again.

\displaystyle E_{n}=F_{3n-2}+2E_{n-1}+F_{3n-4}\\E_{n}=F_{3n-3}+F_{3n-4}+2E_{n-1}+F_{3n-5}+F_{3n-6}

Stop! We have another F_{3n-3} term, as well as an F_{3n-6} term. Similar to the above equality, we can say that:

\displaystyle F_{3n-6}=F_{3(n-2)}=E_{n-2}

Leaving us with:

\displaystyle E_{n}=F_{3n-4}+3E_{n-1}+F_{3n-5}+E_{n-2}

Now we just notice that, by definition:

\displaystyle F_{3n-4}+F_{3n-5}=F_{3n-3}=E_{n-1}

Which leaves us with our final recurrence relation:

\displaystyle E_{1}=2\\E_{n}=4E_{n-1}+E_{n-2}\text{  for  }n>1

Now we can get to the real work. The first step is to take this relation and make it valid for all values of n. To do that, we'll use a Kronecker Delta, which looks like \delta_{nm}. This is just a handy little notation, the equivalent of a ternary conditional for math: if n=m, then \delta_{nm}=1. Otherwise, \delta_{nm}=0. Using this logic, we can state the above recurrence as:

\displaystyle E_{n}=4E_{n-1}+E_{n-2}+2\delta_{n1}

I should note here that E_{n} is presumed to be 0 when n<=0. Thus, we can say the above is valid for all n>0, as opposed to n>1. When n=1, the first two terms sum to 0, and the Kronecker delta spits out a 1, which is multiplied by 2 giving us an answer of 2. For all other values of n, the delta spits out a 0 and that whole term can be ignored. Neat!

Now comes the tricky part that is less than intuitive. We're going to solve this recurrence by means of an Ordinary generating function, or OGF. If you are unfamiliar with OGFs or generating functions in general, this may seem a bit like magic, but the upshot is that they're little machines that crank out a sequence of numbers, and can be represented in the form:

\displaystyle A(z)=\sum_{n=0}^{\infty}A_{n}z^{n}

...where [z^{n}]A(z) = A_{n} is used to denote the "coefficient of z^{n} in A(z)". Now, looking at this sum, we can see that it looks a hell of a lot like the Taylor series expansion centered at 0 (in that case, also called the MacLaurin series) of any continuous function f(x):

\displaystyle f(x)=\sum_{n=0}^{\infty}\left.\frac{d^{n}f(x)}{dx^{n}}\right|_{0}\frac{x^{n}}{n!}

And, in fact, we can see that for OGFs:

\displaystyle [z^{n}]A(z) = A_{n} = \frac{\left.\frac{d^{n}A(z)}{dx^{n}}\right|_{0}}{n!}

Moving on, GFs in general typically have ridiculously simple closed forms - for instance, let's look at a really simple one. The following OGF generates the sequence (1,1,1,...):

\displaystyle A(z)=\sum_{n=0}^{\infty}z^{n}

Here, [z^{n}]A(z) = A_{n} = 1, and its closed form is:

\displaystyle A(z)=\sum_{n=0}^{\infty}z^{n}=\frac{1}{1-z}

OGFs have their own flavors of operations that can be performed on them, but they're far too dense a topic for me to cover in entirety here - if interested, I recommend you to Graham, Knuth, and Patashnik's fantastic (and entertaining!) textbook, Concrete Mathematics.

Anyway, moving on, the next step in solving our recurrence E_{n} is to turn it into a function of generating functions, which we can then combine. First thing we'll do is multiply both sides by z^{n}.

\displaystyle E_{n}z^{n}=(4E_{n-1}+E_{n-2}+2\delta_{n1})z^{n}\\E_{n}z^{n}=4E_{n-1}z^{n}+E_{n-2}z^{n}+2\delta_{n1}z^{n}

Now, we sum both sides on n from 0 to infinity:

\displaystyle \sum_{n=0}^{\infty}E_{n}z^{n}=\sum_{n=0}^{\infty}(4E_{n-1}z^{n}+E_{n-2}z^{n}+2\delta_{n1}z^{n})\\ \displaystyle\sum_{n=0}^{\infty}E_{n}z^{n}=\sum_{n=0}^{\infty}4E_{n-1}z^{n}+\sum_{n=0}^{\infty}E_{n-2}z^{n}+\sum_{n=0}^{\infty}2\delta_{n1}z^{n}

Now, by the basic operations on OGFs, we know that multiplying an OGF by z shifts the whole sequence to the right - so we can restate the above as:

\displaystyle \sum_{n=0}^{\infty}E_{n}z^{n}=z\sum_{n=0}^{\infty}4E_{n}z^{n}+z^{2}\sum_{n=0}^{\infty}E_{n}z^{n}+\sum_{n=0}^{\infty}2\delta_{n1}z^{n}

Additionally, since that last term is only ever 2 when n=1 and 0 otherwise, it is equivalent to z, and we can trash the other notation.

\displaystyle \sum_{n=0}^{\infty}E_{n}z^{n}=z\sum_{n=0}^{\infty}4E_{n}z^{n}+z^{2}\sum_{n=0}^{\infty}E_{n}z^{n}+2z

Now we can clean things up further by using the notation:

\displaystyle E(z)=\sum_{n=0}^{\infty}E_{n}z^{n}

Leading us to the much-easier-to-read:

\displaystyle E(z)=4zE(z)+z^{2}E(z)+2z

Now we just need to solve for E(z)!

\displaystyle E(z)-4zE(z)-z^{2}E(z)=2z\\\displaystyle E(z)(1-4z-z^{2})=2z\\\displaystyle E(z)=\frac{2z}{1-4z-z^{2}}

Woot! Now let's factor that polynomial in the denominator and get something even sexier! The roots of the polynomial are 2-\sqrt{5} and 2+\sqrt{5} (I'll leave this to you to verify on your own), and for conciseness we'll define these values as two constants:

\displaystyle\psi=2+\sqrt{5}\\\displaystyle\hat{\psi}=2-\sqrt{5}

This means we can represent the above as:

\displaystyle E(z)=\frac{2z}{(1-\psi z)(1-\hat{\psi}z)}\\\displaystyle E(z)=2z\frac{1}{(1-\psi z)(1-\hat{\psi}z)}

We can further clean up by using partial fractions to represent the remaining fraction:

\displaystyle\frac{1}{(1-\psi z)(1-\hat{\psi}z)}=\frac{c_{0}}{1-\psi z}+\frac{c_{1}}{1-\hat{\psi}z}

Cross-multiplying and simplifying the above leaves us with two equations that c_{0} and c_{1} must satisfy:

\displaystyle c_{0}+c_{1}=1\\\displaystyle -\psi c_{0}-\hat{\psi}c_{1}=0

These two linear equations of two variables can be solved pretty simply, leaving us with the following values for c_{0} and c_{1}:

\displaystyle c_{0}=\frac{\psi}{2\sqrt{5}}\\\displaystyle c_{1}=\frac{-\hat{\psi}}{2\sqrt{5}}

Substituting these constants into the partial fraction gives us:

\displaystyle \frac{1}{2\sqrt{5}}(\frac{\psi}{1-\psi z}-\frac{\hat{\psi}}{1-\hat{\psi} z})

Which we then subtitute back into the original equation as:

\displaystyle E(z)=2z\frac{1}{(1-\psi z)(1-\hat{\psi}z)}\\\displaystyle E(z)=2\frac{1}{2\sqrt{5}}z(\frac{\psi}{1-\psi z}-\frac{\hat{\psi}}{1-\hat{\psi} z})\\\displaystyle E(z)=\frac{1}{\sqrt{5}}z(\psi\frac{1}{1-\psi z}-\hat{\psi}\frac{1}{1-\hat{\psi} z})

Now we can state this whole thing in terms of known OGFs and operations thereupon!

\displaystyle E(z)=\frac{1}{\sqrt{5}}z(\sum_{n=0}^{\infty}\psi^{n+1}z^{n}-\sum_{n=0}^{\infty}\hat{\psi}^{n+1}z^{n})\\\displaystyle E(z)=\frac{1}{\sqrt{5}}z\sum_{n=0}^{\infty}(\psi^{n+1}-\hat{\psi}^{n+1})z^{n}\\\displaystyle E(z)=\frac{1}{\sqrt{5}}\sum_{n=1}^{\infty}(\psi^{n}-\hat{\psi}^{n})z^{n}\\\displaystyle E(z)=\sum_{n=1}^{\infty}\frac{\psi^{n}-\hat{\psi}^{n}}{\sqrt{5}}z^{n}

Now, we have the following truth (we can ignore [z^0]E(z) because it is defined as 0):

\displaystyle E(z)=\sum_{n=1}^{\infty}E_{n}z^{n}=\sum_{n=1}^{\infty}\frac{\psi^{n}-\hat{\psi}^{n}}{\sqrt{5}}z^{n}

This in turn implies that:

\displaystyle E_{n}=[z^{n}]E(z)=\frac{\psi^{n}-\hat{\psi}^{n}}{\sqrt{5}}

We now have our closed-form solution for the nth even Fibonacci number! Let's try it out!

\displaystyle E_{1}=\frac{\psi^{1}-\hat{\psi}^{1}}{\sqrt{5}}\\\displaystyle E_{1}=\frac{(2+\sqrt{5})-(2-\sqrt{5})}{\sqrt{5}}\\\displaystyle E_{1}=\frac{2+\sqrt{5}-2+\sqrt{5}}{\sqrt{5}}\\\displaystyle E_{1}=\frac{2\sqrt{5}}{\sqrt{5}}\\\displaystyle E_{1}=2

What about for n=2?

\displaystyle E_{2}=\frac{\psi^{2}-\hat{\psi}^{2}}{\sqrt{5}}\\\displaystyle E_{2}=\frac{(2+\sqrt{5})(2+\sqrt{5})-(2-\sqrt{5})(2-\sqrt{5})}{\sqrt{5}}\\\displaystyle E_{2}=\frac{(4+4\sqrt{5}+5)-(4-4\sqrt{5}+5)}{\sqrt{5}}\\\displaystyle E_{2}=\frac{4+4\sqrt{5}+5-4+4\sqrt{5}-5}{\sqrt{5}}\\\displaystyle E_{2}=\frac{8\sqrt{5}}{\sqrt{5}}\\\displaystyle E_{2}=8

Looks pretty good! Now that we have a closed-form solution for the sequence of even Fibonacci numbers, we are asked to find the sum over all less than 4,000,000. The first step therein is to find the value m for which E_{m} is less than or equal to 4,000,000. First we generalize to find the value m for which E_{m} is less than or equal to some max value v So we write the following inequality and solve for m (as a floored integer):

\displaystyle E_{m}=\frac{\psi^{m}-\hat{\psi}^{m}}{\sqrt{5}}\le v\\\displaystyle \psi^{m}-\hat{\psi}^{m}\le v\sqrt{5}

Since we're taking the floor of the result, and \left|\hat{\psi}^{m}\right|=\left|(2-\sqrt{5})^{m}\right|<\frac{1}{2} for all positive values of m, we can safely ignore that term:

\displaystyle \psi^{m}\le v\sqrt{5}\\\displaystyle m\ln\psi\le \ln v\sqrt{5}\\\displaystyle m=\lfloor\frac{\ln v\sqrt{5}}{\ln\psi}\rfloor

Now that we know how to find m given v, What we need to answer the question is to find the following:

\displaystyle\sum_{k=1}^{m}\frac{\psi^{k}-\hat{\psi}^{k}}{\sqrt{5}}

Oh no! Not another freakin' sum! Luckily, this can be tidied away quite simply itself!

\displaystyle\sum_{k=1}^{m}\frac{\psi^{k}-\hat{\psi}^{k}}{\sqrt{5}}\\\displaystyle\frac{1}{\sqrt{5}}\sum_{k=1}^{m}(\psi^{k}-\hat{\psi}^{k})\\\displaystyle\frac{1}{\sqrt{5}}(\sum_{k=1}^{m}\psi^{k}-\sum_{k=1}^{m}\hat{\psi}^{k})

We can kerjigger these two sums a bit by shifting indices, popping off the top term, and subtracting one from each:

\displaystyle\frac{1}{\sqrt{5}}((\sum_{k=0}^{m-1}\psi^{k}+\psi^{m}-1)-(\sum_{k=0}^{m-1}\hat{\psi}^{k}+\hat{\psi}^{m}-1))

Now, because both \psi and \hat{\psi} are constants, these evaluate to simple geometric series for which we have closed forms!

\displaystyle\frac{1}{\sqrt{5}}((\frac{1-\psi^{m}}{1-\psi}+\psi^{m}-1)-(\frac{1-\hat{\psi}^{m}}{1-\hat{\psi}}+\hat{\psi}^{m}-1))

While I'm not going to go through all the steps here, you can simplify this down to our final formula for s(v), the sum of all even Fibonacci numbers less than or equal to v:

\displaystyle s(v)=\frac{\psi^{m(v)}(3+\sqrt{5})-\hat{\psi}^{m(v)}(3-\sqrt{5})-2\sqrt{5}}{4\sqrt{5}}

where:

\displaystyle m(v)=\lfloor\frac{\ln v\sqrt{5}}{\ln\psi}\rfloor\\\displaystyle\psi=2+\sqrt{5}\\\displaystyle\hat{\psi}=2-\sqrt{5}

Now, plugging 4,000,000 into m(v) returns 11. So, replacing all instances of m(v) with 11 in the formula for s(v) yields us our final answer:

\displaystyle s(4,000,000)=4,613,732

Finally! Done!

Cartesian products, part 1

I was recently tasked with finding a solution to a particularly wonky problem that challenged me algorithmically, and I figured I'd write a little post on the experience. The fact that as software developers we are surrounded by a cornucopia of libraries, frameworks, gems, jars, etc. - while certainly beneficial to our execution of our craft - means that the opportunity to solve a really *interesting* problem comes up relatively rarely. So when I was given the job of finding all possible combinations of 1 element each from n distinct sets of varying length and contents (i.e the Cartesian product of n sets), I leapt at the opportunity to flex my long-underutilized comp-sci muscles. The problem and my own solution are given in detail below, with a few caveats: 1) If you don't like math, well, you should. :P 2) I wholeheartedly reserve the right to be wrong, and if you know a better way to acquire the same result, please email me!

So, here's the problem:

Given n distinct sets X_{1}, X_{2}, ... X_{n}, find all possible sets consisting of n elements, one from each set. Such a result is known as the Cartesian product, and can be represented mathematically as:

P = \prod_{j=1}^{n}X_{j} = X_{1}\times X_{2}\times ...X_{n}

where P is the actual product. What does this product look like? Let's see an example.

We'll say we have two sets X_{1} and X_{2}:

X_{1} = \{a, b\}
X_{2} = \{1, 2\}

The Cartesian product of these two sets would be a "set of sets", or more formally a collection or family of sets, and would look like the following:

P = X_{1}\times X_{2} = \{\{a,1\},\{a,2\},\{b,1\},\{b,2\}\}

As you can see if you examine the above, for each element of set X_{1}, there are exactly two sets in the result that contain that element - each of which also contains one or the other element of X_{2}. We have our product. Now let's figure out how we got here - and we'll start with the easy-to-compute question: How many possible combinations can there be?

Well, this is likely the easiest problem in combinatorics there is to solve. Remember above where we stated the formula for arriving at a Cartesian product?

P = \prod_{j=1}^{n}X_{j} = X_{1}\times X_{2}\times ...X_{n}

Well, the number we're particularly after is the cardinality of the final collection P. In this context, the word "cardinality" means "size" - i.e. the cardinality of any set is the number of elements in that set. In mathematical notation, we represent the cardinality of a set by surrounding the name of the set by two vertical bars - very similar to what you may know as the notation for absolute value. So, the cardinality of the set X_{1} above would be represented as \left|X_{1}\right| and would be equal to 2, because set X_{1} has exactly 2 elements, a and b. So, given what we now know about cardinality, how do we use that to compute the cardinality of P, or \left|P\right|? It literally couldn't be easier:

\left|P\right| = \prod_{j=1}^{n}\left|X_{j}\right| = \left|X_{1}\right|\times \left|X_{2}\right|\times ...\left|X_{n}\right|

That's right - you just multiply the cardinalities of all the sets in question together, and voilá: you have how many combinations there would be. Now, we've already stated that the cardinalty \left|X_{1}\right| of set X_{1} is equal to two, and it should be plain by now that the cardinality \left|X_{2}\right| of set X_{2} is also equal to two. As such, that would give us the answer 4 for \left|P\right|, and lo and behold, the result set P shown above has exactly 4 elements!

Now to the harder bit - how do we figure out precisely what those elements are? Enter my good friend and soon-to-be yours (if he isn't already), recursion. Below I have implemented the algorithm I worked out - though I make no claims to it being a heretofore undiscovered one - as a pure function (i.e. a function "without side-effects", which does not alter any external state) in Ruby, my programming language of choice. Note how the method calls itself as part of its execution - that's recursion.

def cartesian_product(*enums)
  return Set.new([Set.new]) if enums.empty?
  combos = Set.new
  first = enums.shift
  subproducts = cartesian_product(*enums)
  subproducts.each do |product|
    first.each do |element|
      combos.add(product.clone.add(element))
    end
  end
  combos
end

So there's the meat of the matter - more explanation to come in part 2, coming soon!

Attempt #4

OK. This time, I'm going to do it, and do it right. This time, I will become a prolific, opinionated, and sought-after blogger. This time, I will write more than a half-hearted rant condemning regular expressions and half a tutorial on a subject hardly anybody cares about. This time, I'm doing it right.

That's science - so stay tuned.

Return top