Stochastic Calculus and the Nobel Prize Winning Black-Scholes Equation

The 1997 Nobel Prize in Economics went to Robert Merton and Myron Scholes for their revolutionary Black-Scholes differential equation for the value of financial instruments—termed a stochastic differential equation because it includes a random element. The work contained some deep but relatively simple mathematical ideas, as I try to explain in my course on Investment Mathematics and in the following blog.

To get started, let’s toss a fair coin n times and let Xi = +1 if the ith toss is heads or 1 if the ith toss is tails, each with probability 1/2. Each Xi is called a random variable because it takes on certain values with certain probabilities. It has mean 0, which is computed by taking each value multiplied by its probability and adding them up:

(1/2)(1) + (1/2)(1) = 0.

It has variance 1, which is computed as the mean of the square of the difference from the mean, which in this case is always 1. The standard deviation, the square root of the variance, is also 1, and gives an estimate of how far Xi will deviate from its mean on average. The Xi are called independent because no toss is affected by another. In particular, the mean of XiXj is 0; if Xi is positive, Xj is equally likely to be positive or negative. The sum

f(n) = X1 + … + Xn

has mean 0, variance n, and standard deviation s = √n. To compute the variance, just note that the square

(X1 + … + Xn)2 = X12 + … + Xn2 + cross terms

has mean n because each Xi2 has mean 1 and the cross terms have mean 0. The fact that the mean of f(n) is 0 implies that if you toss a coin n times, the average number of heads is n/2. The fact that the standard deviation is √n means that in practice the deviation from n/2 should be on the order of √n.

You could take a random walk on the line by tossing a coin every second and taking a unit step forward or backward according to whether the coin came up heads or tails. The random function f(n) would give your position after n seconds.

To get a continuous limit for 0 ≤ t ≤ 1 of random walks with rapid small steps, you could try considering

X_1\Delta t+{.}{.}{.}+X_n\Delta t

with \Delta t = 1/n and mean 0, but this has standard deviation √n /n = 1/√n, which goes to 0 in the limit, and you end up just standing at the origin. The reason is that independent identical random variables with mean 0 tend to cancel when you add them up. In the stochastic calculus, this can be summarized by saying that

(1)          Xt dt = 0

because you always get 0 when you integrate or take limits of sums. If you replace \Delta t by some function a of \Delta t and consider

X1 a + … + Xn a ,

then the standard deviation is a√n, which is 1 if a = 1/√n = √\Delta t. Therefore

X1 \Delta t + … + Xn \Delta t

does have a non-zero limit, with mean 0 and standard deviation 1 at time t = 1, called the Wiener process or Brownian motion z. This process is a solution to the stochastic differential equation

dz = Xt √dt .

At time t, z(t) has mean 0 and standard deviation √t. More generally you can consider a generalized Wiener process x satisfying the differential equation

dx = a dt + b dz ,

with solution x = at + bz. The mean, due to the deterministic component at, is at, while the standard deviation, due to the stochastic term bz, is b√t. Still more generally you can consider an Ito process

(2)          dx = a(x,t) dt + b(x,t) dz ,

which can be hard to solve explicitly.

Stochastic calculus appears much trickier than ordinary calculus because dz2 is on the order of dt and hence it is not negligible the way that dt2 is. What makes it all manageable is Ito’s Lemma, which in abbreviated form just says that

(3)          dz2 = dt .

The essence of the proof is that

dz2 = Xt2dt = 1 dt + (Xt21) dt = dt .

The first equality is definition, the second is trivial. To understand the third, note that Xt21 is random variable with mean 0, so that (Xt21) dt = 0 as in (1). Here is the associated stochastic chain rule, also called Ito’s Lemma:

Ito’s Lemma. Consider an Ito process (2) and let y = f(x) be a twice differentiable function of x. Then

(4)          dy = (f’ a + f” b2/2) dt + f’ b dz .

To see how (4) follows from (3), start with the second order Taylor series for y:

dy = f’(x) dx + (1/2)f”(x)dx2 .

Note that the dx2 term is not negligible; indeed, by (2) and (3), dx2 = b2dz2 = b2dt. Equation (4) now follows from (2). The interesting feature is the appearance of the second derivative f” because dz2 = dt .

Before applying stochastic calculus to stocks, recall how money grows in the bank at a risk-free rate r, which governs the relative growth rate of the balance B:

(5)          \frac{dB}{B}=r dt

For the price S of a stock, in addition to a nonrandom growth rate m, one considers a random component, a multiple of Brownian motion:

(6)          \frac{dS}{S}=\mu dt+\sigma dz

These two coefficients, the mean growth rate \mu and the so-called volatility \sigma are considered the two most important characteristics of a stock. In general, higher growth rate entails higher volatility and risk. Probably the most important principle from investment mathematics, called diversification, mandates buying many uncorrelated stocks with high \mu and \sigma with the expectation that their random fluctuations will tend to cancel and thus entail much less risk than any of them individually.

The hard part of investment analysis comes in treating more complicated financial instruments. A call is the right to buy for example 100 shares of Sears at $95/share six months from now. The challenge facing Black, Scholes, and Merton was to figure out what such a call should be worth. The value C(S,t) of such a call varies in time and depends on how the price of the stock varies. Even though the current price of Sears is $91, the call option is worth something, because the price may go above $95. If the price stays at $91, the value of the call will gradually decay over the six months to 0, but if the price rises, the value of the call may rise. It will never fall below 0, because it is just an option to buy, not an obligation to buy.

The key to evaluating the call is to note that it can be instantaneously replicated by some linear combination G = uSvB of buying the stock and borrowing money from the bank. The call should have the same price as uSvB. If, for example, the call had a higher price, one could go into the business of selling calls, buying replications, and making a risk-free profit. The opportunity for such “arbitrage” keeps market prices coherent. So the difficult problem of pricing the call seems to be reduced to the easier problem of pricing stocks.

The difficulty is that the coefficients u and v vary in time, depending in part on the price of the stock. Such “dynamic arbitrage” was a revolutionary idea. It means that instead of classical probability theory, you really need the random or stochastic calculus and differential equations we introduced above.

To replicate the call, the evolving linear combination G = uSvB must satisfy certain conditions. First of all, you need to borrow more money to buy more stock, i.e., funds to increase u must come from corresponding increases in v, so that S du = B dv. Therefore,

dG = udS-vdB+Sdu-Bdv=udS -vdB

      =(u\mu S-vrB)dt+u\sigma S dz

by (5) and (6). Meanwhile, by Ito’s Lemma (4),

dC=(\frac{\partial C}{\partial t}+\frac{\partial C}{\partial S}\mu S+\frac{1}{2}\frac{\partial ^2C}{\partial S^2}\sigma^2 S^2)dt+\frac{\partial C}{\partial S}\sigma S dz ,

with the extra term ∂C/∂t because here C(S,t) also depends explicitly on t. For G to replicate C, dG must equal dC. Equality of the dz terms means that u = ∂C/∂S. Consequently, vB = uS – G = (∂C/∂S)S – C. Equality of the dt terms means that

\frac{\partial C}{\partial S}\mu S-(\frac{\partial C}{\partial S}S-C)r=\frac{\partial C}{\partial t}+\frac{\partial C}{\partial S}\mu S+\frac{1}{2}\frac{\partial ^2C}{\partial S^2}\sigma^2 S^2.

Canceling the \mu S terms yields the celebrated Black-Scholes differential equation for the value of the call option:

(7)     \frac{\partial C}{\partial t}+\frac{\partial C}{\partial S}rS+\frac{1}{2}\frac{\partial ^2C}{\partial S^2}\sigma^2 S^2=rC.

Here again the interesting feature is the appearance of the second derivative of C, multiplied by the volatility \sigma. By great good fortune, it happens that for r and s constant, this differential equation has an exact, analytic solution, although the formula is a bit complicated (google “Black-Scholes” and see for yourself). It was discovered because it is essentially the same as the solution to the heat equation in physics. The main drawback is that the volatility s is hard to estimate. For variable interest rates r, relatively easy to estimate by the prices of short- and long-term bonds, one can solve the differential equation numerically.

Merton’s landmark paper after Black and Scholes appeared in 1973. In 1994 Merton, Scholes, and others started a hedge fund, Long-Term Capital Management (LTCM), which was soon earning 40% a year. In 1997 Merton and Scholes won the Nobel Prize in Economics for their work (and Black received posthumous recognition). The very next year the LTCM fund crashed, losing $4.6 billion. In an extraordinary move, the Federal Reserve intervened to rescue the fund and prevent international financial repercussions.