Build the Heston Model from scratch in Python — Part III : Monte Carlo Pricing

9 min readApr 13, 2021

In this section, we review pricing of complex/exotic options using a Monte Carlo implementation of the Heston model.

We have seen how to calibrate the Heston model in the previous article. We will use the same parameters we obtained from that section.

I’ll use the simplest scheme available but it should serve as a template to more advanced schemes. The idea is to get the ball rolling and use what we do here as a template for more efficient schemes.

This section is a key component of options pricing. Many of the options today are in fact priced using Monte Carlo. This is because there are no clean pricing functions that exist for more complex options.

Black Scholes SDE
Heston SDE
Discretization Schemes
Intuition behind the need for Monte Carlo
Euler Scheme
Implementation in Python
Full Code

Black Scholes SDE

The assumption in the Black Scholes model is that the underlying asset moving through time t, which we call 𝑆𝑡 is driven by the stochastic differential equation:

In very basic terms, you can think of 𝑑𝑆𝑡 as the derivative of 𝑆𝑡. This means that the changes in 𝑆𝑡 are governed by the equation above.

The 𝜇(𝑆𝑡,𝑡)𝑑𝑡 is dependent on dt which is the change in time. That is 𝜇(𝑆𝑡,𝑡) can be a function or a number that changes with t . This is a function that whose output we are certain of given a specific input. This is called the deterministic part of this equation. There is no randomness within the function.

To make this even more explicit, in the Black Scholes model, the 𝜇(𝑆𝑡,𝑡) function is actually the risk free rate we defined as r multiplied by 𝑆𝑡.

So essentially we can replace 𝜇(𝑆𝑡,𝑡) with 𝑟𝑆𝑡 to have a bit of simplicity.

On the other hand, 𝜎(𝑆𝑡,𝑡)𝑑𝑊𝑡 is dependent on 𝑑𝑊𝑡. In very simplified terms, 𝑑𝑊𝑡 is a normal(gaussian) distribution whose mean is 0 and whose variance changes with every step.

Also, there is no correlation between 𝑑𝑊𝑡 and 𝑑𝑊t+dt when we move from time 𝑡 to 𝑡+𝑑𝑡. This means that the gaussians are independent across time.

Note that we can also define 𝜎(𝑆𝑡,𝑡) as 𝜎.𝑆𝑡 in the Black Scholes model. Where 𝜎 is what we refer to as the implicit volatility.

So in reality, our equation is defined as:

You can see from this that the 𝑆𝑡+𝑑𝑡−𝑆𝑡 is essentially the first representation in an integral.

Heston SDE

If you have worked with the Black Scholes model, you know that the implicit volatility is a key component in pricing options.

However, under the Black Scholes model the assumption is made that this volatility does not have variations caused by other market effects.(You can see that 𝜎(𝑆𝑡,𝑡) is just a value 𝜎 multiplied by 𝑆𝑡.)

However, we know that volatility actually changes depending on market conditions. This is the primary premise behind the Heston model.

The assumption is made that the volatility moves in a different fashion from the market. Typically, what we see in equities is that the volatility of options increases as prices decrease and vice versa.

This means that the prices and volatilities are negatively correlated. You can observe this by heuristically by comparing the VIX to the S&P 500.

The highest points on the VIX are the 2008 crisis and the 2020 Covid pandemic.

So if this is the case, then the 𝜎(𝑆𝑡,𝑡) function is not deterministic. It is in fact random and we can therefore assign another stochastic differential equation to the volatility to have two equations.

We begin by noting that instead of having an equation on 𝜎(𝑆𝑡,𝑡), we can have an equation on:

which we can equate intuitively to the variance.

So our final equations will be:

and because there’s a relationship between the volatility and the underlying, we can set a correlation parameter between the two equations.

This can only be done between the 𝑑𝑊𝑡 elements because they’re the only sources of randomness in our model. As we said, the other parts are deterministic.

Therefore:

And here we see our parameters appear: {𝜎,𝜃,𝜅,𝜈, ρ}.

Discretization Schemes

In this section, we implement the integral SDE in a discrete form.

Much like we did with the Heston probability function, we intend to break down the integral into manageable discrete pieces that we can sum together to obtain the modeled underlying level at the expiry of the option.

Because of the randomness produced by the two 𝑑𝑊𝑡, we will generate a large number of trajectories and calculate the mean.

This mean should represent our expectation of the future price of the underlying. This reasoning can be intuited from the pricing formulas where we have probabilities as elements of the pricing function.

Intuiting the need for Monte Carlo Pricing

We start with the payoff of a call option. At expiry, we know that the call will either give us 0 if we’re below the strike or 𝑆𝑇−𝐾. Here 𝑇 represents the maturity.

So in effect, the value of our option today is the expected value of 𝑚𝑎𝑥(𝑆𝑇−𝐾,0) discounted by the risk free rate because we’re in a risk neutral setup.

So our call should be worth: 𝐸[𝑚𝑎𝑥(𝑆𝑇−𝐾,0)] discounted by the rate r where 𝐸 is just the sign for the expectation.

But regardless of the probability, if 𝑆𝑇−𝐾 is negative, our option is worth 0.

So we can just focus on the scenario where 𝑆𝑇−𝐾. This leaves us with 𝐸[𝑆𝑇−𝐾] and we already know 𝐾 so our option, heuristically should be worth (𝐸[𝑆𝑇]−𝐾) discounted by r.

Now if we generate enough random 𝑆𝑇 using our two random equations, the mean of all these values should give us 𝐸[𝑆𝑇]. The details and proofs come after the intuition but this is the general framework.

So for a complex payoff, we can replace the 𝑚𝑎𝑥 function with a generic 𝑓 function which takes ST and K as inputs.

We can try and find E[𝑓(𝑆𝑇, 𝐾)] by essentially simulating possible values of 𝑆𝑇 and applying the function 𝑓(𝑆T, K) to each scenario created.

The mean of these should give us an approximation of the risk neutral value of the option.

Euler Scheme

We start off with our 2 equations in integral form:

Because the first equation needs the 𝛼(S,u) as an input, we start by simulating the 𝛼t before filling the value we obtain into the equation of 𝑆𝑡+𝑑𝑡

Because 𝑑𝑡 is so small, we can essentially remove the integral from the expression:

to remain with: 𝜃(𝜅−𝛼𝑢)𝑑𝑡

Similarly, because of some special properties of 𝑑𝑊𝑡, we can approximate the second integral:

We can see that the leap is not too big. However, providing these proofs is far from trivial.

Because of 𝑑𝑊𝑡 once again, we can show 𝑑𝑊𝑡+𝑑𝑡−𝑑𝑊𝑡 to be a normal distribution of mean 0 and variance 𝑑𝑡.

So in essence:

where Z is just the normal distribution.

Approximation one (volatility equation)

So our first approximation is:

Approximation two (Equity equation)

The second approximation is:

Implementation in Python

To implement this, we will do the following:

Determine the number of time steps we want in each year eg dt could be daily,weekly, monthly or yearly
Determine the number of iterations we want e.g we could generate 𝑆𝑇 1000, 10000 or even a million times
Declare all the gaussians we will use beforehand. This helps speed up the code. In this case, we have X number of timesteps which is ourtimesteps x maturityand Y the number of iterations.
For each timestep, we calculate all 𝑣𝑡 at the same time. The numpy package allows us to do vectorized calculations quickly.
We use all the 𝑣𝑡 we obtain to calculate 𝑆𝑡
We subtract the mean of all 𝑆𝑇 from 𝐾 and discount it at the rate of r on a continuous basis to get the present value of our option.

Determine the number of time steps and iterations

We define the time steps we want to have per year. We set a single timestep per month. So the total number of timesteps will be the product of timeStepsPerYear and the maturity. You are free to play with these as you wish.

timeStepsPerYear = 12
iterations = 1000000
timesteps = T * timeStepsPerYear
dt = 1/timeStepsPerYear

Next we define containers to hold all the values of ST and 𝛼. So these will be of size (iterations, timesteps). The initial values of ST will be the stock price today and the initial value of 𝛼 should be the parameter 𝜎.

# Define the containers to hold values of St and Vt
S_t = np.zeros((timesteps, iterations))
V_t = np.zeros((timesteps, iterations))# Assign first value of all Vt to sigma
V_t[0,:] = sigma
S_t[0, :] = St

Declare correlated gaussians

With ρ as our correlation, we declare a correlated set of gaussians using the multivariate_normal function in numpy.

# Use Cholesky decomposition to
means = [0,0]
stdevs = [1, 1]
covs = [[stdevs[0]**2          , stdevs[0]*stdevs[1]*rho], 
            [stdevs[0]*stdevs[1]*rho,           stdevs[1]**2]]Z = np.random.multivariate_normal(means, covs, 
                 (iterations, timesteps)).T
Z1 = Z[0]
Z2 = Z[1]

Calculate St and Vt for all timesteps

We apply our discretization schemes. Numpy allows us to run all the computations for each iterations at one time. The result is that we only have one loop to run.

As you can see, each V_t[i,:]is dependent on the previous value V_t[i-1, :] and the same applies to S_t[i,:]. Similarly, we see that the values V_t are calculated first then plugged into S_t.

for i in range(1, timesteps):
    # Use Z2 to calculate Vt
    V_t[i,:] = np.maximum(V_t[i-1,:] + 
                kappa * (theta - V_t[i-1,:])* dt + 
                volvol *  np.sqrt(V_t[i-1,:] * dt) * Z2[i,:],0)
        
    # Use all V_t calculated to find the value of S_t
    S_t[i,:] = S_t[i-1,:] + r * S_t[i,:] * dt +
                 np.sqrt(V_t[i,:] * dt) * S_t[i-1,:] * Z1[i,:]

Full Code

Check out GitHub for jupyter notebooks and .py files that you can use directly!

# implementation of MC
import numpy as np
def MCHeston(St, K, r, T, 
              sigma, kappa, theta, volvol, rho, 
              iterations, timeStepsPerYear):
    timeStepsPerYear = 12
    iterations = 1000000
    timesteps = T * timeStepsPerYear
    dt = 1/timeStepsPerYear    # Define the containers to hold values of St and Vt
    S_t = np.zeros((timesteps, iterations))
    V_t = np.zeros((timesteps, iterations))    # Assign first value of all Vt to sigma
    V_t[0,:] = sigma
    S_t[0, :] = St    # Use Cholesky decomposition to
    means = [0,0]
    stdevs = [1/3, 1/3]
    covs = [[stdevs[0]**2          , stdevs[0]*stdevs[1]*rho], 
            [stdevs[0]*stdevs[1]*rho,           stdevs[1]**2]]    Z = np.random.multivariate_normal(means, 
                    covs, (iterations, timesteps)).T
    Z1 = Z[0]
    Z2 = Z[1]    for i in range(1, timesteps):
        # Use Z2 to calculate Vt
        V_t[i,:] = np.maximum(V_t[i-1,:] + 
                kappa * (theta - V_t[i-1,:])* dt + 
                volvol *  np.sqrt(V_t[i-1,:] * dt) * Z2[i,:],0)
        
        # Use all V_t calculated to find the value of S_t
        S_t[i,:] = S_t[i-1,:] + r * S_t[i,:] * dt +
                 np.sqrt(V_t[i,:] * dt) * S_t[i-1,:] * Z1[i,:]    return np.mean(S_t[timesteps-1, :]- K)