A Monte Carlo simulation is a method that allows for the generation of future potential outcomes of a given event. In this case, we are trying to model the price pattern of a given stock or portfolio of assets a predefined amount of days into the future. With Python, R, and other programming languages, we can generate thousands of outcomes on the potential price pattern of a stock. We are much better off running Monte Carlo Simulations with programming languages rather than Excel. Excel simply does not provide the computational necessary to efficiently and quickly run thousands of trials. You could use Excel if you want, but it is extraordinarily inefficient.
Let’s get a better understanding of the background of Monte Carlo simulations specifically being applied to the stock market. The reality of the matter is that Monte Carlo simulations aren’t just used to generate future stock prices. In fact, they are often used to estimate risk! For example, Value at Risk is often calculated using a Monte Carlo method, where we would be using theoretical future data rather than historical data. We could include a method in the class we will build below to estimate Value at Risk over a given confidence interval with our generated stock prices. The key takeaway from Monte Carlo simulations is the fact that there is some sort of random variable involved. The stock market is a perfect application of a model that uses a type of Monte Carlo simulation due to the level of statistical noise within the markets. We are trying to model the probability of different outcomes, simple as that.
In the script I wrote, the intention was to design a class that could use different Monte Carlo methods, and different graphical outputs of the simulation with descriptive statistics as well. Note we can use a single ticker or portfolio of assets in the simulation. We also have multiple models we can use.
Model 1 – Simulation using daily volatility:
Pretty straightforward. The random “shock” element here will be determined by the historical volatility of a stock over a given timeframe.
Model 2 – Simulation using Brownian Motion:
Here, we are assuming the stock(s) will “drift” a certain amount over time. We can calculate this with the average daily return in our series as well as the variance of the returns. This post will not cover the background of the simulation techniques extensively. We are trying to focus on implementing the models in a program.
Disclaimer: You will find that most of the figures used here demonstrate different simulations of the same company. Spyder (the IDE I use) would only let me interact with one chart at a time.
To start, we need to import the various dependencies that will allow for this script to run as well as creating the class and initial variables we will need. In this case, the variables are the start and end date for the time series data for the asset or portfolio of assets.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import pandas_datareader.data as web import pandas as pd import datetime import numpy as np import math from matplotlib import style import matplotlib.pyplot as plt import matplotlib.mlab as mlab #See Styles #print(plt.style.available) class monte_carlo: def __init__(self, start, end): self.start = start self.end = end |
Next, we need to obtain our data. We can do this using pandas’ remote data access feature. We will make two methods within our class, one for a single asset, and one for a portfolio of assets.
1 2 3 4 5 6 7 8 9 10 |
def get_asset(self, symbol): #Dates start = self.start end = self.end prices = web.DataReader(symbol, 'google',start, end)['Close'] returns = prices.pct_change() self.returns = returns self.prices = prices |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
def get_portfolio(self, symbols, weights): start = self.start end = self.end #Get Price Data df = web.DataReader(symbols, 'google',start, end)['Close'] #Percent Change returns = df.pct_change() returns += 1 #Define dollar amount in each asset port_val = returns * weights port_val['Portfolio Value'] = port_val.sum(axis=1) #Portfolio Dollar Values prices = port_val['Portfolio Value'] #Portfolio Returns returns = port_val['Portfolio Value'].pct_change() returns = returns.replace([np.inf, -np.inf], np.nan) self.returns = returns self.prices = prices |
Next, we can create two types of simulations here. We will create a basic simulation that only uses the asset’s daily volatility, while the other simulation will be using the concept of Geometric Brownian Motion.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
def monte_carlo_sim(self, num_simulations, predicted_days): returns = self.returns prices = self.prices last_price = prices[-1] simulation_df = pd.DataFrame() #Create Each Simulation as a Column in df for x in range(num_simulations): count = 0 daily_vol = returns.std() price_series = [] #Append Start Value price = last_price * (1 + np.random.normal(0, daily_vol)) price_series.append(price) #Series for Preditcted Days for i in range(predicted_days): if count == 251: break price = price_series[count] * (1 + np.random.normal(0, daily_vol)) price_series.append(price) count += 1 simulation_df[x] = price_series self.simulation_df = simulation_df self.predicted_days = predicted_days |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
def brownian_motion(self, num_simulations, predicted_days): returns = self.returns prices = self.prices last_price = prices[-1] #Note we are assuming drift here simulation_df = pd.DataFrame() #Create Each Simulation as a Column in df for x in range(num_simulations): #Inputs count = 0 avg_daily_ret = returns.mean() variance = returns.var() daily_vol = returns.std() daily_drift = avg_daily_ret - (variance/2) drift = daily_drift - 0.5 * daily_vol ** 2 #Append Start Value prices = [] shock = drift + daily_vol * np.random.normal() last_price * math.exp(shock) prices.append(last_price) for i in range(predicted_days): if count == 251: break shock = drift + daily_vol * np.random.normal() price = prices[count] * math.exp(shock) prices.append(price) count += 1 simulation_df[x] = prices self.simulation_df = simulation_df self.predicted_days = predicted_days |
Next, we want to visualize our simulations! We can create a line graph and histogram to visualize our data with Matplotlib.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
def line_graph(self): prices = self.prices predicted_days = self.predicted_days simulation_df = self.simulation_df last_price = prices[-1] fig = plt.figure() style.use('bmh') title = "Monte Carlo Simulation: " + str(predicted_days) + " Days" plt.plot(simulation_df) fig.suptitle(title,fontsize=18, fontweight='bold') plt.xlabel('Day') plt.ylabel('Price ($USD)') plt.grid(True,color='grey') plt.axhline(y=last_price, color='r', linestyle='-') plt.show() |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
def histogram(self): simulation_df = self.simulation_df ser = simulation_df.iloc[-1, :] x = ser mu = ser.mean() sigma = ser.std() num_bins = 20 # the histogram of the data n, bins, patches = plt.hist(x, num_bins, normed=1, facecolor='blue', alpha=0.5) # add a 'best fit' line y = mlab.normpdf(bins, mu, sigma) plt.plot(bins, y, 'r--') plt.xlabel('Price') plt.ylabel('Probability') plt.title(r'Histogram of Speculated Stock Prices', fontsize=18, fontweight='bold') # Tweak spacing to prevent clipping of ylabel plt.subplots_adjust(left=0.15) plt.show() |
For the heck of it, we can actually calculate our theoretical Value at Risk with the given portfolio and simulation outputs. The Value at Risk is simply the difference between the current price and specified price at a given confidence interval.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
def VaR(self): simulation_df = self.simulation_df prices = self.prices last_price = prices[-1] price_array = simulation_df.iloc[-1, :] price_array = sorted(price_array, key=int) var = np.percentile(price_array, 1) val_at_risk = last_price - var print "Value at Risk: ", val_at_risk #Histogram fit = stats.norm.pdf(price_array, np.mean(price_array), np.std(price_array)) plt.plot(price_array,fit,'-o') plt.hist(price_array,normed=True) plt.xlabel('Price') plt.ylabel('Probability') plt.title(r'Histogram of Speculated Stock Prices', fontsize=18, fontweight='bold') plt.axvline(x=var, color='r', linestyle='--', label='Price at Confidence Interval: ' + str(round(var, 2))) plt.axvline(x=last_price, color='k', linestyle='--', label = 'Current Stock Price: ' + str(round(last_price, 2))) plt.legend(loc="upper right") plt.show() |
Next, we want to analyze the outputted simulations. We can do this in a variety of ways.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
def key_stats(self): simulation_df = self.simulation_df print '#------------------Simulation Stats------------------#' count = 1 for column in simulation_df: print "Simulation", count, "Mean Price: ", simulation_df[column].mean() print "Simulation", count, "Median Price: ", simulation_df[column].median() count += 1 print '\n' print '#----------------------Last Price Stats--------------------#' print "Mean Price: ", np.mean(simulation_df.iloc[-1,:]) print "Maximum Price: ",np.max(simulation_df.iloc[-1,:]) print "Minimum Price: ", np.min(simulation_df.iloc[-1,:]) print "Standard Deviation: ",np.std(simulation_df.iloc[-1,:]) print '\n' print '#----------------------Descriptive Stats-------------------#' price_array = simulation_df.iloc[-1, :] print price_array.describe() print '\n' print '#--------------Annual Expected Returns for Trials-----------#' count = 1 future_returns = simulation_df.pct_change() for column in future_returns: print "Simulation", count, "Annual Expected Return", "{0:.2f}%".format((future_returns[column].mean() * 252) * 100) print "Simulation", count, "Total Return", "{0:.2f}%".format((future_returns[column].iloc[1] / future_returns[column].iloc[-1] - 1) * 100) count += 1 print '\n' #Create Column For Average Daily Price Across All Trials simulation_df['Average'] = simulation_df.mean(axis=1) ser = simulation_df['Average'] print '#----------------------Percentiles--------------------------------#' percentile_list = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95] for per in percentile_list: print "{}th Percentile: ".format(per), np.percentile(price_array, per) print '\n' print '#-----------------Calculate Probabilities-------------------------#' print "Probability price is between 30 and 40: ", "{0:.2f}%".format((float(len(price_array[(price_array> 30) & (price_array<- 40)])) / float(len(price_array)) * 100)) print "Probability price is > 45: ", "{0:.2f}%".format((float(len(price_array[price_array > 45])) / float(len(price_array)))* 100) |
Finally, to run the script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
if __name__== "__main__": start = datetime.datetime(2017, 1, 3) end = datetime.datetime(2017, 10, 4) sim = monte_carlo(start, end) #symbols = ['AAPL', 'KO', 'HD', 'PM'] #weights = [1000,1000,2000,3000] #sim.get_portfolio(symbols, weights) sim.get_asset('AAPL') sim.brownian_motion(1000, 200) sim.line_graph() sim.VaR() |
Below are the statistics outputted for a simulation consisting of 10 trials with AT&T Stock (Ticker: T)
#——————Simulation Stats——————#
Simulation 1 Mean Price: 31.4382622414
Simulation 1 Median Price: 31.9897921901
Simulation 2 Mean Price: 35.7671671325
Simulation 2 Median Price: 35.9623238149
Simulation 3 Mean Price: 35.6989735203
Simulation 3 Median Price: 35.6337173263
Simulation 4 Mean Price: 33.1080272815
Simulation 4 Median Price: 32.4359524255
Simulation 5 Mean Price: 38.2088369838
Simulation 5 Median Price: 36.0823949931
Simulation 6 Mean Price: 34.8515405456
Simulation 6 Median Price: 34.7751886371
Simulation 7 Mean Price: 35.6820401892
Simulation 7 Median Price: 34.8281466864
Simulation 8 Mean Price: 39.682759584
Simulation 8 Median Price: 37.2205632288
Simulation 9 Mean Price: 35.5834596902
Simulation 9 Median Price: 35.5221844294
Simulation 10 Mean Price: 38.1580107637
Simulation 10 Median Price: 39.228093444
#———————-Last Price Stats——————–#
Mean Price: 36.5581328863
Maximum Price: 45.7485017495
Minimum Price: 25.9968452
Standard Deviation: 6.21986513961
#———————-Descriptive Stats——————-#
count 10.000000
mean 36.558133
std 6.556314
min 25.996845
25% 33.827175
50% 36.277416
75% 40.346060
max 45.748502
Name: 200, dtype: float64
#————–Annual Expected Returns for Trials———–#
Simulation 1 Annual Expected Return -34.42%
Simulation 1 Total Return 638.40%
Simulation 2 Annual Expected Return 15.94%
Simulation 2 Total Return 1508.27%
Simulation 3 Annual Expected Return 4.92%
Simulation 3 Total Return -275.24%
Simulation 4 Annual Expected Return -25.92%
Simulation 4 Total Return -205.23%
Simulation 5 Annual Expected Return 34.57%
Simulation 5 Total Return -471.27%
Simulation 6 Annual Expected Return 8.90%
Simulation 6 Total Return -31.48%
Simulation 7 Annual Expected Return 1.70%
Simulation 7 Total Return -111.73%
Simulation 8 Annual Expected Return 35.25%
Simulation 8 Total Return -898.73%
Simulation 9 Annual Expected Return -2.43%
Simulation 9 Total Return -100.06%
Simulation 10 Annual Expected Return 21.52%
Simulation 10 Total Return 101.22%
#———————-Percentiles——————————–#
5th Percentile: 26.8433098784
10th Percentile: 27.6897745569
15th Percentile: 29.8815727367
20th Percentile: 32.457751917
25th Percentile: 33.8271746117
30th Percentile: 34.2311921181
35th Percentile: 34.6550694547
40th Percentile: 35.1186664518
45th Percentile: 35.6054189947
50th Percentile: 36.2774159036
55th Percentile: 36.9494128126
60th Percentile: 37.9284581889
65th Percentile: 38.9458846236
70th Percentile: 39.7094402515
75th Percentile: 40.3460604761
80th Percentile: 41.6228406181
85th Percentile: 43.6998206569
90th Percentile: 45.3585747934
95th Percentile: 45.5535382715
#—————–Calculate Probabilities————————-#
Probability price is between 30 and 40: 50.00%
Probability price is > 45: 20.00%
Hi! This was a really good read! However, I am having a bit of trouble running it on sypder (python 3.6 ). I’m unsure what is wrong but there seems to be an issue with all the print statements for me. Any help would be greatly appreciated. Thanks again!
Email me some screenshots of the error messages you are getting and I can help you out!
Thank you for the explanation and the code! This is excellent. I have one doubt in the brownian motion function. In line 26 of the function, last_price * math.exp(shock) is not assigned to any variable. I was wondering if it is a part of the np.random.normal just before it or if it should be assigned to some other variable? It would be helpful if you could clarify this doubt. Thank you.