Some basic and advanced examples
At the moment, the Python Quant Platform comprises the following components and features:
rpy2
and IPython Notebook In the left panel of the platform, you find the current working path indicated (in black) as well as the current folder and file structure (as links in purple). Note that in this panel only IPython Notebook files are displayed. Here you can navigate the current folder structure by clicking on a link. Clicking on the double points ".." brings you one level up in the structure. Clicking the refresh button right next to the double points updates the folder/file structure. Clicking on a file link opens the IPython Notebook file.
You find a link to open a new notebook on top of the left panel. With IPython notebooks, like with this one, you can interactively code Python and do data/financial analytics.
print ("Hello Quant World.")
Hello Quant World.
# simple calculations
3 + 4 * 2
11
# working with NumPy arrays
import numpy as np
rn = np.random.standard_normal(100)
rn[:10]
array([ 1.58301042, -0.95741018, -0.05121091, -0.13169825, -0.35603553, 1.3760845 , 1.47297158, 0.87653332, 0.04641067, -0.23808384])
# plotting
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(rn.cumsum())
plt.grid(True)
IPython Notebook as a system shell.
!ls -n
total 864
-rw-r-----@ 1 501 20 440610 Mar 1 10:14 Python_in_the_Browser.ipynb
drwxr-xr-x@ 8 501 20 272 Mar 1 09:06 Python_in_the_Browser.key
!mkdir test
!ls
Python_in_the_Browser.ipynb test Python_in_the_Browser.key
!rm -r test
IPython Notebook as a media integrator. Here a talk by Yves about "Interactive Analytics of Large Financial Data Sets" with Python & IPython.
from IPython.display import YouTubeVideo
YouTubeVideo(id="XyqlduIcc2g", width=700, height=400)
Combining the pandas library with IPython Notebook makes for a powerful financial analytics environment.
import pandas as pd
import pandas.io.data as web
AAPL = web.DataReader('AAPL', data_source='google')
# reads data from Google Finance
AAPL['42d'] = pd.rolling_mean(AAPL['Close'], 42)
AAPL['252d'] = pd.rolling_mean(AAPL['Close'], 252)
# 42d and 252d trends
AAPL[['Close', '42d', '252d']].plot(figsize=(10, 5))
<matplotlib.axes._subplots.AxesSubplot at 0x107ce4490>
We analyze the statistical correlation between the EURO STOXX 50 stock index and the VSTOXX volatility index.
First the EURO STOXX 50 data.
import pandas as pd
cols = ['Date', 'SX5P', 'SX5E', 'SXXP', 'SXXE',
'SXXF', 'SXXA', 'DK5F', 'DKXF', 'DEL']
es_url = 'http://www.stoxx.com/download/historical_values/hbrbcpe.txt'
try:
es = pd.read_csv(es_url, # filename
header=None, # ignore column names
index_col=0, # index column (dates)
parse_dates=True, # parse these dates
dayfirst=True, # format of dates
skiprows=4, # ignore these rows
sep=';', # data separator
names=cols) # use these column names
# deleting the helper column
del es['DEL']
except:
# read stored data if there is no Internet connection
es = pd.HDFStore('data/SX5E.h5', 'r')['SX5E']
Second, the VSTOXX data.
vs_url = 'http://www.stoxx.com/download/historical_values/h_vstoxx.txt'
try:
vs = pd.read_csv(vs_url, # filename
index_col=0, # index column (dates)
parse_dates=True, # parse date information
dayfirst=True, # day before month
header=2) # header/column names
except:
# read stored data if there is no Internet connection
vs = pd.HDFStore('data/V2TX.h5', 'r')['V2TX']
Bridging to R from within IPython Notebook and pushing Python data to the R run-time.
%load_ext rpy2.ipython
import numpy as np
# log returns for the major indices' time series data
datv = pd.DataFrame({'SX5E' : es['SX5E'], 'V2TX': vs['V2TX']}).dropna()
rets = np.log(datv / datv.shift(1)).dropna()
ES = rets['SX5E'].values
VS = rets['V2TX'].values
%Rpush ES VS
Plotting with R in IPython Notebook.
%R plot(ES, VS, pch=19, col='blue'); grid(); title("Log returns ES50 & VSTOXX")
Linear regression with R.
%R c = coef(lm(VS~ES))
<FloatVector - Python:0x10c0b1638 / R:0x109846f50> [0.000005, -2.787833]
%R plot(ES, VS, pch=19, col='blue'); grid(); abline(c, col='red', lwd=5)
Pulling data from R to Python
%Rpull c
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(9, 6))
plt.plot(ES, VS, 'b.')
plt.plot(ES, c[0] + c[1] * ES, 'r', lw=3)
plt.grid(); plt.xlabel('ES'); plt.ylabel('VS')
<matplotlib.text.Text at 0x10c72e1d0>
Let us apply Multi-Variate Auto Regression to the financial time series data we have:
Let us resample the data set.
datf = datv.resample('1M', how='last')
# resampling to monthly data
# datf = datf / datf.ix[0] * 100
# uncomment for normalized starting values
# datf = np.log(datf / datf.shift(1)).dropna()
# uncomment for log return based analysis
The starting values of the time series data we use.
datf.head()
SX5E | V2TX | |
---|---|---|
Date | ||
1999-01-31 | 3547.15 | 37.6140 |
1999-02-28 | 3484.24 | 30.7952 |
1999-03-31 | 3559.86 | 25.2455 |
1999-04-30 | 3757.87 | 23.7370 |
1999-05-31 | 3629.46 | 24.9911 |
We use the VAR
class of the statsmodels
library.
from statsmodels.tsa.api import VAR
model = VAR(datf)
lags = 5
# number of lags used for fitting
results = model.fit(maxlags=lags, ic='bic')
# model fitted to data
The summary statistics of the model fit.
results.summary()
Summary of Regression Results ================================== Model: VAR Method: OLS Date: Sun, 01, Mar, 2015 Time: 10:14:55 -------------------------------------------------------------------- No. of Equations: 2.00000 BIC: 13.0480 Nobs: 193.000 HQIC: 12.9876 Log likelihood: -1791.05 FPE: 419379. AIC: 12.9465 Det(Omega_mle): 406639. -------------------------------------------------------------------- Results for equation SX5E ========================================================================== coefficient std. error t-stat prob -------------------------------------------------------------------------- const 106.324009 77.243632 1.376 0.170 L1.SX5E 0.972930 0.016956 57.379 0.000 L1.V2TX -0.710521 1.463409 -0.486 0.628 ========================================================================== Results for equation V2TX ========================================================================== coefficient std. error t-stat prob -------------------------------------------------------------------------- const 3.066736 2.246539 1.365 0.174 L1.SX5E 0.000267 0.000493 0.542 0.589 L1.V2TX 0.838820 0.042562 19.708 0.000 ========================================================================== Correlation matrix of residuals SX5E V2TX SX5E 1.000000 -0.694541 V2TX -0.694541 1.000000
Historical data and forecasts.
results.plot_forecast(50, figsize=(8, 8), offset='M')
# historical/input data and
# forecasts given model fit
Integrated simulation of the future development of the financial instrument prices/levels.
results.plotsim(paths=5, steps=50, prior=False,
initial_values=datf.ix[-1].values,
figsize=(8, 8), offset='M')
# simulated paths given fitted model
Technically, the platform can be deployed in almost any Linux-based hardware environment:
Deployment with and without Docker container usage. White labeling easily possible.
Recent, selected use cases of the platform.
Please contact us if you have any questions or want to get involved in our Python community events.
Python Quant Platform | http://quant-platform.com
Derivatives Analytics with Python | Derivatives Analytics @ Wiley Finance
Python for Finance | Python for Finance @ O'Reilly