The Python Quants

Python in the Browser

Some basic and advanced examples

Dr. Yves J. Hilpisch

The Python Quants GmbH

analytics@pythonquants.com

www.pythonquants.com

Platform Components & Features

At the moment, the Python Quant Platform comprises the following components and features:

  • IPython Notebook: interactive data and financial analytics in the browser with full Python integration and much more (cf. IPython home page).
  • Anaconda Python Distribution: complete Python stack for financial, scientific and data analytics workflows/applications (cf. Anaconda page); you can easily switch between Python 2.7 and 3.4.
  • R Stack: for statistical analyses, integrated via rpy2 and IPython Notebook
  • DX Analytics: our library for advanced financial and derivatives analytics with Python based on Monte Carlo simulation.
  • File Manager: a GUI-based File Manager to upload, download, copy, remove, rename files on the platform.
  • Chat/Forum: there is a simple chat/forum application available via which you can share thoughts, documents and more.
  • Collaboration: the platform features user/group administration as well as file sharing via public folders.
  • Linux Server: the platform is powered by Linux servers to which you have full shell access.
  • Deployment: the platform is easily scalable since it is cloud-based and can also be easily deployed on your own servers (via Docker containers).

PQP Overview

IPython Notebook

In the left panel of the platform, you find the current working path indicated (in black) as well as the current folder and file structure (as links in purple). Note that in this panel only IPython Notebook files are displayed. Here you can navigate the current folder structure by clicking on a link. Clicking on the double points ".." brings you one level up in the structure. Clicking the refresh button right next to the double points updates the folder/file structure. Clicking on a file link opens the IPython Notebook file.

Basic Approach

You find a link to open a new notebook on top of the left panel. With IPython notebooks, like with this one, you can interactively code Python and do data/financial analytics.

In [1]:
print ("Hello Quant World.")
Hello Quant World.

In [2]:
# simple calculations
3 + 4 * 2
Out[2]:
11
In [3]:
# working with NumPy arrays
import numpy as np
rn = np.random.standard_normal(100)
rn[:10]
Out[3]:
array([ 1.58301042, -0.95741018, -0.05121091, -0.13169825, -0.35603553,
        1.3760845 ,  1.47297158,  0.87653332,  0.04641067, -0.23808384])
In [4]:
# plotting
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(rn.cumsum())
plt.grid(True)

IPython Notebook as a system shell.

In [5]:
!ls -n
total 864
-rw-r-----@ 1 501  20  440610 Mar  1 10:14 Python_in_the_Browser.ipynb
drwxr-xr-x@ 8 501  20     272 Mar  1 09:06 Python_in_the_Browser.key

In [6]:
!mkdir test
In [7]:
!ls
Python_in_the_Browser.ipynb test
Python_in_the_Browser.key

In [8]:
!rm -r test

IPython Notebook as a media integrator. Here a talk by Yves about "Interactive Analytics of Large Financial Data Sets" with Python & IPython.

In [9]:
from IPython.display import YouTubeVideo
In [10]:
YouTubeVideo(id="XyqlduIcc2g", width=700, height=400)
Out[10]:

Efficient Financial Analytics

Combining the pandas library with IPython Notebook makes for a powerful financial analytics environment.

In [11]:
import pandas as pd
import pandas.io.data as web
In [12]:
AAPL = web.DataReader('AAPL', data_source='google')
  # reads data from Google Finance
AAPL['42d'] = pd.rolling_mean(AAPL['Close'], 42)
AAPL['252d'] = pd.rolling_mean(AAPL['Close'], 252)
  # 42d and 252d trends
In [13]:
AAPL[['Close', '42d', '252d']].plot(figsize=(10, 5))
Out[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x107ce4490>

Statistics with R

We analyze the statistical correlation between the EURO STOXX 50 stock index and the VSTOXX volatility index.

First the EURO STOXX 50 data.

In [14]:
import pandas as pd
cols = ['Date', 'SX5P', 'SX5E', 'SXXP', 'SXXE',
        'SXXF', 'SXXA', 'DK5F', 'DKXF', 'DEL']
es_url = 'http://www.stoxx.com/download/historical_values/hbrbcpe.txt'
try:
    es = pd.read_csv(es_url,  # filename
                     header=None,  # ignore column names
                     index_col=0,  # index column (dates)
                     parse_dates=True,  # parse these dates
                     dayfirst=True,  # format of dates
                     skiprows=4,  # ignore these rows
                     sep=';',  # data separator
                     names=cols)  # use these column names

    # deleting the helper column
    del es['DEL']
except:
    # read stored data if there is no Internet connection
    es = pd.HDFStore('data/SX5E.h5', 'r')['SX5E']

Second, the VSTOXX data.

In [15]:
vs_url = 'http://www.stoxx.com/download/historical_values/h_vstoxx.txt'

try:
    vs = pd.read_csv(vs_url,  # filename
                     index_col=0,  # index column (dates)
                     parse_dates=True,  # parse date information
                     dayfirst=True, # day before month
                     header=2)  # header/column names
except:
    # read stored data if there is no Internet connection
    vs = pd.HDFStore('data/V2TX.h5', 'r')['V2TX']

Bridging to R from within IPython Notebook and pushing Python data to the R run-time.

In [16]:
%load_ext rpy2.ipython
In [17]:
import numpy as np
# log returns for the major indices' time series data
datv = pd.DataFrame({'SX5E' : es['SX5E'], 'V2TX': vs['V2TX']}).dropna()
rets = np.log(datv / datv.shift(1)).dropna()
ES = rets['SX5E'].values
VS = rets['V2TX'].values
In [18]:
%Rpush ES VS

Plotting with R in IPython Notebook.

In [19]:
%R plot(ES, VS, pch=19, col='blue'); grid(); title("Log returns ES50 & VSTOXX")

Linear regression with R.

In [20]:
%R c = coef(lm(VS~ES))
Out[20]:
<FloatVector - Python:0x10c0b1638 / R:0x109846f50>
[0.000005, -2.787833]
In [21]:
%R plot(ES, VS, pch=19, col='blue'); grid(); abline(c, col='red', lwd=5)

Pulling data from R to Python

In [22]:
%Rpull c
In [23]:
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(9, 6))
plt.plot(ES, VS, 'b.')
plt.plot(ES, c[0] + c[1] * ES, 'r', lw=3)
plt.grid(); plt.xlabel('ES'); plt.ylabel('VS')
Out[23]:
<matplotlib.text.Text at 0x10c72e1d0>

Vector Auto Regression

Let us apply Multi-Variate Auto Regression to the financial time series data we have:

  • EURO STOXX 50
  • VSTOXX

Let us resample the data set.

In [24]:
datf = datv.resample('1M', how='last')
  # resampling to monthly data
# datf = datf / datf.ix[0] * 100
  # uncomment for normalized starting values
# datf = np.log(datf / datf.shift(1)).dropna()
  # uncomment for log return based analysis

The starting values of the time series data we use.

In [25]:
datf.head()
Out[25]:
SX5E V2TX
Date
1999-01-31 3547.15 37.6140
1999-02-28 3484.24 30.7952
1999-03-31 3559.86 25.2455
1999-04-30 3757.87 23.7370
1999-05-31 3629.46 24.9911

We use the VAR class of the statsmodels library.

In [26]:
from statsmodels.tsa.api import VAR
model = VAR(datf)
In [27]:
lags = 5
  # number of lags used for fitting
results = model.fit(maxlags=lags, ic='bic')
  # model fitted to data

The summary statistics of the model fit.

In [28]:
results.summary()
Out[28]:
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Sun, 01, Mar, 2015
Time:                     10:14:55
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                    13.0480
Nobs:                     193.000    HQIC:                   12.9876
Log likelihood:          -1791.05    FPE:                    419379.
AIC:                      12.9465    Det(Omega_mle):         406639.
--------------------------------------------------------------------
Results for equation SX5E
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const         106.324009        77.243632            1.376           0.170
L1.SX5E         0.972930         0.016956           57.379           0.000
L1.V2TX        -0.710521         1.463409           -0.486           0.628
==========================================================================

Results for equation V2TX
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           3.066736         2.246539            1.365           0.174
L1.SX5E         0.000267         0.000493            0.542           0.589
L1.V2TX         0.838820         0.042562           19.708           0.000
==========================================================================

Correlation matrix of residuals
            SX5E      V2TX
SX5E    1.000000 -0.694541
V2TX   -0.694541  1.000000


Historical data and forecasts.

In [29]:
results.plot_forecast(50, figsize=(8, 8), offset='M')
  # historical/input data and
  # forecasts given model fit

Integrated simulation of the future development of the financial instrument prices/levels.

In [30]:
results.plotsim(paths=5, steps=50, prior=False,
                initial_values=datf.ix[-1].values,
                figsize=(8, 8), offset='M')
  # simulated paths given fitted model

Deployment Scenarios of Python Quant Platform

  • Getting Started with Python at high school, university, trainings, corporations, institutions, etc.
  • Browser-based Data Analytics at universities, companies, financial institutions, etc.
  • Browser-based Financial Analytics based on interactive exploration, (automated) workflows, applications
  • Browser-based Usage of Infrastructure for desktops, servers, grid, cloud, etc.
  • Browser-based Python Development for all kinds of Python projects

Technically, the platform can be deployed in almost any Linux-based hardware environment:

  • dedicated server like the trial server (4 Cores, 16 GB Ram)
  • cloud instances like a Digital Ocean droplet (from the smallest one with 1 Core, 512MB)
  • hosted and on premise behind your firewalls

Deployment with and without Docker container usage. White labeling easily possible.

Recent, selected use cases of the platform.

  • Provision of IPython Notebooks from "Python for Finance" (O'Reilly) book & for Plotly
  • Use during NumPy + pandas Training for NYC-based Hedge Fund
  • Use during Python for Quant Finance Training in London
  • Hosting of IPython Notebooks for German derivatives Exchange Eurex
  • Interactive Collaboration with South African client during Python project

Contact us

Please contact us if you have any questions or want to get involved in our Python community events.