**— A Machine Learning Perspective**

*WORK IN PROGRESS*

Dr Yves J Hilpisch

CEO The Python Quants | The AI Machine

- Predictability Defined
- Why Does it Matter?
- Prediction Methods
- Correlation vs. Causation
- Finance History
- Efficient Markets
- Passive Investing
- Mathematics in Science
- Unreasonable Effectiveness of Data
- Artificial Intelligence
- ML-Based Applications
- Conclusions

Let $t=0$ denote the current point in **time** and let $t \in \{\ldots, -3, -2, -1, 0, 1, 2, 3, \ldots \}$ denote the relevant previous, current and future points in time.

Let $P_t$ denote the **price** of a financial instrument at time $t$. We have

or for short

$$P \in \{\ldots, P_{-2}, P_{-1}, P_{0}, P_{1}, \ldots \}.$$From Wikipedia (https://en.wikipedia.org/wiki/Markov_chain):

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

- for a Markovian process, the distribution of price $P_t$ only depends on $P_{t-1}$
- for a non-Markovian process, the distribution of price $P_t$ can depend on the history of prices $P_{t-1}, P_{t-2}, P_{t-3}, \ldots$

To simplify things, let's assume that only the **direction** $d_t$ of the change in price at time $t$ is relevant.

This gives the proces of directional changes $d \in \{\ldots, d_{-2}, d_{-1}, d_{0}, d_{1}, \ldots \}$. The problem we want to focus on, is whether we can predict the future directional change in price $d_1$ when we know the history of price changes $d_0, d_{-1}, d_{-2}, d_{-3}, \ldots$. Taking into account, say, five historical directional changes, a total of $2^5=32$ patterns can emerge.

In [1]:

```
from numpy.random import default_rng
```

In [2]:

```
rng = default_rng()
```

In [3]:

```
rng.integers(0, 2, 5)
```

Out[3]:

array([1, 0, 1, 0, 0])

In [4]:

```
rng.integers(0, 2, 5)
```

Out[4]:

array([1, 0, 0, 1, 0])

We are now interested in estimating — by whatever means — **probabilities** for $d_1 = 1$ and $d_1 = 0$ for any historical pattern $h = (d_0, d_{-1}, d_{-2}, d_{-3}, d_{-4})$.

Formally, the following **conditional probabilities** are of interest:

Such a problem is typically called a **binary classification problem** because, given a history $h$, one of two classes is to be picked — i.e., the one is to be picked for which a higher probability has been estimated. Classification is a special type of **pattern recognition** and also falls in the category of **supervised learning algorithms** in machine learning (ML).

From Wikipedia (https://en.wikipedia.org/wiki/Predictability):

Predictability is the degree to which a correct prediction or forecast of a system's state can be made either qualitatively or quantitatively.

From Google (https://developers.google.com/machine-learning/crash-course/classification/accuracy):

Accuracy is one metric for evaluating classification models. Informally, accuracy is the fraction of predictions our model got right.

Formally, we have for the **accuracy**:

**We say that a process of directional price changes is predictable if the accuracy of the predictions is significantly higher than 50%.**

To be able to consistently well predict financial prices or directional changes in financial markets can be considered the *holy grail* in finance.

Why? Plain simple, because it would be a sure path to tremendous riches.

A simulation analysis with real-world data shall illustrate the point.

The following example is taken from chapter 14 of Hilpisch (2020): *Artificial Intelligence in Finance*, O'Reilly. First, some imports and the data preparation.

In [5]:

```
import random
import numpy as np
import pandas as pd
import cufflinks as cf
cf.set_config_file(offline=True, theme='ggplot')
```

In [6]:

```
url = 'https://hilpisch.com/aiif_eikon_eod_data.csv'
```

In [7]:

```
raw = pd.read_csv(url, index_col=0, parse_dates=True)
```

We pick the **EUR/USD** currency pair for the analysis with data for 2015 to 2019. `bull`

stands for a long only strategy.

In [8]:

```
symbol = 'EUR='
```

In [9]:

```
raw['bull'] = np.log(raw[symbol] / raw[symbol].shift(1))
```

In [10]:

```
data = pd.DataFrame(raw['bull']).loc['2015-01-01':]
```

In [11]:

```
data.dropna(inplace=True)
```

In addition to the `bull`

strategy, a `random`

one, going long and short randomly, is generated. Also a `bear`

strategy, going short only, is added.

In [12]:

```
rng = default_rng(100)
```

In [13]:

```
data['random'] = rng.choice([-1, 1], len(data)) * data['bull']
```

In [14]:

```
data['bear'] = -data['bull']
```

Assume now that a prediction model gets the top `t`

percent of both the positive *and* negative movements (days) correct. Assume further, that for the rest it is not better than a random strategy. The following code creates the return series for such strategies.

In [15]:

```
def top(t):
top = pd.DataFrame(data['bull'])
top.columns = ['top']
top = top.sort_values('top')
n = int(len(data) * t)
top['top'] = abs(top['top'])
top['top'].iloc[n:-n] = rng.choice([-1, 1],
len(top['top'].iloc[n:-n])) * top['top'].iloc[n:-n]
data[f'{int(t * 100)}_top'] = top.sort_index()
```

In [16]:

```
for t in [0.1, 0.15]:
top(t)
```

Assume now that the prediction model gets `ratio`

percent of all directional movements correct and that it is not better than a random strategy for the rest.

In [17]:

```
def afi(ratio):
correct = rng.binomial(1, ratio, len(data))
random = rng.choice([-1, 1], len(data))
strat = np.where(correct, abs(data['bull']), random * data['bull'])
data[f'{int(ratio * 100)}_afi'] = strat
```

In [18]:

```
for ratio in [0.51, 0.6, 0.75, 0.9]:
afi(ratio)
```

The following shows the first rows of the returns series of the different strategies.

In [19]:

```
print(data.head())
```

This translates into the following gross performance values. Throughout, zero transaction costs are assumed.

In [20]:

```
data.sum().apply(np.exp)
```

Out[20]:

bull 0.926676 random 1.245684 bear 1.079126 10_top 12.322428 15_top 23.343766 51_afi 18.627366 60_afi 16.950628 75_afi 43.611802 90_afi 90.892721 dtype: float64

This translates into the following gross performance values over time, again assuming zero transaction costs.

In [21]:

```
data.cumsum().apply(np.exp).iplot(colorscale='reds')
```

From Wikipedia (https://en.wikipedia.org/wiki/Stock_market_prediction):

Stock market prediction is the act of trying to determine the future value of a company stock or other financial instrument traded on an exchange. The successful prediction of a stock's future price could yield significant profit. The efficient-market hypothesis suggests that stock prices reflect all currently available information and any price changes that are not based on newly revealed information thus are inherently unpredictable. Others disagree and those with this viewpoint possess myriad methods and technologies which purportedly allow them to gain future price information.

Over time, financial practitioners have used a plethora of methods in their effort to predict financial markets:

- fundamental analysis (value/growth)
- technical analysis (trends, support, breakout)
- statistical methods (regression, Bayesian models)
- time series modeling (ARIMA, ARCH, GARCH)
- artificial intelligence (ML, DL, RL)

Technical analysis methods are still pretty popular and intuitively appealing. One of the most simple examples is the use of two simple moving averages (SMAs) to decide whether to go long or short a financial instrument.

In [22]:

```
data = pd.DataFrame(raw[symbol])
```

In [23]:

```
data['SMA1'] = data[symbol].rolling(42).mean()
```

In [24]:

```
data['SMA2'] = data[symbol].rolling(252).mean()
```

In [25]:

```
data['POSITION'] = np.where(data['SMA1'] > data['SMA2'], 1, -1)
```

The idea here is to go long when the shorter/faster SMA is above the longer/slower SMA and short otherwise.

In [26]:

```
data.iloc[252:].iplot(secondary_y='POSITION')
```