We present a straightforward systematic investing strategy that uses hidden Markov regimes to switch between risky equities and cash. We consider this as part of the momentum & trend class strategies as we assume that regimes are persistent. The model can easily be implemented using ETFs.
We compare the strategy to a simple trend-following system (200SMA) and simple buy-and-hold on the S&P500 index. The strategy outperforms both with a Sharpe ratio of ~0.8 and a very conservative drawdown of only 13%.
Trend following as an investment strategy has been around for ages. Invest in an asset (i.e. SPY) when it trades above its 200-day simple moving price average, else stay out. This remarkably simple rule tends to dramatically reduce risk associated with just a buy-and-hold strategy. The next figure shows the equity curves of both strategies in case you started investing in year 2000. The trend-following strategy did a great job of reducing drawdowns and volatility.
Trend-following can be effective for various reasons:
However, it’s no silver bullet. When there is no recession or consistent downtrend, the strategy is typically not fast enough to re-invest and could trail behind the classic buy-and-hold as seen in Figure 1 during the last few years.
We could partially alleviate this problem by using different moving averages based on e.g. market volatility. Somewhat similar to what they did in this fast and slow momentum system. However, this would cause us to end up with more arbitrary hyper-parameters for the investing strategy, risking overfitting (overfitting?).
Instead, let’s look at hidden Markov models as an alternative to the beloved trend-following strategy from before.
Markov models (as the name suggests) model Markov processes (also known as Markov chains). These are stochastic models describing a sequence of possible events in which the probability of the next event only depends on the current event. For example, let’s assume there are two possible events (or states): sunny and cloudy weather. The next figure illustrates this Markov process, where the lines indicate transition probabilities to go from one state to another. I.e. if today is cloudy (C), we have an 80% probability that it remains cloudy, and a 20% probability we have sunny weather (S) tomorrow. Notice how transitions only depend on the current state we’re in.
How do we arrive at the model above? We could track the weather each day for a year and infer the transition probabilities ourselves. I.e. we end up with a sequence of observed weather states C-C-C-S-S-S-S-C-C-... and calculate the probabilities of staying in S (or C) and transitioning into the other.
Things become more tricky when we’re dealing with hidden Markov models. Here, we are still interested in modeling the same type of Markov process, but the states are now not directly observable. In terms of our weather example from above: we are blind and we cannot look at the sky every day to collect our data to compute the probabilities.
So how do we figure out whether it is sunny or cloudy today? We’re going to assume we can still find an observable sequence of values whose behavior depends on our hidden process. So even though we may be blind, thanks to our other senses, we can go out each day and note down the perceived temperature. (i.e. we find in Celsius 10-8-9-18-20-17-19-10-7). The assumption here is that different weather states will emit different temperatures. This is illustrated in the next figure.
After collecting our sequence of temperatures, we can use algorithms (i.e. Baum-Welch) that try to determine what Markov process was most likely to produce the given observed sequence. Put another way, it automatically finds the transition probabilities and emission (i.e. temperature) distributions for each weather state that were most likely to produce the temperature sequence we observed. After, we can use the obtained model to figure out our original sequence of weather states: C-C-C-S-S-S-S-C-C-..using just temperatures, and determine what weather state we’re in today.
This all might sound a bit too magical. Which it is. In practice, this turns out to be a hard problem to solve (i.e. the observed sequence is noisy) and we end up with imperfect Markov models. Moreover, unless you can tap into specific domain expertise, it’s hard to know how many hidden states there are a priori.
You might have guessed it by now, but we can use this methodology on financial data as well. A popular use case is figuring out whether we’re currently in a bearish (cloudy) or bullish (sunny) market regime. Sadly, this is not always clear to us (the actual regimes are hidden). Luckily HMMs can help us infer these regimes using asset returns (temperatures) that we assume to be dependent on the hidden market regimes.
We fit our HMM on S&P500 (adjusted close) returns from 1980 to 1999. It is used out-of-sample afterward (Jan 2000-May 2023) and never re-fitted. Our model uses two hidden states (conceptually representing bear and bull regimes) each using a full covariance matrix, and Gaussian emissions. Special care was taken to not introduce data leakage. All backtesting was done using Vectorbt.
The strategy is straightforward:
The next figure contrasts the equity curve of the HMM strategy to the 200SMA and Buy&Hold strategy from before.
A clear winner emerges from the figure above. The hidden Markov model is able to detect favorable (and troublesome) investment regimes much better than the simple trend-following one and therefore does much better. The following table summarises the performance measures:
According to our HMM, we’re currently in a bullish regime. The next figure displays the regimes from 2020 onwards (1 being bullish and 0 being bearish). The second graph shows the different return distributions for each of the learned regimes.
HMMs look very promising as risk-on/risk-off signals in investment strategies. However, we do want to note that past performance is not indicative of future performance. Moreover, it is unclear when the model should be retrained, and whether returns will be adequately summarised with only two different states going into the future.
We also want to note that we ran the simple trend-following strategy on daily data instead of monthly data (which is more popular). The 200SMA causes lots of noisy trading signals on daily data, which could’ve contributed negatively to its performance.