How we build a model that supplements stock market investors’ intuition with big data and AI
Alternative data and Machine Learning are gaining rapid traction in the investment industry. However AI and Machine learning still is not able to beat the market or most regular investors in a real life environment, even though the internet is full of articles predicting stock prices with 99% accuracy (overfitting is no joke). One can safely say there are domains where humans outperform AI and there are domains where it is the other way around. This also goes for investing. But why only use either human input or AI input when you can also combine them?
In this article Ill explain how we build a model that uses both human intuition and AI in analysing the stock market, as well as some examples on how it works in practice. This will be followed up by more elaborate examples in subsequent articles.
Where humans have the edge over AI— developing investment narratives
Academics are becoming increasingly convinced that narratives play a key role in explaining stock price movements (eg Robert Shiller — Narrative Economics, Andrew Lo-Adaptive Markets). Not so strange if you think about it, people have been storytellers for thousands of years and our brains are hardwired to build a narrative for everything we experience. The best stock pickers in the world are using narratives to make their plays, only they call their narratives investment theses, scenario’s or bets. The famous play George Soros made against the Bank of England basically was betting on a narrative, namely the narrative that the Bank of England would be able to keep supporting the British pound. Similarly tension in oil producing countries are narratives that move the price of ExxonMobil. Or narratives what the FED will do will move the entire SP500. Essentially when you start to think about it, narratives really matter.
Humans are way better than AI in and telling stories and recognizing stories. A five year old can tell a coherent story whereas even the most sophisticated AI (eg GPT -3) still struggles with telling a random story, let alone recognizing one. Ask an investor what narrative will affect the stock price of Microsoft and he will instantly have some idea that it relates to tech, working from home (and therefore Covid developments) and cloud computing. AI does not have that intuition.
Where AI has the edge over humans — data and patterns
Humans can see the big picture, but AI can spot the details, and it can do it better and faster. I can write an AI script in one hour that can beat any human at the “Find Wally” game, no matter if they are trained to do it for decades. Similarly I can (and I did) write an AI script that is able to monitor any desired narrative. And where the average investor might be able to crunch several news sources per day, AI can monitor thousands within minutes. Moreover we programmed our AI to pick up meaningful and small differences in nuance of specific narratives that humans cannot detect. Additionally AI can assess whether narratives are discussed in a positive or negative tone. In this way we can see whether negative sentiment is developing on oil, or positive sentiment on interest rates, or controversy around working from home, or fear of negative economic data, or an increase in bullish sentiment within the manufacturing sector, or on any other narrative one can think of.
Let humans find the narrative and let AI find the pattern
Having established all that we set out to build a model where humans can supply the narrative (whether that is an investment thesis, stress-scenario or monitoring the new hype sector) and AI. Below is a brief explanation of the steps taken to build a narrative model, followed by an exemplary case study/example.
Human input → Which narratives are suspected to drive price. Experienced investors generally have a good sense on what news or narratives drive price. We can use this information to feed in to our model (see also the real life example further on).
Narrative-investing.io AI input → deduce narrative time series from our alternative dataset (Financial Tweets, Financial headlines, Financial subreddits). This is done by:
- Using our customized algorithm to assign a narrative to each headline or Tweet
- Determine for each relevant headline (i.e. that is related to the narrative) its sentiment, emotional value etc by using a Machine Learning based sentiment/emotion classifier.
Narrative-investing.io AI input → Analyse which narrative exposures are most important in explaining future price. We feed (lagged) price and our narrative time series to our model. From this our model can be trained on what narrative series are most important in explaining future price movements.
Narrative-investing.io AI input → Having established a model we continuously keep monitoring how narratives develop and how this would influence future price movements. We do this by setting up a dashboard with all relevant metrics.
We did this by leveraging our inhouse dataset of news websites, financial tweets and Reddit data. As with all things in live, it helps to describe this with an example;
A real life example
The investors’ intuition — One of our earlier clients/follower is an investor in the semiconductor industry. Many years of experience taught him that the semiconductor industry is sensitive to various domains/narratives, from which he gave the following as input to our model:
- Taiwan → Taiwan is one of the main semiconductor producing regions. Uncertainty on that region seems to affect the semiconductor market.
- China US tension → Related to above, if there is tension between US and China it increases tension with respect to Taiwan.
- Consumer → Consumers create demand for semiconductors. Positive narratives on consumers influence the semiconductor market.
- Tech → Tech is heavily dependent on semiconductors and positive narratives on is reflected in the semiconductor market.
- Economic factors → Finally all general economic narratives are also at play for the semiconductor business. So FED policy, investors emotions and all other economic related narratives are taken in to account.
The narrative time series— From the ideas investors have we can let AI determine narrative time series from our alternative dataset of millions of tweets and headlines. We plotted a subset of our narratives against the stock market returns of TSMC stock (a large semiconductor company). Already you see there is some relation.
Example of a narrative time series vs stock returns
Determining the most important narratives — From all narratives given we can let our model analyse which narratives are most important for (future) returns.
Most important narratives for TSM returns
Here you see the top 10 most important narratives. For TSM this means that when consumer news gets more exposure that is an important indicator for future price. Almost as important as the sentiment on news on Taiwan (second row). This in turn is followed by negative exposure of consumer headlines.
Monitoring narratives and run predictions — All these metrics (and many more) are then continuously updated on our dashboard. In this way the investors intuition can be quantified and monitored. Our engine and AI model is able to build time series for any narrative so all of our dashboard are unique and tailormade to the need of the investor or client. It helps them leveraging all their years of experience and intuition by means of our dataset and AI analyses.
Narrative-Investing.IO custom dashboard
This in a nutshell explains how we tried to build a model that offers the best of both worlds, human intuition combined with the rigor and insight from AI. In a following piece we will dive in to some more use cases (spoiler, scenarios analyses, trend watching, theme investing). And in case you are interested in analysing your own intuition and narratives by means of a (free) customized example dashboard, go to www.narrative-investing.io and let us know, we are happy to help.