Quantitative Trading Is Going To Eat All Markets
Introduction
A lesser-known fact about sysls lore is that I actually started my career in finance as a discretionary trader. I ran an arbitrage book that earned a premium from collapsing commodity spreads between SGX and INE/DCE. While many of these strategies were repeatable structural trades that could be codified, there were a handful where discretionary trading was the only way to go.
For example, we discovered that rubber merchants in Singapore were highly incentivised to bid rubber futures in Singapore extremely aggressively during certain time periods, because their OTC contracts would reference the current futures prices. You could effectively come in and bet that prices were going to collapse within the next trading session, because there was no fundamental basis for the price shock — simply a temporary dislocation. It was not trivial to algorithmically discern when large price shocks were the result of shady business practices and when they were the result of an inflow of informed buying.
These experiences shaped my thinking on discretionary trading, and thus, unlike my more pure-blooded counterparts, I actually have a great deal of respect for discretionary traders. I understand that the alpha comes from being able to generalize small-sample events into a trade thesis — a feat that has been impossible for quantitative trading, until now.
In the rest of this article, I write about why we are structurally poised for another renaissance in quantitative trading, and why discretionary trading will soon cease to be able to compete meaningfully with systematic, quantitative processes.
What Is Quantitative Trading, Really?
This is not a technical exposition of quantitative trading, so I will keep the language simple and hopefully intuitive. At its core, quantitative trading is really, simply about finding patterns in the markets that we believe will continue repeating in the future.
Essentially, it is finding an event A such that whenever you observe A, you can bet that returns will change in a predictable way, B. All the difficulty, all the modelling, all the sophistication comes from quantifying and modelling A and understanding the A → B pattern (relationship).
A simple example of the above is that whenever you observe prices rising two days in a row, you can bet that on the third day they will mean-revert. This might play out with a probability of 50.01% and a magnitude of 2bps (hundredths of a percentage point). Because it has a (tiny) positive expectation, you can essentially bet on this a million times to extract this positive expectation and make real money.
Of course, this is a simplification — professionals typically do not trade a single signal, because trading costs are too onerous if you attempt that. But at its core, that’s really all there is to quantitative trading: finding patterns A → B that you can bet on an indefinite amount of times to harvest the positive expectation from the pattern.
When you model and find such a pattern, betting on the pattern is the signal. So you can essentially expect that whenever someone says they have found a “signal”, they really have found a pattern that they believe will repeat itself in the future with some positive expectation.
Why Hire PhDs If It’s So Simple, Then?
Because it’s not! It’s actually very difficult to model and identify any particular pattern. The above example would be easy to model and identify — anyone could observe how prices have changed over the past two days. But this is not generally true.
For example, suppose you hypothesize that another simple pattern exists. This pattern is: “whenever correlation between an instrument’s returns and its sector’s returns rises, then the instrument’s cross-sectional rank will rise.”
This pattern may be simple to understand but would not be trivial to model, because you’d need to gather many instruments, get their sectors, make sure your instrument set covers all the instruments found in all the sectors, calculate all instrument returns, calculate all sector returns, calculate their correlation, and of course, calculate an instrument’s cross-sectional rank within its sector.
A small increase in complexity led to a large leap in the things that needed to be done to model and study the pattern, and therefore increases the ways things can go wrong.
How We “Typically” Find Signals
This, in detail, would entail countless articles on its own — so I will keep it extremely brief. You can stumble onto signals in one of two ways. It starts with idea generation: you either hypothesize an idea through inspiration, idealization, or observation, OR you data-mine (search) for ideas in a statistical fashion.
Once you find an idea worth testing, you need to give it structure by figuring out how you want to express the idea as a pattern, A → B. For example, a “mean-reversion” idea can be expressed as many possible different patterns: time-series mean-reversion, cross-sectional mean-reversion, mean-reversion in different dimensions, etc.
Then, you need to validate the signal by running some statistical tests. The laziest form of this is backtesting, but we will assume that’s all it takes. So you backtest the signal, and if it does well, you assume it’s good and that it will generalize.
You then repeat this “idea generation → expression → backtesting” pipeline until you have a giant compendium of signals.
The (Old) Limits Of Quantitative Trading
Quantitative trading has really eaten much of public markets — and for good reason. It introduces some rigor, and more importantly a lot of discipline, in a field that is plagued by an extremely low signal-to-noise ratio.
Yet there are also quite clearly limits to quantitative trading, since the percentage across the largest podshops typically amounts to around 15%–50% of the business, with discretionary long/short and global macro being the other large segments.
The main difficulty in quantitative trading’s ability to infringe on a larger share of the pie essentially comes down to two main problems:
It is quite difficult to quantify some events. The difficulty typically stems from data collection and/or the fuzziness of an event. Perhaps you might want to create a signal that bets that the more management understands the business, the more likely returns will be positive in the future. How do you quantify your event? An analyst might qualitatively be able to assess this by watching interactions of management with questions about the company.
Some patterns have an extremely small sample size, making it really difficult to “backtest” and gain any kind of statistical confidence. An example that comes to mind would be a pharma company seeking approval of a novel drug. If it is truly a novel drug, it would be an N=1 event; how do you model and gain statistical confidence in this?
For the most part, quantitative trading simply bypassed these problems by not trying to compete in areas plagued by them at all. You see almost zero meaningful quantitative trading in the pharma industry, as an example. It is an open secret that almost all quantitative strategies are ex-pharma.
It is also why quantitative trading in venture as an asset class simply does not exist — because (unique) start-ups in general are N=1 events, and when sufficiently early, quantifying founder traits is just about the only thing you can go on to make a meaningful bet, but that is a fuzzy and difficult affair.
Why These Limits Are Slowly Evaporating
In a single word: AI. In more words: with how good LLMs have gotten, we can essentially count on them to (1) turn almost any kind of unstructured, fuzzy information into a quantifiable metric, and (2) reason about “one-off” events in a higher-dimensional space where learnings can be “generalized”.
Natural Language Processing (NLP) as a means to transform fuzzy, unstructured documents into a quantifiable metric that can be used in a quantitative trading process is certainly not new. However, its lack of intelligence has greatly limited the shape of what was previously possible.
NLP techniques were largely confined to shallow tasks, such as sentiment analysis of news headlines, counting frequencies of words, or classifying documents into buckets. They were useful, and continue to be used today, but a large part of these techniques collapsed rich information into single-dimensional, crude numerical proxies.
LLMs Can Quantify Anything With Less Lossy Compression
Given the (rising) intelligence of LLMs, you can now feed an LLM transcripts of many historical earnings calls, have it evaluate management’s depth of understanding across a multi-dimensional rubric (e.g. specificity of answers, willingness to engage with difficult questions, consistency of narrative, etc.), and get back a structured, high-dimensional, quantifiable output.
Hordes of analysts (each with their own biases, inconsistencies, and limited bandwidth) that used to do this can now be replaced with a single LLM in a fraction of the time. You can do this at scale, across thousands of companies and hundreds of quarters of historical data, producing a clean panel dataset that can be plugged directly into a standard quantitative pipeline.
The space of previously “unquantifiable” events that are now quantifiable is vast: regulatory filings, legal documents, product reviews, a host of management/founder statistics gleaned from social media, career history, and so on. The list goes on and on.
LLMs Can Generalize Any Patterns In A Higher-Dimensional Space
Humans are perhaps the greatest one-shot learning machines ever to exist. Sometime when we were younger, we touched something hot — perhaps a boiling kettle — and got burned for the first time. Tears aside, a lifetime association of avoiding hot things was immediately built. Whether it’s open flames, something that might look hot, or something that has been described to us as hot, we learn to be careful and avoid it all. Why? Because we did not pattern-match to just the kettle burning us; instead, we generalized, at a higher dimension, to the features of what hurt us — namely, the property of something being hot.
Classical machine learning models are traditionally extremely bad at doing this. They are unable to “reason in higher dimensions”, and thus often fail to generalize outside of the dimensions explicitly provided in the training set. Further, they are notoriously incapable of “one-shot learning”, and need thousands of samples in order to “teach a behavior”.
This is all very conceptual, so let’s try to solidify it.
Essentially, the central argument here is that the traditional quantitative paradigm requires statistical regularity. You need enough historical instances of “event A” to build confidence that “A → B” will generalize. It is why there are entire segments of markets that are “quant-proof” — because participants believe that the events are too idiosyncratic and the sample sizes are too small.
LLMs essentially allow us to reason in a higher-dimensional space where generalization happens at the level of features, not events. A novel drug approval is, yes, an N=1 event at the level of the specific drug. But it is not an N=1 event at the level of its features: the therapeutic area, the mechanism of action, the quality of the Phase II data, the composition of the advisory committee, the regulatory precedent for similar compounds, the strength of the sponsor’s prior interactions with the FDA, and so on. An LLM can decompose a “unique” event into a high-dimensional feature vector where each feature has been observed many times, in many combinations, across many analogous situations.
This line of reasoning is not new, and is in fact what really good discretionary traders have always done. They pattern-match against a rich internal library of analogous situations, weighting features they’ve learned matter.
The big unlock here is that LLMs are the first technology that can systematically replicate this pattern-matching at scale and with consistency.
The Coming Renaissance
If you put these two things together — everything can be quantified, and everything can be learnt — this means that asset classes and event types that were previously the exclusive domain of discretionary traders are now, in principle, tractable for quantitative approaches.
The pie that quants can address is expanding dramatically. The fuzzy, qualitative edges that have always been the refuge of discretionary traders are being systematically colonized by quantitative processes powered by LLMs.
This is not to say discretionary trading will disappear overnight. There will always be situations where the combination of specific domain expertise, relationships, and human judgment will command a premium. But the margin of that premium is shrinking, and it is shrinking in a way that is structural rather than cyclical.
Discretionary traders who continue to compete on pattern recognition in publicly available information will find themselves increasingly outcompeted by quant firms that can do the same pattern recognition faster, more consistently, and across a wider cross-section. Given that the moat of discretionary trading is slowly eroding, I cannot imagine a future where the supermajority of all trading — across all markets, public and private — is anything other than quantitative and systematic.

