How to build a quantamental system for investment management

A quantamental system combines customized high-quality databases and statistical programming outlines in order to systematically investigate relations between market returns and plausible predictors. The term “quantamental” refers to a joint quantitative and fundamental approach to investing. The purpose of a quantamental system is to increase the information efficiency of investment managers, support the development of robust algorithmic trading strategies and to reduce costs of quantitative research. Its main building blocks are [1] bespoke proprietary databases of “clean” high-quality data, [2] market research outlines that analyse the features of particular types of trades, [3] factor construction outlines that calculate plausible trading factors based on theoretical reasoning, [4] factor research outlines that explore the behaviour and predictive power of these trading factors, [5] backtest outlines that investigate the commercial prospects of factor-based strategies, and [6] trade generators that calculate positions of factor-based strategies.

The post is a very condensed guide to one particularly powerful structure of quantamental system, based on over a decade of experience with building quantamental tools in the macro trading space.

The post ties in with the SRSV summary on quantitative methods to increase macro information efficiency.

What is a quantamental system?

A modern quantamental system is a collection of customized databases, statistical outlines (scripts of code), and special methods and classes, whose purpose is to display and analyse the quantitative relations between asset prices (or returns) and potential fundamental drivers. Code is typically written in programming languages that are suitable for data science, such as Python or R. Methods and classes enable the operation of specific repeated tasks (such as data analysis) in a simple and transparent manner. The term ‘fundamentals’ here means quantifiable and observable phenomena that have a theoretically founded and plausible impact on asset prices. Thus, ‘fundamentals’ are not just based on ‘fair value’, but can include trends in economic conditions, risk premia, implicit subsidies, endogenous market risks (such as positioning), and price distortions.

A quantamental system serves three main purposes:

The first purpose is information efficiency of the investment process, which bolsters the performance of human traders. In particular, a quantamental system condenses a vast array of quantitative information into a small manageable set, tailored to the style and mandate of a particular manager. The quantamental system can also ‘test’ trade ideas and strategies of portfolio managers, based on historical experience. Indeed, one of the most powerful sources of value-generating investment principles is the very combination of intuition of experienced traders and efficient quantitative investigation.
Second, a quantamental system is a powerful basis for algorithmic trading. The output is naturally precise and easily convertible into trading rules. Indeed, the theoretical foundation of quantamental trading rules makes them often more robust and reliable than purely quantitative strategies.
Third, a quantamental system holds great potential for cost savings. The costs of ad-hoc non-systematic research are widely underestimated. All too often managers ask ‘desk quants’ to ‘check’ the relation between some variable and market development. This case-by-case research is wasteful because it [1] repeats basic steps of data collection and data wrangling, [2] does not allow to easily integrate findings with other projects or algorithmic rules, and [3] lends itself to forgetting know-how once the quant leaves the firm or moves on to other projects.

Importantly, with a quantamental system almost everyone in an investment management organization wins. Individual investment managers receive relevant information advantage over their competitors to boost their personal track records. Researchers and economists can make their work and experience more directly ‘tradable’ and hence demonstrate their contribution return generation. And shareholders of management companies gradually ‘collect’ the know-how of senior professionals in a system, thus supporting the long-term value of the fund.

The proprietary database

While a quantamental system relies on external databases, it typically intermediates external sources through a bespoke database that focuses on relevant information in a convenient form. The term ‘bespoke’ implies specialization. Wholesale databases, such as Refinitiv, Bloomberg or Macrobond hold a vast array of information, but only a fraction of it is relevant and usable for a specific investment process.

A proprietary database typically provides four key services:

It selects the data that hold the most promise for value generation. This is not trivial. For example, the performance of fixed income markets across countries is often related to credit growth and lending conditions. However, there are many statistics related to credit conditions and selecting the ones that are most meaningful and conventionally followed requires knowledge of the economy and its financial system.
It wrangles the data. Many potentially relevant data series require considerable preparatory work. Before data are eligible for a specialized internal database they often require operations, such as ‘stitching’ (combining two data series representing a similar concept at different points in time), seasonal and other calendar-adjustment, treatment of missing observations or mismeasured observations, exclusion of invalid data and so forth.
It makes data time consistent. For trading strategies, non-market data, such as company and economic reports, must be recorded at the time they became available to the market and in the form they became available to the market. The first means that many reports need to be lagged beyond their period of reference. The second often means to distinguish between revised and unrevised data.
It documents the meaning of the data. This is probably one of the most underestimated benefits of a good quantamental system. Many economic reports, in particular, are poorly understood. Market participants know their names, but are confused about their meaning. For example, a business survey for the month of July may not actually report the level of confidence in July, but the assessment of business affairs in June and in comparison with the same month a year ago. Ignorance about the meaning of data leads to misspecified trading rules and expensive judgment errors.

One essential part of a proprietary database is time series of generic returns for trading positions, including information on transaction costs. This could be, for example, generic returns of FX forward trades (rolled at specific maturities), interest rate curve positions or volatility-targeted futures positions and so forth. In practice, one focuses on generic returns of commonly traded types of positions of the investment manager. The quality of these return series is critical for evaluating fundamental trading factors (see below). Generic returns can be quite tedious to procure and to calculate. There is (at present and to my knowledge) no commercial provider of a large range of generic returns on derivatives and cash trading positions.

Market research outlines

Market research outlines are generic programming scripts that analyse and visualize the returns of a specific type of trading position, in order to understand its empirical distribution and its behaviour under certain circumstances. This is a preparatory step for strategy development and particularly important for developers that have no personal experience in trading the type of position.

Understanding the return profile of a class of trades is beneficial for three reasons:

It provides guidance for risk management. In particular, it shows the proclivity of a certain type of position to outliers or even illiquidity.
It assists in the development of a suitable strategy. For example, return history reveals if similar trades have been highly correlated or independent across currency areas. Also, we can learn if returns of a type of trading positions have been highly dependent on global directional risk. And, we can see if returns are one-sided. For example long-volatility option returns are mostly negative, except for occasional outsized positive returns, making the timing of trading factors critical.
It improves the quality of backtests: One of the greatest benefits of studying returns is the discovery of incorrect and meaningless observations. This feeds back in the data wrangling for the internal database and often requires ‘blacklisting’ of certain periods and markets for the purpose of evaluating trading strategies.

Factor construction outlines

Factor construction outlines calculate tradeable factors (the basis of signals) from available high-quality data based on theory and logical reasoning. It is unlikely that any of the ‘unconstructed’ series can directly serve as a trading factor. The construction of such factors should be an intermediate step rather than part of strategy or factor research in order to focus on quality and logical consistency. Otherwise, there is a great temptation to construct trading factors based on ‘trial and error’. One of the most harmful habits of strategy developers is to check out and mutate factors until they deliver some predictions of returns. The measurement of a ‘good factor’ should be its logical quality and apparent relevance. Application for trading strategies will then follow naturally.

Constructing factors and their components only requires a logically and mathematically consistent construction plan and some basic empirical ‘checkup’. The latter should make sure that the constructed factors behave roughly as expected and are not distorted by apparent miscalculations and data errors.

For the purpose of factor construction one can deploy the full arsenal of econometric weapons for extracting information from data, which is provided by many powerful statistical packages in Python or R. This means, for example, that we can condense many related series into one (dynamic factor models, principal components), focus on the unpredicted component of a time series, estimate the trend and variance of a time series, and so forth.

Factor research outlines

Factor research outlines are generic programming scripts that investigate the behaviour and predictive power of factors. This type of research can be divided into four research questions:

Does the trading factor look plausible? Often a careful study of a trading factor already tells us that the original reasoning when calculating the factor was flawed. This means we have to go back to the factor construction outline and make changes. This is not data mining and a valid basis for altering factors. For example, we may find an inflation trend factor is far too volatile to credibly represent a meaningful trend and may require additional smoothing. Or we may find that a specific factor for a directional strategy has a strong ‘long bias’, i.e. prescribes a long risk position 99% of the time, because we forgot setting a suitable threshold. In this way, factor research is an additional checkpoint for quality control.
Does the trading factor predict target returns? This is typically the core questions researchers focus on. Many statistical tests and visualizations have been developed for the purpose. The important point is to ascertain plausibility and robustness Plausibility means that strength and horizon of the relation should be in accordance with underlying theory. For example, price distortion measures should have a strong short-term relation with future returns, while macro trends should have a more subtle medium-term relation. Robustness means that the relationship holds across different time periods, for different countries (if returns are uncorrelated) and for different plausible versions of the factor.
Is the factor tradable? Stylized initial backtests inform on the likely return profile of trading a simple version of the factor. Typically, if the simple versions of the factor do not produce acceptable value then optimized versions will not do either, at least in live trading. Stylized backtests answer the following types of questions: Is the factor suitable for a stand-alone algorithmic strategy? Is the factor suitable for supporting another algorithmic strategy, such as improving trend following? Is the factor suitable as a ‘pointer’ for discretionary trading opportunities, for example if it signals rare opportunities that require more careful risk management?

Backtest outlines

Backtest outlines are generic programming scripts that assess the prospects for ‘commercial success’ of a factor. This is very different from the marketing-oriented backtests shown in many presentations. The judgment over commercial prospects should not be based on a single or even optimized version of the factor. Rather, the backtest outline should investigate either a range of plausible versions of the factor or a range of plausible algorithms that optimize the factor version sequentially strictly out of sample and based on past experience.

The performance of different versions of the principal strategy factor over different time periods can be the basis of Bayesian estimation of strategy performance parameters. Bayesian estimation does not just deliver a single parameter estimate, such as a long-term Sharpe ratio, but estimates the distribution of this parameter. Hence, it informs about the uncertainty about the estimation as well.

Put simply, a good backtest outline delivers at least three different types of information about the strategy:

The first is the probable long-term performance in terms of return, volatility and seasonality. Here seasonality refers to longer periods of under- and outperformance of the strategy.
The second type of information is the uncertainty of long-term performance. This considers how strong empirical evidence and informed prior views actually are. It gives, for example, the probability that the strategy has only produced value accidentally in the past and may not perform ‘out of sample’.
The third type of information combines the former two and estimates the uncertainty of performance over a shorter business-relevant horizon. For example, this assesses the risk of negative return over a 1-3 year forward horizon (which is the longest most allocators are ready to wait for returns).

Trade generator outlines

Trade generators are simply programs that calculate trading positions in real-time. They can be the basis of automated executions or simply propositions for discretionary trading.

Trade generator outlines are based on a single version of the trading factor, which may be chosen based on historic performance or theoretical plausibility. The actual trading signal is typically enhanced by risk-management parameters, such as volatility targeting or limits to concentration on single positions or overall leverage. Trade generators usually run their own updating backtests in order to monitor the consistency of theoretical and live PnL.

It is often helpful to make the output trade generator easily readable so that managers can double-check the plausibility of the signals. This does not only help spotting errors in the code and in the data but also supports further development of the factor. The point is that seeing a trading factor in action can help understanding if an algorithm does something evidently against the strategy’s intention and where there is the most obvious room for improvement.