Transaction costs are a key consideration for the development of trading strategies; and not just in final profitability checks. Indeed, disregard for trading costs at the design stage leads to excessive reliance on fleeting small-scale characteristics for return predictors. It also skews the conventional efficient frontier of portfolio choice towards risky trading strategies. A realistic implementable efficient frontier would penalize risky strategies for trading costs, which in turn depend on the size of the risky strategy allocation. This is particularly important for portfolios with large assets under management whose rebalancing has a large market impact. For the development of algorithmic trading strategies with machine learning, it can be highly beneficial to integrate transaction costs into the learning process. A recent paper proposes a “portfolio machine learning method” to that end.

The below paragraphs are excerpts from the paper. Headings, emphasis and text in brackets have been added. The term “agnostic to transaction costs” in the original text has been replaced by “disregarding transaction costs”.

This post ties in with the summary on quantitative methods for macro information efficiency on this site, particularly the section on backtesting.

## The impact of transaction costs on portfolio choice

“A typical use of machine learning [for portfolio strategies] takes a two-step approach: First, find a function of characteristics that predicts gross returns; and, second, use the resulting forecasts to build portfolios. This typical approach [disregards] transaction costs and turnover, and the resultant investment strategies [in practice often] produce negative returns net of transaction costs…The __high transaction costs of portfolio strategies based on machine learning imply that these strategies are difficult to implement in practice__ and, more broadly, raise questions about the relevance and interpretation of the predictability documented.”

“__Portfolio choice methods and machine learning predictions [should be] evaluated based on the net-of-cost investment opportunities__ that they produce…Textbooks and real-world investors often depict their investment opportunities in terms of the achievable combinations of risk and expected return…The textbook version of the efficient frontier – without trading costs – is a straight line that is tangent to the hyperbola of risky investments. However, we propose that __investors should focus on what…the implementable efficient frontier, namely the efficient frontier net of trading costs__.”

“The textbook frontier is drawn in a frictionless setting that [disregards] trading costs, but real-world investors care about their net return…[The figure below] illustrates frontiers for various methods that we study…[It] the implementable efficient frontier for different portfolio methods with [assets under management] of USD10bn by 2020. The dashed lines show indifference curves. The dotted hyperbola is the mean-variance frontier of risky assets without trading costs, implemented by estimating risk and expected return separately, out-of-sample. The grey line is the Markowitz-machine learning efficient frontier before trading cost.”

“The __other [blue, green, red, and violet] lines [in the figure above]…show…the achievable combinations of risk and expected return, net of trading costs.__ Focusing first on the Markowitz portfolio, we see that its net-of-cost, implementable frontier immediately dives into negative expected return territory as soon as it moves away from a 100% risk-free allocation, as seen in the bottom curve.”

“The shape of the implementable efficient frontier may be surprising: Whereas the textbook frontier is a straight line when increasing the allocation to the risky securities while reducing the risk-free allocation (or applying leverage), the true implementable frontier bends down because larger positions incur larger transaction costs. Said differently, we show that the __net-of-cost Sharpe ratio declines along the implementable efficient frontier. __The feature importance of this portfolio reveals the culprit: __excessive reliance on fleeting small-scale characteristics__ (e.g., 1-month reversal for small stocks), which bear high turnover, high trading costs, and result in poor net returns.”

“While numerous studies use machine learning return forecasts to generate portfolios, their disregard of trading costs leads to excessive reliance on fleeting small-scale characteristics, resulting in poor net returns.”

*Note on prediction method and data:
*“As machine learning method…we use random features (RF) method of Rahimi and Recht (2007)…The RF method transforms the original features using random weights and a non-linear activation function…[For empirical evaluation] we use the dataset from Jensen et al. (2021), a publicly available dataset and replication code of stock returns and characteristics, with the underlying return data sourced from CRSP and accounting data from Compustat. We restrict our sample to US common stocks.”

## Why portfolio size is critical for building trading strategies

“We are interested in deriving machine learning-driven portfolios that can be realistically implemented by market participants with a substantial fraction of aggregate assets under management, such as large pension funds or other professional asset managers.”

“__The implementable efficient frontier has a declining net Sharpe ratio __– it is not a straight line with a constant Sharpe ratio as in the textbook frontier without trading costs. The declining net Sharpe ratio reflects that __investors cannot freely leverage their portfolio to the desired risk in the presence of trading costs__ – because more leveraged positions are larger and incur larger trading costs. Further, a larger investor faces larger trading costs, leading to a lower frontier.”

“[The figure below] draws the implementable efficient frontier using our [machine learning portfolio selection method explained below] at different levels of wealth or asset under management. Interestingly, while the textbook efficient frontier is the same for all investors, __the implementable efficient frontier depends on the investor’s size __via the implied trading costs. Indeed, we see that larger investors face worse (i.e., lower) efficient frontiers that ‘cut into’ the hyperbola.”

“The top line shows the Portfolio-machine learning strategy when trading costs are nearly zero since the investor has assets under management near zero. This implementable frontier is obviously good due to the near-zero trading costs, but we note that such a sophisticated machine learning-based trading is hardly feasible for small investors in the real world. __The frontier at each wealth level shows that the set of optimal implementable portfolios is strictly worse for [larger] investors__. This degradation happens for two reasons. First, trading a larger portfolio simply incurs higher market impact cost. However, the investor can partly mitigate direct transaction costs by trading less, but this increases opportunity costs. Indeed, an investor with larger assets under management internalizes price impact from their trades, and this leads the investor to tilt away from highly predictive but costly-to-trade stocks and signals. __Large cost-aware investors opt to forego some predictability in order to hold trading costs at bay__.”

“__As the investor takes more risk, trading costs increase, but the investor compensates by trading more slowly__ toward a more stable aim. Likewise, an investor with larger [assets under management] has a lower trading speed because of larger market impact costs.”

“Naturally, a tiny investor holds a portfolio close to the Markowitz portfolio because of the low market impact costs. The limiting behavior as assets under management go to infinity is less obvious…The investor ultimately holds almost all wealth in the risk-free asset as trading a meaningful proportion of wealth in illiquid assets becomes too costly.”

## Competing techniques for considering trading costs

### Standard method

“[The standard method for incorporating transaction costs] __first uses machine learning to build expected returns disregarding trading costs__ [and] then in an auxiliary second-stage optimization takes transaction costs into account to build portfolios.”

“The investor faces transaction costs due to her market impact. [A] trade leads to a market impact that can be modeled as a multivariate version of ‘Kyle’s Lambda,’ which is symmetric and positive semi-definite such that __transaction costs are non-negative and may vary as a function of time and the state of the market__. The resulting __transaction cost is the product of the trade size and its market impact__…Specifically, we assume that the (expected and realized) market impact is 0.1%, when trading 1% of the daily dollar volume in a stock.”

“The portfolio’s return naturally depends on the portfolio weights but also on the assets under management even though the return is measured in percent of assets under management. This is because [in the model] trading costs increase by the square of assets.”

### Portfolio machine learning method

“Our preferred approach __learns directly about portfolio weights, rather than the two-step procedure__ of first predicting returns and then constructing portfolios. We thus refer to this approach as ‘Portfolio-machine learning.’”

“Specifically, we show how to integrate the machine learning problem into a generalized version of the optimal portfolio selection framework…__The main thrust of our approach is to feed the objective function explicit knowledge of implementability__, so it knows to search for perhaps subtle but usable predictive patterns while __discarding more prominent but very costly predictive patterns__. That is, we develop a machine learning method designed to produce optimal portfolios while taking into account realistic frictions from transaction costs of the securities that it trades. Our solution also gives rise to __a new measure of ‘economic feature importance; that captures which characteristics provide the most investment value in risk-adjusted terms and net of trading costs__.

“Our approach builds transaction costs directly into the objective function, thus ensuring that the algorithm learns about usable predictability. One element of usable predictability is that it is relevant for large stocks with low transaction costs. Another important element is alpha decay, that is, how persistent a predictor is. __With transaction costs, whatever you buy today you will likely own for a while__ because the trading costs encourage as you to only slowly enter or exit positions. Naturally then, understanding the expected return both now and…into the future is relevant. Empirically, __we find that the optimal machine learning predictor of near-term returns is indeed different from the optimal machine learning predictor of returns far into the future__.”

“While [previous methods] assume that expected [market] price changes are linear functions of a set of signals, we allow expected returns to be a fully general function of the signals, opening the door for flexible non-linear machine learning…__We integrate the estimation process into the optimization process via machine learning.__”

“The utility function depends on risk and expected returns net of trading costs, which gives rise to the implementable efficient frontier as illustrated in [the figures above]…The net Sharpe ratio declines along implementable efficient frontier. This result means that __an investor cannot just maximize her Sharpe ratio net of trading costs and then choose her risk level__ – as she could in the standard mean-variance analysis. Instead, she __must directly maximize the return net of trading costs and risk, thus jointly considering risk, return, and trading costs__. Hence, our framework provides useful tools to evaluate the implementability of trading strategies in general – namely the concepts of the implementable efficient frontier.

“Our Portfolio-machine learning method delivers out-of-sample net-of-cost returns that outperform a highly sophisticated alternative by roughly 20% in Sharpe ratio terms and 60% in utility terms. Further, the feature importance across signals changes when we take transaction costs into account. While naive methods are highly influenced by short-term reversal signals, our method seeks to optimally blend return predictability across multiple future horizons, especially for liquid stocks, which leads to value and quality earning the highest feature importance.”