Advanced search
114
Views
0
CrossRef citations to date
0
Altmetric
Research Papers

Transaction cost analytics for corporate bonds

ORCID Icon, ORCID Icon & ORCID Icon
Received 18 Aug 2020
Accepted 11 Mar 2022
Published online: 12 Apr 2022

Electronic platforms have been increasingly popular for executing large corporate bond orders by asset managers, who in turn have to assess the quality of their executions via Transaction Cost Analysis (TCA). One of the challenges in TCA is to build a realistic benchmark for the expected transaction cost and to characterize the price impact of each individual trade with given bond characteristics and market conditions. Taking the viewpoint of retail investors, this paper presents an analytical methodology for TCA of corporate bond trading. Our analysis is based on the TRACE Enhanced dataset; and starts with estimating the initiator of a bond transaction, followed by estimating the bid-ask spread and the mid-price dynamics. With these estimations, the first part of our study is to identify key features for corporate bonds and to compute the expected average trading cost. This part is on the time scale of weekly transactions, and is by applying and comparing several regularized regression models. The second part of our study is using the estimated mid-price dynamics to investigate the amplitude of its price impact and the decay pattern of individual bond transaction. This part is on the time scale of each transaction of liquid corporate bonds, and is by applying a transient impact model to estimate the price impact kernel using a non-parametric method. Our benchmark model allows for identifying abnormal transactions and for enhancing counter-party selections. A key discovery of our study is the price impact asymmetry between customer-buy orders and consumer-sell orders.

1. Introduction

Corporate bonds are critical to firm finance and play an important part in asset management (Nagel 2016). Unlike equity shares which are mostly traded via order books available in multilateral trading facilities, cooperate bonds are mainly traded via bilateral mechanisms (Fermanian et al. 2015) due to limited available electronic platforms (Linciano et al. 2014). Even with the introduction of the TRACE reporting system in US in June 2002 and the establishment of MiFID 2 for electronic bond tradings in Europe in January 2018, bond trading remains far less transparent than equity trading (Bessembinder and Maxwell 2008).

After the 2008 financial crisis, the macroprudencial regulation requires more transparency of corporate bond trading to reduce information asymmetry (Hendershott and Madhavan 2015) between intermediaries and their clients, leading to an increase in capital requirement and in turn preventing banks from taking large inventories as before (Wilson et al. 2014). This lower inventories, combined with the requirement of more transparency, pushes banks and dealers towards flow driven business via electronification (Harris 2015).

In this new trading environment, asset managers, lacking pricing tools and private databases enjoyed by maker-dealers, have to assess the quality of corporate bonds execution via Transaction Cost Analysis (TCA). Details of TCA are then shared with the portfolio managers of investment firm to review market liquidity and for allocation and hedging purposes (Albanese and Tompaidis 2008).

TCA is difficult as there is a dire lack of benchmark (Collins and Fabozzi 1991) for bond trading, unlike equity trading where the bid-ask spread is an obvious and easy choice for the benchmark. Instead, TCA needs to break down costs of a particular bond trading according to brokers from all possible execution venues in fragmented markets, including order books, requests-for-quotes, voice, dark pools, and block discovery mechanisms.

Our work. The goal of this paper is to establish a TCA benchmark in bond trading for retail investors. That is, we take the standpoint of an individual investor to evaluate the execution performance of each transaction.

Our TCA is based on the Enhanced TRACE dataset from 2015 to 2016. We perform TCA analysis via estimating the bid-ask spread to measure the cost of illiquidity and estimating the mid-price move to measure the impact of an individual trade.

Our analysis starts with a preliminary step of estimating the initiator of a bond transaction (section 3.1). Initiator, currently missing from the TRACE database, indicates whether a given transaction is buyer-initiated or seller-initiated. This estimated initiator of each trade enables us to estimate the bid-ask spread and the mid-price dynamics (section 3.1).

With this preliminary step, the first part of our study is to identify the most important features for corporate bonds and to compute the expected average trading cost (section 4). This study is on the time scale of weekly transactions, and is carried out via comparing several regularized regression models including the two-step Lasso, the Ridge regression, and the two-step Elastic Net regression. The response variable in the regression analysis is the estimated bid-ask spread from the preliminary analysis.

Our regression approach manages to select features for corporate bonds that are consistent with existing works, including the volatility of the bond price, the number of years from the issue date, and the activities of the bond characterized by the number of trades and the traded amount (in dollars) per week. In addition, the number of trades and the traded amounts are found to play two opposite roles: the larger the amount traded in dollars, the smaller the bid-ask spread; the more trades (for the same amount in dollars), the larger the bid-ask spread. It is worth mentioning that the R2 value obtained from our regression analysis ranges from 0.50 to 0.60, whereas the R2 in existing works via regressions varies from 0.05 to 0.20 in Hendershott and Madhavan (2015), 0.30 to 0.50 in Edwards et al. (2007), and 0.50 to 0.80 in Dick-Nielsen et al. (2012).

The second part of our TCA is using the estimated mid-price dynamics to investigate the amplitude of its price impact and the price decay pattern of individual bond transaction. This study (detailed in section 5) is on the time scale of each transaction of liquid corporate bonds. It is done by applying a transient impact model (TIM) to estimate the price impact kernel via a non-parametric method. The transient impact functions estimated in our study is found to share several important characteristics with those in the equity market:

  • a price jump when the trade occurs,

  • a price decay after the initial jump,

  • and the stabilization at a ‘permanent level’ higher than the initial price: this permanent impact can be interpreted as the informational content of the trade.

In addition, we discover an asymmetry in the amplitude of the initial price jump: buy-initiated transactions with more instantaneous impact than sell-initiated transactions on corporate bonds. Note that such an asymmetry, not present in the equity market, has also been reported in Hendershott and Madhavan (2015) and Ruzza (2016) for corporate bonds.

Existing works on TCA of corporate bonds. Empirical studies on transaction costs of corporate bonds are mostly from post-TRACE, as it was difficult for retail investors to obtain data in the pre-TRACE era. The post-TRACE trade reporting obligation started in US in July 2002. Because of the exogenous shock from the entry of TRACE, a number of earlier works (Ruzza 2016, Goldstein et al. 2007, Bessembinder and Maxwell 2008) focused on the early years of its introduction in order to identify the cost effect of this transparency. Another family of post-TRACE studied the influence of adopting electronic and multilateral trading (Hendershott and Madhavan 2015) and decreasing borrowing costs from 2004 to 2007 (Asquith et al. 2013).

All these studies reached similar conclusions that the trading costs of corporate bonds decreased on average over the last 20 years. The main proxy for transaction costs adopted in these works was the (expected) bid-ask spread (Glosten and Milgrom 1985, Edwards et al. 2007). Their main statistical approach was ordinary least square (OLS) regression to account for bond-specific or context-driven variations. The explanatory variables in these studies (Eom et al. 2004, Edwards et al. 2007, Goldstein et al. 2007, Dick-Nielsen et al. 2012) were the coupon, the maturity date, the number of years to maturity, the volatility, the risk-free rate, the expected recovery rate of the company, and the probability of default.1 See table A1 in appendix 1 for a summary of the dataset and the years of bonds studied in these empirical analysis.

Existing works on asymmetric price impact. The price impact asymmetry between customer-initiated buy orders and customer-initiated sell orders in corporate bond market has been documented in the literature. For example, this asymmetry is reflected in the regression of table IV of Hendershott and Madhavan (2015) since the coefficients of the buy and sell orders are not of the same amplitude for over-the-counter (OTC) trades (but not for electronic trades). Such an asymmetry was also reported in figure 15 of Mizrach (2015) and table 1 of Ruzza (2016). The former plotted the yearly average price change after five trades from 2003 to 2015, with the impact of buys around 25% more than the impact of sells. The latter suggested that the average difference between the price of a transaction and the average price of the day is 56bp to 33bp for institutional buyers and −25bp to −21bp for institutional sellers on TRACE data from 2004 to 2012.

Our work is different from existing studies of average transaction costs with OLS, as we apply regularized regression models to select features within a broader class of candidate features. Unlike previous studies on the price impact asymmetry from a static point of view, we investigate the price impact curve of individual trade and analyze its asymmetry in a dynamic setting, characterizing both the amplitude of the price impact and the decay via TIM models. Moreover, the analysis and methodology presented in this paper are general and can be applied to conduct TCA on other datasets including Standard TRACE.

2. Data processing and bond selection

Enhanced TRACE. TRACE, an acronym for the Trade Reporting and Compliance Engine, is the FINRA-developed mechanism that facilitates the mandatory reporting of over-the-counter secondary market transactions in eligible fixed income securities. TRACE database contains some useful (though limited) information and has been used for empirical studies by Dick-Nielsen (2014) and Harris (2015).

The main difficulty working with TRACE is the lack of information on the liquidity offer. For example, there are neither quotes, nor bid prices, nor ask prices. Instead, only final transactions are recorded, together with the type of the transaction: dealer-to-dealer, dealer-to-customer, or customer-to-customer. Therefore, besides TRACE, we also rely on Thomson Reuters to retrieve information on the bonds traded, such as the amount issued, the coupon rate, the sector information, rating information, and Libor and Overnight Indexed Swap rate. In addition, we obtain the outstanding amount through the Mergent Fixed Income Securities Database (FISD).

There are two types of TRACE datasets, Standard TRACE dataset and Enhanced TRACE dataset. Both TRACE datasets contain corporate bond transactions. The difference is that transactions are available on Standard TRACE with a delay of two weeks, with the volume of the transaction capped at 1MM for high yield bonds and 5MM for investment grade bonds. The Enhanced TRACE dataset has uncapped volumes, with transactions available with a delay of six months. There is a separate dataset provided by FINRA for monthly price, return, coupon, and yield information for all corporate bonds traded since July 2002.

Our study is primarily based on the Enhanced TRACE dataset. We use the non-truncated transaction volumes on Enhanced TRACE along with other information from FISD and Thomson Reuters to construct the estimation of the bid-ask spread (section 3). It is worth noting that the Enhanced TRACE dataset and the Standard TRACE dataset yield insignificant differences in terms of the estimation of the expected bid-ask spread, as shown in section A.6.

Data processing. The data used in our study is from January 1, 2015 to December 31, 2016, obtained from Wharton WRDS. During this period, there are 34 809 405 original trade reports, 390 193 reports of trade cancellations (approximately 1.1 % of all original trade reports), 497 249 corrected trade reports (about 1.4 %), and 28 005 reports of trade reversals. Trade reversals are transactions that have been changed after more than 20 days since they were initially recorded. Occasionally there are multiple correction records for the same original trade and cancel records that cancel previously corrected trades. There are 54 885 CUSIP2 -days spread over 656 calendar days, many of which are weekends and holidays. The CUSIP-days are computed by counting all the trade days over all the CUSIP bonds.

In particular, for each transaction of a bond, one can recover from Enhanced TRACE the following information:

  • tkb: the timestamp for the kth transaction of bond b;

  • Pkb: the price of the kth transaction of bond b;

  • Vkb: the volume of the kth transaction of bond b;

  • the side of the dealer-to-customer transaction: customer buy order or customer sell order.

The data cleaning procedure combines the approaches of Dick-Nielsen (2014) and Harris (2015), as detailed in appendix A.2. In total, about 17.50% reports are filtered out from the original Enhanced TRACE dataset. Among all the remaining 28 719 813 records, 14 071 375 (49%) are dealer-to-customer trades and the remaining 14 648 438 (51%) are trades between dealers. These statistics are summarized in table A4.

After the data cleaning, appropriate bond selection is necessary to facilitate the analysis of transaction costs, for both the regression and price impact analysis.

Bond selection for regression analysis. There are two types of bonds for regression analysis, investment grade bonds and high yield bonds, which are picked from the standard universe of U.S. corporate bonds. The investment grade bonds are selected from iShares iBoxx Investment Grade Corporate Bond ETF, and the high yield bonds from the components of iShares iBoxx High Yield Corporate Bond ETF. There are 1033 current holdings of the former, among which 538 bonds have more than one transaction recorded in Enhanced TRACE during the time period of Jan 1, 2015 to Dec 31, 2016. There are 1575 current holdings of the latter, 1485 of which have transaction records during the same period. Moreover, there are 30 bonds that belong to both iShares iBoxx High Yield Corporate Bond ETF and iShares iBoxx Investment Grade Corporate Bond ETF. The rating levels of all these 30 bonds have been adjusted since issuance. Hence, there are a total of 1993 bonds for the regression analysis. These selected bonds consist of 31.05 % of the total 14 071 375 customer-to-dealer reports from all bonds. Table A4 reports this selection as ‘Selection LR’ and table 1 reports the statistics of these selected bonds.

Table 1. Description of selected 1993 bonds for regression (dealer-customer trades).

Note that our analysis throughout the paper focuses on trades between customers and dealers, which are statistically different from trades between dealers. The study for the latter requires the initiator analysis for dealer-to-dealer trades, which is infeasible to estimate given the current information from the database.

Bond selection for price impact analysis. Given that the calculation of price impact curves requires a higher trading frequency, out of all the 1993 bonds for regression analysis, the top-200 traded bonds (in terms of number of transactions) are selected to compute price impact curves. See the ‘Selection for PI’ step in table A4 and the statistical summary of these 200 bonds in table 2. Tables 1 and 2 show that these top-200 bonds account for 32% of the total number of transactions among the 1993 bonds. Note that among the 30 bonds with a rating level adjustment, 13 belong to the top-200 traded bonds.

Table 2. Description of the selected 200 bonds for price impact analysis (dealer-customer trades).

3. Preliminary analysis

Two key components for TCA are the bid-ask spread and the mid-price dynamics, for which it is necessary to identify the riskless-principle-trades (RPTs) and the initiator of a transaction.

3.1. RPT and initiator

In the TRACE database, the information of the initiator, i.e. whether a given trade is a buyer-initiated or a seller-initiated, is missing. In fact there is a substantial fraction of transactions between dealers and customers (Harris 2015) where the dealer has found two clients and put herself in between the transactions. These transactions are called riskless principal trades (RPT) since the dealer does not take any inventory risk by matching two clients. Consequently, it is not possible to recover the initiator of the RPT because there is no information on which of the two clients has initiated the trades.

Our first step is therefore to identify these RPTs. See tables A2(a,b) for the statistics of RPTs and non-RPTs. Tables A3(a,b) report the potential RPTs and non-RPT dealer-customer trades of the top-200 traded bonds. See also appendix A.1 for a detailed literature review on RPT.3

After identifying and removing all the potential RPTs, we consider the transaction initiated by the client. We define the sign of the transaction ϵkb as +1 (i.e. ‘buy’) if a client buys from a dealer and ϵkb as 1 (i.e. ‘sell’) if a client sells to a dealer. When it is not possible to determine the sign of a trade as in the above RPT case, we assign ϵkb to be zero.4

Figure 1 reports the auto-correlation of the order signs with lag 20. It suggests that with high confidence level, there is a persistent positive auto-correlation among order signs which delays very slowly. In comparison, Bouchaud et al. (2009) showed that the sign of market orders on equity market is strongly correlated in time.

Figure 1. Auto-correlation of the signs.

3.2. Bid-ask spread and mid-price estimation

After identifying the initiator of each trade, we now analyze the two essential building blocks for TCA: the bid-ask spread and the mid-price dynamics.

To start, let us find two consecutive trades that have opposite signs ϵk+1b=ϵkb with ϵkb0 and are sufficiently close in time (i.e. |tk+1btkb|<Δt). Let us then define the estimate of bid-ask spread in absolute value as: (1) ψk+1b:=(Pk+1bPkb)ϵk+1b,(1) with Pkb the price of the kth transaction of bond b. We next estimate the mid-price at tk as: (2) Mkb:=Pkbϵk+1bψk+1b2,(2) and define the bid-ask spread in basis point (relative value) by (3) sk+1b:=ψk+1bMkb×10000.(3) Note that the choice of Δt is 5-minute, which is largely due to the low trading frequency of the corporate bond market.5 Consequently, only 15.6% of the transactions are used to calculate the bid-ask spread among bonds that are selected from section 2.

We next check the reliability and stationarity of the estimated bid-ask spread.

Reliability of the estimated spread. We compare the estimated bid-ask spread with the one computed using bid and ask quotes provided by Composite Bloomberg Bond Trader (CBBT) for those bonds that are available in both the CBBT and Enhanced TRACE data sets. CBBT is a composite price based on the most relevant executable quotations on FIT, Bloomberg's Fixed Income Trading platform. The CBBT pricing source provides average bid-ask prices based on executable quotes listed on Bloomberg's trading platform. Fermanian et al. (2016) used the CBBT data as a measure of bond liquidity. We only have access to quote price data from Bloomberg CBBT from June 1, 2015 to May 31, 2016 (12 months) for 2361 investment grade bonds that belong to the iboxxIG universe, among which we have identified 1401 bonds with records in both the Bloomberg CBBT database and the Enhanced TRACE subset.

Figure 2 shows the plot for the empirical distribution of the spread from CBBT and the estimated spread from Enhanced TRACE for two arbitrarily chosen bonds, whose statistics are reported in table 3. It is noticeable (and expected) that the CBBT spreads are larger than those estimated from real trades available in Enhanced TRACE. As Fermanian et al. (2016) pointed out, CBBT bid-ask spread estimates are based on quotes, and not on real transactions. As a consequence they include quotes that are not attractive enough (i.e. not small enough) to trigger a transaction. Since the bid-ask spread is the first component of implicit transaction costs, trades occur when they are smaller than the average bid-ask spread.

Figure 2. Empirical distributions of the spread (equation (3)). (a) US375558BG78 and (b) US126650CJ78.

Table 3. Spread comparison.

Stationarity of the bid-ask spread. We next check the consistency of the two approaches via a stationarity test on the ratio of the two estimates: the CBBT bid-ask spread and our trades-based estimates. Denote sb,wCBBT as the average spread for bond b over the period w taken from Bloomberg CBBT and sb,wTRACE as the average of the estimated bid-ask spread for bond b in period w from Enhanced TRACE (see equation (3)). Similarly, define Rb,w=sb,wCBBT/sb,wTRACE the ratio between these two spreads for bond b over the period w.

First of all, note that the empirical estimate of the bid-ask spread using Enhanced TRACE transactions are smaller than the CBBT ones: the average ratio is between 0.9 and 1 and its median is between 0.7 and 1, as summarized in table 4.

Table 4. Statistics of the ratios.

We also split the year 2016 into 11 groups of two consecutive months and check if this ratio is stationary from one period of the two months to the other. We use two tests for the stationarity of Rb,w. The first is the one-way ANOVA test and the second is the Kruskal–Wallis H-test. The former tests the stationarity of the mean and the latter tests the stationarity of the median. The mathematical formulations and definitions of the ANOVA test and the Kruskal–Wallis H-test are provided in appendix A.3.

Table 5 summarizes the results of both the ANOVA test and the Kruskal–Wallis H-test. With a 99% confidence level, we accept (cannot reject) the null hypothesis (i.e. the the ratios are stationary over time) in both the ANOVA test and the Kruskal-Wallis test for 7 of the total 11 comparisons. We will thus use this estimated bid-ask spread in all subsequent analyses because it can be operated over years of data using Enhanced TRACE, where CBBT is costly to obtain and linked to a private procedure owned by Bloomberg. Nevertheless, these stationarity tests imply that a large investor using CBBT estimates could rely on the methodology presented thereafter and apply a ratio to interpret our results in terms of ‘units’ of CBBT.

Table 5. Results of ANOVA and Kruskal–Wallis H-tests.

We have found similar results in terms of reliability and stationarity for the estimated mid-price dynamics, with details skipped here to avoid repetition.

4. Regularized regression analysis for bid-ask spread

In this section, we use regularized regressions to identify the key features that drive the bid-ask spread, which provides the estimated cost for investors who needs to move from one side (e.g. the buy side) to another side (e.g. the sell side).

We will exploit several regularized regression models including OLS, two-step Lasso, Ridge, and two-step Elastic Net regressions, along with a K-fold cross-validation method, to identify the most significant features and associated parameters for these models.

As illustrated in section 2, there are total of 1993 bonds selected for this regression analysis, along with a total 152 408 (weekly) samples processed from the Enhanced TRACE dataset during January 2015 and December 2016. Our regression analysis is performed on a weekly basis, with the weekly average bid-ask spread computed according to equation (3) serving as the response variable.

4.1. Review of methodologies

We start by reviewing the necessary notations and steps for the regression analysis that will be used throughout the paper.

OLS. OLS assumes that the regression function is in linear form. That is, given YY:=(y1,y2,,yn)Rn the vector of n observations of independent variables, and XX:=(11,xx1,,xxw1) with covariates 11Rn and xxiRn (i=1,2,,w1), OLS is to find: (4) θθˆ:=argminθθRw{YYXXθ22}.(4) In an OLS, R2 is used to measure the goodness of fit for the model. Meanwhile, an associated p-value indicates the significance level of the feature.

Two-step Lasso. The first step is to use Lasso regression to select the covariates by solving the following optimization problem: (5) minθθRw{1NYYXXθθ22+λj=1w1|θj|}.(5) Here a fixed constant λ, called the tunable hyperparameter, controls both the size and the number of coefficients: a higher value of λ leads to a smaller number of covariates in the linear model. In the second step, an OLS with only the selected covariates is applied (Belloni and Chernozhukov 2013). That is, given the Lasso estimator θˆθˆlλ in (5), the subsequent OLS refitting is to find θ¯θ¯lλ such that: (6) θ¯θ¯lλargminsupp[θθ]=supp[θˆθˆlλ]{YYXXθ2}.(6) θ¯θ¯lλ is thus called the estimator for the LSLasso (least-squares Lasso), also known as post-Lasso. This two-step Lasso estimation procedure has been shown to produce a smaller bias than Lasso for a range of models (Belloni and Chernozhukov 2013, Lederer 2013, Chételat et al. 2017).

Ridge regression. The penalty term in the Ridge regression is of the L2 norm. That is, for a fixed hyperparameter λ, Ridge regression is to solve for: (7) θˆθˆrλargminθθRw{1NYYXXθθ22+λj=1w1θj2}.(7)

Two-step Elastic Net regression. Elastic Net (EN) regression, introduced in Zou and Hastie (2005), is a hybrid of Lasso and Ridge. That is, for a fixed hyperparameter (λ,α) with α[0,1], EN is to solve for: (8) θˆθˆeλargminθθRw×{1NYYXXθθ22+αλj=1w1|θj|+(1α)λj=1w1θj2}.(8) Note that the Lasso regression is recovered from equation (8) by taking α=1 and the Ridge regression is recovered by taking α=0. Similar to the two-step Lasso, the second step of the Two-step Elastic Net is to fit an OLS with only the selected covariates. That is, given the EN estimator θˆθˆeλ in (8), the subsequent OLS refitting is to find θ¯θ¯eλ such that: (9) θ¯θ¯eλargminsupp[θθ]=supp[θˆθˆeλ]{YYXXθ2}.(9)

Cross-validation. In all three regularized regression models, the selection of hyperparameters is by the standard K-fold cross-validation approach to improve the predictive power of the model. That is, the dataset is randomly divided into K subsets. Each time, one of the K subsets is used as the validation set and the remaining K−1 subsets form a training set. In this approach, every data point is in a validation set exactly once and in a training set K−1 times. The variance of the resulting estimate is reduced as K increases.

Out-of-sample test. With the hyperparameters selected from the cross-validation step, the coefficients for a regression model are estimated using the training and validation datasets. The performance of the regression model is then evaluated with the test dataset. For financial applications, the time period of the test dataset needs to be after those of the training and validation datasets to ensure the information adaptiveness.

Given a test dataset (XX~,YY~) with size m and YY~=(y~1,,y~m), the following relative error function is used as the criterion to measure the performance: (10) Relativeerr=1mj=1m|y~jyˆj||y~j|,(10) where y~j is the true label and yˆj is the label predicted from the regression model for test sample j (j=1,2,,m).

4.2. Features for regression analysis

The features in the regression consist of two categories. One category concerns bond information, including time to maturity date, time since issued date, coupon rate, amount outstanding, and duration. The other category focuses on trade information including average transaction price, volatility, proportion of costumer-buys (sells), LIBOR-OIS rate, and the 5-year treasury rate during the given week. More specifically, we consider:

  • Volatility: calculated from the trade price. For bond b, assume there are n trades in week w. Recall Pjb as the trade price of the jth transaction (j=0,1,2,,n) of bond b. Denote the log return rib=log(PibPi1b) (i=1,2,,n) and the average return r¯b=i=1nrib/n. Then the volatility in week w is: σb=1n1i=1n(ribr¯b)2100. Notice that n may vary from bond to bond and from week to week.

  • Number of trading days: the number of days that bond b is traded during the week.

  • Log(zero trade days): the log of the number of days that bond b is not traded during the week.

  • Proportion of buy/sell number: estimated by counting the number of customer-buy orders and the number of customer-sell orders in week w and calculating the proportion of buys and sells for each bond b.

  • Proportion of buy/sell volume: estimated by taking the total volume (in dollars) for customer-buy orders and customer-sell orders in week w and calculating the proportion of buys and sells for each bond b.

  • Trading activity: the log of the number of trades in the week.

  • Total volume: the weekly total trading volume in dollars of both customer-dealer trades and dealer-dealer trades.

  • Average price: the weekly average trade price in dollars.

  • Coupon: annual coupon payments paid by the issuer relative to the bond's face or par value. The coupon rate is the yield the bond paid on its issue date. This yield changes as the value of the bond changes, thus giving the bond's yield to maturity.

  • Duration: an approximation of a bond's price sensitivity to changes in interest rates which is defined as: Db=tPV(Ctb)tPV(Ctb)×t for bond b, where Ctb is the cash flow on date t , PV(Ctb) is its present value (evaluated at the bond's yield), and tPV(Ctb) is the total present value of the cash flow, which is equal to the bond's current price.

  • Years to maturity: the time to maturity date calculated in years.

  • Years since issuance: the time since issued date counted in years.

  • Amount outstanding: the principal amount outstanding of a bond; sometimes referred to as the notional amount.

  • Turnover: the volume of bonds blacktraded relative to the total volume of outstanding bonds. The inverse of the turnover can be interpreted as the average holding time of the bond. For instance, a turnover of one implies an average holding time of about two weeks.

  • LIBOR-OIS rate: London inter-bank offer rate (LIBOR) is the rate at which banks indicate they are willing to lend to other banks for a specified term of the loan: Overnight indexed swap (OIS) rate is the rate on a derivative contract on the overnight rate. The term LIBOR-OIS spread is assumed to be a measure of the health of banks because it reflects the default risk associated with lending to other banks. In this analysis, the 1-month LIBOR-OIS rate is used to indicate the bank health condition over time.

  • Indicator of high yield (HY) or investment grade (IG) bond: indicator of the bond.

  • Indicator of different sectors: including nine different sectors such as basic materials sector (S1), communications sector (S2), consumer & cyclical sector (S3), consumer & non-cyclical sector (S4), energy sector (S5), financial sector (S6), industrial sector (S7), technology sector (S8), and utilities sector (S9).

Table 6 provides descriptive statistics of these response variables and features.

Table 6. Statistics of the response variable and the features.

Hyperparameter selection. Specific to the regression models aforementioned, denote μμ as the parameter for one of the regression models (for example, μμ=(λe,α) for EN). Partition in log-scale is used for m different hyperparameter values for μμ and the training dataset is divided into K folds for cross-validation. For each leave-out fold i, Ri2(μμ) is computed with regression coefficients calculated using the other K−1 folds. Hence for each λ, there is an empirical distribution of R2~(μμ)={Ri2(μμ),i=1,2,,K}. Denote R2ˆ(μμ) and σR2(μμ) as the mean and standard deviation of the empirical distribution with parameter μμ, and define the confidence interval by: (11) I1(μμ)=[R2ˆ(μμ)σR2(μμ)K,R2ˆ(μμ)+σR2(μμ)K].(11) Then μμ is picked such that the number of R2~(μμ) in I1(μμ) is maximized. Moreover, define: (12) I2(μμ)=[R2ˆ(μμ)σR2(μμ),R2ˆ(μμ)+σR2(μμ)].(12) Note that I2(μμ) in (12) is a relaxation of I1(μμ) in (11). When the number of {Ri2(μμ)} is not sensitive to μμ in I1(μμ), one can compare I2(μμ) instead.

The training dataset consists of data from January 1, 2015 to December 31, 2016, and the test dataset (for out-of-sample performance) consists of data from January 1, 2017 to March 31, 2017.

4.3. Features identified by regression analysis

In this section, we present the results from the two-step Lasso, the Ridge, and the EN regressions, including the most significant features and associated parameters identified by these models.

4.3.1. Benchmark model: OLS

We first summarize the result from the benchmark method OLS. As seen in table 8, all but two of the estimated coefficients are statistically significant at any reasonable level of significance. The two exceptions are year to maturity and turnover. Moreover,

  • The coefficients of Prop number of buys and Prop number of sells have the same sign but different values. The coefficient of Prop number of buys is roughly one third of the coefficient of Prop number of sells. Similarly, both of the coefficients of Prop buy volume and Prop sell volume are positive. The coefficient of Prop buy volume is roughly half of the coefficient of Prop sell volume. This shows the asymmetry between customer buy orders and customer sell orders. This is consistent with numerous studies, i.g., Fermanian et al. (2016), suggesting that dealers offer tighter quotes for larger trades than for smaller ones.

  • Avg price has a small effect on the bid-ask spread.

  • The indicators of different sectors have different coefficients, but the overall values are small.

  • The Log(total volume) coefficient is negative as expected. With value 21.4028, the estimated coefficient implies that a increase of 10 000 in trade size would make a retail-size trade into a large institutional-size trade and would reduce the bid-ask spread by 100 basis points.

  • The Indicator of investment grade bonds coefficient is negative and the Indicator of high yield bonds coefficient is positive. This is consistent with the well-documented empirical findings: larger spreads for high yield bonds and smaller spreads for investment grade bonds.

Table 7. Mean relative error of out-of-sample test on data during January–March 2017.

Table 8. OLS regression: the impact on bid-ask spread.

4.3.2. Features identified from two-step Lasso

We then present the results from a class of two-step Lasso parameterized by different μμ=λl and discuss how to select the best λl with cross-validation.

In this analysis, 20 different values of μμ=λl are picked with a partition in the range of [101,103] in the log scale. Note that the ranges of hyperparameters are different for two-step Lasso, Ridge, and two-step EN, shown in the figures of cross-validation scores (figures  A1,  A2, and  A3). The range is selected according to the sensitivity of the model with a larger prior partition grid.

figure A1 shows the 25%, 50%, and 75% percentiles of out-of-sample R~2 with different λl values. One can see that all three 25%, 50% and 75% curves decrease fast before λl=2.98 and tend to be flat after λl. Also, both I1(λl) and I2(λl) are large when λl=λl. Hence, λl is a good choice of the regularization level. Table A5 shows λl's along with I1(λl) and I2(λl), in which the number of R~2 are the largest, respectively. Table 9 shows the features selected from the first step of the two-step Lasso, with corresponding parameters λl=1.13, 2.98, and 7.85, respectively. It also shows the models from the OLS regression in the second step of the two-step Lasso. For instance, in Model L2 of table 9 with λl=2.98, the model is of the form with four features such that: (13) Bidask spread=83.48×Volatility+39.07×Trading activity19.45×Log(Total volume)+0.23×Issued years+86.(13) In addition, as seen from table 9:

Table 9. Two-step Lasso regression table: the impact on bid-ask spread (in bp).

  • The coefficient of Volatility is positive with value 83.48. This is consistent with existing theoretical and empirical studies in market microstructure in that a higher return volatility is predicted to lead to decreased liquidity (e.g. Stoll 1978).

  • The coefficient of Issued years is positive with value 0.23, which means a newly issued bond will have a small bid-ask spread. This is consistent with the work of Konstantinovsky et al. (2016), which argued that recent and large issues are cheaper to trade than seasoned and small ones.

  • The number of trades per day N and the trade volume V (in dollars) suggest a joint impact of order blacklog(N/V) on the bid-ask spread. Section 4.4 provides a detailed analysis of this relationship.

  • Finally, λl=7.85 leads to the features of Volatility and Issued year in model L3. Compared to model L2 with four features and R2=51.5%, R2 in EN3 drops to 43.22%. In our view, the model with four features, significantly reduced from the original 26 features, is preferable to EN3.

It is worth noting that our findings are supported by previous studies. For instance, Chacko et al. (2005) found that credit quality, the age of a bond, the size of a bond issue, the original maturity value of a bond at issuance date, and provisions such as a call, put, or convertible options all have strong impacts on liquidity. Choi and Huh (2019) showed that trade size and maturity date are important features to understand in the bid-ask spread.

4.3.3. Features identified from ridge

We next discuss the results from a group of Ridge regression methods parameterized by different λr values and analyze how to select the best λr with cross-validation.

In this analysis, 20 different values of λr are chosen in the range of [102,108] with uniform partition in the log scale. Figure A2 shows the 25%, 50% and 75% percentiles of out-of-sample R~2 with different λr values. One can see that all three 25%, 50%, and 75% curves start to decrease at λr=1.27106. Hence 1.27106 is a good choice for the regularization level.

Table A6 shows CI(λr) and CI2(λr) for different values of λr, in which the number of R~2 are the largest, respectively. Table 10 shows the results of Ridge regressions with parameters λr=1.62104,6.95104, and 1.27106.

Table 10. Ridge regression table: the impact on bid-ask spread (in bp).

The analysis by the Ridge regression is consistent with the findings from the two-step Lasso. In particular,

  • When λr goes up, the coefficients of the following features go to 0 very fast: Indicator functions of different sectors, Proportion of buy (or sell) volumes (or numbers), Turnover , and Number of trading days. Note that from table 10 these features are also excluded from Model L3 and L4 of table 9, which means that results from these two approaches are consistent.

  • When λr takes a large value 6.95104, the Volatility, Issued years, Trading activity and Log(Total volume) are still significant. This is also consistent with the findings from Lasso in table 9.

  • Both two-step Lasso and Ridge regressions point to the significance of the time value and the special structure of bonds. The variable years since issuance is significant in two of the two-step Lasso models, L1 and L3, and all three Ridge regression models.

The difference between Lasso and Ridge regression: Avg price is not significant in all three two-step Lasso models, whereas it is significant in all three Ridge models. This inconsistency is expected because of the collinearity among features. When features are correlated, Lasso tends to select one feature from a group of correlated features while Ridge tends to penalize the group of correlated features towards the same coefficients (Zou and Hastie 2005). Indeed, as shown in Models R1, R2 and R3, the coefficients of Prop number of buys and Prop number of sells have the same value but different signs; the coefficients of Prop buy volume and Prop sell volume also have the same value but different signs. Additionally, the reappearance of Avg price in Model EN3 is due to this group effect as well.

4.3.4. Features identified from two-step EN

Finally, we discuss the results from the two-step EN models parameterized by different λe values and analyze how to select the best λe with cross-validation.

Figure A3 shows the 25%, 50%, and 75% percentiles of out-of-sample R~2 with different λe values given different α=0.2,0.5 and 0.8. When (α,λe)=(0.5,103) and (α,λe)=(0.8,103), more than 170 empirical R~2 falls into I2. This is because the hyperparameter over-penalizes the model such that all the coefficients and the empirical R~2 are all nearly zero. Therefore, these sets of hyperparameters should be excluded.

Instead, for the analysis the following parameters are selected (α,λe)=(0.5,0.774), (α,λe)=(0.8,2.15), and (α,λe)=(0.5,129). Parameter (α,λe)=(0.8,2.15) leads to the following set of features: Volatility, Number of trades, Log(total volume) and Issued year. This is consistent with the feature selection in the two-step Lasso model L2. (α,λe)=(0.5,129) leads to the features: Volatility and Average price in model EN3. Compared with model EN2 with four features and R2=51.5%, the R2 in EN3 drops to 42.62%. Similar to the argument for L3, model EN2 with four features is superior to EN3, in our view.

Through all three different regression models, the consensus is that Volatility and Issued years are important features.

4.3.5. Out-of-sample performance

It is well recognized out-of-sample forecast performance is generally considered more trustworthy than in-sample performance (Alpaydin 2020). The latter can be more sensitive to outliers and data mining, while the former tends to better reflect the information available to the forecaster in ‘real time’.

In this subsection, we test the out-of-sample performance of all regression models on an unseen dataset during the period of January-March 2017. The distributions of the relative errors (defined in (10)) are provided in figure 3 and the mean of the relative errors are recorded in table 7. For the OLS model, errors smaller than 0.3 account for more than 70% of the testing dataset. Similar results hold for the two-step LASSO and two-step EN for which errors smaller than 0.3 account for more than 60% of the testing dataset. The errors are bigger for Ridge, as expected, since the coefficients are biased.

Figure 3. Out-of-sample test on data during January-March 2017. (a) OLS model. (b) Two-step LASSO Model L2 with parameter λl=2.98. (c) Ridge Model R3 with hyper-parameter λr=1.26×106. and (d) Two-step EN Model EN1 with hyper-parameter (α,λe)=(0.5,0.77).

The value of R2 in table 8 indicates that the OLS model can explain around 60% of the variance hence its out-of-sample performance is acceptable. Similar results are obtained from the two-step LASSO model EN1 and the two-step LASSO model L2. Ridge regression model R3 has a larger mean relative error since the coefficients are biased with the L2 penalty term, which is also expected.

4.4. Summary: bid-ask spread of corporate bonds

Different linear regressions are performed for feature selection and for estimating the bid-ask spread using two kinds of variables: one describing the bond and the other characterizing the market. Tables 811 summarize the results, with comparisons to the benchmark OLS approach.

Table 11. Two-step EN regression table: the impact on bid-ask spread (in bp).

These regressions allow for computing an ‘expected bid-ask spread’ for a given week, which can be used as a benchmark cost for TCA. In particular,

  • Volatility is an important feature, as expected by both empirical observations and theory: the larger the volatility, the larger the bid-ask spread. It has been observed in practice that an increase of 5% in volatility, that is 1/2 of its standard deviation in our dataset, corresponds to an increase of the bid-ask spread by 25 basis points, which is around one third of its standard deviation.

  • Number of trades per day N and Traded volume V (in dollars) are both important variables (in log units), with coefficients suggesting blacklog(N/V) being the feature impacting the bid-ask spread in the basis points,6 implying that:

    1. for a given trading activity N, the larger the traded volume, the smaller the bid-ask spread (in basis points);

    2. for a given traded volume in dollars, the lower the average trade size (i.e. the more trades), the larger the bid-ask spread.

    Our result is compatible with the documented stylized fact that for corporate bonds: small trade size obtain a worse bid-ask spread than large trades (Fermanian et al. 2016).

  • The value of the coupon and the duration of the corporate bond play a small role in the formation of the bid-ask spread (both with a positive coefficient).

  • Last but not least, the Number of years to maturity and the Years since issuance are selected by our robust regressions. Keep in mind these two variables are linked, via the maturity of the bond, thanks to the relation: Year to maturity = MaturityYears since issuance. Hence naturally, the coefficient of year to maturity is negative while the one of years since issuance is positive: the further away from the maturity, the smaller the bid-ask spread (in basis points).

    This could support the market folklore that there is only a short time period after the issuance when corporate bond trading is not too expensive on secondary markets.

Other variables appearing in the OLS are not robust enough to be selected by penalized regressions. Removing these 17 variables from the regression only reduces the R2 from around 0.55 to around 0.50, a minor price to pay for the increased robustness. It is worth noticing that the R2 of all these OLS and regularized regressions are around 50%, which is in line with the best results obtained in the literature: Dick-Nielsen et al. (2012) obtained R2 between 0.50 and 0.80, while the R2 of other studies were far below 0.50 (see section 1). The out-of-sample performance in section 4.3.5 further indicates the promise of the regression models.

In addition, the regression results (table A9 for OLS and table A8 for two-step LASSO) between the Enhanced TRACE and Standard TRACE datasets confirm similar performance in terms of R2 for these two datasets. Therefore, one could choose either dataset for the bid-ask spread estimation, depending on the time-lag and the accuracy of the trade volume.

5. Price impact analysis

After analyzing the average TCA on the weekly basis (section 4), we now move on to the second part of TCA analysis. We will focus on the individual trade, and study the amplitude of its price impact and the price impact decay after the transaction for liquid corporate bonds. The goal of the price impact analysis is to determine the necessary set of events to fit the mid-price dynamics and to understand how the impact of each type of event decays over time.

As a benchmark comparison, recall several stylized facts on equity market from Bouchaud et al. (2009):

  • buy trades on average push the price up and sell trades on average drive the price down;

  • the impact curve as a function of the volume of the trade is strongly concave. In other words, large volumes impact the price only marginally more than small volumes;

  • the sign of market orders is strongly autocorrelated in time.

To see whether these facts hold for corporate bonds, we apply a transient price impact model to estimate the price impact amplitude and decay pattern. Since the trading frequency is much lower on corporate bonds than on equities, it is more appropriate to use the ‘event time’ in our transient impact model instead of the chronological time used in other non-parametric price impact models for equity markets (Biais et al. 2016, Van Kervel and Menkveld 2019).

Our first attempt is to model naively the mid-price by a single-event TIM (TIM1) (section 5.2). Using the signature plot as a metric for the goodness-of-fit shows that TIM1 is not sufficient to describe the mid-price movements for corporate bonds. Meanwhile, statistical evidence implies an asymmetry between the price impacts from customer-buy orders and customer-sell orders, as detailed in section 5.3.1. This statistical evidence motivates us to propose a TIM model with two types of events (TIM2) (section 5.3.2): customer-buy orders and customer-sell orders. The signature plot indicates good performance of this improved TIM framework.

5.1. Review: transient price impact models

Let us first review the classic transient impact model (TIM) following Bouchaud et al. (2009) with the ‘event time’. Assume Π is a set of event-types considered on the market and the mid-price Mkb of a corporate bond b follows Eisler et al. (2012), Taranto et al. (2018) and Lehalle and Laruelle (2018): (14) Mkb=k=kπΠ(Gπb(kk)1(πkb=π)(Vkb)αϵkb+ηkb)+Mb,(14) where ϵkb{1,+1} is the sign of the kth trade, estimation of which has been detailed in section 3.1, Vkb is the volume of the kth trade, πkb is the type of the kth trade, α is a power index, Gπb(δk) is a decaying kernel of type π event, ηb is a noise, and Mb is a initial value for the mid-price. η is a random change of the fair price independent of ϵ and it is assumed to be i.i.d.

Note that Gπb is typically an exponential or a power law (i.e.Gπb(δt)exp(λδt) or (1+δt)γ) (Eisler et al. 2012, Lehalle and Laruelle 2018, Taranto et al. 2018). Gπb(δk) can be interpreted as the response function per bond when α=1; Gπb(δk) can be understood as the response function per order when α=0 and the volume is ignored. In equity markets, there has been empirical evidence showing that α0.1.

5.2. First attempt: single-event transient impact model

In this section, we will show that the naive TIM1 model (i.e. equation (14) with one type of events) does not fit the price impact curves for cooperate bonds.

To see this, note that the mid-price dynamics of bond b under TIM1 are: (15) Mkb=k=k(Gb(kk)(Vkb)αϵkb+ηkb)+Mb,(15) where ϵkb{1,+1} is the sign of the kth trade as estimated in section 3.1, Vkb is the volume of the kth trade, α is a power index, Gb(δk) is a decaying kernel of the bond b mid-price, ηb is noise and Mb is an initial value for the mid-price, and ηb is a random change of the fair price independent of ϵb and is assumed to be i.i.d.

Under (14), the change of the mid-price can be written as: (16) Rkb(1):=Mk+1bMkb=Gb(0)(Vk+1b)αϵk+1b+ηk+1b+j=0(Gb(j+1)Gb(j))Δ1Gb(j)(Vkjb)αϵkjb.(16) Consequently, we can check the values of Sb(l)=E[Rkb(1)ϵkl+1b] and Cb(n)=E[(Vt+nb)αϵt+nbϵtb] and obtain: (17) Sb(l)=Gb(0)Cb(l)+j=0+Δ1Gb(j)Cb(lj1).(17) If we only focus on the first N transaction in the calculation of the response function, then (17) can be written in the following matrix format: (18) (Sb(1)Gb(0)Cb(1)Sb(2)Gb(0)Cb(2)Sb(L)Gb(0)Cb(L))=:S¯b(L)=[Cb(0)Cb(1)Cb(2)Cb(N+1)Cb(1)Cb(0)Cb(1)Cb(N)Cb(L1)Cb(LN)]=:C¯b(N,L)(Δ1Gb(0)Δ1Gb(1)Δ1Gb(N1))=:G¯b(N).(18) Note that it suffices to estimate Sb(l) and Cb(n) for different values of l and n, and the initial value Gb(0). Afterwards, an estimator for G¯b(N) can be constructed using (18) such that it follows: (19) G¯b(N)ˆ=C¯b(N,L)ˆ1S¯b(L)ˆ,(19) with C¯b(N,L)ˆ and S¯b(L)ˆ the estimates of C¯b(N,L) and S¯b(L), respectively.

To evaluate the model and quantify the price diffusion for different lags, define the signature plot (Bouchaud et al. 2009) as below: (20) Db(l)=1lE[(Mt+lbMtb)2].(20) For the TIM1 model, the approximated signature plot follows: (21) DTIM1b(l)=1l0n<l(Gb(ln))2+1ln>0(Gb(l+n)Gb(n))2+2Φb(l)+Dconstb,(21) where Dconstb is some constant and Φb(l) is the correlation-induced contribution to the price diffusion: lΦb(l)=0n<m<lGb(ln)Gb(lm)Cb(mn)+0n<m[Gb(l+n)G(n)]×[Gb(l+m)Gb(m)]Cb(mn)+0n<lm>0Gb(ln)[Gb(l+m)Gb(m)]×Cb(m+n).

Experiment set-up. We fit the TIM1 model (18)–(19) with L = N = 10 for the top-200 bonds (described in section 2.0.0.7). In order to facilitate the comparison of different bonds, we calculate the propagator functions for relative price changes with α=0.0.7

From figure 4, observe that:

  • unlike equity markets where the decay of the propagator function is slow for both large tick stocks and small tick stocks (Taranto et al. 2018), the decay of the propagator functions is fast in cooperate bond markets and G(l)10(bp) when l10 (figure 4(a)).

  • since the signature plot serves as a metric to evaluate the fitted models, we observe from figure 4(b) that DTIM1(l), the signature calculated from the TIM1 Model, does not fit well with the signature plot D(l) calculated from the data. It appears that DTIM1(l) overestimates the signatures for small l.

5.3. Modified TIM model and asymmetric price impact

5.3.1. Statistical evidence of asymmetric price impact

We next present some statistical evidence on the asymmetry of price impacts. This study then motivates us to consider a two-type event model treating customer-buy and customer-sell orders separately.

Figure 4. Fitted TIM1 model and the goodness-of-fit (aggregation over 200 bonds and α=0). (a) Propagator function G for TIM1 model and (b) Signature plots: data vs. TIM1 model.

To start, we adopt one-sided spread to test if there is any difference between the buy-side liquidity and the sell-side liquidity (Choi and Huh 2019): (22) spreadB=tradedpricereferencepricereferenceprice1(buyorder),spreadS=referencepricetraded pricereferenceprice1(sellorder).(22) For each customer trade, we calculate its reference price as the volume-weighted average price of inter-dealer trades larger than $100 000 in the same bond-day, excluding inter-dealer trades executed within 15 minutes. spreadB and spreadS are calculated at the bond-day level by taking the volume-weighted average of trade-level spreads. The average buy spread is 44.52 (bp) and the average sell spread is 38.74 (bp) across all 1993 liquid bonds. Afterwards, we perform a t-test with the null-hypothesis that the buy spread and the sell spread have the same sample mean. The null-hypothesis is rejected with a p-value smaller than 1%, indicating that the buy spread is different from the sell spread. Meanwhile, we also perform t-tests for individual bonds – 1215 out of 1993 bonds have p-values smaller than 5% indicating different buy spread and sell spread distributions.

The rank frequency plot of buy-spread ( spreadB) and sell-spread (spreadS) is visualized in figure 5. Note that a rank-frequency distribution is a discrete form of a quantile function (inverse cumulative distribution) in reverse order, giving the size of the element at a given rank. From figure 5, we observe that the distributions of the buy-spread and sell-spread have different tail behaviors.

5.3.2. Modified TIM models and estimation of asymmetric price impact

The preliminary analysis of price impact asymmetry from section 5.3.1 motivates us to propose a modified TIM model in which costumer-buy orders and costumer-sell orders are treated as different events in the calculation of propagator functions (Bouchaud et al. 2009, Eisler et al. 2012, Eisler and Bouchaud 2016, Taranto et al. 2018, Schneider and Lillo 2019). See also Jurksas et al. (2021) on liquidity spill-overs in sovereign bond market which estimate the price impact curves for buy and sell orders separately.

Figure 5. Rank frequency plot of buy-spread (spreadB) and sell-spread (spreadS) of all 1,993 bonds.

This model is inspired by Taranto et al. (2018) where events with small trades and large trades are treated differently. Here we assume that there are two types of events Π:={+1,1} with +1 denoting the customer-buy orders and 1 denoting the customer-sell orders. The calculation of the propagator function is similar to (19) and as detailed in appendix A.7. See also Lehalle and Laruelle (2018, appendix A.12) for a more detailed discussion.

Experiment set-up. We fit the TIM2 model (18)–(19) with L = N = 10 for the top-200 bonds (described in section 2.0.0.7). Similar as before, we calculate the propagator functions for relative price changes in order to make different bonds comparable. In the estimation, we take α=0.0 and the qualitative results for small α are similar.

One can observe the following from figure 6: First, customer-buy orders have larger price impacts than customer-sell orders for the first few trade-times l = 1, 2 (figure 6(a)). Second, the decay of the propagator function for customer-buy orders are slightly faster than customer-sell orders (figure 6(a)). Moreover, comparing figures 4(b) and 6(b), we see the TIM2 model fits better with the signature plot calculated from the data. This implies that the TIM2 model with customer-buy orders and customer-sell orders being treated differently is better than the single-event model TIM1.

Figure 6. Fitted TIM2 model and the goodness-of-fit (α=0 and aggregation over 200 selected bonds). (a) Propagator function G+1 and G1 of the TIM2 model and (b) Signature plots: data vs. TIM2 model.

Figure 7 suggests heterogeneity among bonds in terms of the size of the market impact, the different impacts between market-buy orders and market-sell orders, and the shape of the decay. In addition, for newly issued bonds (i.e. Wells Fargo 94974BFY1 4.1%) or bonds that are close to maturity (i.e. Transocean 893830AS8 6.0%), the difference between the customer-buy propagator function and the customer-sell propagator function is larger than for the bonds that are in the middle of their life-time (i.e. Goldman Sachs 38141GGQ1 5.25%).

Figure 7. Heterogeneity among bonds (α=0). (a) Wells Fargo 94974BFY1 4.1% (Issued 2 years and 10 years to maturity). (b) Goldman Sachs 38141GGQ1 5.25% (Issued 5 years and 5 years to maturity) and (c) Transocean 893830AS8 6.0% (Issued 9 years and 2 years to maturity).

It is worth pointing out that price impact models based on no-arbitrage considerations in equity markets require the price impact to be symmetric (Huberman and Stanzl 2004, Gatheral 2010). It is intriguing to see the presence of an asymmetric price impact in the illiquid and fragmented bond market, suggesting possible arbitrage opportunities in the secondary OTC market for corporate bonds.

Summary. We propose to use propagator functions to measure the price impact of each single trade for corporate bonds that are liquid enough. Our analysis finds two characteristics of the price impact of corporate bonds:

  • The asymmetry between buying and selling trades. The mid price moves triggered by a trade on a corporate bond are larger for buying transactions than those for selling ones. In terms of TCA, it means that the asset manager has to respect such an asymmetry and take it into consideration during the evaluation of the counterparty dealers.

  • Decay in price impact curves, similar to the one identified in equity markets (Eisler et al. 2012, Taranto et al. 2018). The price impact curve consists of a jump corresponding to the adverse selection suffered by the dealer, followed by a decay stabilizing the price at the level of the permanent market impact.

6. Conclusion

This paper established a TCA benchmark in bond trading for retail investors and asset managers. It consists of (a) estimating the expected average cost on a weekly basis via regularized regression analysis and (b) investigating the amplitude of price impact and the price impact decay for each trade of liquid corporate bonds via TIM model.

The most important features identified in the regression analysis are volatility, trading activity, log(total volume), and issued years. Meanwhile, asymmetry is discovered between buying and selling trades: mid-price moves triggered by a trade on corporate bonds are larger for buying than those for selling.

Our study suggests the following approach for TCA in practice:

  1. For all corporate bonds of interest, asset managers first compute an expected bid-ask spread given the characteristics of the bond and market conditions using one of the regression approaches proposed in section 4, and using either the Standard TRACE or the Enhanced TRACE datasets for bid-ask spread approximation.

  2. This reference bid-ask spread can be used to benchmark the bid-ask spread obtained while requesting for quotes from counterparties. It can also be used to score all the obtained trades during the week.

  3. Worst trades can be qualitatively evaluated using the average price impact curves obtained in section 5.3. More specifically, if a trade has price impact larger than the curve showed in figure 7, then it can be identified as a ‘worst trade’ and the asset manager can conduct further analysis on the counterparty.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Note that Biais and Green (2019) did not perform any linear regression, but relied on descriptive statistics, probably due to the lack of explanatory variables available during this period.

2 CUSIP stands for Committee on Uniform Securities Identification Procedures.

3 Our percentage of RPTs is lower than that reported in Harris (2015), partly because of different datasets with different time periods. Harris (2015) used the Standard TRACE dataset from April 1, 2014 to March 31, 2015, where the markers (‘1MM+’ and ‘5MM+’) for larger trades assign the same value to many large trades. Finally, we only count the RPTs for a subset of bonds whereas Harris (2015) estimated the PRTs for a larger set of bonds.

4 Note that there are other methods to estimate the sign of a trade when quote price is not available. See Holthausen et al. (1987) for the tick test and Lee and Ready (1991) for the inverse tick test.

5 Δt<1-minute is not realistic for the (illiquid) corporate bond market; meanwhile the choice of Δt10-minutes is infeasible due to lack of data. Comparing results with Δt=4-minutes, Δt=6-minutes, and Δt=5-minutes leads to the choice of Δt=5-minute.

6 The ridge regression suggests the feature being in the form of NV, with an addition term of the average price. Possible explanation: penalization in the Ridge regression tends to avoid large coefficients in the regression.

7 The results with small α are qualitatively the same.

References

  • Albanese, C. and Tompaidis, S., Small transaction cost asymptotics and dynamic hedging. Eur. J. Oper. Res., 2008, 185(3), 14041414. [Crossref], [Web of Science ®][Google Scholar]
  • Alpaydin, E., Introduction to Machine Learning, 2020 (MIT Press). [Google Scholar]
  • Asquith, P., Au, A.S., Covert, T. and Pathak, P.A., The market for borrowing corporate bonds. J. Financ. Econ., 2013, 107(1), 155182. [Crossref], [Web of Science ®][Google Scholar]
  • Belloni, A. and Chernozhukov, V., Least squares after model selection in high-dimensional sparse models. Bernoulli, 2013, 19(2), 521547. [Crossref], [Web of Science ®][Google Scholar]
  • Bessembinder, H. and Maxwell, W., Markets transparency and the corporate bond market. J. Econ. Perspect., 2008, 22(2), 217234. [Crossref], [Web of Science ®][Google Scholar]
  • Bessembinder, H., Maxwell, W. and Venkataraman, K., Market transparency, liquidity externalities, and institutional trading costs in corporate bonds. J. Financ. Econ., 2006, 82(2), 251288. [Crossref], [Web of Science ®][Google Scholar]
  • Biais, B. and Green, R., The microstructure of the bond market in the 20th century. Rev. Econ. Dyn., 2019, 33, 250271. [Crossref], [Web of Science ®][Google Scholar]
  • Biais, B., Declerck, F. and Moinas, S., Who supplies liquidity, how and when? Technical report, BIS Working Paper, 2016. [Google Scholar]
  • Bouchaud, J.-P., Farmer, J.D. and Lillo, F., How markets slowly digest changes in supply and demand. In Handbook of Financial Markets: Dynamics and Evolution, pp. 57–160, 2009 (Elsevier). [Google Scholar]
  • Chacko, G., Mahanti, S., Mallik, G. and Subrahmanyam, M.G., The determinants of liquidity in the corporate bond markets: An application of latent liquidity, 2005. [Google Scholar]
  • Chakravarty, S. and Sarkar, A., Trading costs in three US bond markets. J. Fixed Income, 2003, 13(1), 3948. [Crossref][Google Scholar]
  • Chételat, D., Lederer, J. and Salmon, J., Optimal two-step prediction in regression. Electron. J. Stat., 2017, 11(1), 25192546. [Crossref], [Web of Science ®][Google Scholar]
  • Choi, J. and Huh, Y., Customer liquidity provision: Implications for corporate bond transaction costs, 2019. Available at SSRN 2848344. [Google Scholar]
  • Chordia, T., Sarkar, A. and Subrahmanyam, A., An empirical analysis of stock and bond market liquidity. Rev. Financ. Stud., 2005, 18(1), 85129. [Crossref], [Web of Science ®][Google Scholar]
  • Collins, B.M. and Fabozzi, F.J., A methodology for measuring transaction costs. Financ. Anal. J., 1991, 47(2), 2736. [Taylor & Francis Online][Google Scholar]
  • Dick-Nielsen, J., How to clean Enhanced TRACE data, 2014. Available at SSRN 2337908. [Google Scholar]
  • Dick-Nielsen, J., Feldhütter, P. and Lando, D., Corporate bond liquidity before and after the onset of the subprime crisis. J. Financ. Econ., 2012, 103(3), 471492. [Crossref], [Web of Science ®][Google Scholar]
  • Edwards, A.K., Harris, L.E. and Piwowar, M.S., Corporate bond market transaction costs and transparency. J. Finance, 2007, 62(3), 14211451. [Crossref], [Web of Science ®][Google Scholar]
  • Eisler, Z. and Bouchaud, J.-P., Price impact without order book: A study of the otc credit index market, 2016. Available at SSRN 2840166. [Google Scholar]
  • Eisler, Z., Bouchaud, J.-P. and Kockelkoren, J., The price impact of order book events: Market orders, limit orders and cancellations. Quant. Finance, 2012, 12(9), 13951419. [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Eom, Y.H., Helwege, J. and Huang, J. -Z., Structural models of corporate bond pricing: An empirical analysis. Rev. Financ. Stud., 2004, 17(2), 499544. [Crossref], [Web of Science ®][Google Scholar]
  • Fermanian, J.-D., Guéant, O. and Rachez, A., Agents' Behavior on Multi-Dealer-to-Client Bond Trading Platforms, 2015. [Google Scholar]
  • Fermanian, J.-D., Guéant, O. and Pu, J., The behavior of dealers and clients on the european corporate bond market: The case of multi-dealer-to-client platforms. Market Microstruct. Liquid., 2016, 2, 1750004. [Google Scholar]
  • Friewald, N. and Nagler, F., Dealer inventory and the cross-section of corporate bond returns. Social Science Research Network Working Paper Series, 2014. [Google Scholar]
  • Gatheral, J., No-dynamic-arbitrage and market impact. Quant. Finance, 2010, 10(7), 749759. [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Glosten, L.R. and Milgrom, P.R., Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. J. Financ. Econ., 1985, 14(1), 71100. [Crossref], [Web of Science ®][Google Scholar]
  • Goldstein, M.A., Hotchkiss, E.S. and Sirri, E.R., Transparency and liquidity: A controlled experiment on corporate bonds. Rev. Financ. Stud., 2007, 20(2), 235273. [Crossref], [Web of Science ®][Google Scholar]
  • Harris, L., Transaction costs, trade throughs, and riskless principal trading in corporate bond markets. Social Science Research Network Working Paper Series, 2015. [Google Scholar]
  • Hendershott, T. and Madhavan, A., Click or call? auction versus search in the over-the-counter market. J. Finance, 2015, 70(1), 419447. [Crossref], [Web of Science ®][Google Scholar]
  • Holthausen, R.W., Leftwich, R.W. and Mayers, D., The effect of large block transactions on security prices: A cross-sectional analysis. J. Financ. Econ., 1987, 19(2), 237267. [Crossref], [Web of Science ®][Google Scholar]
  • Huberman, G. and Stanzl, W., Price manipulation and quasi-arbitrage. Econometrica, 2004, 72(4), 12471275. [Crossref], [Web of Science ®][Google Scholar]
  • Jurksas, L., Teresiene, D. and Kanapickiene, R., Liquidity spill-overs in sovereign bond market: An intra-day study of trade shocks in calm and stressful market conditions. Economies, 2021, 9(1), 35. [Crossref], [Web of Science ®][Google Scholar]
  • Konstantinovsky, V., Ng, K.Y. and Phelps, B.D., Measuring bond-level liquidity. J. Portfolio Manage., 2016, 42(4), 116. [Crossref], [Web of Science ®][Google Scholar]
  • Lederer, J., Trust, but verify: Benefits and pitfalls of least-squares refitting in high dimensions. arXiv preprint arXiv:1306.0113, 2013. [Google Scholar]
  • Lee, C.M. and Ready, M.J., Inferring trade direction from intraday data. J. Finance, 1991, 46(2), 733746. [Crossref], [Web of Science ®][Google Scholar]
  • Lehalle, C.-A. and Laruelle, S., Market Microstructure in Practice, 2018 (World Scientific). [Crossref][Google Scholar]
  • Linciano, N., Fancello, F., Gentile, M. and Modena, M., The liquidity of dual-listed corporate bonds. empirical evidence from italian markets. Technical report, CONSOB. italy14bonds, 2014. [Google Scholar]
  • Mizrach, B., Analysis of corporate bond liquidity. Technical report, FINRA, 2015. [Google Scholar]
  • Nagel, J., Electronic trading in fixed income markets. Technical report, NBIS, 2016. [Google Scholar]
  • Ruzza, A., Agency issues in corporate bond trading. Technical report, SSRN, 2016. [Google Scholar]
  • Schneider, M. and Lillo, F., Cross-impact and no-dynamic-arbitrage. Quant. Finance, 2019, 19(1), 137154. [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Schultz, P., Corporate bond trading costs: A peek behind the curtain. J. Finance, 2001, 56(2), 677698. [Crossref], [Web of Science ®][Google Scholar]
  • Stoll, H.R., The supply of dealer services in securities markets. J. Finance, 1978, 33(4), 11331151. [Crossref], [Web of Science ®][Google Scholar]
  • Taranto, D.E., Bormetti, G., Bouchaud, J. -P., Lillo, F. and Tóth, B., Linear models for the impact of order flow on prices. I. History dependent impact models. Quant. Finance, 2018, 18(6), 903915. [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Van Kervel, V. and Menkveld, A., High-frequency trading around large institutional orders. J. Finance, 2019. [Crossref], [Web of Science ®][Google Scholar]
  • Wilson, D., Trivedi, K., Weisberger, N., Karoui, L., Timcenko, A., Ursua, J., Cole, G. and Yin, S., The state of play in the leveraged finance market: Ok for now. Technical Report 33, Global Economics Weekly, 2014. [Google Scholar]
  • Zou, H. and Hastie, T., Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B, 2005, 67(2), 301320. [Crossref][Google Scholar]
Appendices

Appendix 1. Literature review

Appendix 2. Data processing

A.1. Assigning a sign to a trade and identifying RPT

To estimate the sign of transactions, we will first reproduce the essentials of preprocessing to identify such RPTs in Harris (2015). We identify potential RPTs as pairs of sequentially adjacent trades of the same size for which one trade is a customer trade. To find these trades in the Enhanced TRACE data, we first identify all size runs (sequences) of two or more trades of equal size. Next, for each size run, we consider which trades, if any, consist of a pair of trades in a potential RPT. We identify potential RPTs if one trade of two adjacent trades within a size run is a dealer trade with a customer, or if both trades in an adjacent pair are customer trades and the dealer both buys and sells. We identify the first such pair as a potential RPT, and then continue searching the size run for any additional pairs that do not involve trades already identified as being part of a potential RPT. Harris (2015) found that the RPT rate is above 42% and 41% of customer trades appear to be RPTs. The RPT rate for our entire Enhanced TRACE data set is 23.9%. Moreover, table A2(a) shows we found 21.8% RPTs.

A.2. Data filtering

The data cleaning procedure combines the approaches in Dick-Nielsen (2014) and Harris (2015), with the following steps:

  1. Remove canceled trades and apply corrections to ensure that only trades that are actually settled are accounted for. After the removal of canceled trades and canceled corrections records, there are 32 931 539 trades.

  2. Remove the transactions reported by agents as both principal and agent in the dealer-to-dealer transactions report to FINRA (see Dick-Nielsen 2014). As a result, 2 095 934 (6.36%) of the reports are removed, with 30 835 605 reports remaining after this step.

  3. Remove the transactions on unusual trading days such as weekends and holidays. Thus 5753 (0.02%) records are removed, with 30 829 852 reports left after this step.

  4. Exclude all trade reports with an execution time outside of the normal 8:00AM to 5:15PM ET trading hours. Therefore 745 619 (2.4%) are removed, with 30 084 233 reports remaining after this step.

  5. Remove all irregular trades with sales condition codes that indicate late reports, late reports after market hours, weighted average price trades, or trades with special price flags. As a result, 583 157 (1.9%) reports are removed, with 29 501 076 reports left after this step.

  6. Remove trade reports with a price below 10. This price filter step excludes 217 321 (0.74%) of the remaining trades, with 29 283 755 reports left after this step.

  7. Select reports classified as corporate bonds in the dataset. Remove those reports with sub-product indicators such as Mortgage Backed Securities Transactions. Consequently 563 942 (1.94%) of the remaining reports are filtered out, with 28 719 813 reports left.

Appendix 3. Statistical test and regression analysis

A.3. ANOVA test and Kruskal–Wallis H-test

Suppose there are W groups of observations. (In our example, W = 6.) There are nw observations in group w and the total number among all groups is n. Within each group, w=1,2,,W and the observations are denoted as yw,1,,yw,nw with sample size nw. Denote y¯w=i=1nwyw,inw as the sample mean in group w and y¯=w=1Wi=1nwyw,in as the sample mean of all observations.

One-way ANOVA test. A one-way ANOVA test is applied to samples from two or more groups, possibly with differing sizes. In a one-way ANOVA test, the formula for the F-ratio is F=MSBMSW, where MSB=j=1Jnj(y¯jy¯)2n1 is the between-group mean square value and MSW=j=1Ji=1nj(yj,iy¯j)2n(n1)W is the within-group mean square value.

A.4. KS test

Denote by F(x)=P(X1x) a cumulative density function of a true underlying distribution of the data and define an empirical cumulative density function by Fn(x)=Pn(Xx)=1ni=1nI(Xix). Then supxR|Fn(x)F(x)|0.

Suppose that the first sample X1,X2,,Xm of size m has a cumulative distribution function (CDF) F(x) and the second sample Y1,Y2,,Yn of size n has a CDF G(x). Suppose that one wants to test H0:F=G vs. H1:FG. Let Fm(x) and Gn(x) be their respective empirical CDFs, then Dmn=(mn(m+n))1nsupx|Fm(x)Gn(x)| satisfies the following property of convergence in the distribution: P(|Dmn|<t)H(t)=12i=1(1)i1e2i2t, where H(t) is the CDF of the KS distribution.

Kruskal–Wallis H-test. The Kruskal–Wallis H-test is a non-parametric version of ANOVA. The test works on two or more independent samples, which may have different sizes. The mathematical formula for H-statistic is H=12n(n+1)j=1WTj2nj3(n+1), where Tj is the sum of ranks in the jth group.

A.5. Cross-validation results

Lasso

Figure A1. Cross-validation score for Lasso.

Table B1. List of empirical papers on transaction costs of corporate bonds.

Table C1. Statistics of selected 1,993 bonds for the BA-spread regression.

Table C2. Statistics of selected 200 bonds for the price impact analysis.

Table C3. Data filtering procedure.

Table D1. Number of R~2 in the confidence interval for Lasso.

A.6. Comparison between the enhanced TRACE and standard TRACE datasets

The comparison between the Enhanced TRACE and Standard TRACE datasets for OLS and two-step LASSO is summarized here.

Table A9 compares the OLS regression with data in Enhanced TRACE and Standard TRACE from the same period January 01, 2015–December 31, 2016. It shows similar R2 values, with 55.4% for Enhanced TRACE and 54.4% for Standard TRACE. The relative difference between the regression coefficients is small except for the following features.

  • Prop Volume sell $ and Prop Volume buy $: For Enhanced TRACE, the coefficient of Prop Volume buy $ is 50% larger than that of Prop Volume sell $. This relationship is reversed for Standard TRACE. Further studies of the distribution of capped transactions show that 65% of these transactions are customer sell orders. This contributes to the changes of these two features in the Standard TRACE as well as the difference in regression coefficients. See figure A4.

  • Turnover and Years to maturity: The coefficients of these two features are much bigger for Standard TRACE. This difference is tolerable as neither of two features is significant (p0.1).

Comparing the two-step LASSO model on these two datasets (see table A8) confirms similar conclusions: the outstanding features are consistent and the regression coefficients are compatible.

Appendix 4. Additional details of the price impact analysis

Figure A2. Cross-validation score for Ridge.

Table D2. Number of R~2 in the confidence interval for Ridge regression.

Figure A3. Cross-validation score for EN. (a) α=0.2. (b) α=0.5 and (c) α=0.8.

Table D3. Number of R~2 in confidence interval for EN.

A.7. Estimation of TIM2 model

For simplicity, we omit the subscript b f(or bond b) in the derivation here. Assume that there are two types of events Π:={+1,1} with +1 denoting customer-buy orders and 1 denoting customer-sell orders. In this case, the mid-price dynamics (14) leads to the following expression for mid-price changes: (A1) Rk(1):=Mk+1Mt=πΠGπ(0)I(πk+1=π)Vk+1αϵk+1+ηk+1+j=0πΠ(Gπ(j+1)Gπ(j))Δ1Gπ(j)I(πkj=π)ϵkjVkjα.(A1) As a consequence, we can write the conditional response functions and response correlation matrix as: (A2) Sπ(l)=E[Rkϵkl+1|πkl+1=π]=E[Rkϵkl+1I(πkl+1=π)]P(π),(A2) (A3) Cπ,π(n)=E[ϵtϵt+nVt+nα|πt=π,πt+n=π]=E[ϵtI(πt=π)Vt+nαϵt+nI(πt+n=π)]P(πt=π,πt+n=π),(A3)

Table D4. Two-step Lasso regression table: the impact on bid-ask spread (in bp).

for 1lL and NnL. Then the response function can be written as, for π=+1 or 1: (A4) Sπ(l)=πΠP(π|π)Gπ(0)Cπ,π(l)+j=0+πΠP(π|π)Δ1Gπ(j)Cπ,π(lj1).(A4) Denote C~π,π(l)=P(πt+l=π|πt=π)Cπ,π(l) and S~π(l)=Sπ(l)πΠGπ(0)C~π,π(l), then: (A5) (S~+1(1)S~+1(2)S~+1(L)S~1(1)S~1(2)S~1(L))S¯(L)=[C~+1,+1(0)C~+1,+1(1)C~+1,+1(N+1)C~+1,+1(1)C~+1,+1(0)C~+1,+1(N+2)C~+1,+1(L1)C~+1,+1(N+L)C~1,+1(0)C~1,+1(1)C~1,+1(N+1)C~1,+1(1)C~1,+1(0)C~1,+1(N+2)C~1,+1(L1)C~1,+1(N+L)C~+1,1(0)C~+1,1(1)C~+1,1(N+1)C~+1,1(1)C~+1,1(0)C~+1,1(N+2)C~+1,1(L1)C~+1,1(N+L)C~1,1(0)C~1,1(1)C~1,1(N1)C~1,1(1)C1,1(0)C~1,1(N+2)C~1,1(L1)C~1,1(N+L)]C¯(N,L)(Δ1G+1(0)Δ1G+1(1)Δ1G+1(N1)Δ1G1(0)Δ1G1(1)Δ1G1(N1))G¯(N),(A5) where we have: S1(l)=E[Rk|ϵkl+1=1],S1(l)=E[Rk|ϵkl+1=1],C~1,1(n)=P(πt=1,πt+n=1)E[Vt+nα|πt=1,πt+n=1]P(πt=1),C~1,1(n)=P(πt=1,πt+n=1)E[Vt+nα|πt=1,πt+n=1]P(πt=1),C~1,1(n)=P(πt=1,πt+n=1)E[Vt+nα|πt=1,πt+n=1]P(πt=1),C~1,1(n)=P(πt=1,πt+n=1)E[Vt+nα|πt=1,πt+n=1]P(πt=1). The signature plot for TIM2 model (Eisler et al. 2012) can be similarly defined as: (A6) lDTIM2(l)=0n<l+1G+1(ln)2P(+1)+n>0+1[G+1(l+n)G+1(n)]2P(+1)+20n<n<l+1,1×G+1(ln)G1(ln)C+1,1(nn)+20<n<n<l+1,1[G+1(l+n)G+1(n)]×[G1(l+n)G1(n)]C+1,1(nn)+20n<ln>0+1,1G+1(ln)×[G1(l+n)G1(n)]C1,+1(n+n).(A6)

Figure A4. Proportion of customer buy orders in dollars (mean 0.48 for Enhanced TRACE and mean 0.52 ofr Standard TRACE).

Table D5. OLS regression: Comparison between TRACE Enhanced and Standard TRACE.

Alternative formats

 

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.