Evaluation of surrogacy in the multi-trial setting based on information theory: an extension to ordinal outcomes

ABSTRACT Summary: In clinical trials, surrogate outcomes are early measures of treatment effect that are used to predict treatment effect on a later primary outcome of interest: the primary outcome therefore does not need to be observed and trials can be shortened. Evaluating surrogates is a complex area as a given treatment can act through multiple pathways, some of which may circumvent the surrogate. One of the best established and practically sound approaches to surrogacy evaluation is based on information theory. We have extended this approach to the case of ordinal outcomes, which are used as primary outcomes in many medical areas. This extension provides researchers with the means of evaluating surrogates in this setting, which expands the usefulness of the information theory approach while also demonstrating its versatility.


Introduction
It is legitimate to use a surrogate in place of the true or primary outcome of interest in a clinical trial if it can be established that it informs on the treatment effect on the true outcome. In so doing, a clinical trial can be conducted with a smaller sample size and efficacious treatments can be made available to patients in a more timely fashion. However, the confidence placed in a "legitimate" surrogate can only be as strong as the means of establishing its validity. Baker and Kramer (2003) stated that where treatments work through multiple pathways (as is often the case) surrogacy assessment is difficult. Many different approaches for the evaluation of surrogates have been suggested Frangakis and Rubin 2004;Molenberghs et al. 2008;Robins and Greenland 1992). For a systematic review of methods see Ensor et al. (2016). These tend to examine whether surrogates are informative at both the individual patient level and the clinical trial level.
The work presented here aims to extend multi-trial information theory-based surrogate evaluation to the case of ordinal outcomes. In so doing we allow researchers to evaluate surrogates in areas where ordinal outcomes are used, for instance in stroke where the Oxford Handicap Scale (Bamford et al. 1989) is often measured.
Various existing surrogate evaluation approaches could be extended to the case of ordinal outcomes, including the direct and indirect effects, principal stratification and information theory approaches Frangakis and Rubin 2004;Robins and Greenland 1992). Aside from quantitatively evaluating the potential surrogate, we would wish any such approach to have four main properties. An approach should be: (a) practically viable; (b) able to inform on the causal nature of relationships between the surrogate and true outcome; (c) able to identify the surrogate paradox. The surrogate paradox occurs when there are positive treatment effects on the surrogate, and a positive relationship between the surrogate and true outcome, but a negative treatment effect on the true outcome; and (d) inform on the surrogate's transportability or predictive ability, a fundamental requirement of surrogacy whereby surrogates evaluated in one trial would be able to inform on the treatment effect on the true outcome in a new trial.
Pragmatic multi-trial approaches, including meta-analytical (Buyse et al. 2000) and information theory  are well-established methods that fulfil to a good standard all the above criteria. Therefore, we consider the multi-trial approaches to be the most appropriate for extension to ordinal outcomes. These approaches assess surrogacy at two levels: the individual patient and trial levels. In simple terms, correlation is an insufficient measure of surrogacy because it ignores treatment mechanisms of action and can lead to the surrogate paradox. In multi-trial approaches, the individual patient measure of surrogacy is essentially a correlation but treatment allocation is taken into account. At the trial level, multi-trial approaches provide a measure of the predictive ability of the surrogate to determine whether a surrogate could inform on the likely treatment effect on the primary outcome in a new trial, i.e. its transportability. This satisfies one of the primary aims of a valid surrogate. Combined these measures provide a methodologically sound and practically useful assessment of surrogacy that goes beyond a simple measurement of correlation.
The multi-trial information theory approach has fewer computational and interpretational issues compared to earlier multi-trial approaches and provides consistent interpretation across settings (for example, ordinal or continuous outcomes) . Given these strong methodological and practical advantages, we select the multi-trial information theory approach here for extension to the case of ordinal outcomes.
Most methodology developed for ordinal outcomes has been in early surrogate evaluation measures such as single trial studies (Molenberghs et al. 2001) or the multi-trial (meta-analytical) approach (Burzykowski et al. 2003;Molenberghs et al. 2002;Renard et al. 2002); none have been evaluated via simulation. Under the meta-analytical approach Renard et al. (2002) briefly outline a latent variable approach; Burzykowski et al. (2003) present methodology for an ordinal surrogate and time to event true outcome; and Alonso et al. (2002) investigate the setting where one of the surrogate or true outcome is ordinal and the other continuous. In contrast, our work provides a fully developed methodological extension to the ordinal case in the multi-trial setting, building on the established strengths of the information theory approach to surrogacy evaluation. This has been evaluated by an extensive simulation study incorporating many settings not investigated previously, including weak strengths of surrogacy; discordant strengths of surrogacy at trial and individual levels; ceiling effects for categorical outcomes; as well as an investigation of the impact of non-proportional odds.
In Section 2 we outline the information theory approach and how this can be extended to the case of ordinal outcomes. We cover the "binary-ordinal" setting where the surrogate is binary and the true outcome ordinal; the theory developed could also be applied to the ordinal-ordinal setting with some minor modifications. Section 3 presents a simulation study to evaluate the properties of the ordinal extension. Section 4 illustrates the method using a case study from the stroke clinical trial CLOTS3 (Dennis et al. 2015) and Section 5 discusses our methodology extension in the broader context.

Methods
In what follows, the surrogate is denoted S, treatment is Z and the true outcome is T. There are i = 1,2, …,N trials, and j = 1,2, … .,n i patients per trial. N T ¼ P i n i is the total number of patients in all trials. The ordinal true outcome has Wordered categories.
2.1. The information theory approach Alonso and Molenberghs (2007) proposed an information theory surrogate evaluation measure based on the concepts of entropy and information theory by Shannon (1948). Information theory uses the central concept of entropy to measure the "information, choice and uncertainty" in a random variable.
In the discrete case, entropy can be represented as H Y ð Þ ¼ À P m y b ¼ 1 p b log p b ð Þ, where Y is a discrete random variable with values k 1 ; k 2 ; . . . :; k m y and probabilities p 1 ; p 2 ; . . . :; p m y respectively. Conditional; H YjX ð Þ, and joint entropy, H Y; X ð Þ, can be straightforwardly defined. And differential entropy measures information in the continuous case, h d Y ð Þ ¼ À ò 1 À1 f y y ð Þ log f y y ð Þ È É dy. A concept of fundamental importance is the mutual information. This is defined as I X; Y ð Þ¼ H Y ð Þ À HðYjXÞ and is interpreted as the amount of uncertainty in Y removed if X is known. Another useful concept for comparing random variables is the entropy power, obtained by maximising the entropy of a continuous random variable, defined as EP (1948) for a full list of the properties of entropy and the mutual information. These concepts are useful in surrogate evaluation as, at the individual level, we are interested in the amount of information on T (or 'treatment effects on T' at the trial level) covered by our knowledge of S (or 'treatment effects on S' at the trial level).
2.1.1. Individual level: information theory approach At the individual level, Alonso and Molenberghs (2007) proposed an information theory surrogate evaluation measure: where EP T ð Þ is the entropy power of T and EP TjS ð Þ is the entropy power of T given S. This can be interpreted as the amount of uncertainty in the true outcome T removed when S is known. R 2 h has useful properties: it is linked to the mutual information through R 2 h ¼ 1 À e 2I S;T ð Þ ; R 2 h is invariant by bijective transformations of S and T; and R 2 h ¼ 0 if and only if T and S are independent. Alonso and Molenberghs (2007) suggested a multi-trial framework R 2 h : as shown in Equation (2) to enable transportability of results for the information theory approach.
For N trials there are N q possible values of R 2 h i , the R 2 h for the i th trial since trials can be clustered depending, say, on q different characteristics (e.g. centre, country, treating physician). There are many different choices for the set of unknown weights, # i in (2). The choice of which leads to an uncountable set of parameters, Ω h , each parameter of which could act as a single meaningful measure of R 2 h in the multi-trial setting: where Φ h are the parameters of the set Ω h . Alonso and Molenberghs (2007) highlighted the likelihood reduction factor (LRF) as a good candidate from Ω h which provides a useful route to defining P N q i¼1 # i . The LRF is a measure of information gain that has been considered under an information theory framework by several authors (Brillinger 2004;Joe 1989;Kullback 1997;Linfoot 1957).
The LRF is particularly useful for surrogacy evaluation as it ranges in the unit interval and has a consistent interpretation across settings: this is a key point as previous approaches could not provide this. Furthermore, it is possible that a high-dimension integral would be needed in the calculation of I(T,S) which the LRF avoids, and as we will expand on in section 2.1.1.1 the LRF provides consistent estimation of R 2 h (Alonso et al. 2016;Alonso and Molenberghs 2007). Finally, previous approaches to surrogacy assessment relied on computationally intensive joint models of S and T, but the LRF assesses just the conditional model of T|S and the marginal model of T and hence avoids this issue.
2.1.1.1. LRF at the individual level: the continuous setting. The LRF was proposed by Alonso et al. (2006) based on the ideas of Kent (1983). At the individual level, the LRF is based on the amount of information gained about the true outcome after accounting for the surrogate which was proposed as a general measure of correlation. Alonso et al. (2005) proposed modelling (4) and (5) for each trial i (linear models are presented here, whereas generalised linear models were originally given): where: θ 0 i and μ i are intercept parameters with and without adjustment for the surrogate; β i is the treatment effect parameter for the true outcome; θ 1 i and θ 2 i are treatment and surrogate parameters for the model with adjustment for the surrogate. The amount of information on the true outcome gained from the surrogate is calculated via the difference in the log-likelihood between (4) and (5) which is formally expressed as G 2 i , for each trial i. LL 0 is the log-likelihood for the unsaturated model, in this case (4), and LL 1 for the saturated model, (5), for trial i.
The LRF is then calculated: To demonstrate the link between the LRF and R 2 h , consider the i th trial and joint density function Where θ 2 i represents the dependence between S and T,θ Ãi is the maximum likelihood estimator under the null hypothesis of independence (θ 2 i ¼ 0), andθ i is the maximum likelihood estimator for the saturated model. We can express Using the estimator 1 Brillinger 2004). For a full proof see the supplementary material of (Alonso and Molenberghs 2007).
2.1.1.2. LRF at the individual level: extension to the binary-ordinal setting. The LRF can be used to calculate R 2 h for a binary surrogate and ordinal true outcome. At the individual level, the LRF can be applied in the binary-ordinal setting in the same manner as in the continuous case using (6), based in this case on the difference G2 ¼ 2 LL 1 À LL 0 ð Þof the following proportional odds models: where w ¼ 1; . . . :; W À 1, and W is the number of categories in the ordinal true outcome. For trial i, μ T w i and θ 0 w i are intercept parameters for each cut point of the ordinal true outcome, β i and θ 1 i represent the treatment effect on the true outcome and θ 2 i is the surrogate parameter. Again, the LRF is based on the amount of information gained on the true outcome after adjusting for the surrogate for each trial.
However, in the case of discrete outcomes and a family of conditional models, the LRF is bounded above by a number strictly less than one (Kent 1983). Alonso and Molenberghs (2007) showed that R 2 h 1 À e À2H T ð Þ , where H(T) represents the entropy of T. They also suggested that H(T) can be approximated based on the log-likelihood of the intercept-only model of true outcome (logit P T ij w À Á È É ¼ θ 3 , where θ 3 is the intercept parameter). Alonso and Molenberghs (2007) therefore proposed rescaling R 2 h as calculated by the LRF in (6) by: The LRF thus gives a consistent interpretation at the individual level for both the binary-ordinal and continuous settings.

Trial level: information theory approach
At the trial level, interest is in the relationship between treatment effects on the surrogate and treatment effects on the true outcome. Alonso and Molenberghs (2007) proposed a two-stage approach. At the first stage, the treatment effects for each trial on the surrogate and true outcome are obtained, α i and β i respectively. This is done by regressing the surrogate and true outcome on treatment in separate models: where μ S i ; μ T i ; represent the mean intercept and α i ; β i the treatment effects for S and T, respectively. Using the treatment effect estimates for S and T from these models,α i andβ i respectively, we calculate the information theory surrogacy measure R 2 ht ; where the subscript t indicates that we are now considering trial-level surrogacy, through: where EPβ is the entropy power of the distribution of treatment effect estimates on T across the i trials and EPβjα is the entropy power of the distribution of treatment effect estimates on T given those on S. R 2 ht can be interpreted as the amount of uncertainty in the treatment effect on T removed through knowledge of the treatment effect on S.
2.1.2.1. The LRF at the trial level: the continuous setting. The LRF can be applied to calculate R 2 ht in the continuous-continuous case. In order to do this (10) and (11) are again modelled to obtain treatment estimatesα i andβ i andμ S i . At the second stage, two further models of the treatment effect on the true outcome are required:β where γ 3 and γ 0 are the intercept parameters with and without adjustment for the surrogate treatment effects and γ 1 and γ 2 are the parameters for the surrogate intercept and treatment effect estimates provided from stage one. The difference in log-likelihood between these two models can then be calculated and the LRF applied as in (15).
In a similar fashion to the LRF at the individual level, it can be shown that the LRF is a consistent estimator of R 2 ht (Alonso et al. 2016).
2.1.2.2. The LRF at the trial level: extension to the binary-ordinal setting. In the binary-ordinal setting, the key difference in the approach is in the models used at the first stage. Here a generalised linear and proportional odds model are required for the surrogate and true outcome, respectively: where w ¼ 1; . . . :; W À 1, and W is the number of categories in the ordinal true outcome, μ T w i is the set of intercept parameters for each of the W-1 cut points of the ordinal true outcome and all other parameters are analogous to the continuous case. The second stage models (13) and (14) can be fitted in the same manner as in the continuous setting using the parameters of (16) and (17), and the LRF applied as in (15). The LRF has a consistent interpretation at the trial level for the continuous-continuous and binary-ordinal settings, and it can easily be seen how this would be the case for other settings.

Confidence intervalsall settings
A confidence interval based on the non-central χ 2 distribution for R 2 ht may be calculated as per (Kent 1983): where γ 1:α and δ 1:α are defined by χ 2 1 γ 1: represents the non-central chi-squared distribution with 1 degree of freedom. The above is true h on the other hand has multiple G 2 i . Previous publications have computed non-parametric bootstrap confidence intervals in this setting and we follow that methodology (Alonso et al. 2006).

Set-up
The practical worth of the approach is demonstrated via a thorough simulation study using R, based on the approach of . Different scenarios were simulated to see how the R 2 measures perform when different numbers of trials and sizes of trial are available. We reported the median point estimate and median upper and lower confidence limits over 250 simulations for each scenario investigated. We use the methodology of the precursor to the information theory approach, the meta-analytical approach, to set up the simulation as conducted by many previous authors Tilahun et al. (2008). The normal joint mixed model (17) gives the basis for the data generation: where ðμ s , μ T ) and (α, β) are fixed intercepts and treatment effects, respectively. (m S i ; m T i Þ and ða i ; b i ) are random intercepts and treatment effects for the i th trial, respectively. ðε S ij ,ε T ij Þ~Nð0; P Þ and random effects, m S i ; m T i ; a i ; b i ð Þ T~N 0; D ð Þ, where: Specific values of D and P were chosen in line with Tilahun et al. (2008) as were individual trial intercept and treatment parameters for S and T which were set to μ s = 0.50, μ T = 0.45, α = 0.05, and β = 0.03. Their values do not influence the true strength of surrogacy.
Four surrogacy scenarios were simulated: strong, with R 2 ht ¼ ρ 2 ¼ 0:90 and R 2 h ¼ ψ 2 ¼ 0:64; weak, with R 2 ht ¼ ρ 2 ¼ 0:30 and R 2 h ¼ ψ 2 ¼ 0:30; or to have discordant levels of surrogacy at trial and individual level, R 2 ht ¼ ρ 2 ¼ 0:90 and R 2 h ¼ ψ 2 ¼ 0:30; or R 2 ht ¼ ρ 2 ¼ 0:30 and R 2 h ¼ ψ 2 ¼ 0:64. After simulating a continuous S and T these were then dichotomised or categorised to represent a binary S and ordinal T. T was set to have seven categories and its distribution was simulated to follow what might be observed in the Oxford Handicap Scale (Van Swieten et al. 1988) investigated in the stroke case study (section 4). We also investigate the setting where the ordinal outcome does not fulfil the proportional odds assumption, by changing for one treatment arm one of the quantiles at which the continuous T is cut to generate the ordinal categorical T. Trial sizes were set to 60, 100, and 300 patients. There were 5, 10, 20 or 30 trials in each simulated data set. There were 250 datasets simulated for each scenario: a total of 15,000 simulations covering all combinations of the strength of surrogacy (4), trial size (3) and number of trials (4) scenarios and in addition the non-proportional odds setting with strong surrogacy for all trial size and number of trials scenarios.
At the individual level in the discrete binary-ordinal case, information theory explores surrogacy at the observed rather than latent scale, and therefore the strength of surrogacy is expected to be lower than on the latent continuous level ). This reflects reality, since for example binary measures often represent latent continuous variables and a binary surrogate would be expected to provide less information than a continuous one. Therefore, we expect the maximum surrogacy strength achievable in the observed binary-ordinal setting to be much lower than the 'true' strength of surrogacy set at the latent level. We investigated the individual level surrogacy ceiling for a binary surrogate with an ordinal true outcome by further investigating the ideal scenario where R 2 ht ¼ ρ 2 ¼ 0:90 and R 2 h ¼ ψ 2 ¼ 1. In this case, 250 data sets were simulated for each scenario: a total of 3,000 simulations covering all combinations of the trial size (3) and number of trials (4) scenarios.

Results
For strong surrogacy c R 2 h converges to around 0.30 (Table 1) for larger numbers of trials and trial sizes; this is much lower than the 0.64 strength simulated on the latent continuous scale. Equally, for weak surrogacy c R 2 h converges to around 0.13 (Table 2) which is again much lower than the strength of 0.30 simulated on the latent scale. Simulations for the 'perfect' surrogate with R 2 h = 1 converge to around c R 2 h = 0.48, the ceiling for this binary surrogate for an ordinal true outcome generated from a latent continuous measure, see Table 3.
Unlike individual-level surrogacy, trial-level surrogacy, R 2 ht , ought to report the same surrogacy strength at the latent and explicit scales . However, there appears to be some underestimation of c R 2 ht for strong surrogacy even where trial sizes are large (Table 1). This is in line with results in the continuous-binary and binary-binary settings Tilahun et al. 2008). Conversely, where surrogacy is set to be weak (Table 2) there is overestimation of R 2 ht for small trial sizes. Further examination showed this was due to overfitting to the resultant small number of data points (one for each trial) in the regression model used at the second stage of R 2 ht modelling.
c R 2 ht and c R 2 h estimates where surrogacy strengths differ at trial and individual levels are similar to where surrogacy strengths are consistent, see Table 4. Deviation from the proportional odds assumption also seems to have little impact on results at either level, see Table 5.

Case study -CLOTS3
The case study, conducted using data from the randomised trial Clots in Legs Or sTockings after Stroke (CLOTS) 3 trial (Dennis et al. 2015), aimed to determine whether measures taken within 30 days of a stroke could be used as a surrogate in place of death and disability measured 6 months post stroke.
Venous thromboembolism encompasses the ailments: deep vein thrombosis (DVT), a blood clot in the deep veins of the legs; and pulmonary embolism (PE), where clots detach from the veins and cause blockages to the lungs. Venous thromboembolism can be serious enough to cause death or be so debilitating it hinders rehabilitation. Dennis et al. (2013) showed that 20-42% of stroke patients suffer a venous thromboembolism. This result reflects the fact that stroke patients are typically bedbound and often unable to move one side of their body. A primary measure of ongoing health and survival measured in patients 6 months post stroke is the Oxford Handicap Scale (OHS) (Van Swieten et al. 1988). This is an ordinal measure on a sevenpoint scale, ranging from no symptoms up to severe disability and death.
CLOTS3 was a 94 centre randomised clinical trial with 2,876 patients. It was conducted to investigate whether intermittent pneumatic compression (IPC) applied to the legs of acute stroke patients reduced the occurrence of DVT (Dennis et al. 2015). CLOTS3 () showed that IPC reduced the odds of DVT by 30 days [OR 0.65 (95% CI 0.51-0.84; p = .001) after adjustment for baseline variables] and had a positive impact on survival at 6 months, HR 0.86 (0.74-0.99), p = .042. Table 3. Simulation study results: Ceiling effect. True values on the latent continuous scale used to generate data are trial-level surrogacy R 2 ht = 0.90, and individual-level surrogacy R 2 h = 1 (at the individual level we expect strength of surrogacy in the binary-ordinal setting to be low due to loss of information from moving from continuous to categorical outcomes). 250 simulations were performed for each of the scenarios reported in the table. We present the number and size of trials simulated; the median R2 of the 250 simulations; median lower and upper limits of the 95% confidence intervals for the 250 simulations. We used this data set to assess whether the occurrence of DVT, PE or death within 30 days is a surrogate for OHS at 6 months. The information theory approach was applied to investigate this and we used study centres in place of trials (Abrahantes et al. 2004). The results shown in Table 6 and Figure 1 indicate that DVT is not a good surrogate for OHS, as c R 2 h is 0.173 95% CI (0.141, 0.188) and c R 2 ht is 0.186 95% CI (0.048, 0.374). While there is no established cut-off corresponding to a 'valid' surrogate previous publications have suggested that surrogates that exceed 0.80 at both levels can be deemed valid, while if surrogacy strength at either level is below 0.50 surrogacy strength is poor (Alonso et al. 2016). Therefore, these results suggest a poor surrogate.
A sensitivity analysis was conducted to assess whether the potential bias witnessed in the simulation study (where underestimation increased with increased number of trials) may have influenced our results. We regrouped centres so there were fewer groups (results not shown), and we found that while some level of underestimation had taken place the point estimates were still comfortably under 0.50 and therefore our conclusions did not change. Further information on the case study is provided in Appendix A.

Discussion
The information theory approach of Alonso and Molenberghs (2007) has previously been extended to failure time outcomes (Pryseley et al. 2011); repeated measures (Alonso et al. 2006); a continuous surrogate and binary true outcome ; and binary outcomes . This paper complements these by extending the methodology to the case of a binary surrogate and an ordinal true outcome.
A major strength of the work presented is the wide range of scenarios considered in the simulation study that evaluated the performance of the extended methodology measures of Table 6. CLOTS3 case study results: Information theory surrogacy estimates for binary DVT surrogate and ordinal OHS true outcome; analysed using a modified information theory approach incorporating a penalized likelihood approach (Firth 1993)   individual-level surrogacy, R 2 h , and trial-level surrogacy, R 2 ht in the binary-ordinal context. Extending previous simulation studies in this area  we assessed weak strengths of surrogacy, discordant levels of surrogacy at trial and individual levels and investigated the ceiling effect present when using binary and ordinal outcomes. We also completed the first assessment via simulation of the non-proportional odds scenario for ordinal outcomes. A further benefit was the opportunity to provide a clear answer to a question of clinical interest regarding deep vein thrombosis, DVT, as a potential surrogate for long-term outcome following stroke the Oxford Handicap Scale, OHS, using data from the CLOTS (Dennis et al. 2015) randomised controlled trial.
As might have been expected, the simulation study showed that a binary surrogate is less informative than its latent counterpart at the individual level; the ceiling for the binary-ordinal setting is around half the strength of that simulated on the underlying continuum.
Some unexpected underestimation of R 2 ht was observed; we speculate that this is due to inefficiencies in estimation through a combination of the use of a two-stage estimation approach and the involvement of discrete outcomes. Furthermore, overestimation of R 2 ht occurred for weak surrogacy and small numbers of trials, due to overfitting at the second stage of modelling. Assessments of surrogates of this kind might lead researchers to believe incorrectly that they are valid. This is likely to be an issue regardless of the setting (binary-continuous, continuous-continuous etc.) and has not previously been identified. These two sources of bias at the trial level, overfitting and inefficiency, point to some practical issues with the two-stage modelling approach and require further investigation. Deviations from the proportional odds assumption or discordant surrogacy strength at trial and individual levels had little impact on c R 2 ht or c R 2 h results with positive implications for the robustness of this surrogacy assessment approach.
In future work, it would be interesting to study the underestimation found in this work in more detailperhaps in the context of contrasting settings, e.g. time-to-event or repeated measures. Nevertheless, if inefficiency is the root cause of underestimation the discrete case is likely to be the most severely affected. In the discrete outcome setting the issues of estimation in the presence of separation (perfect agreement between two discrete outcomes) is one that might have a large impact on results. Our simulations did not consider small trial sizes where separation is likely to be a substantial issue; however, this important topic should be considered in more detail alongside Figure 1. CLOTS3 case study results: Graphical display of information theory surrogacy estimates for binary DVT surrogate and ordinal OHS true outcome; study centre size categorised by the terciles of centre size. The regression line represents the regression of the treatment effects on the true outcome on those for the surrogate. Analysed using a modified information theory approach incorporating a penalized likelihood approach (Firth 1993) to deal with the issue of sparse data. potential solutions. Equally, it would be worth establishing if the overfitting witnessed in the case of weak surrogacy is systemic to all settings of the information theory approach.
Overall results from the simulation show that the information theory approach works well in general in the binary-ordinal context, although some issues concerning the two stage nature of the modelling approach for R 2 ht have been identified. Methodologically we have seen that across settings the information theory approach is readily applied and provides a consistent interpretation. The methodological extensions reported here will enable researchers working in clinical areas where ordinal outcomes are important to investigate surrogacy. This work provides further confirmation that information theory is a practical and methodologically sound approach to surrogacy evaluation.