knitr::opts_chunk$set(echo = TRUE)
library(knitr)
library(kableExtra)
library(shiny)
library(shinyWidgets)
library(shinydashboard)
library(formattable)
library(ggthemes)
library(tidyverse)
library(fGarch)
library(scales)
library(DT)
library(plotly)

library(ggplot2) 
theme_set(theme_minimal(base_size=12))
theme_update(panel.background = element_rect(fill = "transparent", colour = NA),
             plot.background = element_rect(fill = "transparent", colour = NA))
opts_chunk$set(dev.args=list(bg="transparent"))

# ----------------------------------------------------------------------- #
#   Create Artificial Distributions
# ----------------------------------------------------------------------- #
set.seed(667)     # Setting seed to reproduce normalized values

l_skew  <- rsnorm(10000, mean = 100, sd = 800, xi = -9)       # Data for left skewed INB
no_skew <- rsnorm(10000, mean = 1000, sd = 800)               # Data for normally distributed INB
r_skew  <- rsnorm(10000, mean = 100, sd = 800, xi = 9)        # Data for right skewed INB


all_skew <- cbind(l_skew, no_skew, r_skew) %>% as_tibble()   # create tibble from all three vectors
names(all_skew) <- c("Left", "Normal", "Right")              # name the tibbles

# ----------------------------------------------------------------------- #
#   Calculating random cost and utility matrices to match distribution
# ----------------------------------------------------------------------- #
sim_vec <- paste("Sample: ", 1:10000)                   # creating row identifier (for table purposes)
drug_A_nmb <- rnorm(10000, mean = 2555322, sd = 300000) # create arbitrary drug_A_nmb for given WTP
drug_B_nmb <- all_skew + drug_A_nmb                     # matrix of all skews of drug_B_nmb for given WTP


nmb_list <- list()
i <- 1
for(i in 1:ncol(drug_B_nmb)){
  df <- cbind.data.frame(sim_vec, drug_A_nmb, drug_B_nmb[,i])
  names(df) <- c("Sample", "Drug A NMB", "Drug B NMB")
  
  nmb_list[[i]] <- df
}

Reasons for the post

In the Fall of 2019, while studying at The University of Washington’s CHOICE institute, I had the privilege of attending my first decision modeling workshop led by DARTH (Decision Analysis in R for Technologies in Health). Quick plug: if you ever have the chance to attend one of their workshops, I would highly recommend it. They’re a group of researchers who have banded together to form a collaborative group dedicated to open-source, transparent solutions to decision analysis in health.

On the third day of the workshop, they presented an idea known as the “expected loss” curve (ELC) as an addition to the cost-effectiveness acceptability curve (CEAC) and the cost-effectiveness acceptability frontier (CEAF). After reading their paper on the topic (see here), I was really struggling to understand why this wasn’t recommended by all the major CEA guidelines? In fact, almost all of them recommend a CEAC/CEAF but that only tells some of the story. So why not both?

My own conclusion is that the impetus behind the ELC is not always explicitly clear and, because I am a visual person, I thought visuals may help. I’m first going to start with some simple definitions, then provide a working example of the math, and finally wrap up with an interactive section to let you play around with the different cases and demonstrate with interactive visuals why the ELC is necessary to be a fully informed decision maker.

So, what is an Expected Loss Curve?

ELC vs. CEAC/CEAF

Cost-Effectiveness Acceptability Curves represent the proportion of samples, derived from a probabilistic sensitivity analysis (PSA), that have the highest net monetary benefit (NMB).

Cost-Effectiveness Acceptability Frontiers indicate which strategy on the CEAC is optimal, or has the greatest net monetary benefit (NMB).

These two will work in tandem to tell the decision maker:

  • What strategy is most likely to be cost-effective (Cost-Effectiveness Acceptability);
  • and what strategy at a given WTP is optimal regarding NMB.

But neither of these pieces of information explicitly describe to decision makers how much they may stand to lose from a given choice. Further, it is often confusing to readers why a choice may be labeled “optimal” when it is clearly less likely to be cost-effective.

Expected Loss Curves represent the expected financial loss at a given WTP when a sample from the same PSA does not have the highest net monetary benefit. This curve takes into account the likelihood of being cost-effective and the economic cost when a strategy is sub-optimal.

The ELC informs decision makers explicitly about the risks they are incurring when choosing a particular strategy. A graphical representation of the ELC describes the frontier mentioned above in economic terms rather than binary terms (“optimal” or “sub-optimal”) presented by the CEAF.

A simplistic example

Let us suppose that you want to compare two strategies (Strategy A and Strategy B) to see which will be optimal for your company. Your head statistician informs you that Strategy A will be cost-effective 70% of the time and in the 70 times out of 100 that it is cost-effective, you stand to gain $5 dollars each time (you lose $0 each time). She then proceeds to tell you that for every time you are wrong (30% of the time) you stand to lose $100. Your expected loss would be $30 (30% x $100 = $30). With that in mind, you also calculate the expected loss for Strategy B. Turns out it is only $7! ($7 is arbitrary for the sake of example).

In this example, Strategy B would be favored on the CEAF but the CEAC would have shown it to be less likely. Having the CEAF at least informs us what strategy is optimal, but we are still left with a relatively confusing picture of cost-effectiveness.

Instead, the statistitician could present the expected loss of each strategy, which would demonstrate the optimal strategy at this particular WTP and what you stand to lose economically, relative to the other strategies.

The ELC provides a much clearer picture of optimal strategies and the risks associated.

Math

The core of every PSA analysis is derived from the NMB of each strategy for a given iteration of the PSA:

\[NMB_{i,d} = (QALY_{i,d})\lambda - Cost_{i,d}\]

Where i represents a given iteration of the PSA, d is a given strategy, and \(\lambda\) representing the WTP for 1 additional unit of effectiveness.

Let us consider a hypothetical NMB matrix for “Drug A” and “Drug B”.


PSA Sample Drug A NMB Drug B NMB
1 $2,771,108 $2,769,940
2 $2,957,934 $2,955,832
3 $2,996,100 $3,005,927
4 $2,913,308 $2,912,642
5 $2,396,263 $2,394,946



From a given NMB matrix (like the one above), one can calculate the optimal strategy at each iteration, or the strategy with highest net benefit as:

\[NMB_{d^*,i} = argmax_{d,i\in [1,2,... D]}\]


Where the strategy with the highest net monetary benefit, \(NMB_{d*,i}\), for a given PSA sample, i, is the maximum value of the set of \(NMB\) values for a given iteration and all given strategies, D.



With our maximum values identified for each row, we can calculate the economic loss for each PSA sample and each strategy as:


\[L_{d,i} = NMB_{d^*,i} - NMB_{d,i} | d\in [1,2,... D]\]


Where \(L_{d,i}\) is the loss for a given strategy at a given iteration of the PSA. If the strategy happens to be the optimal strategy at that iteration, the matrix will take a value of 0 (i.e. the value will be subtracted from itself).

Applying this to our mock NMB matrix, we get:


PSA Sample Drug A Loss Drug B Loss
1 $0 $1,168
2 $0 $2,102
3 $7,827 $0
4 $0 $666
5 $0 $1,317



And then our expected loss for Drug A or Drug B is just the average of all of those losses…


\[\overline{EL}_{DrugA} = \frac{1}{N}\sum_{x = i}^{N} Drug A\]


Therefore, our expected losses for each would be:

Drug Expected Loss
A $1,565
B $1,051



And we would conclude that Drug B is optimal at this WTP given that it has a lower expected loss.

The expected loss of a given strategy is simply the average loss of that strategy when it is not the optimal decision. The optimal decision overall is considered to be that with the lowest expected loss.

But why does it matter?

Primarily, many guidelines (including the Second Panel on Cost-Effectiveness in Health and Medicine), recommend that uncertainty in a model should be displayed using a CEAC with the frontier plotted on the same graphic. This is still missing half the value of the sensitivity analysis! This method still tells us nothing about economic risks associated with each decision.

But aren’t they the same?

Now that we know what the expected loss is, it seems reasonable to assume that the expected loss would just be a reflection of the CEAC – and in most cases that is correct; however, it is crucial to understand that this isn’t always the case. It is not always true that the strategy most likely to be cost effective is, indeed, the optimal decision (that is, the decision with the lowest expected loss).

Allow me to explain…

Consider the following possible distributions of your incremental net benefit (INB), where the INB is just the difference between the net monetary benefit of Drug B (\(NMB_{d_B,i}\)) and Drug A (\(NMB_{d_A,i}\)) at each iteration (to continue with the strategies considered above):

\[INB = NMB_{d_B,i} - NMB_{d_A,i}\]

s <- all_skew %>% 
  gather(key = "Skew", value = "INB")

ggplot(s) +
  geom_density(aes(x = INB, fill = Skew), alpha = 2/3) +
  theme_hc() +
  labs(title = "Distribution of Incremental Net Monetary Benefit",
       subtitle = "for three hypothetical INB distributions of Drug B vs. Drug A") +
  scale_x_continuous(labels = dollar_format())



All three hypothetical distributions represent scenarios where Drug B is not 100% likely to be cost-effective (we know this because we can see some of each distribution being < $0, which means that at least one sample of the PSA favors the other Drug A). Of course, these distributions could be located anywhere on the x-axis but most often you’ll have something that looks like those pictured above (each strategy has some probability of being cost-effective).


We know that the mean will represent the optimal strategy per our definitions above. We also know that the median of the data above will represent the most probable to be cost-effective. So, in the special cases where the mean and median are on opposite sides of the $0 line, we will have a strategy that is most likely to be cost effective for one strategy but have the highest INB for another!


It is in this special case, the CEAC alone will give us misleading results and lead a decision maker to make a suboptimal decision. That is why it matters and that is why one should provide both to gather the most comprehensive information available.


Let’s check out the results from the above distributions:

el_matrix <- ce_matrix <- matrix(0,   # create matrix for storing proportion cost effectiveness and expected loss
                                 nrow = length(nmb_list),
                                 ncol = 2,
                                 dimnames = list(c("Left Skewed Data", "Normally Distributed Data", "Right Skewed Data"),
                                                 c("DrugA", "DrugB")))

i <- 1
for(i in 1:length(nmb_list)){
  nmb <- nmb_list[[i]][,-1]
  max_nmb <- max.col(nmb)                       # selecting column with highest nmb in each iteration
  prop_ce <- prop.table(table(max_nmb))                   # proportion in each column
  ce_matrix[i, as.numeric(names(prop_ce))] <- prop_ce  # filling in cost effective matrix created above with prop ce
  # ---- note regarding frontier: this is important alongside proportion ce
  # ---- because it gives an idea of data distribution
  # ---- in theory, one could be most cost effective on average, but have lower mean nmb
  
  loss_matrix <- nmb[cbind(1:10000,max_nmb)] - nmb  # calculating loss of sub-optimal strategy in each iteration                                       
  
  el_matrix[i,] <- colMeans(loss_matrix)              # average loss by strat for each wtp threshold
}

f <- rbind.data.frame(el_matrix, ce_matrix) %>% rownames_to_column(var = "Distribution")
html_table <- f %>% 
                mutate(
                    Val = 1:6,
                    DrugA = ifelse(DrugA<DrugB & Val < 4 | DrugA>DrugB & Val > 3,
                                  cell_spec(DrugA, "html", color = "green"),
                                  cell_spec(DrugA, "html", color = "red")),
                    DrugB = ifelse(DrugB>DrugA & Val < 4 | DrugB<DrugA & Val > 3,
                                  cell_spec(DrugB, "html", color = "red"),
                                  cell_spec(DrugB, "html", color = "green"))
                      ) %>% 
                select(Distribution, DrugA, DrugB) %>% 
                kable("html", escape = FALSE) %>% 
                kable_styling(bootstrap_options = c("striped", "hover")) %>% 
                pack_rows("Expected Loss", 1, 3) %>% 
                pack_rows("Proportion Cost Effective", 4,6)

# ------------------------- #
# NOTE: If you are reding this code, this function above didn't quite work (the ifelse statements would not output the exact colors I needed. So, I printed the html from the table above in my console, then edited it myself, then put it back into Rmd raw. Very hacky but it worked. )
# ------------------------- #
Distribution DrugA DrugB Mean Median
Expected Loss
Left Skewed Data $376.01 $284.55 $91.45 $255.85
Normally Distributed Data $1,032.44 $20.70 $,1011.74 913.04061
Right Skewed Data $361.19 $271.40 $89.79 -$75.47
Proportion Cost Effective
Left Skewed Data 0.3834 0.6166 $91.45 $255.85
Normally Distributed Data 0.0827 0.9173 $,1011.74 913.04061
Right Skewed Data 0.5373 0.4627 $89.79 -$75.47


Notice that when we consider the expected loss, Drug B is optimal (green) for all three scenarios; however, if we only consider the proportion of times a drug is cost-effective, it appears that in the right skewed data, Drug A is the best choice!

That is why these two should be compared together!

See for yourself!

Here you can adjust the distribution of INBs youreslf to observe the special distributions which lead to an incomplete CEAC.

When the mean and median have opposite signs, the distribution below will light up red and the EL and the CEAC will tell a different story! (recall that a lower expected loss is optimal)

Try moving the skew to “7”. Notice how the mean of the INB favors “Strategy B” while the median favors “Strategy A”? Observe, in this case, that B has the lower expected loss but is less likely to be cost effective.


knitr::include_app("https://brennanbeal.shinyapps.io/ExpectedLoss_Shiny/",
                   height = "1000px")



Divide and Conquer!

My goal of this post was for the reader to understand what the expected loss curve is, and how it can add value to the CEAC. You can now see why the CEAC should only stand alone when a decision maker is risk-neutral (that is, when a decision maker is indifferent to the potential magnitude of loss when they’re wrong). For other cases, it is imperitive that an ELC accompany.

So now that you (hopefully) have a solid understanding of what the expected loss curve is, and what it represents, building one should take no time at all!

Thanks

I want to give one last shout out to the DARTH group!

  • Their paper on expected losses (which I highly recommend reading) can be found here
  • Their website can be found here here
 

A work by Brennan T. Beal

brennanbeal@gmail.com