Check out our site adheRenceRX … Or check out our Github
The goal of adheRenceRX is to provide a slightly opinionated set of functions to allow researchers to assess medication adherence in the most flexible way possible. The goal was (is) to write piping-friendly verbs the “tidy” way to allow users to manipulate their data as they’d like without storing data multiple times into their environment. In tidy fashion, we aimed to create functions that did only one thing, ideally that thing is obviated by the name of the function! So, the package makes assessing adherence as flexible as possible with some key things left in the hands of the researcher. The final value is that functions without vectorised solutions (propagate_date()
and rank_episodes()
) are written with C++ allowing speed and performance when you’d rather do research than run a function for an hour!
This was a lot of fun to build but is still in production. If you find errors, or know things you’d like to see done differently, reach out!
You can install v1.0.0 from CRAN using
install.packages("adheRenceRX")
… Or you can install the dev version in you’re feeling wild!
devtools::install_github("btbeal/adheRenceRX", ref = "Dev")
Much of the inspiration for this package came from conversations with analysts who struggle to deal with the non-intuitive ways to deal with medication adherence calculations from pharmaceutical claims data.
Our package is built around suggestions from Canfield and colleagues
(2019) who note that overlapping fill dates should be pushed forward and never counted backwards, to assess adherence properly. For that reason, our package revolves around the first step of creating adjusted dates prior to any other calculation. Next, one can identify the gaps, rank episodes of care, and calculate pdc. The purpose of the package was to be as flexible as possible. So, there will be a lot left to be done by the researcher (on purpose!). For example, are there time periods you’re particularly concerned with? Patient filters? Other groupings (maybe episode of care?). Those are meant to be defined with dplyr verbs outside of our functions.
Our verbs to date are:
For the most part, our verbs assume that dates have been propagated forward and gaps have been properly identified. This is on purpose but is subject to change in the future.
More examples of use can be found on within each functions documentation; however, this should provide a decent idea of intended package workflow.
Take a look at our toy_claims
data loaded with the package. Notice that there is a patient identifier, a fill date, and a days supply. Often times these can be messy. Ideally, you’ll have cleaned this up before you need our package. The very next thing to be done is to adjust these fill dates so that they are not overlapping. We can do that with propagate_date()
.
library(adheRenceRX) library(dplyr) library(magrittr) # manipulate toy_claims, which has IDs based on the Canfield 2019 paper toy_claims %>% # filter for some interesting IDs filter(ID %in% c("B", "D")) %>% # Group by them (grouping not limited, of course) group_by(ID) %>% # propagate the dates forward within those groups propagate_date(.date_var = date, .days_supply_var = days_supply) #> # A tibble: 14 x 4 #> # Groups: ID [2] #> ID date days_supply adjusted_date #> <chr> <date> <dbl> <date> #> 1 B 2020-01-01 30 2020-01-01 #> 2 B 2020-01-31 30 2020-01-31 #> 3 B 2020-03-01 30 2020-03-01 #> 4 B 2020-05-30 60 2020-05-30 #> 5 B 2020-06-29 60 2020-07-29 #> 6 B 2020-07-29 30 2020-09-27 #> 7 B 2020-08-28 30 2020-10-27 #> 8 B 2020-09-27 30 2020-11-26 #> 9 D 2020-01-01 60 2020-01-01 #> 10 D 2020-01-31 60 2020-03-01 #> 11 D 2020-03-01 60 2020-04-30 #> 12 D 2020-05-30 30 2020-06-29 #> 13 D 2020-08-28 60 2020-08-28 #> 14 D 2020-09-27 30 2020-10-27
Notice that several rows have been pushed forward to account for overlaps in date. Also notice that the output changes the date and days supply variable to date
and days_supply
while adding an adjusted_date
variable. The adjusted_date
variable is used by some of the other functions so it is important to complete this step first.
To get a visual of what is happening here…
library(ggalt) library(tidyr) library(lubridate) library(ggtext) toy_claims %>% filter(ID %in% c("B", "D")) %>% group_by(ID) %>% propagate_date(date, days_supply) %>% identify_gaps() %>% pivot_longer(cols = c("date", "adjusted_date"), names_to = "type", values_to = "date") %>% mutate(type = if_else(type == "date", "Pre", "Post"), type = factor(type, levels = c("Pre", "Post"))) %>% group_by(ID, type) %>% mutate(end = date + days(days_supply), d_rank = rank(date)) %>% ggplot() + geom_dumbbell(aes(x = date, xend = end, y = d_rank, color = ID), dot_guide = TRUE) + ggthemes::theme_clean() + theme(text=element_text(size=16, family="Arial"), axis.title.y = element_blank(), legend.text=element_text(size=14, family="Arial"), legend.title=element_text(family = "Arial"), plot.caption = element_markdown(size = 10, family = "Arial", lineheight = 1.2, hjust = 0), plot.caption.position = "panel", panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.grid.major.y = element_blank(), panel.spacing.x = unit(10, "mm") ) + facet_grid(ID ~ type) + guides(color = FALSE) + labs( y = "Fill Order", x = "Fill Date", caption = "<b>Figure 1</b> Two mock patient IDs, 'B' and 'D', pre and post fill-date adjustments<br> <b>Notes</b> The purpose of this function is to shift overlapping dates forward to account for <br> stockpiling. Note that for Patient 'D', this removes a gap. <br>" )
Notice how the overlapping dates are pushed forward - in the case of patient D, this removes what would have otherwise been counted as a gap between their third and fourth fill and accounts for stockpiling!
Once the dates have been adjusted, we can identify gaps in therapy with identify_gaps()
or summarise them with summarise_gaps()
. Note, we must identify our gaps in therapy to move forward because our calculations depend on those gaps as well as the episodes of care function. Again, this structure revolves around the one-function-one-task principle.
# The same code from above toy_claims %>% filter(ID %in% c("B", "D")) %>% group_by(ID) %>% propagate_date(.date_var = date, .days_supply_var = days_supply) %>% # But now we can identify gaps # Note that the identified gap is appended to the fill after # the gap occurs identify_gaps() #> # A tibble: 14 x 5 #> # Groups: ID [2] #> ID date days_supply adjusted_date gap #> <chr> <date> <dbl> <date> <dbl> #> 1 B 2020-01-01 30 2020-01-01 0 #> 2 B 2020-01-31 30 2020-01-31 0 #> 3 B 2020-03-01 30 2020-03-01 0 #> 4 B 2020-05-30 60 2020-05-30 60 #> 5 B 2020-06-29 60 2020-07-29 0 #> 6 B 2020-07-29 30 2020-09-27 0 #> 7 B 2020-08-28 30 2020-10-27 0 #> 8 B 2020-09-27 30 2020-11-26 0 #> 9 D 2020-01-01 60 2020-01-01 0 #> 10 D 2020-01-31 60 2020-03-01 0 #> 11 D 2020-03-01 60 2020-04-30 0 #> 12 D 2020-05-30 30 2020-06-29 0 #> 13 D 2020-08-28 60 2020-08-28 30 #> 14 D 2020-09-27 30 2020-10-27 0 # Or, we could just summarise them all: toy_claims %>% filter(ID %in% c("B", "D")) %>% group_by(ID) %>% propagate_date(.date_var = date, .days_supply_var = days_supply) %>% # Summarising gaps summarise_gaps() #> # A tibble: 2 x 2 #> ID Sum_Of_Gaps #> <chr> <dbl> #> 1 B 60 #> 2 D 30
With the gaps identified, we can check for episodes of care using our rank_episodes()
functions. Note that this function assumes that you’ve propagated your dates appropriately and identified all gaps. You can then tell our function what can be considered a permissible gap, and everything after a gap that large or more will be considered the next episode! This function is useful for medication persistence questions as well given that you can classify gaps larger than a given .permissable_gap
as a new episode and then group by those episodes. Let me show you.
# The same code from above toy_claims %>% filter(ID %in% c("B", "D")) %>% group_by(ID) %>% # Must propagate the date AND identify gaps in this workflow propagate_date(.date_var = date, .days_supply_var = days_supply) %>% identify_gaps() %>% # say that anything over a 10 day gap should count as the next episode rank_episodes(.permissible_gap = 10) #> # A tibble: 14 x 6 #> # Groups: ID [2] #> ID date days_supply adjusted_date gap episode #> <chr> <date> <dbl> <date> <dbl> <dbl> #> 1 B 2020-01-01 30 2020-01-01 0 1 #> 2 B 2020-01-31 30 2020-01-31 0 1 #> 3 B 2020-03-01 30 2020-03-01 0 1 #> 4 B 2020-05-30 60 2020-05-30 60 2 #> 5 B 2020-06-29 60 2020-07-29 0 2 #> 6 B 2020-07-29 30 2020-09-27 0 2 #> 7 B 2020-08-28 30 2020-10-27 0 2 #> 8 B 2020-09-27 30 2020-11-26 0 2 #> 9 D 2020-01-01 60 2020-01-01 0 1 #> 10 D 2020-01-31 60 2020-03-01 0 1 #> 11 D 2020-03-01 60 2020-04-30 0 1 #> 12 D 2020-05-30 30 2020-06-29 0 1 #> 13 D 2020-08-28 60 2020-08-28 30 2 #> 14 D 2020-09-27 30 2020-10-27 0 2
Finally, an actual adherence calculation. This is fairly straightforward since the bulk of the work has been done adjusting your dates and then appropriately identifying the gaps in therapy. Still, more functions = more fun!
toy_claims %>% group_by(ID) %>% propagate_date(.date_var = date, .days_supply_var = days_supply) %>% identify_gaps() %>% calculate_pdc() #> # A tibble: 3 x 4 #> ID total_gaps total_days adherence #> <chr> <dbl> <dbl> <dbl> #> 1 A 0 270 1 #> 2 B 60 330 0.818 #> 3 D 30 300 0.9
Importantly, the total number of days (our denominator) is calculated from the first fill to the last fill. This generates a more conservative estimate for patients. Of course, to get around this (and include the last supply towards adherence), one could simply append an extra date for each patient.
That’s all we have for now. Again, this package is meant to provide some helper functions with the meat of the project coming from our propagate_date()
and rank_episodes()
. Notbaly, those tasks can’t be accomplished with dplyr
alone (as they do not have vectorised solutions). For this reason, we’ve written some C++ functions to help you speed up the task!