\[ \usepackage{dsfont} \usepackage{xcolor} \require{mathtools} \definecolor{bayesred}{RGB}{147, 30, 24} \definecolor{bayesblue}{RGB}{32, 35, 91} \definecolor{bayesorange}{RGB}{218, 120, 1} \definecolor{grey}{RGB}{128, 128, 128} \definecolor{couleur1}{RGB}{0,163,137} \definecolor{couleur2}{RGB}{255,124,0} \definecolor{couleur3}{RGB}{0, 110, 158} \definecolor{coul1}{RGB}{255,37,0} \definecolor{coul2}{RGB}{242,173,0} \definecolor{col_neg}{RGB}{155, 191, 221} \definecolor{col_pos}{RGB}{255, 128, 106} {\color{bayesorange} P (\text{H} \mid \text{E})} = \frac {{\color{bayesred} P(\text{H})} \times {\color{bayesblue}P(\text{E} \mid \text{H})}} {\color{grey} {P(\text{E})}} \]
This online appendix provides R codes to apply the methods presented in the companion paper (Charpentier, Flachaire, and Gallic (2023)).
1 Data
To illustrate the methods, we rely on the Linked Birth/Infant Death Cohort Data (2013 cohort). The CSV files for the 2013 cohort were downloaded from the NBER collection of Birth Cohort Linked Birth and Infant Death Data of the National Vital Statistics System of the National Center for Health Statistics, on the NBER website. We formated the data (see ./births_stats.html).
Let us load the birth data.
library(tidyverse)
load("../data/births.RData")Then, we can only keep a subsample of variables to illustrate the method.
base <-
births %>%
mutate(
black_mother = mracerec == "Black",
nonnatural_delivery = rdmeth_rec != "Vaginal"
) %>%
select(
sex, dbwt, cig_rec, wtgain,
black_mother, nonnatural_delivery,
mracerec, rdmeth_rec
) %>%
mutate(
cig_rec = replace_na(cig_rec, "Unknown or not stated"),
black_mother = ifelse(black_mother, yes = "Yes", no = "No"),
is_girl = ifelse(sex == "Female", yes = "Yes", no = "No")
) %>%
labelled::set_variable_labels(
black_mother = "Is the mother Black?",
is_girl = "Is the newborn a girl?",
nonnatural_delivery = "Is the delivery method non-natural?"
) %>%
rename(birth_weight = dbwt)
base# A tibble: 3,940,764 × 9
sex birth_weight cig_rec wtgain black_mo…¹ nonna…² mrace…³ rdmet…⁴ is_girl
<fct> <dbl> <fct> <dbl> <chr> <lgl> <fct> <fct> <chr>
1 Male 3017 No 1 No FALSE "White" Vaginal No
2 Female 3367 No 18 No FALSE "White" Vaginal Yes
3 Male 4409 No 36 No FALSE "White" Vaginal No
4 Male 4047 No 25 No FALSE "White" Vaginal No
5 Male 3119 Yes 0 No FALSE "Ameri… Vaginal No
6 Female 4295 Yes 34 No FALSE "White" Vaginal Yes
7 Male 3160 No 47 No FALSE "Ameri… Vaginal No
8 Male 3711 No 39 No FALSE "White" Vaginal No
9 Male 3430 No NA No FALSE "White" Vaginal No
10 Female 3772 No 36 No FALSE "White" Vaginal Yes
# … with 3,940,754 more rows, and abbreviated variable names ¹black_mother,
# ²nonnatural_delivery, ³mracerec, ⁴rdmeth_rec
The variables that were kept are the following:
sex: sex of the newbornbirth_weight: Birth Weight (in Grams)cig_rec: mother smokes cigaretteswtgain: mother weight gainmracerec: mother’s raceblack_mother: is mother black?rdmeth_rec: delivery methodnonnatural_delivery: is the delivery method non-natural?
Then, let us discard individuals with missing values.
base <-
base %>%
filter(
!is.na(birth_weight),
!is.na(wtgain),
!is.na(nonnatural_delivery),
!is.na(black_mother)
)Let us define some colours for later use:
library(wesanderson)
colr1 <- wes_palette("Darjeeling1")
colr2 <- wes_palette("Darjeeling2")
couleur1 <- colr1[2]
couleur2 <- colr1[4]
couleur3 <- colr2[2]
coul1 <- "#882255"
coul2 <- "#DDCC77"We need to load some packages.
library(car)
library(transport)
library(splines)
library(mgcv)
library(expm)
library(tidyverse)2 Objectives
We would like to measure the effect of a treatment variable \(T\) on the probability of observing a binary outcome \(y\), depending on a binary treatment \(T\). The outcome is assumed to depend on another variable \(\boldsymbol{x}^m\) that is also influenced by the treatment; this variable is called a mediator.
Output (\(y\),
nonnatural_delivery): Probability of having a non-natural deliveryTreatment (\(T\)), either:
cig_rec: Whether the mother smokes \(\color{couleur2}{t=1}\) or not \(\color{couleur1}{t=0}\)black_mother: Whether the mother is Black \(\color{couleur2}{t=1}\) or not \(\color{couleur1}{t=0}\)sex: Whether the baby is a girl \(\color{couleur2}{t=1}\) or not \(\color{couleur1}{t=0}\)
Mediator (\(\boldsymbol{x}^m\)) is either:
birth_weight: Birth weight of the newbornwtgain: weight gain of the mother.
Consider two models, \(\color{couleur1}{\widehat{m}_0(x)}\) and \(\color{couleur2}{\widehat{m}_1(x)}\), that estimate, respectively, \(\color{couleur1}{\mathbb{E}[Y|X=x,T=0]}\) and \(\color{couleur2}{\mathbb{E}[Y|X=x,T=1]}\), \[ \text{SCATE}(x)=\color{couleur2}{\widehat{m}_1}\big(\widehat{\mathcal{T}}(x)\big)\color{black}{} - \color{couleur1}{\widehat{m}_0}\big(x\big) \] where \(\widehat{\mathcal{T}}(\cdot)\) is a transport function:
- if we consider a quantile-based matching, \(\widehat{\mathcal{T}}(x)= \color{couleur2}{\widehat{F}_1^{-1}}\color{black}{}\circ \color{couleur1}{\widehat{F}_0}(x)\), with \(\color{couleur1}{\widehat{F}_0}\) and \(\color{couleur2}{\widehat{F}_1}\) denoting the empirical distribution functions of \(x\) conditional on \(\color{couleur1}{t=0}\) and \(\color{couleur2}{t=1}\), respectively.
- assuming a Gaussian distribution of the mediator variable, we can consider \(\widehat{\mathcal{T}}(x) := \widehat{\mathcal{T}}_{\mathcal{N}}(x)= \color{couleur2}{\overline{x}_1}\color{black}{}+\color{couleur2}{s_1}\color{couleur1}{s_0^{-1}}\color{black}{} (x-\color{couleur1}{\overline{x}_0}\color{black}{})\), \(\color{couleur1}{\overline{x}_0}\) and \(\color{couleur2}{\overline{x}_1}\) being respectively the averages of \(x\) in the two sub-populations, and \(\color{couleur1}{s_0}\) and \(\color{couleur2}{s_1}\) the sample standard deviations.
To estimate the mutatis mutandis Sample Conditional Average Treatment Effect, the following are required:
- The two estimated models \(\color{couleur1}{\widehat{m}_0(x)}\) and \(\color{couleur2}{\widehat{m}_1(x)}\).
- A function to predict new values with these each of these two models.
- A transport method \(\mathcal{T}(\cdot)\).
We will compute the SCATE at the different values of \(\boldsymbol{x}\):
x_0_birth_weight <- seq(2000,4500, by=500)x_0_weight_gain <- seq(5,55, by=10)3 Models
We will consider here different models:
- GAM
- Local averages computed using kernels
For each model, we need to define two functions: a first one that estimates the model, and a second one that makes predictions based on the estimated model.
3.1 GAM (with cubic splines)
The function used to estimate both models \(\color{couleur1}{\hat{m}_0}\color{black}{}()\) and \(\color{couleur2}{\hat{m}_1}\color{black}{}()\)
library(splines)
#' Returns $\hat{m}_0()$ and $\hat{m}_1()$, using GAM
#' @param target name of the target variable
#' @param treatment_name name of the treatment variable (column in `data`)
#' @param x_m_name name of the mediator variable
#' @param data data frame with the observations
#' @param treatment_0 value for non treated
#' @param treatment_1 value for treated
#' @param df degrees of freedom (in the `bs()` function, deefault to `df=3`)
models_spline <- function(target, treatment_name, x_m_name, data, treatment_0, treatment_1, df = 3){
# \hat{m}_0()
reg_0 <-
bquote(glm(.(target)~bs(.(x_m_name), df = df), data=data,
family=binomial, subset = (.(treatment_name) == .(treatment_0))),
list(target = as.name(target), x_m_name=as.name(x_m_name),
treatment_name = as.name(treatment_name),
treatment_0 = treatment_0)) %>%
eval()
# \hat{m}_1()
reg_1 <- bquote(glm(.(target)~bs(.(x_m_name), df = df), data=data,
family=binomial, subset = (.(treatment_name) == .(treatment_1))),
list(target = as.name(target), x_m_name=as.name(x_m_name),
treatment_name = as.name(treatment_name),
treatment_1 = treatment_1)) %>%
eval()
list(reg_0 = reg_0, reg_1 = reg_1)
}Let us estimate these models.
target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
reg_univ_smoker_bw <-
models_spline(target = target,
treatment_name = treatment_name,
x_m_name = x_m_name,
data = base, treatment_0 = treatment_0, treatment_1 = treatment_1, df = 3)target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
reg_univ_smoker_wg <-
models_spline(target = target,
treatment_name = treatment_name,
x_m_name = x_m_name,
data = base, treatment_0 = treatment_0, treatment_1 = treatment_1, df = 3)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
reg_univ_blackm_bw <-
models_spline(target = target,
treatment_name = treatment_name,
x_m_name = x_m_name,
data = base, treatment_0 = treatment_0, treatment_1 = treatment_1, df = 3)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
reg_univ_blackm_wg <-
models_spline(target = target,
treatment_name = treatment_name,
x_m_name = x_m_name,
data = base, treatment_0 = treatment_0, treatment_1 = treatment_1, df = 3)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
reg_univ_sex_bw <-
models_spline(target = target,
treatment_name = treatment_name,
x_m_name = x_m_name,
data = base, treatment_0 = treatment_0, treatment_1 = treatment_1, df = 3)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
reg_univ_sex_wg <-
models_spline(target = target,
treatment_name = treatment_name,
x_m_name = x_m_name,
data = base, treatment_0 = treatment_0, treatment_1 = treatment_1, df = 3)3.2 Kernels
Let us define a function to compute the local average of a single observation:
#' @param target name of the response variable
#' @param x_m_name name of the mediator variable
#' @param x single value which will be transported
#' @param h bandwidth
#' @param data data frame, subset of data with t=0 or t=1
pred_kernel_single <- function(target, x_m_name, x, h=10, data){
w <- dnorm(pull(data, x_m_name), x, h)
sum(w*pull(data, target))/sum(w)
}target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
mod_0_k_smoker_bw <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_0,])
mod_1_k_smoker_bw <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_1,])target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
mod_0_k_smoker_wg <- list(target = target, x_m_name = x_m_name, h = 50,
data = base[pull(base, treatment_name) == treatment_0,])
mod_1_k_smoker_wg <- list(target = target, x_m_name = x_m_name, h = 50,
data = base[pull(base, treatment_name) == treatment_1,])target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
mod_0_k_blackm_bw <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_0,])
mod_1_k_blackm_bw <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_1,])target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
mod_0_k_blackm_wg <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_0,])
mod_1_k_blackm_wg <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_1,])target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
mod_0_k_sex_bw <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_0,])
mod_1_k_sex_bw <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_1,])target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
mod_0_k_sex_wg <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_0,])
mod_1_k_sex_wg <- list(target = target, x_m_name = x_m_name, h= 50,
data = base[pull(base, treatment_name) == treatment_1,])4 Prediction Function
4.1 GAM
The prediction function for GAM:
#' @param object regression model (GAM)
#' @param newdata data frame in which to look for the mediator variable used to predict the target
model_spline_predict <- function(object, newdata){
predict(object, newdata = newdata, type="response")
}4.2 Kernels
Function used to compute the local average of a single observation:
#' @param target name of the response variable
#' @param x_m_name name of the mediator variable
#' @param x single value which will be transported
#' @param h bandwidth
#' @param data data frame, subset of data with t=0 or t=1
pred_kernel_single <- function(target, x_m_name, x, h=10, data){
w <- dnorm(pull(data, x_m_name), x, h)
sum(w*pull(data, target))/sum(w)
}And the prediction function:
#' @param object regression model (GAM)
#' @param newdata data frame in which to look for the mediator variable used to predict the target
pred_kernel <- function(object, newdata){
target <- object$target
x_m_name <- object$x_m_name
h <- object$h
data <- object$data
pull(newdata, 1) %>%
map_dbl(~pred_kernel_single(target = target, x_m_name = x_m_name, x = ., h = h,
data = data))
}5 Transport
Let us now turn to the transport of the mediator variable.
5.1 Quantile-based
Then, we define a function to transport the values of a mediator.
#' Quantile-based transport (Univariate)
#' @param x_0 vector of values to be transported
#' @param x_m_name name of the mediator variable
#' @param treatment_name name of the treatment variable (column in `data`)
#' @param treatment_0 value for non treated
#' @param treatment_1 value for treated
#' @param data data frame with both T=0 or T=1
transport_quantile <- function(x_0, x_m_name, treatment_name, treatment_0, treatment_1, data){
ind_0 <- pull(data, treatment_name) == treatment_0
x_val <- pull(data, x_m_name)
# Empirical Cumulative Distribution Function for values of the mediator variable for non-treated
Fn <- ecdf(x_val)
# Probability associated in the non-treated
u <- Fn(x_0)
# Transported_values
x_1 <- pull(data, x_m_name)[pull(data, treatment_name) == treatment_1]
x_t_quantile <- quantile(x_1, u)
list(x_0 = x_0, u = u, x_t_quantile = x_t_quantile)
}The transported values of \(\boldsymbol{x}\), i.e., \(\mathcal{T}(\boldsymbol{x})\):
target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at a few points
trans_q_smoker_bw <-
transport_quantile(x_0 = x_0_birth_weight, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
probs <- seq(0,1,length=601)
# Quantiles for non treated
x_0_probs_smoker_bw <- quantile(pull(base, x_m_name)[pull(base, treatment_name) == treatment_0], probs = probs)
# Transported values
trans_q_smoker_bw_probs <-
transport_quantile(x_0 = x_0_probs_smoker_bw, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at a few points
trans_q_smoker_wg <-
transport_quantile(x_0 = x_0_weight_gain, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
probs <- seq(0,1,length=601)
# Quantiles for non treated
x_0_probs_smoker_wg <- quantile(pull(base, x_m_name)[pull(base, treatment_name) == treatment_0], probs = probs)
# Transported values
trans_q_smoker_wg_probs <-
transport_quantile(x_0 = x_0_probs_smoker_wg, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at a few points
trans_q_blackm_bw <-
transport_quantile(x_0 = x_0_birth_weight, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
probs <- seq(0,1,length=601)
# Quantiles for non treated
x_0_probs_blackm_bw <- quantile(pull(base, x_m_name)[pull(base, treatment_name) == treatment_0], probs = probs)
# Transported values
trans_q_blackm_bw_probs <-
transport_quantile(x_0 = x_0_probs_blackm_bw, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at a few points
trans_q_blackm_wg <-
transport_quantile(x_0 = x_0_weight_gain, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
probs <- seq(0,1,length=601)
# Quantiles for non treated
x_0_probs_blackm_wg <- quantile(pull(base, x_m_name)[pull(base, treatment_name) == treatment_0], probs = probs)
# Transported values
trans_q_blackm_wg_probs <-
transport_quantile(x_0 = x_0_probs_blackm_wg, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
# Only at a few points
trans_q_sex_bw <-
transport_quantile(x_0 = x_0_birth_weight, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
probs <- seq(0,1,length=601)
# Quantiles for non treated
x_0_probs_sex_bw <- quantile(pull(base, x_m_name)[pull(base, treatment_name) == treatment_0], probs = probs)
# Transported values
trans_q_sex_bw_probs <-
transport_quantile(x_0 = x_0_probs_sex_bw, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
# Only at a few points
trans_q_sex_wg <-
transport_quantile(x_0 = x_0_weight_gain, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
probs <- seq(0,1,length=601)
# Quantiles for non treated
x_0_probs_sex_wg <- quantile(pull(base, x_m_name)[pull(base, treatment_name) == treatment_0], probs = probs)
# Transported values
trans_q_sex_wg_probs <-
transport_quantile(x_0 = x_0_probs_sex_wg, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)We can visualize the transported values from the control group to the treated group. To do so, let us define a function.
Show the R codes
#' @param x_0 vector of values
#' @param x_0_t vector of transported values
#' @param data data frame with the observations
#' @param treatment_name name of the treatment variable (column in `data`)
#' @param treatment_0 value for non treated
#' @param treatment_1 value for treated
#' @param x_m_name name of the mediator variable
#' @param distrib_plot_type type of visualization for the distribution in each subset ("density", "histogram", "none")
#' @param breaks breaks for histogram (if `distrib_plot_type` is not "none")
#' @param xlim limits on the x-axis
#' @param x_c_label label of the mediator variable
#' @param treated_label label of the treated variable
#' @param non_treated_label label of the non-treated variable
plot_transport_univariate <- function(x_0, x_0_t, data,
treatment_name, treatment_0, treatment_1, x_m_name,
distrib_plot_type = c("density", "histogram", "none"),
breaks = NULL,
xlim, x_c_label, treated_label, non_treated_label){
distrib_plot_type <- match.arg(distrib_plot_type)
# Subset of non treated
d_0 <- filter(data, !!sym(treatment_name) == treatment_0)
# Subset of treated
d_1 <- filter(data, !!sym(treatment_name) == treatment_1)
if(distrib_plot_type == "density"){
# Density for non-treated
dens_0 <- density(pull(d_0, x_m_name))
# Density for treated
dens_1 <- density(pull(d_1, x_m_name))
}else if(distrib_plot_type == "histogram"){
hist_0 <- hist(pull(d_0, x_m_name), breaks = breaks, plot = FALSE)
hist_1 <- hist(pull(d_1, x_m_name), breaks = breaks, plot = FALSE)
}
if(distrib_plot_type != "none"){
mat <- matrix(c(1,2,0,3), 2)
par(mfrow = c(2,2))
layout(mat, c(3.5,1), c(1,3))
par(mar = c(0.5, 4.5, 0.5, 0.5))
}
if(distrib_plot_type == "density"){
# Density of the mediator variable on the subset of non treated (green)
plot(dens_0$x, dens_0$y,
main="", axes=FALSE, xlab="", ylab="",
xlim = xlim, col="white")
polygon(dens_0$x, dens_0$y, col = couleur1, border = NA)
}else if(distrib_plot_type == "histogram"){
# Histogram of the mediator variable on the subset of non treated (green)
barplot(hist_0$counts[hist_0$breaks >= min(xlim) & hist_0$breaks <= max(xlim)],
width=5, space=0, main="",
axes=FALSE,xlab="", ylab="",
xlim=xlim, col=couleur1, border="white")
}
par(mar=c(4.5, 4.5, 0.5, 0.5))
# Optimal transport (quantile based)
plot(x_0, x_0_t,
col = couleur3, lwd = 2,
type = "l",
xlab = "",
ylab = "",
xlim = xlim, ylim = xlim)
abline(a = 0, b = 1, col = couleur3, lty = 2)
ylab_1 <- bquote(.(x_c_label) * " (" *phantom(.(non_treated_label)) * ")")
ylab_2 <- bquote(phantom(.(x_c_label)) * phantom(" (") *.(non_treated_label) * phantom(")"))
mtext(ylab_1, side=1,line=3, col = "black")
mtext(ylab_2, side=1,line=3, col = couleur1)
xlab_1 <- bquote(.(x_c_label) * " (" *phantom(.(treated_label)) * ")")
xlab_2 <- bquote(phantom(.(x_c_label)) * phantom(" (") *.(treated_label) *phantom(")"))
mtext(xlab_1, side=2,line=3, col = "black")
mtext(xlab_2, side=2,line=3, col = couleur2)
if(distrib_plot_type == "density"){
# Density of the mediator variable the subset of treated (orange)
par(mar = c(4.5, 0.5, 0.5, 0.5))
plot(dens_1$y, dens_1$x,
main="", axes=FALSE, xlab="", ylab="",
ylim = xlim, col="white")
polygon(dens_1$y, dens_1$x, col = couleur2, border = NA)
}else if(distrib_plot_type == "histogram"){
# Histogram of the mediator variable the subset of treated (orange)
par(mar = c(4.5, 0.5, 0.5, 0.5))
barplot(hist_1$counts[hist_1$breaks >= min(xlim) & hist_1$breaks <= max(xlim)],
width=5, space=0, main="",
axes=FALSE,xlab="", ylab="",
ylim=xlim, col=couleur2, border="white", horiz=TRUE)
}
}Show the R codes
plot_transport_univariate(x_0 = trans_q_smoker_bw_probs$x_0,
x_0_t = trans_q_smoker_bw_probs$x_t_quantile,
data = base,
treatment_name = "cig_rec",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "birth_weight", distrib_plot_type = "density",
breaks = NULL, xlim = c(1800,4600),
x_c_label = "Weight of the baby",
treated_label = "Smoker mother", non_treated_label = "non-smoker mother")Show the R codes
plot_transport_univariate(x_0 = trans_q_smoker_wg_probs$x_0,
x_0_t = trans_q_smoker_wg_probs$x_t_quantile,
data = base,
treatment_name = "cig_rec",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "wtgain", distrib_plot_type = "histogram",
breaks = seq(0,105,by=5), xlim = c(0,90),
x_c_label = "Weight of the baby",
treated_label = "Smoker mother", non_treated_label = "non-smoker mother")Show the R codes
plot_transport_univariate(x_0 = trans_q_blackm_bw_probs$x_0,
x_0_t = trans_q_blackm_bw_probs$x_t_quantile,
data = base,
treatment_name = "black_mother",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "birth_weight", distrib_plot_type = "density",
breaks = NULL, xlim = c(1800,4600),
x_c_label = "Weight of the baby",
treated_label = "Black mother", non_treated_label = "non-Black mother")Show the R codes
plot_transport_univariate(x_0 = trans_q_blackm_wg_probs$x_0,
x_0_t = trans_q_blackm_wg_probs$x_t_quantile,
data = base,
treatment_name = "black_mother",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "wtgain", distrib_plot_type = "histogram",
breaks = seq(0,105,by=5), xlim = c(0,90),
x_c_label = "Weight gain of the mother",
treated_label = "Black mother", non_treated_label = "non-Black mother")Show the R codes
plot_transport_univariate(x_0 = trans_q_sex_wg_probs$x_0,
x_0_t = trans_q_sex_wg_probs$x_t_quantile,
data = base,
treatment_name = "sex",
treatment_0 = "Male", treatment_1 = "Female",
x_m_name = "wtgain", distrib_plot_type = "histogram",
breaks = seq(0,105,by=5), xlim = c(0,90),
x_c_label = "Weight gain of the mother",
treated_label = "baby girl", non_treated_label = "baby boy")5.2 Gaussian Assumption
Now, let us consider the case in which the mediator is assumed to be Gaussian.
The transport function can be defined as follows:
#' @param x_0 vector of values to be transported
#' @param x_m_name name of the mediator variable
#' @param treatment_name name of the treatment variable (column in `data`)
#' @param treatment_0 value for non treated
#' @param treatment_1value for treated
#' @param data data frame with both T=0 or T=1
transport_univ_gaussian <- function(x_0, x_m_name, treatment_name, treatment_0, treatment_1, data){
x0 <- mean(pull(data, x_m_name)[pull(data, treatment_name) == treatment_0])
x1 <- mean(pull(data, x_m_name)[pull(data, treatment_name) == treatment_1])
s0 <- sd(pull(data, x_m_name)[pull(data, treatment_name) == treatment_0])
s1 <- sd(pull(data, x_m_name)[pull(data, treatment_name) == treatment_1])
u_N <- pnorm(x_0,x0,s0)
x_t_N <- qnorm(u_N, x1, s1)
list(x_0 = x_0, u_N = u_N, x_t_N = x_t_N)
}The transported values of \(\boldsymbol{x}\), i.e., \(\mathcal{T}_\mathcal{N}(\boldsymbol{x})\):
target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at a few points
trans_N_smoker_bw <-
transport_univ_gaussian(x_0 = x_0_birth_weight, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
trans_N_smoker_bw_probs <-
transport_univ_gaussian(x_0 = x_0_probs_smoker_bw, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
trans_N_smoker_wg <-
transport_univ_gaussian(x_0 = x_0_weight_gain, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
trans_N_smoker_wg_probs <-
transport_univ_gaussian(x_0 = x_0_probs_smoker_wg, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
trans_N_blackm_bw <-
transport_univ_gaussian(x_0 = x_0_birth_weight, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
trans_N_blackm_bw_probs <-
transport_univ_gaussian(x_0 = x_0_probs_blackm_bw, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
trans_N_blackm_wg <-
transport_univ_gaussian(x_0 = x_0_weight_gain, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
trans_N_blackm_wg_probs <-
transport_univ_gaussian(x_0 = x_0_probs_blackm_wg, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
trans_N_sex_bw <-
transport_univ_gaussian(x_0 = x_0_birth_weight, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
trans_N_sex_bw_probs <-
transport_univ_gaussian(x_0 = x_0_probs_sex_bw, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
trans_N_sex_wg <-
transport_univ_gaussian(x_0 = x_0_weight_gain, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)
# At more points
trans_N_sex_wg_probs <-
transport_univ_gaussian(x_0 = x_0_probs_sex_wg, x_m_name = x_m_name,
treatment_name = treatment_name,
treatment_0 = treatment_0, treatment_1 = treatment_1, data = base)We can visualize the transported values from the control group to the treated group.
Show the R codes
plot_transport_univariate(x_0 = trans_N_smoker_bw_probs$x_0,
x_0_t = trans_N_smoker_bw_probs$x_t_N,
data = base,
treatment_name = "cig_rec",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "birth_weight", distrib_plot_type = "density",
breaks = NULL, xlim = c(1800,4600),
x_c_label = "Weight of the baby",
treated_label = "Smoker mother", non_treated_label = "non-smoker mother")Show the R codes
plot_transport_univariate(x_0 = trans_N_smoker_wg_probs$x_0,
x_0_t = trans_N_smoker_wg_probs$x_t_N,
data = base,
treatment_name = "cig_rec",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "wtgain", distrib_plot_type = "histogram",
breaks = seq(0,105,by=5), xlim = c(0,90),
x_c_label = "Weight of the baby",
treated_label = "Smoker mother", non_treated_label = "non-smoker mother")Show the R codes
plot_transport_univariate(x_0 = trans_N_blackm_bw_probs$x_0,
x_0_t = trans_N_blackm_bw_probs$x_t_N,
data = base,
treatment_name = "black_mother",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "birth_weight", distrib_plot_type = "density",
breaks = NULL, xlim = c(1800,4600),
x_c_label = "Weight of the baby",
treated_label = "Black mother", non_treated_label = "non-Black mother")Show the R codes
plot_transport_univariate(x_0 = trans_N_blackm_wg_probs$x_0,
x_0_t = trans_N_blackm_wg_probs$x_t_N,
data = base,
treatment_name = "black_mother",
treatment_0 = "No", treatment_1 = "Yes",
x_m_name = "wtgain", distrib_plot_type = "histogram",
breaks = seq(0,105,by=5), xlim = c(0,90),
x_c_label = "Weight gain of the mother",
treated_label = "Black mother", non_treated_label = "non-Black mother")Show the R codes
plot_transport_univariate(x_0 = trans_N_sex_bw_probs$x_0,
x_0_t = trans_N_sex_bw_probs$x_t_N,
data = base,
treatment_name = "sex",
treatment_0 = "Male", treatment_1 = "Female",
x_m_name = "birth_weight", distrib_plot_type = "density",
breaks = NULL, xlim = c(1800,4600),
x_c_label = "Weight of the baby",
treated_label = "baby girl", non_treated_label = "baby boy")Show the R codes
plot_transport_univariate(x_0 = trans_N_sex_wg_probs$x_0,
x_0_t = trans_N_sex_wg_probs$x_t_N,
data = base,
treatment_name = "sex",
treatment_0 = "Male", treatment_1 = "Female",
x_m_name = "wtgain", distrib_plot_type = "histogram",
breaks = seq(0,105,by=5), xlim = c(0,90),
x_c_label = "Weight gain of the mother",
treated_label = "baby girl", non_treated_label = "baby boy")6 Estimation
Now that the two models \(\color{couleur1}{\widehat{m}_0(x)}\) and \(\color{couleur2}{\widehat{m}_1(x)}\) are defined, we can compute the Mutatis Mutandis Sample Conditional Average Effect.
#' Computes the Sample Average Treatment Effect with and without transport
#' @param x_0 vector of values at which to compute $\hat{m}_0(x_0)$, $\hat{m}_1(x_0)$
#' @param x_t vector of values at which to compute $\hat{m}_1(\mathcal{T}(x_t))$
#' @param x_m_name vector of names of the mediator variables
#' @param mod_0 model $\hat{m}_0$
#' @param mod_1 model $\hat{m}_1$
#' @param pred_mod_0 prediction function for model $\hat{m}_0(\cdot)$
#' @param pred_mod_1 prediction function for model $\hat{m}_1(\cdot)$
#' @param return_x if `TRUE` (default) the mediator variables are returned in the table, as well as their transported values
sate <- function(x_0, x_t, x_m_names, mod_0, mod_1, pred_mod_0, pred_mod_1, return_x = TRUE){
if(is.vector(x_0)){
# Univariate case
new_data <- tibble(!!x_m_names := x_0)
}else{
new_data <- x_0
}
if(is.vector(x_t)){
# Univariate case
new_data_t <- tibble(!!x_m_names := x_t)
}else{
new_data_t <- x_t
}
# $\hat{m}_0(x_0)$
y_0 <- pred_mod_0(object = mod_0, newdata = new_data)
# $\hat{m}_1(x_0)$
y_1 <- pred_mod_1(object = mod_1, newdata = new_data)
# $\hat{m}_1(\mathcal{T}(x_0))$
y_1_t <- pred_mod_1(object = mod_1, newdata = new_data_t)
scate_tab <-
tibble(y_0 = y_0, y_1 = y_1, y_1_t = y_1_t,
CATE = y_1-y_0, SCATE = y_1_t-y_0)
if(return_x){
new_data_t <- new_data_t %>% rename_all(~paste0(.x, "_t"))
scate_tab <- bind_cols(new_data, new_data_t, scate_tab)
}
scate_tab
}6.1 GAM, with Quantile-based transport
Let us compute here the SCATE where the models \(\color{couleur1}{\widehat{m}_0(x)}\) and \(\color{couleur2}{\widehat{m}_1(x)}\) are GAM and where the transport method is quantile-based.
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at some points
cate_q_gam_univ_smoker_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_q_smoker_bw$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_bw$reg_0,
mod_1 = reg_univ_smoker_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_q_gam_univ_smoker_bw_2 <-
sate(x_0 = trans_q_smoker_bw_probs$x_0,
x_t = trans_q_smoker_bw_probs$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_bw$reg_0,
mod_1 = reg_univ_smoker_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at some points
cate_q_gam_univ_smoker_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_q_smoker_wg$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_wg$reg_0,
mod_1 = reg_univ_smoker_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_q_gam_univ_smoker_wg_2 <-
sate(x_0 = trans_q_smoker_wg_probs$x_0,
x_t = trans_q_smoker_wg_probs$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_wg$reg_0,
mod_1 = reg_univ_smoker_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at some points
cate_q_gam_univ_blackm_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_q_blackm_bw$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_bw$reg_0,
mod_1 = reg_univ_blackm_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_q_gam_univ_blackm_bw_2 <-
sate(x_0 = trans_q_blackm_bw_probs$x_0,
x_t = trans_q_blackm_bw_probs$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_bw$reg_0,
mod_1 = reg_univ_blackm_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at some points
cate_q_gam_univ_blackm_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_blackm_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_wg$reg_0,
mod_1 = reg_univ_blackm_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_q_gam_univ_blackm_wg_2 <-
sate(x_0 = trans_q_blackm_wg_probs$x_0,
x_t = trans_q_blackm_wg_probs$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_wg$reg_0,
mod_1 = reg_univ_blackm_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
# Only at some points
cate_q_gam_univ_sex_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_q_sex_bw$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_bw$reg_0,
mod_1 = reg_univ_sex_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_q_gam_univ_sex_bw_2 <-
sate(x_0 = trans_q_sex_bw_probs$x_0,
x_t = trans_q_sex_bw_probs$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_bw$reg_0,
mod_1 = reg_univ_sex_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
# Only at some points
cate_q_gam_univ_sex_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_sex_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_wg$reg_0,
mod_1 = reg_univ_sex_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_q_gam_univ_sex_wg_2 <-
sate(x_0 = trans_q_sex_wg_probs$x_0,
x_t = trans_q_sex_wg_probs$x_t_quantile,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_wg$reg_0,
mod_1 = reg_univ_sex_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)6.2 GAM, with Gaussian assumption for transport
Let us compute here the SCATE where the models \(\color{couleur1}{\widehat{m}_0(x)}\) and \(\color{couleur2}{\widehat{m}_1(x)}\) are GAM and where the transport method is made under the Gaussian assumption.
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
# Only at a few points
cate_N_gam_univ_smoker_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_N_smoker_bw$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_bw$reg_0,
mod_1 = reg_univ_smoker_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_N_gam_univ_smoker_bw_2 <-
sate(x_0 = x_0_probs_smoker_bw,
x_t = trans_N_smoker_bw_probs$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_bw$reg_0,
mod_1 = reg_univ_smoker_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_gam_univ_smoker_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_smoker_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_wg$reg_0,
mod_1 = reg_univ_smoker_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_N_gam_univ_smoker_wg_2 <-
sate(x_0 = x_0_probs_smoker_wg,
x_t = trans_N_smoker_wg_probs$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_smoker_wg$reg_0,
mod_1 = reg_univ_smoker_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_gam_univ_blackm_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_N_blackm_bw$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_bw$reg_0,
mod_1 = reg_univ_blackm_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_N_gam_univ_blackm_bw_2 <-
sate(x_0 = x_0_probs_blackm_bw,
x_t = trans_N_blackm_bw_probs$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_bw$reg_0,
mod_1 = reg_univ_blackm_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_gam_univ_blackm_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_blackm_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_wg$reg_0,
mod_1 = reg_univ_blackm_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_N_gam_univ_blackm_wg_2 <-
sate(x_0 = x_0_probs_blackm_wg,
x_t = trans_N_blackm_wg_probs$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_blackm_wg$reg_0,
mod_1 = reg_univ_blackm_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
cate_N_gam_univ_sex_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_N_sex_bw$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_bw$reg_0,
mod_1 = reg_univ_sex_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_N_gam_univ_sex_bw_2 <-
sate(x_0 = x_0_probs_sex_bw,
x_t = trans_N_sex_bw_probs$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_bw$reg_0,
mod_1 = reg_univ_sex_bw$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
cate_N_gam_univ_sex_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_sex_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_wg$reg_0,
mod_1 = reg_univ_sex_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)
# At more values
cate_N_gam_univ_sex_wg_2 <-
sate(x_0 = x_0_probs_sex_wg,
x_t = trans_N_sex_wg_probs$x_t_N,
x_m_names = x_m_name,
mod_0 = reg_univ_sex_wg$reg_0,
mod_1 = reg_univ_sex_wg$reg_1,
pred_mod_0 = model_spline_predict, pred_mod_1 = model_spline_predict)6.3 Kernel, with Gaussian assumption for transport
target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_kernel_univ_smoker_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_N_smoker_bw$x_t_N,
x_m_names = x_m_name,
mod_0 = mod_0_k_smoker_bw,
mod_1 = mod_1_k_smoker_bw,
pred_mod_0 = pred_kernel, pred_mod_1 = pred_kernel)target <- "nonnatural_delivery"
treatment_name <- "cig_rec"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_kernel_univ_smoker_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_smoker_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = mod_0_k_smoker_wg,
mod_1 = mod_1_k_smoker_wg,
pred_mod_0 = pred_kernel, pred_mod_1 = pred_kernel)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "birth_weight"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_kernel_univ_blackm_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_N_blackm_bw$x_t_N,
x_m_names = x_m_name,
mod_0 = mod_0_k_blackm_bw,
mod_1 = mod_1_k_blackm_bw,
pred_mod_0 = pred_kernel, pred_mod_1 = pred_kernel)target <- "nonnatural_delivery"
treatment_name <- "black_mother"
x_m_name <- "wtgain"
treatment_0 <- "No"
treatment_1 <- "Yes"
cate_N_kernel_univ_blackm_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_blackm_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = mod_0_k_blackm_wg,
mod_1 = mod_1_k_blackm_wg,
pred_mod_0 = pred_kernel, pred_mod_1 = pred_kernel)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "birth_weight"
treatment_0 <- "Male"
treatment_1 <- "Female"
cate_N_kernel_univ_sex_bw <-
sate(x_0 = x_0_birth_weight,
x_t = trans_N_sex_bw$x_t_N,
x_m_names = x_m_name,
mod_0 = mod_0_k_sex_bw,
mod_1 = mod_1_k_sex_bw,
pred_mod_0 = pred_kernel, pred_mod_1 = pred_kernel)target <- "nonnatural_delivery"
treatment_name <- "sex"
x_m_name <- "wtgain"
treatment_0 <- "Male"
treatment_1 <- "Female"
cate_N_kernel_univ_sex_wg <-
sate(x_0 = x_0_weight_gain,
x_t = trans_N_sex_wg$x_t_N,
x_m_names = x_m_name,
mod_0 = mod_0_k_sex_wg,
mod_1 = mod_1_k_sex_wg,
pred_mod_0 = pred_kernel, pred_mod_1 = pred_kernel)7 Results
7.1 Sample Conditional Average Treatment Effect at some values of \(\boldsymbol{x}\)
Let us wrap up the results in a single output. To do so, we define a small function (only useful in this notebook).
print_table <- function(x_m_name, x_label, gam_q, gam_n, kernel_n, u){
gam_q %>%
rename(x = !!x_m_name, x_t = !!paste0(x_m_name, "_t")) %>%
select(x, x_t_quantile = x_t, CATE, SCATE) %>%
left_join(
gam_n %>%
rename(x = !!x_m_name, x_t_N = !!paste0(x_m_name, "_t")) %>%
select(x, x_t_N, SCATE_N_gam = SCATE)
) %>%
left_join(
kernel_n %>%
rename(x = !!x_m_name) %>%
select(x, SCATE_N_kernel = SCATE)
) %>%
mutate(u = u) %>%
mutate(across(.cols = c(u, CATE, SCATE, SCATE_N_gam, SCATE_N_kernel), .fns = ~scales::percent(., accuracy = .01))) %>%
mutate(across(.cols = c(x, x_t_quantile, x_t_N), .fns = ~scales::number(.))) %>%
select(x, u, CATE, x_t_quantile, SCATE, x_t_N, SCATE_N_gam, SCATE_N_kernel) %>%
rename("\\(\\text{CATE}_0(x)\\) (GAM)" = "CATE",
"\\(\\widehat{\\mathcal{T}}(x)\\)" = "x_t_quantile",
"\\(\\text{SCATE}(x)\\) (GAM)" = "SCATE",
"\\(\\widehat{\\mathcal{T}}_{\\mathcal{N}}(x)\\)" = "x_t_N",
"\\(\\text{SCATE}_{\\mathcal{N}}(x)\\) (GAM)" = "SCATE_N_gam",
"\\(\\text{SCATE}_{\\mathcal{N}}(x)\\) (kernel)" = "SCATE_N_kernel") %>%
pivot_longer(cols = -x) %>%
pivot_wider(names_from = x, values_from = value) %>%
rename(!!sym(x_label) := "name") %>%
knitr::kable(format = "html", escape=FALSE) %>%
kableExtra::kable_classic(full_width = F, html_font = "Cambria")
}Show the codes used to gather the results in a table
x_m_name <- "birth_weight" ; x_label <- "\\(x\\) (newborn's weight)"
print_table(x_m_name = x_m_name, x_label = x_label,
gam_q = cate_q_gam_univ_smoker_bw,
gam_n = cate_N_gam_univ_smoker_bw,
kernel_n = cate_N_kernel_univ_smoker_bw,
u = trans_q_smoker_bw$u)| \(x\) (newborn's weight) | 2 000 | 2 500 | 3 000 | 3 500 | 4 000 | 4 500 |
|---|---|---|---|---|---|---|
| u | 2.88% | 7.90% | 26.15% | 65.18% | 92.06% | 98.91% |
| \(\text{CATE}_0(x)\) (GAM) | -4.41% | -2.55% | -0.69% | 0.79% | 1.97% | 2.50% |
| \(\widehat{\mathcal{T}}(x)\) | 1 800 | 2 301 | 2 815 | 3 340 | 3 850 | 4 338 |
| \(\text{SCATE}(x)\) (GAM) | -0.55% | 0.78% | 1.01% | 0.49% | -0.77% | -4.36% |
| \(\widehat{\mathcal{T}}_{\mathcal{N}}(x)\) | 1 786 | 2 295 | 2 805 | 3 314 | 3 824 | 4 333 |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (GAM) | -0.28% | 0.88% | 1.12% | 0.50% | -1.15% | -4.53% |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (kernel) | -0.80% | 0.24% | 1.72% | 0.15% | -1.75% | -2.78% |
Show the codes used to gather the results in a table
x_m_name <- "wtgain" ; x_label <- "\\(x\\) (weight gain of the mother)"
print_table(x_m_name = x_m_name, x_label = x_label,
gam_q = cate_q_gam_univ_smoker_wg,
gam_n = cate_N_gam_univ_smoker_wg,
kernel_n = cate_N_kernel_univ_smoker_wg,
u = trans_q_smoker_wg$u)| \(x\) (weight gain of the mother) | 5 | 15 | 25 | 35 | 45 | 55 |
|---|---|---|---|---|---|---|
| u | 4.87% | 14.95% | 37.78% | 66.91% | 86.19% | 94.80% |
| \(\text{CATE}_0(x)\) (GAM) | 0.06% | 0.84% | 0.69% | -0.08% | -1.26% | -2.59% |
| \(\widehat{\mathcal{T}}(x)\) | 1 | 13 | 25 | 37 | 49 | 60 |
| \(\text{SCATE}(x)\) (GAM) | 1.60% | 1.21% | 0.69% | 0.14% | -0.38% | -1.08% |
| \(\widehat{\mathcal{T}}_{\mathcal{N}}(x)\) | 1 | 13 | 24 | 36 | 48 | 59 |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (GAM) | 1.63% | 1.29% | 0.71% | 0.01% | -0.71% | -1.33% |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (kernel) | 0.35% | 0.32% | 0.29% | 0.27% | 0.25% | 0.25% |
Show the codes used to gather the results in a table
x_m_name <- "birth_weight" ; x_label <- "\\(x\\) (newborn's weight)"
print_table(x_m_name = x_m_name, x_label = x_label,
gam_q = cate_q_gam_univ_blackm_bw,
gam_n = cate_N_gam_univ_blackm_bw,
kernel_n = cate_N_kernel_univ_blackm_bw,
u = trans_q_blackm_bw$u)| \(x\) (newborn's weight) | 2 000 | 2 500 | 3 000 | 3 500 | 4 000 | 4 500 |
|---|---|---|---|---|---|---|
| u | 2.88% | 7.90% | 26.15% | 65.18% | 92.06% | 98.91% |
| \(\text{CATE}_0(x)\) (GAM) | -1.33% | 0.37% | 1.89% | 4.09% | 8.46% | 15.18% |
| \(\widehat{\mathcal{T}}(x)\) | 1 550 | 2 268 | 2 835 | 3 345 | 3 856 | 4 366 |
| \(\text{SCATE}(x)\) (GAM) | 6.65% | 4.53% | 3.41% | 3.38% | 4.64% | 7.95% |
| \(\widehat{\mathcal{T}}_{\mathcal{N}}(x)\) | 1 652 | 2 205 | 2 758 | 3 311 | 3 864 | 4 417 |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (GAM) | 5.03% | 5.73% | 4.31% | 3.33% | 4.82% | 10.63% |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (kernel) | 7.34% | 7.07% | 3.53% | 3.62% | 4.77% | 7.70% |
Show the codes used to gather the results in a table
x_m_name <- "wtgain" ; x_label <- "\\(x\\) (weight gain of the mother)"
print_table(x_m_name = x_m_name, x_label = x_label,
gam_q = cate_q_gam_univ_blackm_wg,
gam_n = cate_N_gam_univ_blackm_wg,
kernel_n = cate_N_kernel_univ_blackm_wg,
u = trans_q_blackm_wg$u)| \(x\) (weight gain of the mother) | 5 | 15 | 25 | 35 | 45 | 55 |
|---|---|---|---|---|---|---|
| u | 4.87% | 14.95% | 37.78% | 66.91% | 86.19% | 94.80% |
| \(\text{CATE}_0(x)\) (GAM) | 2.10% | 3.72% | 4.34% | 4.24% | 3.58% | 2.52% |
| \(\widehat{\mathcal{T}}(x)\) | 0 | 11 | 23 | 34 | 45 | 57 |
| \(\text{SCATE}(x)\) (GAM) | 3.73% | 4.17% | 4.26% | 4.08% | 3.66% | 3.02% |
| \(\widehat{\mathcal{T}}_{\mathcal{N}}(x)\) | 0 | 11 | 23 | 34 | 45 | 57 |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (GAM) | 3.73% | 4.17% | 4.26% | 4.08% | 3.66% | 3.02% |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (kernel) | 3.87% | 3.88% | 3.89% | 3.90% | 3.92% | 3.94% |
Show the codes used to gather the results in a table
x_m_name <- "birth_weight" ; x_label <- "\\(x\\) (newborn's weight)"
print_table(x_m_name = x_m_name, x_label = x_label,
gam_q = cate_q_gam_univ_sex_bw,
gam_n = cate_N_gam_univ_sex_bw,
kernel_n = cate_N_kernel_univ_sex_bw,
u = trans_q_sex_bw$u)| \(x\) (newborn's weight) | 2 000 | 2 500 | 3 000 | 3 500 | 4 000 | 4 500 |
|---|---|---|---|---|---|---|
| u | 2.88% | 7.90% | 26.15% | 65.18% | 92.06% | 98.91% |
| \(\text{CATE}_0(x)\) (GAM) | -0.24% | -1.66% | -2.24% | -2.00% | -0.77% | 2.38% |
| \(\widehat{\mathcal{T}}(x)\) | 1 984 | 2 466 | 2 948 | 3 430 | 3 920 | 4 403 |
| \(\text{SCATE}(x)\) (GAM) | 0.14% | -0.97% | -1.66% | -2.08% | -2.34% | -2.32% |
| \(\widehat{\mathcal{T}}_{\mathcal{N}}(x)\) | 1 947 | 2 424 | 2 901 | 3 377 | 3 854 | 4 331 |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (GAM) | 1.02% | -0.08% | -1.07% | -2.05% | -3.41% | -5.43% |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (kernel) | 2.06% | -0.27% | -1.02% | -2.30% | -3.53% | -5.10% |
Show the codes used to gather the results in a table
x_m_name <- "wtgain" ; x_label <- "\\(x\\) (weight gain of the mother)"
print_table(x_m_name = x_m_name, x_label = x_label,
gam_q = cate_q_gam_univ_sex_wg,
gam_n = cate_N_gam_univ_sex_wg,
kernel_n = cate_N_kernel_univ_sex_wg,
u = trans_q_sex_wg$u)| \(x\) (weight gain of the mother) | 5 | 15 | 25 | 35 | 45 | 55 |
|---|---|---|---|---|---|---|
| u | 4.87% | 14.95% | 37.78% | 66.91% | 86.19% | 94.80% |
| \(\text{CATE}_0(x)\) (GAM) | -1.79% | -1.60% | -1.60% | -1.73% | -1.90% | -2.02% |
| \(\widehat{\mathcal{T}}(x)\) | 4.4 | 14.3 | 24.3 | 34.2 | 44.1 | 54.0 |
| \(\text{SCATE}(x)\) (GAM) | -1.52% | -1.47% | -1.61% | -1.87% | -2.17% | -2.41% |
| \(\widehat{\mathcal{T}}_{\mathcal{N}}(x)\) | 4.4 | 14.3 | 24.3 | 34.2 | 44.1 | 54.0 |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (GAM) | -1.52% | -1.47% | -1.61% | -1.87% | -2.17% | -2.41% |
| \(\text{SCATE}_{\mathcal{N}}(x)\) (kernel) | -1.77% | -1.78% | -1.80% | -1.81% | -1.83% | -1.84% |
7.2 Sample Conditional Average Treatment Effect at more values of \(\boldsymbol{x}\)
7.2.1 GAM, with Quantile-based transport
Now, we can visualize how the predicted probability of the response variable evolves at different values of the mediator variable, depending on the treatment variable and whether or not the mediator’s values are transported. Let us first define a function that can be applied depending on the treatment and on the mediator variable.
# Plot the evolution of the SCATE
plot_evolution <- function(cate, trans_q_x0, trans_q_xt, x_m_name,
title, xlab, ylab,
lab_0, lab_1, lab_1_t, base,
treatment_name, treatment_0, treatment_1,
legend_x, legend_y, xlim, ylim){
x_0 <- pull(base, x_m_name)[pull(base, treatment_name) == treatment_0]
x_1 <- pull(base, x_m_name)[pull(base, treatment_name) == treatment_1]
# Estimated probability of the target, for non-treated
plot(pull(cate, x_m_name),
cate$y_0, type="l",
col = couleur1, main = "",
ylab = title, xlab = "",
axes = FALSE, ylim = ylim, xlim = xlim)
axis(1)
axis(2)
axis(3, at = trans_q_xt, label = trans_q_x0)
mtext(xlab,side=3,line=3)
mtext(ylab,side=1,line=3)
b <- .6
# Empirical quantiles of mediator among non-treated
a <- quantile(x_0, c(.05,(1:9)/10,.95))
# Legend for the quantile probabilities
L <- paste(c(5,(1:9)*10,95),"%",sep="")
segments(a,b,a,b+1)
i = c(1,2,3,5,7,9,10,11)
text(a[i],b,labels = L[i],pos=1)
# Estimated probability of y=1, for treated, without transport
lines(pull(cate, x_m_name), cate$y_1, col = scales::alpha(couleur2,.2))
# Estimated probability of y=1, for treated, with transport
lines(pull(cate, x_m_name), cate$y_1_t, col = couleur2)
legend(legend_x,legend_y, c(lab_0, lab_1, lab_1_t),
lty = 1, col =c (couleur1,couleur2, scales::alpha(couleur2,.2)),bty="n")
}Show the codes used to plot the results
plot_evolution(
cate = cate_q_gam_univ_smoker_bw_2,
trans_q_x0 = trans_q_smoker_bw$x_0,
trans_q_xt = trans_q_smoker_bw$x_t_quantile,
x_m_name = "birth_weight",
title = "Probability of non-natural birth",
xlab = "Weight of the baby (smoker mother)",
ylab = "Weight of the baby (non-smoker mother)",
lab_0 = "Non-Smoker",
lab_1 = "Smoker with transport (Quantile-based)",
lab_1_t = "Smoker (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = 1800,
legend_y = .4-0.01,
xlim = c(1800,4600),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_q_gam_univ_smoker_wg_2,
trans_q_x0 = trans_q_smoker_wg$x_0,
trans_q_xt = trans_q_smoker_wg$x_t_quantile,
x_m_name = "wtgain",
title = "Probability of non-natural birth",
xlab = "Weight gain of the mother (smoker mother)",
ylab = "Weight gain of the mother (non-smoker mother)",
lab_0 = "Non-Smoker",
lab_1 = "Smoker with transport (Quantile-based)",
lab_1_t = "Smoker (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = -5,
legend_y = .55-0.01,
xlim = c(0,90),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_q_gam_univ_blackm_bw_2,
trans_q_x0 = trans_q_blackm_bw$x_0,
trans_q_xt = trans_q_blackm_bw$x_t_quantile,
x_m_name = "birth_weight",
title = "Probability of non-natural birth",
xlab = "Weight of the baby (Black mother)",
ylab = "Weight of the baby (non-Black mother)",
lab_0 = "Non-Black",
lab_1 = "Black with transport (Quantile-based)",
lab_1_t = "Black (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = 1800,
legend_y = .4-0.01,
xlim = c(1800,4600),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_q_gam_univ_blackm_wg_2,
trans_q_x0 = trans_q_blackm_wg$x_0,
trans_q_xt = trans_q_blackm_wg$x_t_quantile,
x_m_name = "wtgain",
title = "Probability of non-natural birth",
xlab = "Weight gain of the mother (Black mother)",
ylab = "Weight gain of the mother (non-Black mother)",
lab_0 = "Non-Black",
lab_1 = "Black with transport (Quantile-based)",
lab_1_t = "Black (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = -5,
legend_y = .55-0.01,
xlim = c(0,90),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_q_gam_univ_sex_bw_2,
trans_q_x0 = trans_q_sex_bw$x_0,
trans_q_xt = trans_q_sex_bw$x_t_quantile,
x_m_name = "birth_weight",
title = "Probability of non-natural birth",
xlab = "Weight of the baby (baby girl)",
ylab = "Weight of the baby (baby boy)",
lab_0 = "Baby boy",
lab_1 = "Baby Girl with transport (Quantile-based)",
lab_1_t = "Baby Girl (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = 1800,
legend_y = .4-0.01,
xlim = c(1800,4600),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_q_gam_univ_sex_wg_2,
trans_q_x0 = trans_q_sex_wg$x_0,
trans_q_xt = trans_q_sex_wg$x_t_quantile,
x_m_name = "wtgain",
title = "Probability of non-natural birth",
xlab = "Weight gain of the mother (baby girl)",
ylab = "Weight gain of the mother (baby boy)",
lab_0 = "Baby boy",
lab_1 = "Baby girl with transport (Quantile-based)",
lab_1_t = "Baby girl (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = -5,
legend_y = .55-0.01,
xlim = c(0,90),
ylim = c(.3,.6)
)Let us now visualize \(x\mapsto \text{CATE}(x)=\widehat{m}_1\big(\widehat{\mathcal{T}}(x)\big) - \widehat{m}_0\big(x\big)\) as a function of \(x\). Let us also plot a light curve showing \(\widehat{m}_1\big(x\big) - \widehat{m}_0\big(x\big)\).
# Plot the evolution of the SCATE
plot_scate_evol <- function(cate, trans_q_x0, trans_q_xt, x_m_name, treatment_name, treatment_0,
xlab_0, xlab_1, ylab, xlim, ylim, legend_x, legend_y, legend_lab_1, b = .06){
x_0 <- pull(base, x_m_name)[pull(base, treatment_name) == treatment_0]
# CATE with transport
plot(pull(cate, x_m_name), cate$SCATE, type = "l", lwd=2,
col = couleur3, main = "",
ylab = "Probability difference of non-natural birth",
xlab = "", axes = FALSE, ylim = ylim, xlim = xlim)
abline(h=0)
# Original CATE (without transport)
lines(pull(cate, x_m_name), cate$CATE, lwd=1, col = scales::alpha(couleur3,.25))
axis(1)
axis(2)
axis(3, at = trans_q_xt, label = trans_q_x0)
mtext(xlab_1,side=3,line=3)
mtext(xlab_0,side=1,line=3)
# Empirical quantiles of mediator among non-treated
a <- quantile(x_0, c(.05,(1:9)/10,.95))
# Legend for the quantile probabilities
L <- paste(c(5,(1:9)*10,95),"%",sep="")
segments(a,b,a,b+1)
i = c(1,2,3,5,7,9,10,11)
text(a[i],b,labels = L[i],pos=1)
legend(legend_x, legend_y, c(legend_lab_1,
"CATE (original)"),
lty = 1, lwd = c(2,1), col = scales::alpha(couleur3, c(1,.25)), bty = "n")
}Show the codes used to plot the results
plot_scate_evol(cate = cate_q_gam_univ_smoker_bw_2,
trans_q_x0 = trans_q_smoker_bw$x_0,
trans_q_xt = trans_q_smoker_bw$x_t_quantile,
x_m_name = "birth_weight",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight of the baby (non-smoker mother)",
xlab_1 = "Weight of the baby (smoker mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(1800,4600), ylim = c(-.06,.06),
legend_x = 1800-100, legend_y = .05,
legend_lab_1 = "SCATE with quantile-based transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_q_gam_univ_smoker_wg_2,
trans_q_x0 = trans_q_smoker_wg$x_0,
trans_q_xt = trans_q_smoker_wg$x_t_quantile,
x_m_name = "wtgain",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight gain of the mother (non-smoker mother)",
xlab_1 = "Weight gain of the mother (smoker mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(0,90),
ylim = c(-.06,.06),
legend_x = -5, legend_y = -.03,
legend_lab_1 = "SCATE with quantile-based transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_q_gam_univ_blackm_bw_2,
trans_q_x0 = trans_q_blackm_bw$x_0,
trans_q_xt = trans_q_blackm_bw$x_t_quantile,
x_m_name = "birth_weight",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight of the baby (non-Black mother)",
xlab_1 = "Weight of the baby (Black mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(1800,4600), ylim = c(-.1,.1),
legend_x = 1800-100, legend_y = -.02,
b = .1,
legend_lab_1 = "SCATE with quantile-based transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_q_gam_univ_blackm_wg_2,
trans_q_x0 = trans_q_blackm_wg$x_0,
trans_q_xt = trans_q_blackm_wg$x_t_quantile,
x_m_name = "wtgain",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight gain of the mother (non-Black mother)",
xlab_1 = "Weight gain of the mother (Black mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(0,90),
ylim = c(-.06,.06),
legend_x = -5, legend_y = -.03,
b = .06,
legend_lab_1 = "SCATE with quantile-based transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_q_gam_univ_sex_bw_2,
trans_q_x0 = trans_q_sex_bw$x_0,
trans_q_xt = trans_q_sex_bw$x_t_quantile,
x_m_name = "birth_weight",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight of the baby (baby boy)",
xlab_1 = "Weight of the baby (baby girl)",
ylab = "Probability difference of non-natural birth",
xlim = c(1800,4600), ylim = c(-.06,.06),
legend_x = 1800-100, legend_y = -.02,
b = .06,
legend_lab_1 = "SCATE with quantile-based transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_q_gam_univ_sex_wg_2,
trans_q_x0 = trans_q_sex_wg$x_0,
trans_q_xt = trans_q_sex_wg$x_t_quantile,
x_m_name = "wtgain",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight gain of the mother (baby boy)",
xlab_1 = "Weight gain of the mother (baby girl)",
ylab = "Probability difference of non-natural birth",
xlim = c(0,90),
ylim = c(-.06,.06),
legend_x = -5, legend_y = -.03,
b = .06,
legend_lab_1 = "SCATE with quantile-based transport")7.2.2 GAM, with Gaussian assumption for transport
Let us turn to the Gaussian assumption for transport.
Show the codes used to plot the results
plot_evolution(
cate = cate_N_gam_univ_smoker_bw_2,
trans_q_x0 = trans_N_smoker_bw$x_0,
trans_q_xt = trans_N_smoker_bw$x_t_N,
x_m_name = "birth_weight",
title = "Probability of non-natural birth",
xlab = "Weight of the baby (smoker mother)",
ylab = "Weight of the baby (non-smoker mother)",
lab_0 = "Non-Smoker",
lab_1 = "Smoker with transport (Gaussian assumption)",
lab_1_t = "Smoker (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = 1800,
legend_y = .4-0.01,
xlim = c(1800,4600),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_N_gam_univ_smoker_wg_2,
trans_q_x0 = trans_N_smoker_wg$x_0,
trans_q_xt = trans_N_smoker_wg$x_t_N,
x_m_name = "wtgain",
title = "Probability of non-natural birth",
xlab = "Weight gain of the mother (smoker mother)",
ylab = "Weight gain of the mother (non-smoker mother)",
lab_0 = "Non-Smoker",
lab_1 = "Smoker with transport (Gaussian assumption)",
lab_1_t = "Smoker (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = -5,
legend_y = .55-0.01,
xlim = c(0,90),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_N_gam_univ_blackm_bw_2,
trans_q_x0 = trans_N_blackm_bw$x_0,
trans_q_xt = trans_N_blackm_bw$x_t_N,
x_m_name = "birth_weight",
title = "Probability of non-natural birth",
xlab = "Weight of the baby (Black mother)",
ylab = "Weight of the baby (non-Black mother)",
lab_0 = "Non-Black",
lab_1 = "Black with transport (Gaussian assumption)",
lab_1_t = "Black (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = 1800,
legend_y = .4-0.01,
xlim = c(1800,4600),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_N_gam_univ_blackm_wg_2,
trans_q_x0 = trans_N_blackm_wg$x_0,
trans_q_xt = trans_N_blackm_wg$x_t_N,
x_m_name = "wtgain",
title = "Probability of non-natural birth",
xlab = "Weight gain of the mother (Black mother)",
ylab = "Weight gain of the mother (non-Black mother)",
lab_0 = "Non-Black",
lab_1 = "Black with transport (Gaussian assumption)",
lab_1_t = "Black (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = -5,
legend_y = .55-0.01,
xlim = c(0,90),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_N_gam_univ_sex_bw_2,
trans_q_x0 = trans_N_sex_bw$x_0,
trans_q_xt = trans_N_sex_bw$x_t_N,
x_m_name = "birth_weight",
title = "Probability of non-natural birth",
xlab = "Weight of the baby (baby girl)",
ylab = "Weight of the baby (baby boy)",
lab_0 = "Baby boy",
lab_1 = "Baby Girl with transport (Gaussian assumption)",
lab_1_t = "Baby Girl (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = 1800,
legend_y = .4-0.01,
xlim = c(1800,4600),
ylim = c(.3,.6)
)Show the codes used to plot the results
plot_evolution(
cate = cate_N_gam_univ_sex_wg_2,
trans_q_x0 = trans_N_sex_wg$x_0,
trans_q_xt = trans_N_sex_wg$x_t_N,
x_m_name = "wtgain",
title = "Probability of non-natural birth",
xlab = "Weight gain of the mother (baby girl)",
ylab = "Weight gain of the mother (baby girl)",
lab_0 = "Baby boy",
lab_1 = "Baby girl with transport (Gaussian assumption)",
lab_1_t = "Baby girl (original)",
base = base,
treatment_name = "cig_rec",
treatment_0 = "No",
treatment_1 = "Yes",
legend_x = -5,
legend_y = .55-0.01,
xlim = c(0,90),
ylim = c(.3,.6)
)Let us now visualize \(x\mapsto \text{CATE}(x)=\widehat{m}_1\big(\widehat{\mathcal{T}}(x)\big) - \widehat{m}_0\big(x\big)\) as a function of \(x\). Let us also plot a light curve showing \(\widehat{m}_1\big(x\big) - \widehat{m}_0\big(x\big)\).
Show the codes used to plot the results
plot_scate_evol(cate = cate_N_gam_univ_smoker_bw_2,
trans_q_x0 = trans_N_smoker_bw$x_0,
trans_q_xt = trans_N_smoker_bw$x_t_N,
x_m_name = "birth_weight",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight of the baby (non-smoker mother)",
xlab_1 = "Weight of the baby (smoker mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(1800,4600), ylim = c(-.06,.06),
legend_x = 1800-100, legend_y = .05,
legend_lab_1 = "SCATE with Gaussian assumption for transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_N_gam_univ_smoker_wg_2,
trans_q_x0 = trans_N_smoker_wg$x_0,
trans_q_xt = trans_N_smoker_wg$x_t_N,
x_m_name = "wtgain",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight gain of the mother (non-smoker mother)",
xlab_1 = "Weight gain of the mother (smoker mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(0,90),
ylim = c(-.06,.06),
legend_x = -5, legend_y = -.03,
legend_lab_1 = "SCATE with Gaussian assumption for transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_N_gam_univ_blackm_bw_2,
trans_q_x0 = trans_N_blackm_bw$x_0,
trans_q_xt = trans_N_blackm_bw$x_t_N,
x_m_name = "birth_weight",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight of the baby (non-Black mother)",
xlab_1 = "Weight of the baby (Black mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(1800,4600), ylim = c(-.1,.1),
legend_x = 1800-100, legend_y = -.02,
b = .1,
legend_lab_1 = "SCATE with Gaussian assumption for transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_N_gam_univ_blackm_wg_2,
trans_q_x0 = trans_N_blackm_wg$x_0,
trans_q_xt = trans_N_blackm_wg$x_t_N,
x_m_name = "wtgain",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight gain of the mother (non-Black mother)",
xlab_1 = "Weight gain of the mother (Black mother)",
ylab = "Probability difference of non-natural birth",
xlim = c(0,90),
ylim = c(-.06,.06),
legend_x = -5, legend_y = -.03,
b = .06,
legend_lab_1 = "SCATE with Gaussian assumption for transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_N_gam_univ_sex_bw_2,
trans_q_x0 = trans_N_sex_bw$x_0,
trans_q_xt = trans_N_sex_bw$x_t_N,
x_m_name = "birth_weight",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight of the baby (baby boy)",
xlab_1 = "Weight of the baby (baby girl)",
ylab = "Probability difference of non-natural birth",
xlim = c(1800,4600), ylim = c(-.06,.06),
legend_x = 1800-100, legend_y = -.02,
b = .06,
legend_lab_1 = "SCATE with Gaussian assumption for transport")Show the codes used to plot the results
plot_scate_evol(cate = cate_N_gam_univ_sex_wg_2,
trans_q_x0 = trans_N_sex_wg$x_0,
trans_q_xt = trans_N_sex_wg$x_t_N,
x_m_name = "wtgain",
treatment_name = "cig_rec",
treatment_0 = "No",
xlab_0 = "Weight gain of the mother (baby boy)",
xlab_1 = "Weight gain of the mother (baby girl)",
ylab = "Probability difference of non-natural birth",
xlim = c(0,90),
ylim = c(-.06,.06),
legend_x = -5, legend_y = -.03,
b = .06,
legend_lab_1 = "SCATE with Gaussian assumption for transport")