Title: | Detect Subgroup with an Enhanced Treatment Effect |
---|---|
Description: | A test for the existence of a subgroup with enhanced treatment effect. And, a sample size calculation procedure for the subgroup detection test. |
Authors: | Ailin Fan and Shannon T. Holloway |
Maintainer: | Shannon T. Holloway <[email protected]> |
License: | GPL-3 |
Version: | 1.2 |
Built: | 2024-11-01 11:27:02 UTC |
Source: | https://github.com/cran/subdetect |
A test for the existence of a subgroup with enhanced treatment effect and estimation of the subgroup if it exists. In addition, a sample size calculation procedure for this subgroup detection test.
subgroup_detect
: test for the existence of a subgroup and estimation of the subgroup.
sample_size
:estimate of the required sample size for subgroup detection test.
Ailin Fan and Shannon T. Holloway,
Maintainer: Shannon T. Holloway [email protected]
Ailin Fan, Rui Song, and Wenbin Lu, (2016). Change-plane analysis for subgroup detection and sample size calculation, Journal of the American Statistical Association, in press.
Estimation of the required sample size for the
test on subgroup existence. With a pre-specified
significance level of the test and a
desired power
at a treatment effect
, and other information about data, the
required sample size that achieves power
can be estimated.
sample_size(outcome, theta0, sigma2, tau, N = 1000L, prob = 0.5, alpha = 0.05, power = 0.9, K = 1000L, M = 1000L, seed = NULL, precision = 0.01, ...)
sample_size(outcome, theta0, sigma2, tau, N = 1000L, prob = 0.5, alpha = 0.05, power = 0.9, K = 1000L, M = 1000L, seed = NULL, precision = 0.01, ...)
outcome |
A formula object. The model of the indicator function. The model must include an intercept. Any lhs variable will be ignored. |
theta0 |
A named numeric vector. The true parameters of the indicator model. |
sigma2 |
A numeric object. The variance of the random error. |
tau |
A numeric object. The desired treatment effect. |
N |
An integer object. The number of random samples to draw. Default value is 1000. |
prob |
A numeric object.
The probability of assigning individuals treatment 1.
|
alpha |
A numeric object.
The significance level of the test, |
power |
A numeric object.
The desired power of the test, |
K |
An integer object.
The number of random sampled points on the unit ball
surface |
M |
An integer object. The number of resamplings of the perturbed test statistic. This sample is used to calculate the critical value of the test. Default and minimum values are 1000. |
seed |
An integer object or NULL. If set, the seed for generating random values set at the onset of the calculation. If NULL, current seed in R environment is used. |
precision |
A numeric object. The precision tolerance for estimating the power from the calculated sample size. Specifically, the power of the sample size returned, P, will be (power - precision) < P < (power + precision). |
... |
For each covariate in outcome, user must provide a list object indicating the distribution function to be sampled and any parameters to be set when calling that function. Each list contains "FUN", the function name of the random generator of the distribution, and any formal arguments of that function. For example: x1 = list("FUN" = rnorm, sd = 2.0, mean = 10.0) x2 = list("FUN" = rbinom, size = 1, prob = 0.5) The number of points generated is determined by input N. Most distributions available through R's stats package can be used. Exceptions are: rhyper, rsignrank, rwilcox. Specifically, any random generator that passes the the number of observations to generate through formal argument 'n' can be used. |
The sample size calculation is based on the asymptotic null and local alternative distributions of the test statistic. More details can be found in the reference paper.
The difference between true baseline mean
function and posited mean function, ,
is set to zero when calculating the sample size.
Because the calculated sample size is based on simulated
data following the null and local alternative distributions
of the test statistic, the results can be different with
different choices of M
and K
, as well as
different seeds. When the signal tau
to noise
sigma2
ratio is large, the calculated sample size
can be more robust.
A list consisting of
n |
An integer object. The estimated sample size. |
power |
A numeric object. The estimated power. |
seed |
If seed was provided as input, the user specified integer seed. If seed was not provided, not present. |
Ailin Fan, Rui Song, and Wenbin Lu, (2016). Change-plane analysis for subgroup detection and sample size calculation, Journal of the American Statistical Association, in press.
model <- ~ x1 theta0 <- c("(Intercept)" = 0.0, "x1" = 1.0) sample_size(outcome = ~ x1, theta0 = theta0, N = 1000, sigma2 = 0.25, tau = 0.25, K = 100, M = 1000, x1 = list(FUN=runif, min = -1, max = 1))
model <- ~ x1 theta0 <- c("(Intercept)" = 0.0, "x1" = 1.0) sample_size(outcome = ~ x1, theta0 = theta0, N = 1000, sigma2 = 0.25, tau = 0.25, K = 100, M = 1000, x1 = list(FUN=runif, min = -1, max = 1))
Tests for the existence of a subgroup with an enhanced
treatment effect. The subgroup of interest is represented
by . The test returns a
p-value for
, where
is the
treatment effect in this subgroup. If
is
rejected, estimates for
can be used to
obtain the estimated subgroup.
subgroup_detect(outcome, propen, data, K = 1000L, M = 1000L, seed = NULL)
subgroup_detect(outcome, propen, data, K = 1000L, M = 1000L, seed = NULL)
outcome |
A formula object. The linear model for the outcome regression. The left-hand-side variable must be the response. R function lm will be used to estimate model parameters. The response must be continuous. |
propen |
A formula object. The model for the propensity score. The left-hand-side variable must be the treatment variable. R function glm will be used with input option family = binomial(link="logit") to estimate model parameters. The treatment must be binary. |
data |
A data.frame object. All covariates, treatment, and response variables. Note that the treatment must be binary and that the response must be continuous. |
K |
An integer object.
The number of random sampled points on the unit ball
surface |
M |
An integer object. The number of resamplings of the perturbed test statistic. This sample is used to calculate the critical value of the test. Default and minimum values are 1000. |
seed |
An integer object or NULL. If integer, the seed for random number generation, set at the onset of the calculation. If NULL, current seed in R environment is used. |
In this function, a linear model with least squares
estimate is used for fitting the baseline model ,
and a logistic model with maximum likelihood estimate is
used for fitting the propensity score model
.
These settings cannot be changed by the user.
A list consisting of
outcome |
An lm object. The object returned by the lm fit of the outcome. |
propen |
A glm object. The object returned by the glm fit of the propensity. |
p_value |
A numeric object. The p-value of the test. |
theta |
A named numeric vector. The change-plane parameter estimates for subgroup. |
prop |
A numeric object.
The proportion of sampled points on |
seed |
If seed was provided as input, the user specified integer seed. If seed was not provided, not present. |
Ailin Fan, Rui Song, and Wenbin Lu, (2016). Change-plane analysis for subgroup detection and sample size calculation, Journal of the American Statistical Association, in press.
#set parameters tau <- 0.5 theta_t <- c(-0.15,0.3,sqrt(1-(-0.15)^2-(0.3)^2)) beta <- c(1,1,1) sigma <- 0.5 n <- 50 p <- 2 #generate data x1 <- rbinom(n,size=1,prob=0.5) x2 <- runif(n,min=-1,max=1) X <- cbind(1,x1,x2) a <- rbinom(n,1,prob=0.5) y <- drop(X%*%beta) + tau*a*(drop(X%*%theta_t)>=0) + rnorm(n,0,sigma) data <- data.frame(X[,2:3], a, y) subgroup_detect(outcome = y~x1+x2, propen = a~x1+x2, data = data)
#set parameters tau <- 0.5 theta_t <- c(-0.15,0.3,sqrt(1-(-0.15)^2-(0.3)^2)) beta <- c(1,1,1) sigma <- 0.5 n <- 50 p <- 2 #generate data x1 <- rbinom(n,size=1,prob=0.5) x2 <- runif(n,min=-1,max=1) X <- cbind(1,x1,x2) a <- rbinom(n,1,prob=0.5) y <- drop(X%*%beta) + tau*a*(drop(X%*%theta_t)>=0) + rnorm(n,0,sigma) data <- data.frame(X[,2:3], a, y) subgroup_detect(outcome = y~x1+x2, propen = a~x1+x2, data = data)