Randomized Clinical Trials
%contrasttest
Copy Link
The %contrastTest macro conducts heterogeneity test for comparing the exposure-disease associations obtained from separate subtype-specific analysis based on the cohort or nested case-controlled studies. Specifically, the user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each sub-type, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.
Faculty: Donna Spiegelman, ScD
Download: %contrasttest package
Platform: SAS
Reference: doi.org (%contrasttest)
CoxBcv
Copy Link
The implementation of bias-corrected sandwich variance estimators for the analysis of cluster randomized trials with time-to-event outcomes using the marginal Cox model, proposed by Wang et al. (2023, Biometrical Journal)
Faculty: Fan Li, PhD
Download: Cran R / CoxBcv package
Platform: R
Reference: doi.org (CoxBcv)
CTMBR
Copy Link
This is a program to construct classification trees for multiple binary responses. In Biomedical Research, many diagnoses are based on multiple items such as depression and anxiety. This program makes it possible to conduct analysis at the item level.
Faculty: Heping Zhang, PhD
Download: CTMBR package
Platform: Unix
Reference: doi.org (CTMBR)
CRTFASTGEEPWR
Copy Link
Randomized trials based on marginal models and multilevel correlation structures
Faculty: Fan Li, PhD
Download: CRTFASTGEEPWR package
Platform: SAS macro
Reference: doi.org (CRTFASTGEEPWR)
crt2power
Copy Link
Provides methods for powering cluster-randomized trials with two co-primary outcomes using five key design techniques. Includes functions for calculating required sample size and statistical power. For more details on methodology, see Li et al. (2020) <doi:10.1111/biom.13212>, Pocock et al. (1987) <doi:10.2307/2531989>, Vickerstaff et al. (2019) <doi:10.1186/s12874-019-0754-4>, and Yang et al. (2022) <doi:10.1111/biom.13692>.
Faculty: Donna Spiegelman, ScD
Download: cran R / crt2power package
Platform: R
Reference: doi.org (crt2power)
cvcrand
Copy Link
Constrained randomization by Raab and Butcher (2001) <doi:10.1002/1097-0258(20010215)20:3%3C351::AID-SIM797%3E3.0.CO;2-C> is suitable for cluster randomized trials (CRTs) with a small number of clusters (e.g., 20 or fewer). The procedure of constrained randomization is based on the baseline values of some cluster-level covariates specified. The intervention effect on the individual outcome can then be analyzed through clustered permutation test introduced by Gail, et al. (1996) <doi:10.1002/(SICI)1097-0258(19960615)15:11%3C1069::AID-SIM220%3E3.0.CO;2-Q>. Motivated from Li, et al. (2016) <doi:10.1002/sim.7410>, the package performs constrained randomization on the baseline values of cluster-level covariates and clustered permutation test on the individual-level outcomes for cluster randomized trials.
Faculty: Fan Li, PhD
Download: Cran R / cvcrand package
Platform: R
Reference: r-project.org (cvcrand)
DIPM
Copy Link
An implementation by Chen, Li, and Zhang (2022) <doi:10.1093/bioadv/vbac041> of the Depth Importance in Precision Medicine (DIPM) method in Chen and Zhang (2022) <doi:10.1093/biostatistics/kxaa021> and Chen and Zhang (2020) <doi:10.1007/978-3-030-46161-4_16>. The DIPM method is a classification tree that searches for subgroups with especially poor or strong performance in a given treatment group.
Faculty: Heping Zhang, PhD
Download: CranR / DIPM package
Platform: R
Reference: doi.org (DIPM)
GenDID
Copy Link
Developing functions to implement the generalized Difference-in-Differences (DID) estimator approach to analysis of stepped wedge cluster-randomized trials and observational/quasi-experimental panel data.
Faculty: Lee Kennedy-Shaffer, PhD
Download: GitHub / GenDID Package
Platform: R
Reference: doi.org (GenDID)
ge_trend_v2
Copy Link
The program ge_trend_v2 is designed to calculate the power and minimum required sample size for case-control studies testing hypotheses about gene-environment interactions with a polygamous exposure variable. The program extends the original program ge_trend by permitting the investigator the freedom to allow the main effect odds ratio for gene and exposure to vary in a user-specific interval under the alternative hypothesis.
Faculty: Donna Spiegelman, ScD
Download: ge_trend_v2 package
Platform: Fortran
Reference: doi.org (ge_trend_v2)
group_lapl
Copy Link
SGLS implements penalization method for integrative analysis of multiple high-throughput cancer prognosis studies incorporating network structures. This method is based on a combination of the group MCP penalty and a Laplacian penalty. The group MCP is adopted for gene selection and Laplacian penalty is applied to smooth the differences between regression coefficients of tightly-connected genes.
Faculty: Shuangge Steven Ma, PhD
Download: GitHub / group_lapl package
Platform: R
Reference: doi.org (group_lapl)
holcroft.f77
Copy Link
A user-friendly Fortran program is available from the second author, which calculates the optimal sampling fractions for all designs considered and the efficiencies of these designs relative to the optimal hybrid design for any scenario of interest.
Faculty: Donna Spiegelman, ScD
Download: holcroft.f77 package
Platform: Fortran
Reference: doi.org (holcroft.f77)
HTE-MMD-app
Copy Link
R Shiny App for finding the locally optimal and maximin cluster randomized trials assessing treatment effect modification
Faculty: Fan Li, PhD; Denise Esserman, PhD
Download: HTE-MMD-app package
Platform: R Shiny
Reference: doi.org (HTE-MMD-app)
H2x2Factorial
Copy Link
Implements the sample size methods for hierarchical 2x2 factorial trials under two choices of effect estimands and a series of hypothesis tests proposed in "Sample size calculation in hierarchical 2x2 factorial trials with unequal cluster sizes" (under review), and provides the table and plot generators for the sample size estimations.
Faculty: Denise Esserman, PhD; Fan Li, PhD
Download: Cran R / H2x2Factorial Package
Platform: R
IC-OLS
Copy Link
Design stepped wedge cluster randomized trials analyzed through generalized estimating equations under a misspecified working independence correlation structure.
Faculty: Fan Li, PhD
Download: IC-OLS package
Platform: R Shiny
Reference: doi.org (IC-OLS)
%metaanal
Copy Link
The %METAANAL macro is a SAS version 9 macro that produces the DerSimonian-Laird estimators for random efects or fixed efects models in pooled or metaanalysis. It can be used to pull results from two or three of the Channing cohorts and test for between-studies heterogeneity.
Faculty: Donna Spiegelman, ScD
Download: %metaanal package
Platform: SAS
Reference: doi.org (%metaanal)
%metadose
Copy Link
The %metadose macro is a SAS macro for meta-analysis of linear and nonlinear dose-response relationships. It is used when research reports studying the same dose-response relationship have dierent exposure or treatment levels. It is a two step macro: First, for each study, it uses either the Greenland method (AJE, 1992) or Hamling method (SIM, 2008) to get estimated cell counts of the 2X2 table adjusted for counfounding, then it estimates the asymptotic correlation between the adjusted log odds ratio estimates for each exposure level relative to the referent level, from which we can get the estimated covariance matrix for these study-specific estimates. After this step, we get a single pooled estimate and its variance estimate across dierent exposure or treatment levels. Then, meta-analysis is performed analysis for all the studies using the single study-specific trend estimate, in common units across studies. An option also exists to explore and graph non-linearity in the poooled results.
Faculty: Donna Spiegelman, ScD
Download: %metadose package
Platform: SAS
Reference: doi.org (%metadose)
%meta subtype trend
Copy Link
The %subtype_trend macro tests whether the exposure-subtype association has a trend across the ordinal cancer subtypes. The user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each subtype, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.
Faculty: Donna Spiegelman, ScD
Download: %meta subtype trend package
Platform: SAS
power_swgee
Copy Link
Stata module to compute power (under both a Z and t distribution) for cluster randomized stepped wedge designs.
Faculty: Fan Li, PhD
Download: power_swgee package
Platform: stata module
Reference: doi.org (power_swgee)
PREGS
Copy Link
A conformal test of non-zero coefficient in linear models via permutation-augmentation.
Faculty: Leying Guan, PhD
Download: GitHub / PREGS package
Platform: R
Reference: doi.org (PREGS)
puddlr
Copy Link
puddlr is a general-purpose set of tools for the analysis of datasets with relatively few observations compared to the total number of features. These data sets are often called "shallow" and "wide", which is the inspiration for the "puddlr" name.
Faculty: Leying Guan, PhD
Download: GitHub / puddlr package
Platform: R
Reference: doi.org (puddlr)
ROOT
Copy Link
Randomized controlled trials (RCTs) serve as the cornerstone for understanding causal effects, yet extending inferences to target populations presents challenges due to effect heterogeneity and underrepresentation. Our paper addresses the critical issue of identifying and characterizing underrepresented subgroups in RCTs, proposing a novel framework for refining target populations to improve generalizability. We introduce an optimization-based approach, Rashomon Set of Optimal Trees (ROOT), to characterize underrepresented groups. ROOT optimizes the target subpopulation distribution by minimizing the variance of the target average treatment effect estimate, ensuring more precise treatment effect estimations. Notably, ROOT generates interpretable characteristics of the underrepresented population, aiding researchers in effective communication.
Faculty: Harsh Parikh, PhD
Download: GitHub / ROOT package
Platform: Python, R
Reference: arxiv.org (ROOT)
strat-crt-ss
Copy Link
Sample Size Calculations for Stratified IRTs and CRTs. The ui.R and server.R files create a Shiny app that can be used to find the sample size required for a target stratified IRT or CRT via a user-friendly interface. Sensitivity plots and data to the proportion of individuals in a given stratum are also available here.
Faculty: Lee Kennedy-Shaffer, PhD
Download: GitHub / strat-crt-ss Package
Platform: R, R Shiny
Reference: doi.org (strat-crt-ss)
STREE
Copy Link
Represents one of the most popular uses of tree-based methods. This program identifies prognostic factors that are predictive of survival outcome and time to an event of interest. It partitions a study sample into strata to reveal distinct patt erns of survival among subgroups.
Faculty: Heping Zhang, PhD
Download: STREE package
Platform: Unix
SW-CRT-analysis
Copy Link
SW-CRT Analysis Methods.R implements the analysis methods for stepped wedge cluster randomized trials detailed in Kennedy-Shaffer et al. 2020. These include the novel methods from that paper (SC, CO, COSC, and ENS) as well as the versions of existing methods described in that article (MEM from Hussey & Hughes 2007, CPI from Hooper et al. 2016, the permutation test versions of these from Wang and De Gruttola 2017 and Ji et al. 2017, and NPWP from Thompson et al. 2018).
Faculty: Lee Kennedy-Shaffer, PhD
Download: GitHub / SW-CRT-analysis Package
Platform: R
Reference: doi.org (SW-CRT-analysis)
SW-CRT-outbreak
Copy Link
Randomized controlled trials are crucial for the evaluation of interventions such as vaccinations, but the design and analysis of these studies during infectious disease outbreaks is complicated by statistical, ethical, and logistical factors. Attempts to resolve these complexities have led to the proposal of a variety of trial designs, including individual randomization and several types of cluster randomization designs: parallel-arm, ring vaccination, and stepped wedge designs. Because of the strong time trends present in infectious disease incidence, however, methods generally used to analyze stepped wedge trials might not perform well in these settings. Using simulated outbreaks, we evaluated various designs and analysis methods, including recently proposed methods for analyzing stepped wedge trials, to determine the statistical properties of these methods. While new methods for analyzing stepped wedge trials can provide some improvement over previous methods, we find that they still lag behind parallel-arm cluster-randomized trials and individually randomized trials in achieving adequate power to detect intervention effects. We also find that these methods are highly sensitive to the weighting of effect estimates across time periods. Despite the value of new methods, stepped wedge trials still have statistical disadvantages compared with other trial designs in epidemic settings.
Faculty: Lee Kennedy-Shaffer, PhD
Download: GitHub / SW-CRT-outbreak Package
Platform: R
Reference: doi.org (SW-CRT-outbreak)
swdpwr
Copy Link
To meet the needs of statistical power calculation for stepped wedge cluster randomized trials, we developed this software. Different parameters can be specified by users for different scenarios, including: cross-sectional and cohort designs, binary and continuous outcomes, marginal (GEE) and conditional models (mixed effects model), three link functions (identity, log, logit links), with and without time effects (the default specification assumes no-time-effect) under exchangeable, nested exchangeable and block exchangeable correlation structures. Unequal numbers of clusters per sequence are also allowed.
Faculty: Fan Li, PhD; Xin Zhou, PhD; Donna Spiegelman, ScD
Download: Cran R / swdpwr package
Platform: R; R Shiny
Reference: doi.org (swdpwr)
SW-IC-binary-count
Copy Link
R Shiny App to estimate the information content of the stepped wedge designs with binary or count outcomes.
Faculty: Fan Li, PhD
Download: SW-IC-binary-count package
Platform: R Shiny
Reference: doi.org (SW-IC-binary-count)
sample_size
Copy Link
R Shiny App for power calculation to detect treatment effect heterogeneity by a single binary effect modifier in a cluster randomized trial with binary outcomes.
Faculty: Fan Li, PhD
Download: sample_size package
Platform: R Shiny
Reference: doi.org (sample_size)
swcrtcalculator
Copy Link
R Shiny App and Stata module for finding the right power and sample size calculator for stepped wedge cluster randomized trials.
Faculty: Fan Li, PhD
Download: swcrtcalculator package
Platform: R Shiny
Reference: doi.org (swcrtcalculator)
SWCRT_3Level_DesignEffect
Copy Link
1: Function.R: including a function that implements the proposed model.
2: Simulation_settings.R: including codes for generating simulated data.
3: case_study.R: performing our method on the LUAD dataset and visualizing results.
Faculty: Fan Li, PhD; Kendra Plourde, PhD
Download: SWCRT_3Level_DesignEffect package
Platform: R Shiny
Reference: doi.org (SWCRT_3Level_DesignEffect)
%subtype_MultipleMarker
Copy Link
A meta-regression method that can utilize existing statistical software for mixed model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by one marker while controlling for other markers, and to evaluate whether the difference in exposure-subtype association across subtype defined by one marker depends on any other markers.
Faculty: Donna Spiegelman, ScD
Download: %subtype_MultipleMarker package
Platform: SAS
Reference: doi.org (%subtype_MultipleMarker)
%subtype
Copy Link
A %subtype macro examines whether the effects of the exposure(s) vary by subtype of a disease. It can be applied to data from the cohort studies, nested or matched case-control studies, unmatched case-control studies and case-case studies.
Faculty: Donna Spiegelman, ScD
Download: %subtype package
Platform: SAS
Reference: nih.gov (%subtype)
tcs
Copy Link
The identification of heterogeneity in effects between studies is a key issue in meta-analyses of observational studies, since it is critical for determining whether it is appropriate to pool the individual results into one summary measure. The result of a hypothesis test is often used as the decision criterion. In this paper, the authors use a large simulation study patterned from the key features of five published epidemiologic meta-analyses to investigate the type I error and statistical power of five previously proposed asymptotic homogeneity tests, a parametric bootstrap version of each of the tests, and tau2-bootstrap, a test proposed by the authors. The results show that the asymptotic DerSimonian and Laird Q statistic and the bootstrap versions of the other tests give the correct type I error under the null hypothesis but that all of the tests considered have low statistical power, especially when the number of studies included in the meta-analysis is small (<20). From the point of view of validity, power, and computational ease, the Q statistic is clearly the best choice. The authors found that the performance of all of the tests considered did not depend appreciably upon the value of the pooled odds ratio, both for size and for power. Because tests for heterogeneity will often be underpowered, random effects models can be used routinely, and heterogeneity can be quantified by means of R(I), the proportion of the total variance of the pooled effect measure due to between-study variance, and CV(B), the between-study coefficient of variation.
Faculty: Donna Spiegelman, ScD
Download: tcs package
Platform: Fortran
Reference: doi.org (tcs)
%table1
Copy Link
The %table1 macro computes indirectly standardized rates, means, or proportions. The results are automatically prepared, by level of a given exposure variable, in a formatted MS Word table. The table is intended for use in publications with minimal additional formatting and/or preparation required. Table1 of many papers is a breakdown of cohort characteristics by exposure categories. In most instances, it is necessary to age-standardize the means or proportions of other potential confounders before displaying them by exposure category.
Faculty: Donna Spiegelman, ScD
Download: %table1 package
Platform: SAS