Skip to Main Content

Randomized Clinical Trials

%contrasttest

The %contrastTest macro conducts heterogeneity test for comparing the exposure-disease associations obtained from separate subtype-specific analysis based on the cohort or nested case-controlled studies. Specifically, the user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each sub-type, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.

Faculty: Donna Spiegelman, ScD

Download: %contrasttest package

Platform: SAS

Reference: doi.org (%contrasttest)


CoxBcv

The implementation of bias-corrected sandwich variance estimators for the analysis of cluster randomized trials with time-to-event outcomes using the marginal Cox model, proposed by Wang et al. (2023, Biometrical Journal)

Faculty: Fan Li, PhD

Download: Cran R / CoxBcv package

Platform: R

Reference: doi.org (CoxBcv)


CTMBR

This is a program to construct classification trees for multiple binary responses. In Biomedical Research, many diagnoses are based on multiple items such as depression and anxiety. This program makes it possible to conduct analysis at the item level.

Faculty: Heping Zhang, PhD

Download: CTMBR package

Platform: Unix

Reference: doi.org (CTMBR)


CRTFASTGEEPWR

Randomized trials based on marginal models and multilevel correlation structures

Faculty: Fan Li, PhD

Download: CRTFASTGEEPWR package

Platform: SAS macro

Reference: doi.org (CRTFASTGEEPWR)


crt2power

Provides methods for powering cluster-randomized trials with two co-primary outcomes using five key design techniques. Includes functions for calculating required sample size and statistical power. For more details on methodology, see Li et al. (2020) <doi:10.1111/biom.13212>, Pocock et al. (1987) <doi:10.2307/2531989>, Vickerstaff et al. (2019) <doi:10.1186/s12874-019-0754-4>, and Yang et al. (2022) <doi:10.1111/biom.13692>.

Faculty: Donna Spiegelman, ScD

Download: cran R / crt2power package

Platform: R

Reference: doi.org (crt2power)


cvcrand

Constrained randomization by Raab and Butcher (2001) <doi:10.1002/1097-0258(20010215)20:3%3C351::AID-SIM797%3E3.0.CO;2-C> is suitable for cluster randomized trials (CRTs) with a small number of clusters (e.g., 20 or fewer). The procedure of constrained randomization is based on the baseline values of some cluster-level covariates specified. The intervention effect on the individual outcome can then be analyzed through clustered permutation test introduced by Gail, et al. (1996) <doi:10.1002/(SICI)1097-0258(19960615)15:11%3C1069::AID-SIM220%3E3.0.CO;2-Q>. Motivated from Li, et al. (2016) <doi:10.1002/sim.7410>, the package performs constrained randomization on the baseline values of cluster-level covariates and clustered permutation test on the individual-level outcomes for cluster randomized trials.

Faculty: Fan Li, PhD

Download: Cran R / cvcrand package

Platform: R

Reference: r-project.org (cvcrand)


DIPM

An implementation by Chen, Li, and Zhang (2022) <doi:10.1093/bioadv/vbac041> of the Depth Importance in Precision Medicine (DIPM) method in Chen and Zhang (2022) <doi:10.1093/biostatistics/kxaa021> and Chen and Zhang (2020) <doi:10.1007/978-3-030-46161-4_16>. The DIPM method is a classification tree that searches for subgroups with especially poor or strong performance in a given treatment group.

Faculty: Heping Zhang, PhD

Download: CranR / DIPM package

Platform: R

Reference: doi.org (DIPM)


GenDID

Developing functions to implement the generalized Difference-in-Differences (DID) estimator approach to analysis of stepped wedge cluster-randomized trials and observational/quasi-experimental panel data.

Faculty: Lee Kennedy-Shaffer, PhD

Download: GitHub / GenDID Package

Platform: R

Reference: doi.org (GenDID)


ge_trend_v2

The program ge_trend_v2 is designed to calculate the power and minimum required sample size for case-control studies testing hypotheses about gene-environment interactions with a polygamous exposure variable. The program extends the original program ge_trend by permitting the investigator the freedom to allow the main effect odds ratio for gene and exposure to vary in a user-specific interval under the alternative hypothesis.

Faculty: Donna Spiegelman, ScD

Download: ge_trend_v2 package

Platform: Fortran

Reference: doi.org (ge_trend_v2)


group_lapl

SGLS implements penalization method for integrative analysis of multiple high-throughput cancer prognosis studies incorporating network structures. This method is based on a combination of the group MCP penalty and a Laplacian penalty. The group MCP is adopted for gene selection and Laplacian penalty is applied to smooth the differences between regression coefficients of tightly-connected genes.

Faculty: Shuangge Steven Ma, PhD

Download: GitHub / group_lapl package

Platform: R

Reference: doi.org (group_lapl)


holcroft.f77

A user-friendly Fortran program is available from the second author, which calculates the optimal sampling fractions for all designs considered and the efficiencies of these designs relative to the optimal hybrid design for any scenario of interest.

Faculty: Donna Spiegelman, ScD

Download: holcroft.f77 package

Platform: Fortran

Reference: doi.org (holcroft.f77)


HTE-MMD-app

R Shiny App for finding the locally optimal and maximin cluster randomized trials assessing treatment effect modification

Faculty: Fan Li, PhD; Denise Esserman, PhD

Download: HTE-MMD-app package

Platform: R Shiny

Reference: doi.org (HTE-MMD-app)


H2x2Factorial

Implements the sample size methods for hierarchical 2x2 factorial trials under two choices of effect estimands and a series of hypothesis tests proposed in "Sample size calculation in hierarchical 2x2 factorial trials with unequal cluster sizes" (under review), and provides the table and plot generators for the sample size estimations.

Faculty: Denise Esserman, PhD; Fan Li, PhD

Download: Cran R / H2x2Factorial Package

Platform: R


IC-OLS

Design stepped wedge cluster randomized trials analyzed through generalized estimating equations under a misspecified working independence correlation structure.

Faculty: Fan Li, PhD

Download: IC-OLS package

Platform: R Shiny

Reference: doi.org (IC-OLS)


%metaanal

The %METAANAL macro is a SAS version 9 macro that produces the DerSimonian-Laird estimators for random efects or fixed efects models in pooled or metaanalysis. It can be used to pull results from two or three of the Channing cohorts and test for between-studies heterogeneity.

Faculty: Donna Spiegelman, ScD

Download: %metaanal package

Platform: SAS

Reference: doi.org (%metaanal)


%metadose

The %metadose macro is a SAS macro for meta-analysis of linear and nonlinear dose-response relationships. It is used when research reports studying the same dose-response relationship have dierent exposure or treatment levels. It is a two step macro: First, for each study, it uses either the Greenland method (AJE, 1992) or Hamling method (SIM, 2008) to get estimated cell counts of the 2X2 table adjusted for counfounding, then it estimates the asymptotic correlation between the adjusted log odds ratio estimates for each exposure level relative to the referent level, from which we can get the estimated covariance matrix for these study-specific estimates. After this step, we get a single pooled estimate and its variance estimate across dierent exposure or treatment levels. Then, meta-analysis is performed analysis for all the studies using the single study-specific trend estimate, in common units across studies. An option also exists to explore and graph non-linearity in the poooled results.

Faculty: Donna Spiegelman, ScD

Download: %metadose package

Platform: SAS

Reference: doi.org (%metadose)


%meta subtype trend

The %subtype_trend macro tests whether the exposure-subtype association has a trend across the ordinal cancer subtypes. The user runs separate Cox (for cohort studies) or conditional logistic models (for nested case-control studies) for each subtype, and then tests the heterogeneity hypothesis using the outputs from the separate models, or the user takes the estimates (and standard errors) from the literature and test the heterogeneity hypothesis. In the subtype-specific analysis, the confounders-disease associations are allowed to be different among the subtypes.

Faculty: Donna Spiegelman, ScD

Download: %meta subtype trend package

Platform: SAS


power_swgee

Stata module to compute power (under both a Z and t distribution) for cluster randomized stepped wedge designs.

Faculty: Fan Li, PhD

Download: power_swgee package

Platform: stata module

Reference: doi.org (power_swgee)


PREGS

A conformal test of non-zero coefficient in linear models via permutation-augmentation.

Faculty: Leying Guan, PhD

Download: GitHub / PREGS package

Platform: R

Reference: doi.org (PREGS)


puddlr

puddlr is a general-purpose set of tools for the analysis of datasets with relatively few observations compared to the total number of features. These data sets are often called "shallow" and "wide", which is the inspiration for the "puddlr" name.

Faculty: Leying Guan, PhD

Download: GitHub / puddlr package

Platform: R

Reference: doi.org (puddlr)


ROOT

Randomized controlled trials (RCTs) serve as the cornerstone for understanding causal effects, yet extending inferences to target populations presents challenges due to effect heterogeneity and underrepresentation. Our paper addresses the critical issue of identifying and characterizing underrepresented subgroups in RCTs, proposing a novel framework for refining target populations to improve generalizability. We introduce an optimization-based approach, Rashomon Set of Optimal Trees (ROOT), to characterize underrepresented groups. ROOT optimizes the target subpopulation distribution by minimizing the variance of the target average treatment effect estimate, ensuring more precise treatment effect estimations. Notably, ROOT generates interpretable characteristics of the underrepresented population, aiding researchers in effective communication.

Faculty: Harsh Parikh, PhD

Download: GitHub / ROOT package

Platform: Python, R

Reference: arxiv.org (ROOT)


strat-crt-ss

Sample Size Calculations for Stratified IRTs and CRTs. The ui.R and server.R files create a Shiny app that can be used to find the sample size required for a target stratified IRT or CRT via a user-friendly interface. Sensitivity plots and data to the proportion of individuals in a given stratum are also available here.

Faculty: Lee Kennedy-Shaffer, PhD

Download: GitHub / strat-crt-ss Package

Platform: R, R Shiny

Reference: doi.org (strat-crt-ss)


STREE

Represents one of the most popular uses of tree-based methods. This program identifies prognostic factors that are predictive of survival outcome and time to an event of interest. It partitions a study sample into strata to reveal distinct patt erns of survival among subgroups.

Faculty: Heping Zhang, PhD

Download: STREE package

Platform: Unix


SW-CRT-analysis

SW-CRT Analysis Methods.R implements the analysis methods for stepped wedge cluster randomized trials detailed in Kennedy-Shaffer et al. 2020. These include the novel methods from that paper (SC, CO, COSC, and ENS) as well as the versions of existing methods described in that article (MEM from Hussey & Hughes 2007, CPI from Hooper et al. 2016, the permutation test versions of these from Wang and De Gruttola 2017 and Ji et al. 2017, and NPWP from Thompson et al. 2018).

Faculty: Lee Kennedy-Shaffer, PhD

Download: GitHub / SW-CRT-analysis Package

Platform: R

Reference: doi.org (SW-CRT-analysis)


SW-CRT-outbreak

Randomized controlled trials are crucial for the evaluation of interventions such as vaccinations, but the design and analysis of these studies during infectious disease outbreaks is complicated by statistical, ethical, and logistical factors. Attempts to resolve these complexities have led to the proposal of a variety of trial designs, including individual randomization and several types of cluster randomization designs: parallel-arm, ring vaccination, and stepped wedge designs. Because of the strong time trends present in infectious disease incidence, however, methods generally used to analyze stepped wedge trials might not perform well in these settings. Using simulated outbreaks, we evaluated various designs and analysis methods, including recently proposed methods for analyzing stepped wedge trials, to determine the statistical properties of these methods. While new methods for analyzing stepped wedge trials can provide some improvement over previous methods, we find that they still lag behind parallel-arm cluster-randomized trials and individually randomized trials in achieving adequate power to detect intervention effects. We also find that these methods are highly sensitive to the weighting of effect estimates across time periods. Despite the value of new methods, stepped wedge trials still have statistical disadvantages compared with other trial designs in epidemic settings.

Faculty: Lee Kennedy-Shaffer, PhD

Download: GitHub / SW-CRT-outbreak Package

Platform: R

Reference: doi.org (SW-CRT-outbreak)


swdpwr

To meet the needs of statistical power calculation for stepped wedge cluster randomized trials, we developed this software. Different parameters can be specified by users for different scenarios, including: cross-sectional and cohort designs, binary and continuous outcomes, marginal (GEE) and conditional models (mixed effects model), three link functions (identity, log, logit links), with and without time effects (the default specification assumes no-time-effect) under exchangeable, nested exchangeable and block exchangeable correlation structures. Unequal numbers of clusters per sequence are also allowed.

Faculty: Fan Li, PhD; Xin Zhou, PhD; Donna Spiegelman, ScD

Download: Cran R / swdpwr package

Platform: R; R Shiny

Reference: doi.org (swdpwr)


SW-IC-binary-count

R Shiny App to estimate the information content of the stepped wedge designs with binary or count outcomes.

Faculty: Fan Li, PhD

Download: SW-IC-binary-count package

Platform: R Shiny

Reference: doi.org (SW-IC-binary-count)


sample_size

R Shiny App for power calculation to detect treatment effect heterogeneity by a single binary effect modifier in a cluster randomized trial with binary outcomes.

Faculty: Fan Li, PhD

Download: sample_size package

Platform: R Shiny

Reference: doi.org (sample_size)


swcrtcalculator

R Shiny App and Stata module for finding the right power and sample size calculator for stepped wedge cluster randomized trials.

Faculty: Fan Li, PhD

Download: swcrtcalculator package

Platform: R Shiny

Reference: doi.org (swcrtcalculator)


SWCRT_3Level_DesignEffect

1: Function.R: including a function that implements the proposed model.

2: Simulation_settings.R: including codes for generating simulated data.

3: case_study.R: performing our method on the LUAD dataset and visualizing results.

Faculty: Fan Li, PhD; Kendra Plourde, PhD

Download: SWCRT_3Level_DesignEffect package

Platform: R Shiny

Reference: doi.org (SWCRT_3Level_DesignEffect)


%subtype_MultipleMarker

A meta-regression method that can utilize existing statistical software for mixed model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by one marker while controlling for other markers, and to evaluate whether the difference in exposure-subtype association across subtype defined by one marker depends on any other markers.

Faculty: Donna Spiegelman, ScD

Download: %subtype_MultipleMarker package

Platform: SAS

Reference: doi.org (%subtype_MultipleMarker)


%subtype

A %subtype macro examines whether the effects of the exposure(s) vary by subtype of a disease. It can be applied to data from the cohort studies, nested or matched case-control studies, unmatched case-control studies and case-case studies.

Faculty: Donna Spiegelman, ScD

Download: %subtype package

Platform: SAS

Reference: nih.gov (%subtype)


tcs

The identification of heterogeneity in effects between studies is a key issue in meta-analyses of observational studies, since it is critical for determining whether it is appropriate to pool the individual results into one summary measure. The result of a hypothesis test is often used as the decision criterion. In this paper, the authors use a large simulation study patterned from the key features of five published epidemiologic meta-analyses to investigate the type I error and statistical power of five previously proposed asymptotic homogeneity tests, a parametric bootstrap version of each of the tests, and tau2-bootstrap, a test proposed by the authors. The results show that the asymptotic DerSimonian and Laird Q statistic and the bootstrap versions of the other tests give the correct type I error under the null hypothesis but that all of the tests considered have low statistical power, especially when the number of studies included in the meta-analysis is small (<20). From the point of view of validity, power, and computational ease, the Q statistic is clearly the best choice. The authors found that the performance of all of the tests considered did not depend appreciably upon the value of the pooled odds ratio, both for size and for power. Because tests for heterogeneity will often be underpowered, random effects models can be used routinely, and heterogeneity can be quantified by means of R(I), the proportion of the total variance of the pooled effect measure due to between-study variance, and CV(B), the between-study coefficient of variation.

Faculty: Donna Spiegelman, ScD

Download: tcs package

Platform: Fortran

Reference: doi.org (tcs)


%table1

The %table1 macro computes indirectly standardized rates, means, or proportions. The results are automatically prepared, by level of a given exposure variable, in a formatted MS Word table. The table is intended for use in publications with minimal additional formatting and/or preparation required. Table1 of many papers is a breakdown of cohort characteristics by exposure categories. In most instances, it is necessary to age-standardize the means or proportions of other potential confounders before displaying them by exposure category.

Faculty: Donna Spiegelman, ScD

Download: %table1 package

Platform: SAS