In the SimTOST R package, which is specifically designed for sample size estimation for bioequivalence studies, hypothesis testing is based on the Two One-Sided Tests (TOST) procedure. (Sozu et al. 2015) In TOST, the equivalence test is framed as a comparison between the the null hypothesis of ‘new product is worse by a clinically relevant quantity’ and the alternative hypothesis of ‘difference between products is too small to be clinically relevant’. This vignette focuses on a parallel design, with 2 arms/treatments and 5 primary endpoints.

In the following two examples, we demonstrate the use of SimTOST for parallel trial designs with data assumed to follow a normal distribution on the log scale. We start by loading the package.

library(SimTOST)

Multiple Independent Co-Primary Endpoints

Here, we consider a bio-equivalence trial with 2 treatment arms and \(m=5\) endpoints. The sample size is calculated to ensure that the test and reference products are equivalent with respect to all 5 endpoints. The true ratio between the test and reference products is assumed to be 1.05. It is assumed that the standard deviation of the log-transformed response variable is \(\sigma = 0.3\), and that all tests are independent (\(\rho = 0\)). The equivalence limits are set at 0.80 and 1.25. The significance level is 0.05. The sample size is determined at a power of 0.8.

This example is adapted from Mielke et al. (2018), who employed a difference-of-means test on the log scale. The sample size calculation can be conducted using two approaches, both of which are illustrated below.

Approach 1: Using sampleSize_Mielke

In the first approach, we calculate the required sample size for 80% power using the sampleSize_Mielke() function. This method directly follows the approach described in Mielke et al. (2018), assuming a difference-of-means test on the log-transformed scale with specified parameters.

ssMielke <- sampleSize_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0, 
                              sigma = 0.3, true.diff = log(1.05), 
                              equi.tol = log(1.25), design = "parallel", 
                              alpha = 0.05, adjust = "no", seed = 1234, 
                              nsim = 10000)
ssMielke
#> power.a      SS 
#>  0.8196 68.0000

For 80% power, 68 subjects per sequence (136 in total) would be required.

Approach 2: Using sampleSize

Alternatively, the sample size calculation can be performed using the sampleSize() function. This method assumes that effect sizes are normally distributed on the log scale and uses a difference-of-means test (ctype = "DOM") with user-specified values for mu_list and sigma_list. This method allows for greater flexibility than Approach 1 in specifying parameter distributions.

mu_r <- setNames(rep(log(1.00), 5), paste0("y", 1:5))
mu_t <- setNames(rep(log(1.05), 5), paste0("y", 1:5))
sigma <- setNames(rep(0.3, 5), paste0("y", 1:5))
lequi_lower <- setNames(rep(log(0.8), 5), paste0("y", 1:5))
lequi_upper <- setNames(rep(log(1.25), 5), paste0("y", 1:5))

ss <- sampleSize(power = 0.8, alpha = 0.05,
                 mu_list = list("R" = mu_r, "T" = mu_t),
                 sigma_list = list("R" = sigma, "T" = sigma),
                 list_comparator = list("T_vs_R" = c("R", "T")),
                 list_lequi.tol = list("T_vs_R" = lequi_lower),
                 list_uequi.tol = list("T_vs_R" = lequi_upper),
                 dtype = "parallel", ctype = "DOM", lognorm = FALSE, 
                 adjust = "no", ncores = 1, nsim = 10000, seed = 1234)
ss
#> Sample Size Calculation Results
#> -------------------------------------------------------------
#> Study Design: parallel trial targeting 80% power with a 5% type-I error.
#> 
#> Comparisons:
#>    R vs. T 
#>     - Endpoints Tested: y1, y2, y3, y4, y5 
#>       (multiple co-primary endpoints, m =  5 )
#> -------------------------------------------------------------
#>                  Parameter       Value
#>          Total Sample Size         136
#>             Achieved Power        80.1
#>  Power Confidence Interval 79.3 - 80.9
#> -------------------------------------------------------------

For 80% power, a total of 136 subjects would be required.

Consider an alternative scenario in which the standard deviation of the log-transformed response variable is unknown, but the standard deviation on the original scale is known (\(\sigma = 1\)). In such cases, the sampleSize() function can still accommodate adjustments to handle these uncertainties by transforming parameters accordingly. We now provide all data on the raw scale, including the equivalence bounds, and set ctype = ROM and lognorm = TRUE.

Multiple Correlated Co-Primary Endpoints

In the second example, we set \(k=m=5\), \(\sigma = 0.3\) and \(\rho = 0.8\). This example is also adapted from Mielke et al. (2018), who employed a difference-of-means test on the log scale. The sample size calculation can again be conducted using two approaches, both of which are illustrated below.

Approach 1: Using sampleSize_Mielke

ssMielke <- sampleSize_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0.8, 
                              sigma = 0.3, true.diff = log(1.05), 
                              equi.tol = log(1.25), design = "parallel", 
                              alpha = 0.05, adjust = "no", seed = 1234, 
                              nsim = 10000)
ssMielke
#> power.a      SS 
#>  0.8031 52.0000

For 80% power, 52 subjects per sequence (104 in total) would be required.

Approach 2: Using sampleSize

Alternatively, the sample size calculation can be performed using the sampleSize() function. This method assumes that effect sizes are normally distributed on the log scale and uses a difference-of-means test (ctype = "DOM") with user-specified values for mu_list, sigma_list, and the correlation parameter rho.

mu_r <- setNames(rep(log(1.00), 5), paste0("y", 1:5))
mu_t <- setNames(rep(log(1.05), 5), paste0("y", 1:5))
sigma <- setNames(rep(0.3, 5), paste0("y", 1:5))
lequi_lower <- setNames(rep(log(0.8), 5), paste0("y", 1:5))
lequi_upper <- setNames(rep(log(1.25), 5), paste0("y", 1:5))

ss <- sampleSize(power = 0.8, alpha = 0.05,
                 mu_list = list("R" = mu_r, "T" = mu_t),
                 sigma_list = list("R" = sigma, "T" = sigma),
                 rho = 0.8, # high correlation between the endpoints
                 list_comparator = list("T_vs_R" = c("R", "T")),
                 list_lequi.tol = list("T_vs_R" = lequi_lower),
                 list_uequi.tol = list("T_vs_R" = lequi_upper),
                 dtype = "parallel", ctype = "DOM", lognorm = FALSE, 
                 adjust = "no", ncores = 1, k = 5, nsim = 10000, seed = 1234)
ss
#> Sample Size Calculation Results
#> -------------------------------------------------------------
#> Study Design: parallel trial targeting 80% power with a 5% type-I error.
#> 
#> Comparisons:
#>    R vs. T 
#>     - Endpoints Tested: y1, y2, y3, y4, y5 
#>       (multiple co-primary endpoints, m =  5 )
#> -------------------------------------------------------------
#>                  Parameter       Value
#>          Total Sample Size         108
#>             Achieved Power        80.5
#>  Power Confidence Interval 79.7 - 81.3
#> -------------------------------------------------------------

References

Mielke, Johanna, Byron Jones, Bernd Jilma, and Franz König. 2018. “Sample Size for Multiple Hypothesis Testing in Biosimilar Development.” Statistics in Biopharmaceutical Research 10 (1): 39–49. https://doi.org/10.1080/19466315.2017.1371071.

Sozu, Takashi, Tomoyuki Sugimoto, Toshimitsu Hamasaki, and Scott R. Evans. 2015. Sample Size Determination in Clinical Trials with Multiple Endpoints. SpringerBriefs in Statistics. Springer International Publishing. https://doi.org/10.1007/978-3-319-22005-5.

Bioequivalence Tests for Parallel Trial Designs with Log-Normal Data

Multiple Independent Co-Primary Endpoints

Approach 1: Using sampleSize_Mielke

Approach 2: Using sampleSize

Multiple Correlated Co-Primary Endpoints

Approach 1: Using sampleSize_Mielke

Approach 2: Using sampleSize

References