GPower: Difference in Sample Size for ANCOVA vs. Repeated Measures ANOVA in clinical trials

I want to conduct a randomized pre-post intervention study with two measurement points (1. Baseline 2. Post-intervention) and two groups (Treatment vs. Control) and am currently trying to work out the required sample size. I am assuming a medium sized effect (f= 0.25), alpha of .05 and a power of .80. Typically, ANCOVAs are used to analyse clinical trials (comparing posttest values while correcting for baseline values). For the given parameters, GPower reports an N of 128 (64 per group). However, a repeated measures ANOVA approach can also be justified to analyse pretest-posttest-data with two groups (https://doi.org/10.1177/0962280218789972 ; https://www.theanalysisfactor.com/pre-post-data-repeated-measures/).
In this case, the main effects of time and condition are not of interest; the interaction of condition X time is what matters. But when I calculate the required sample size for this approach in GPower, it reports an estimated sample size of N= 34 (17 per group!) for the whole sample. This feels like it can't be trusted. What am I missing here? Why is there such a big difference between the required sample sizes? What additional assumptions does the RM-ANOVA approach require to justify this big of a difference? Sphericity shouldn't be an issue since I have only 2 measurement points. ANCOVA enter image description here RM-ANOVA enter image description here

81.7k 32 32 gold badges 199 199 silver badges 654 654 bronze badges asked Jul 19, 2021 at 13:53 33 1 1 silver badge 3 3 bronze badges

1 Answer 1

$\begingroup$

This is a topic for seriously misunderstanding G*Power! Thank you for bringing this up.

We thought the following was an explanation for the matter:

"When calculating the sample size for "ANOVA: Repeated measures, within-between interaction" G*Power assumes a so-called "double dissociation effect" (i.e. a positive effect in group A versus a similar negative effect in group B - see the image below). We think this is a flaw, since this assumption is counterintuitive and not clearly documented. As a result, the effect size is doubled: in your example this results in f = 2x0.25 = 0.5."

Since putting half your value of f (i.e. using f=0.125) gives you the expected N of 128.

However, Edgar Erdfelder, one of the founding fathers of G*Power explained us in a private correspondence:

"The true reason of the apparent inconsitency is that the effect size in ANCOVA procedure is defined as f = s_effect / s_error (as stated in the corresponding effect size dialog).

Since the error variance is given by s_error = s_withingroup * sqrt(1-rho^2), where rho is the mean R^2 of the covariates, you should get the expected value if you insert "=0.25/sqrt(1-rho^2)" in the field "Effect size f". With rho^2 = 0.5 that was used in the repeated measures analysis , this leads to "Total sample size = 65". In the main dialog GPower accepts simple formula as input, i.e. you would actually insert ""=0.25/sqrt(1-0.5)".

Conversely, if you choose in the options dialog (press "Options" at the bottom of the window) "Effect size specification as in Cohen (1988) - recommend", you will get the same behaviour in the repeated measure procedure. Using an effect size f(V) = 0.25 leads to same value as in the ANCOVA procedure, that is N = 128. In this notation, the f in the Ancova-Procedure is actually f(V).

So far, everything is consistent. The only actual flaw is that GPower shows in the ANCOVA procedure the Cohen's "effect size recommendations", which refer to an effect size f = s_effect / s_withingroup.

We are currently preparing a new version of G*Power, which is hopefully clearer and more explicit about these two types of effect size. Specificially, we will use the effect size f = s_effect / s_withingroup and rho^2 as input parameters in ANCOVA."

Thank you, Edgar Erdfelder!

Tim van Balkom, Chris Vriend and Adriaan Hoogendoorn