Statistics refresher resources

Controlling for confounding
"Control of confounding in the analysis phase – an overview for clinicians"

In observational studies, confounding can be controlled for in the design and analysis phases. Using examples from large health care database studies, this article provides clinicians with an overview of standard methods in the analysis phase, such as stratification, standardization and multivariable regression analysis and propensity score (PS) methods together with the more advanced high-dimensional propensity score (HD-PS) method.

Source: Kahlert J, Gribsholt SB, Gammelager H, Dekkers OM, Luta G. Control of confounding in the analysis phase – an overview for clinicians. Clin Epidemiol. 2017; 9:195-204


p-value disease
"Sifting the evidence—what’s wrong with significance tests?"

This paper considers how the practice of significance testing emerged. An arbitrary division of results as “significant” or “non-significant” (according to the commonly used threshold of p = 0.05) was not the intention of the founders of statistical inference. P-values must be much smaller than 0.05 before they can be considered to provide strong evidence against the null hypothesis. Unfortunately, p-values are still commonly misunderstood. The most common misinterpretation is that the p-value is the probability that the null hypothesis is true and that a significant result thus means that the null hypothesis is very unlikely to be true. The misleading nature of this interpretation is shown making two plausible assumptions.

Source: Sterne JAC, Smith GD. Sifting the evidence—what's wrong with significance tests? Another comment on the role of statistical methods. BMJ. 2001; 322:226


Non-inferiority trials
"Challenges in the Design and Interpretation of Noninferiority Trials"

This article provides a framework for considering the features, including pitfalls, of noninferiority studies. Cardiovascular treatment trials are used as examples, although noninferiority trials can be conducted in many fields. These trials include studies designed for regulatory approval of new therapies and trials designed to compare established treatments. In addition, the application of noninferiority concepts and design to emerging areas of clinical investigation are considered.

Source: Mauri L, D’Agostino Sr RB. Challenges in the design and interpretation of noninferiority trials. New England Journal of Medicine. 2017, 377(14); 1357-1367.


Matched case-control studies
"Analysis of Matched Case-Control Studies"

There are two common misconceptions about case-control studies: that matching in itself eliminates (or controls) confounding by the matching factors, and that if matching has been performed, then a “matched analysis” is required. However, matching in a case-control study does not control for confounding by the matching factors; in fact it can introduce confounding by the matching factors even when it did not exist in the source population. Thus, a matched design may require controlling for the matching factors in the analysis. However, it is not the case that a matched design requires a matched analysis. Provided that there are no problems of sparse data, control for the matching factors can be obtained with no loss of validity and a possible increase in precision using a “standard” (unconditional) analysis, and a “matched” (conditional) analysis may not be required or appropriate. 

Source:  Pearce N. Analysis of matched case-control studies. BMJ. 2016; 352:i969


Stepped wedge cluster randomised trials
"The Stepped Wedge Cluster Randomised Trial: Rationale, Design, Analysis, and Reporting"

The stepped wedge cluster randomised trial is a research study design that is increasingly being used in the evaluation of service delivery type interventions. The design involves random and sequential crossover of clusters from control to intervention until all clusters are exposed. It is a pragmatic study design that can reconcile the need for robust evaluations with political or logistical constraints. It is particularly suited to evaluations that do not rely on individual patient recruitment. As in all cluster trials, stepped wedge trials with individual recruitment and without concealment of allocation (or blinding of the intervention) are at risk of selection biases.

Source: Hemming K, Haines TP, Chilton PJ, Girling AJ, Lilford RJ. The stepped wedge cluster randomised trial: rationale, design, analysis, and reporting. BMJ. 2015; 350:h391


Multiple testing
"Multiple testing: when is many too much?"

In almost all medical research, more than a single hypothesis is being tested or more than a single relation is being estimated. Testing multiple hypotheses (multiple testing) increases the risk of drawing a false-positive conclusion. This paper briefly discusses this phenomenon and illustrates methods to mitigate the risk of false-positive conclusions.

Source: Groenwold RHH, Goeman JJ, Cessie SL, Dekkers OM. Multiple testing: when is many too much? Eur J Endocrinol. 2021 Mar;184(3):E11-E14. doi: 10.1530/EJE-20-1375. PMID: 33300887.