sandsynligvis.dk - Sammenligning af baselineværdier i randomiserede, kontrollerede forsøg

Sammenligning af baselineværdier i randomiserede, kontrollerede forsøg :: Draft

Claus Thorn Ekstrøm 1. aug 2018 4 min

{% newthought “I mange artikler om” %} randomiserede, kontrollerede kliniske forsøg (engelsk: randomized controlled trials eller bare RCT) er det ikke ualmindeligt at

(engelsk: randomized controlled trials eller RCT) er den gyldne standard indenfor kliniske forsøg. I mange artikler om RCT ser man ofte, at

Many randomised controlled trial (RCT) papers report significance tests on baseline parameters just after/before randomisation to show that the groups are indeed similar. This is often part of a “baseline characteristics” table. However, significance tests measure the probability of getting the observed (or a stronger) difference by chance, aren’t they? And if the test is significant we conclude that there is a true difference because a random difference of that extent would be unlikely. Does a significance test make any sense after randomisation when we know that any difference must be due to chance?

Ofte ser man afrapoorteret pcærdier registreres hver forsøgsenhed to gange: en gang ved baseline inden behandlingen startes og en gang efter, men i videnskabelige artikler er der stor forskel på, hvordan baseline-værdierne — og andre baggrundsvariable — bliver brugt.

I artikler om randomiserede, kontrollerede forsøg rapporterer

Der er typisk to steder, hvor baseline-værdierne fra et RCT bliver brugt.

Non-significant results cannot prove that patients were allocated randomly. The P-value is calculated assuming homogeneity is true: pr(D|H) which is different from pr(H|D). In other words: the probability of Death-by-Hanging is not the probability of Hanging-as-cause-of-Death. Statistically significant baseline differences will occur even if the null hypothesis is true (randomisation was completed). Moher et al (2010) state that “significance tests assess the probability that observed baseline differences could have occurred by chance; however, we already know that any differences are caused by chance.” This formulation is somewhat awkward because significance tests do not calculate error probabilities. The P-value of a baseline homogeneity test gives us the probability of the observed differences, or larger differences, if the hypothesis is true. This is not an error rate. The proportion of significant results is larger than the common 5% Type I error rate. If we take a thousand samples of 150 cases, divided over two groups, of ten items consisting of random numbers, in about 40% one or more of the 10 t-tests of “pure noise” can be statistically significant (see Figure 1).

Brug af baseline-værdier til at validere randomiseringen

Store tabel 1

http://www.statisticalmisses.nl/index.php/frequently-asked-questions/84-why-are-significance-tests-of-baseline-differences-a-very-bad-idea http://jnnp.bmj.com/content/75/2/181 http://pubmedcentralcanada.ca/ptpicrender.fcgi?aid=277913&blobtype=html&lang=en-ca https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1116277/ https://pdfs.semanticscholar.org/e834/97a00c43c5924617ceaccc73840761d4564d.pdf

https://stats.stackexchange.com/questions/9654/does-significance-test-make-sense-to-compare-randomised-groups-at-baseline https://stats.stackexchange.com/questions/58384/homogeneity-testing-of-baseline-characteristics-in-medical-trials https://stats.stackexchange.com/questions/9654/does-significance-test-make-sense-to-compare-randomised-groups-at-baseline https://stats.stackexchange.com/questions/17724/how-should-one-control-for-group-and-individual-differences-in-pre-treatment-sco?rq=1 https://stats.stackexchange.com/questions/58384/homogeneity-testing-of-baseline-characteristics-in-medical-trials

Brug af baseline-værdier i analysen af af behandlingseffekten

Brug af baseline-værdier

Lad os lige starte med at slå fast, at der ikke er nogle generelle, formelle krav til udformningen eller indholdet af den “store” tabel 1. Med andre ord skal indholdet af tabellen svare til den historie … og hvis Hvis formålet er at præsentere de indsamlede data, så

http://www.sciencedirect.com/science/article/pii/S2221169115303671

Inklusion af andre prædiktorer

https://stats.stackexchange.com/questions/83277/using-control-variables-in-experiments

2 down vote accepted From a frequentist perspective, an unadjusted comparison based on the permutation distribution can always be justified following a (properly) randomized study. A similar justification can be made for inference based on common parametric distributions (e.g., the tt distribution or FF distribution) due to their similarity to the permutation distribution. In fact, adjusting for covariates—when they are selected based on post-hoc analyses—actually risks inflating the Type I error. Note that this justification has nothing to do with the degree of balance in the observed sample, or with the size of the sample (except that for small samples the permutation distribution will be more discrete, and less well approximated by the tt or FF distributions).

That said, many people are aware that adjusting for covariates can increase precision in the linear model. Specifically, adjusting for covariates increases the precision of the estimated treatment effect when they are predictive of the outcome and not correlated with the treatment variable (as is true in the case of a randomized study). What is less well known, however, is that this does not automatically carry over to non-linear models. For example, Robinson and Jewell [1] show that in the case of logistic regression, controlling for covariates reduces the precision of the estimated treatment effect, even when they are predictive of the outcome. However, because the estimated treatment effect is also larger in the adjusted model, controlling for covariates predictive of the outcome does increase efficiency when testing the null hypothesis of no treatment effect following a randomized study.

[1] L. D. Robinson and N. P. Jewell. Some surprising results about covariate adjustment in logistic regression models. International Statistical Review, 58(2):227–40, 1991.