How to tell if drugs work

This was a talk at a departmental research seminar at UiB philosophy.

According to prevailing evidence-based medicine (EBM) guidelines, randomized controlled trials (RCTs) provide the best evidence for efficacy of medical interventions and when available, trump other types of evidence. Some philosophers argue for mechanistic evidence to be considered alongside RCTs (Russo & Williamson, 2007), while others defend EBM with minor qualifications (Howick, 2011; Howick, Glasziou, & Aronson 2013). Yet others suggest that debating the merits of different types of evidence is irrelevant as long as our theories of evidence and causality idealize away bias in the evidence-base (Holman, 2017; Stegenga, 2018). In reality, clinical evidence may be biased due to selective reporting, opportunistic data analyses that favor (commercially) desirable results, or outright fraud. On this point, Jacob Stegenga argues that clinical research is so rigged in favor of drug-based therapies that acquiring supportive evidence from the literature
should add very little to our confidence that drugs really work (Stegenga, 2018). In the picture outlined by Stegenga, objectivity of clinical research literature is compromised so that probability of evidence favoring drug-based therapies is high regardless, yet at the same time history of medicine tells us that most drugs are not particularly effective. Put these two together – probability of evidence of efficacy is high but prior knowledge suggests most drugs are not effecacious – and a standard Bayesian theory of confirmation suggests that the confirmatory value of new evidence favoring drug-based therapies is low. In fact, according to Stegenga, so low that our default expectation should probably be that drugs do not work.

This talk was an attempt to evaluate Stegenga’s argument, taking seriously the failures of objectivity in creation of clinical evidence. The relevant notion of objectivity here concerns whole bodies of research literature, and can even be discontinuous with objectivity of ground-level research activities – think of a case of extreme publication bias. To measure this “metaobjectivity” and evaluate Stegenga’s argument, one should therefore analyse whole bodies of research literature agnostically, as opposed to studying examples of good or bad practices with hindsight as in (Stegenga 2018). I very crudely demonstrated one way of doing this. At the most basic level of research design, clinical trials are fairly homogeneous, and typically report p-values for the estimated effects. This allows the detection of bias by studying the distribution of p-values in whole populations of clinical trial literature, rather than by providing proof of bad practice at the level of individual studies. The definition of p-value entails that in a collection of sufficiently similarly conducted trials on drugs that actually work (respectively, do not work), the reported p-values should follow a right-skewed (respectively, flat) distribution, unless the literature is biased. To demonstrate, I presented plotted p-values from clinical trial data on depression and heart disease, scraped from There are too many caveats to mention here, but the tentative results did not suggest Stegenga-style evidential pessimism with respect to these bodies of literature. Lastly, I free-associated about some implications to the philosophical debate about EBM, such as how EBM's emphasis on RCTs may not reflect commitment to this or that theory of causality, but rather an emphasis on standardization that allows one to reduce individual studies to data points in meta-analyses which can detect failures of objectivity at the level of whole bodies of published evidence.


Holman, B. (2017). Philosophers on drugs. Synthese, 1-28.

Howick, J. (2011). Exposing the vanities—and a qualified defense—of mechanistic reasoning in health care decision making. Philosophy of Science, 78(5), 926-940.

Howick, J., Glasziou, P., & Aronson, J. K. (2013). Can understanding mechanisms solve the problem of extrapolating from study to target populations (the problem of ‘external validity’)?. Journal of the Royal Society of Medicine, 106(3), 81-86.