So there has been a big response to this paper press released by BMJ on behalf of the journal Acupuncture in Medicine. The response has been influenced by the usual characters – retired professors who are professional bloggers and vocal critics of anything in the realm of complementary medicine. They thrive on flexing their EBM muscles for a baying mob of fellow sceptics (see my ‘stereotypical mental image’ here). Their target in this instant is a relatively small trial on acupuncture for infantile colic. Deserving of being press released by virtue of being the largest to date in the field, but by no means because it gave a definitive answer to the question of the efficacy of acupuncture in the condition. We need to wait for an SR where the data from the 4 trials to date can be combined.
On this occasion I had the pleasure of joining a short segment on the Today programme on BBC Radio 4 led by John Humphreys. My protagonist was David Colquhoun, who spent his short air-time complaining that the journal was even allowed to be published in the first place. Why would BBC Radio 4 invite a retired basic scientist and professional sceptic blogger to be interviewed alongside one of the journal editors – a clinician with expertise in acupuncture (WMA)? At no point was it made manifest that only one of the two had ever been in a position to try to help parents with a baby that they think cries excessively.
So what about the research itself? I have already said that the trial was not definitive, but it was not a bad trial. It suffered from under-recruiting, which meant that it was underpowered in terms of the statistical analysis. But it was prospectively registered, had ethical approval and the protocol was published. Primary and secondary outcomes were clearly defined, and the only change from the published protocol was to combine the two acupuncture groups in an attempt to improve the statistical power because of under recruitment. The fact that this decision was made after the trial had begun means that the results would have to be considered speculative. For this reason the editors of Acupuncture in Medicine insisted on alteration of the language in which the conclusions were framed to reflect this level of uncertainty.
David Colquhoun has focussed on multiple statistical testing and p values. These are important considerations, and we could have insisted on more clarity in the paper. P values are a guide and the 0.05 level commonly adopted must be interpreted appropriately in the circumstances. In this paper there are no definitive conclusions, so the p values recorded are there to guide future hypothesis generation and trial design. There were over 50 p values reported in this paper, so by chance alone you must expect some to be below 0.05. If one is to claim statistical significance of an outcome at the 0.05 level, ie a 1:20 likelihood of the event happening by chance alone, you can only perform the test once. If you perform the test twice you must reduce the p value to 0.025 if you want to claim statistical significance of one or other of the tests. So now we must come to the predefined outcomes. They were clearly stated, and the results of these are the only ones relevant to the conclusions of the paper. The primary outcome was the relative reduction in total crying time (TC) at 2 weeks. There were two significance tests at this point for relative TC. For a statistically significant result, the p values would need to be less than or equal to 0.025 – neither was this low, hence my comment on the Radio 4 Today programme that this was technically a negative trial (more correctly ‘not a positive trial’ – it failed to disprove the null hypothesis ie that the samples were drawn from the same population and the acupuncture intervention did not change the population treated). Finally to the secondary outcome – this was the number of infants in each group who continued to fulfil the criteria for colic at the end of each intervention week. There were four tests of significance so we need to divide 0.05 by 4 to maintain the 1:20 chance of a random event ie only draw conclusions regarding statistical significance if any of the tests resulted in a p value at or below 0.0125. Two of the 4 tests were below this figure, so we say that the result is unlikely to have been chance alone in this case. With hindsight it might have been good to include this explanation in the paper itself, but as editors we must constantly balance how much we push authors to adjust their papers, and in this case the editor focussed on reducing the conclusions to being speculative rather than definitive. A significant result in a secondary outcome leads to a speculative conclusion that acupuncture ‘may’ be an effective treatment option… but further research will be needed etc…
Now a final word on the 3000 plus acupuncture trials that David Colquhoun mentions. His point is that there is no consistent evidence for acupuncture after over 3000 RCTs, so it clearly doesn’t work. He first quoted this figure in an editorial after discussing the largest, most statistically reliable meta-analysis to date – the Vickers et al IPDM. He admits that there is a small effect of acupuncture over sham, but follows the standard EBM mantra that it is too small to be clinically meaningful without ever considering the possibility that sham (gentle acupuncture plus context of acupuncture) can have clinically relevant effects when compared with conventional treatments. Perhaps now the best example of this is a network meta-analysis (NMA) using individual patient data (IPD), which clearly demonstrates benefits of sham acupuncture over usual care (a variety of best standard or usual care) in terms of health-related quality of life (HRQoL).
Key to abbreviations
BMJ – British Medical Journal (company)
EBM – evidence-based medicine
HRQoL – health-related quality of life
IDP – individual patient data
IDPM – individual patient data meta-analysis
MCID – minimal clinically important difference
NMA – network meta-analysis
SR – systematic review
VAS – visual analogue scale (usually a 100mm line)
1. Landgren K, Hallström I. Effect of minimal acupuncture for infantile colic: a multicentre, three-armed, single-blind, randomised controlled trial (ACU-COL). Acupunct Med 2017: acupmed-2016-011208. doi:10.1136/acupmed-2016-011208
2. Vickers AJ, Cronin AM, Maschino AC, et al. Acupuncture for chronic pain: individual patient data meta-analysis. Arch Intern Med 2012;172:1444–53. doi:10.1001/archinternmed.2012.3654
3. Saramago P, Woods B, Weatherly H, et al. Methods for network meta-analysis of continuous outcomes using individual patient data: a case study in acupuncture for chronic pain. BMC Med Res Methodol 2016;16:131. doi:10.1186/s12874-016-0224-1
Declaration of interests
I am the salaried medical director of the British Medical Acupuncture Society (BMAS), a membership organisation and charity established to stimulate and promote the use and scientific understanding of acupuncture as part of the practice of medicine for the public benefit.
I have a very modest private income from lecturing outside the UK, royalties from textbooks and a partnership teaching veterinary surgeons in Western veterinary acupuncture. I have no private income from clinical practice in acupuncture. My income is not directly affected by whether or not I recommend the intervention to patients or colleagues, or by whether or not it is recommended in national guidelines.
I have not chaired any NICE guideline development group with undeclared private income directly associated with the interventions under discussion. I have participated in a NICE GDG as an expert advisor discussing acupuncture.
I have used Western medical acupuncture in clinical practice following a chance observation as a medical officer in the Royal Air Force in 1989. My opinions are formed by data that spans the range of quality and reliability, much of which is in the public domain.
I have a logical mistrust of the motives of anyone who advertises an interest or hobby in being a ‘Skeptic’, as opposed to using appropriate scepticism within their primary profession, or indeed organisations that claim to promote generic ‘science’ as opposed to actually engaging in it.
This blog was edited on 26 January to remove some parts of the text to be in line with usual editorial standards.