In the world of non-randomised studies there are a bucketload of variants, a common one that we see if the ‘before and after study’. This is, on the face of it, a sensible approach. Do your ‘thing’, then change stuff, do the ‘other thing’. Monitor something important you hope to change, and then if it does you have some evidence of benefit.
Except for most before & after studies it’s not quite like that. They are rarely conducted prospectively. They usually happen cause Something is going on — rising infection rates, loads of Kawasaki diagnoses, increasing DNA rates — and Something Must Be Done. So it is. And then the worrying thing falls back down again. This is then noted, and the notes are trawled, and the Something That Was Done is given a point at which before was before, and after was after.
Can you see any flaws?
For a start, there’s random fluctuation, which is a particular problem with rare occurrences. After a blip, you’re likely to get things settling down (“regression to the mean” and all that). Then there’s the whole before / after divide — thing are rarely Done in one clean movement, and there’s a period of fluff and movement. And there’s also the idea that when you start Doing Something, all sorts of other things happen too. (Take bare below the elbows. There’s little direct evidence that the movement of cloth from off the wrist makes a difference to infections. What it probably does is signal something; a desire to wash hands lots, to encourage others to do so, to isolate infections quickly and to move to improve things for all. It might be claimed that the Thing was rolling up of sleeves, but the sleeves probably didn’t do the dirty work.)
And then there’s publication bias. If you did Something and it didn’t work, how likely are you to report it? For instance, if with VIP-induced diarrhoea you commenced octreotide and the poo settled, you’d probably write it up. But if you commenced racecontradil and next-to-nothing happened, no-one’s gong to be encouraging you to submit an abstract to the Spring Meeting.
Now in all this skepticism there are some things you may be able to hold onto.
1) Really really big effects – things with greater than 5-fold change – are likely to be real (ish – may still be exaggerated)
2) Actions which are truly before / after, or other discontinuities, for example when a drug is banned from use, and the outcome is consistently monitored in both time frames
3) Evidence of clear difference in the smoothed trends between the two periods, or differences where the period of change is removed from the analysis (taking out the ‘blip’ that might have driven change) are weakly more convincing
You may have really liked this study design before. But after?