28 Aug, 15 | by Bob Phillips
As its summer time & thoughts of exciting summer camps expanding skills, or time spent catching up with missed opportunities, or indeed just beer & strawberries, are filling our lives it seems appropriate to go entirely left field and explore confidence intervals.
Confidence intervals describe – in terms of interpretation – the range of values where we think the real truth lies (x% of the time – where the ‘x%’ is the number that sits before the CI)*
They are a measure of the precision of our estimate. They can described almost any quantity – the mean, odds ratios, test sensitivity …
Mostly we see 95% CI; this sort of corresponds to our desire for p-values of <0.05. We sometimes see 99% CI – even more sure the truth is in here – or occasionally 90% – wanting desperately to sell us a Thing, usually.
* You’ll note that a 95% CI means we are WRONG 5% OF THE TIME
21 Aug, 15 | by Bob Phillips
The shortcut world of acronyms for critical appraisal was lacking one for diagnostic test accuracy – we have RAMbo for RCTs, FAST for systematic reviews, but what of the poor reader of studies evaluating a new test?
We know the basic idea – patients who are considered to potentially have the diagnosis in question have both the test-under-evaluation and the as-good-as-we, these are assessed without looking at the results from the other one, and that if there are cut-offs these are reproducible.
Wait! That’s it … more…
14 Aug, 15 | by Bob Phillips
There are a variety of clinical prediction rules in the world. If you’ve seen one – they always used to have a nomogram attached – it would take the answers to a few questions and come up with a ‘probability of bad thing happening’.
As we’ve mentioned previously, there’s an issue with deriving models and assuming they will work as well in every other place you try them (and have a read here and here if you want a fuller explanation from the E&P section). A new model needs to be ‘validated’ – that is, checked to see if it works.
Two bits could be looked at:
a) Calibration – does the ‘predicted probability’ from the model match(~ish) the observed probability in the new set
b) Discrimination – if you set a threshold (say, less than 10% chance = low risk), does the model have the same sort of sensitivity and specificity as the original data?
How close you need these to be to claim effective validation is, as many things, a matter of clinical judgement but it’s worth understanding what they are validating to make a sensible assessment.
7 Aug, 15 | by Bob Phillips
The basics of evidence based medicine are to ask a question, acquire a paper that might answer it, appraise the study, apply it’s results and assess performance.
The appraisal bit can be done a few different ways – but underneath nearly all of them sits a similarity of key concepts – it’s just the gloss that varies.
But how something looks is VERY important (says my 12 year old). So you might like the look of the pictorial GATE approach, or the simplicity and rapidity of the tiny acronyms like RAMbo, FAST and AVID. Or a more leisurely series of questions, as promoted and freely available from the CASP team may be what you want to use to bring your appraisals into the light.
These are study-type-specific, annotated checklists of about 10 questions that step you through the key elements of evaluating bias in clinical research studies. There are a wealth of online tools and courses about the checklists, and loads of people like them.
As our recent guest blogger might say, “Have a play!”
31 Jul, 15 | by Bob Phillips
Sometimes, we spot stuff that predicts how things will happen. Well, usually happen. These may be described as ‘risk’ factors – that is, factors which predict something will happen – or ‘prognostic’ factors – thinks that predict the outcome of a condition. There are a range of generalisations that are sometimes made from ‘predictive’ studies, and if you take an extremely non-medical example you may spot some of their weaknesses.
Say someone reports a study that shows a barking dog predicts a herd of small children in the kitchen. The study was done during daytime hours, in a family home on a suburban street. While the barking was a good predictor (85% of the time), it wasn’t perfect, sometimes there was a delivery driver at the door; though preceded by hearing the van drawing up. The authors conclude that those wishing to protect the biscuits in their kitchen should use barking dogs to warn them. more…
28 Jul, 15 | by Ian Wacogne
I’ve spent quite a while trying to convince you that you really ought not to be writing a case report. But you’re in a bind. Firstly, you’ve got in mind a case report – or you’re under pressure to write a case report with (for) someone. And also, you’ve got to get published. So, what are you going to write instead?
24 Jul, 15 | by Bob Phillips
If you were cycling or driving, you’d probably know what the stopping rules were. Traffic not moving, big red sign, large goose with malevolent glare (Lincolnshire speciality).
What if you’re doing a clinical trial?
There are a variety of things what have been described, some of them are qualitative (SUSAR – sudden, unexpected, serious adverse reactions) and some statistical. The latter have with them a set of maths that leads to reasons to discontinue, either for proven benefit or futility.
21 Jul, 15 | by Bob Phillips
And there are lots of ways to do ‘synthesis’ of evidence within a systematic review. We’ve gone on – at length – about meta-analysis and described qualitative synthesis with meta-ethnography, but in a new paper in the Archives we see how a narrative combination of quantitative research studies with a qualitative framework to understand them can allow us to see where the trees lie in the wood [insert alternative forestry based metaphor if preferred].
This group of authors decided to examine the safety netting tools after discharge from the paediatric emergency / urgent care department. more…
17 Jul, 15 | by Bob Phillips
The Bonferroni Correction is the simplest, the most understandable, and the most extreme way of correcting for multiple statistical tests.
You take your ‘significance’ level and divide by the number of tests you are doing.
So if you have set ‘significance’ at 0.05, and do 5 different statistical tests, to be actually sure that your “rejection of the null hypothesis” (aka – it’s significant! it works!), you need to see the result to be
0.05 / 5 = 0.01 …. so …
p <0.01 before you can really call it a ‘significant’ result.
(It’s a bit harsh. It probably makes you make too many type II errors.)
(You can also apply it to correct confidence intervals. Do five tests, for the ‘real’ 95% confidence interval for each one, you need to calculate the 99% confidence interval.)
14 Jul, 15 | by Bob Phillips
We measure, monitor and assess lots of things in our jobs. We frequently try hard not to think about poorly reproducible some things are – take breathlessness in children as discussed in a recent blog – and the whole literature is methodologically far weaker than that of intervention research. Sometimes we’re really like to assess something, but find it hard to work out exactly how to make that measurement: for example, what should we be measuring when looking at “time to antibiotics” in sepsis – door to ‘needle’ time? proportion less then one hour? first fever to antibiotic duration?
Sometimes it can help to take a completely different idea, to think about what elements might be important.
Say – “home field advantage” in competitive team sports