Spirals of re-validation

In the UK, revalidation, to the eyes of anyone in a permanent post, brings thoughts of GMC pleasing e-paperwork and the joy of yet more hours staring at a barely functional system to prove you are safe and sensible enough to not play with computers but auscultate bunnies (where necessary) and diagnose life-threatening disorders.

To our more statistically minded colleagues, it might bring about thoughts of repeating a validation – a process of assuring a prediction model, which may be prognostic, diagnostic or risk-based – to check it’s still valid, perhaps after calibration. Model, in this setting, just means fancy equation, or ‘maths machinery’. Sadly there are no tiny die-cast toys to actually play with.

Now the basic idea of validation is to check a prediction model, for example a way of estimating the prognosis of a patient admitted to PICU, is still accurate and any factors in the model work to discriminate between the high and low risk cases. It might be that if overall survival has improved, the model still works, but it needs it’s sights raising a little. (If you think of the model as producing a straight line on a graph, this can be thought of as altering the raising the intercept.)

Alternatively, it might be that the factors which were hugely prognostic previously just aren’t as discriminating; we may have improved our ability to ventilate, control blood pressure and fix failing kidneys further than we could before. In these cases, the line is at the wrong angle, and needs re-sloping.

These modifications of the model are re-calibrations. Now if you think about it, the first time the model was produced it was calibrated. And we needed to validate it. Now we have re-calibrated it, do we need to re-validate it. And if we tweak a bit more, do we need to re-assess our tweaking a bit … until we have spiraled into a madness of unending fiddlechecking?

Well… like most things in life … there’s a balance to be struck. Practically, we have to believe that we can use the model (if it’s any good), but recall we need to keep making sure it works every now and then. So there are very clever folks who will periodically check for calibration and shift things like the PIM score, producing era-specific modifications like PIM2 and PIM3. Some diagnostic models are suggested to be calibrated against local baseline rates of certain disease-states, in an attempt to compromise the maths and the medicine. As the user of evidence, the key parts seem to be, as always: check the methods are sound; assess the outcomes for relevance; make sure the study can apply to your group of patients and fiddle sensibly to make it work.

(Visited 30 times, 1 visits today)