Guest Blog: The end of systematic reviews?

So the titles intentionally provocative and NOT the brainchild of the post’s author (@JRBTrip of @TRIPdatabase) … but Jon has provided us at the Archives with a paediatric-orientated version of the new TRIP rapid-review system. Read on to find out more, and comment / tweet us your thoughts … Bob Phillips for @ADC_BMJ

Trip Rapid Review system.

The Trip Database is a free, EBM-focussed, search engine.  Think Google but for those interested in high quality evidence.  Amongst other content types (e.g. guidelines, CATs) we include systematic reviews, possibly the internet’s largest collection of systematic reviews.  Systematic reviews are a vital component of evidence-based healthcare.  However, they are not without problems, and I have written extensively on the topic.  One of the biggest problems being that they take way too long (nearly two years) which means that many are out of date when launched.  Also – given the huge workload – there are not enough of them meaning many (?most) areas of clinical practice are not covered. It is from this thinking that Trip created it’s recently released rapid review system.  It’s certainly not intended to replace systematic reviews, it’s more of a proof of concept, a curio, something of interest!  I wanted to show that, with very little money and some imagination you can produce something really interesting (it’s certainly that) that might alter the perception of people involved in systematic review production (which it might).  But, there are lots of issues with the system – the main one being it hasn’t been validated (but we’ve shown some positive early results).

So, what does it do and how does it work? It’s really quite simple, we used machine learning to train the system to recognise when an abstract reports the intervention has a positive effect and when it has a negative effect.  This took an age, requiring over 500 abstracts being added to the training system, each read and marked as being positive or negative. Scarily, most machine learning training tasks require thousands of examples, so there’s still further learning/teaching required.  As such the system occasionally has problems and we require a human to tell us the answer.

So, what does the system look like?  When you go to the Trip site ( to the right of the search box are three buttons, Trip Rapid Review is on the bottom (underneath the PICO search button).  You simply click on that and two search boxes appear.  One is for the population of interest (e.g. childhood epilepsy) and the other for the intervention (e.g. melatonin)  Once you entered these you press ‘Search’ and Trip goes off and returns articles that match the search terms and that are likely to be clinical trials.   The user then scans the returned trials and selects the ones that match their intention and press the ‘Analyse’ button.  In less than ten seconds a score will be shown.  The maximum score an intervention can get is +1 and the minimum -1.

The score if generated based on a number of reasonably sounding – to me – assumptions!  Each trial is given a score of +1 (if positive) and -1 (if negative) and we then adjust these scores based on sample size (we use machine reading for this).  If the trial is large (>1000 participants) we leave the score untouched.  If the trial is medium (100-999 participants) we reduce the score to +/-0.5 and if it’s small (<100) the score becomes +/-0.25.  The adjustments are based on the broad principle that the smaller the trial the more unreliable to results are.  We then add up all the scores and divide by the number of trials – giving an average score.

The biggest issue we have is what do the scores mean?  I’m fairly happy with the notion that the closer it is to +1 the more likely the intervention is to be effective.  But at what point does it become effective and at what point is the result unclear?  For instance, is a score of 0.2 good or bad?  I think much depends on the number of trials and the typical size of trials, but these are all things that’ll need to be worked through.

To illustrate all the above points I’ve created a screencast of me working through the example of melatonin inchildhood epilepsy.  If a picture paints a thousand words, how many a screencast?

In summary, what is the Trip Rapid Review system?

Well, I’d reiterate that it’s not been validated and therefore is potentially dangerous.  But,its novel, it’s free, it’s quick, it does undertake a rapid overview of trials.  Perhaps more a ‘ready reckoner’ than a systematic review.  I still maintain it’s a curio – something to get people to think that we’re not wedded to systematic reviews having to take years.

– Jon Brassey,

(Visited 154 times, 1 visits today)