Now, regression is a bad thing if we’re talking development. It might be any number of really difficult to pronounce neurological conditions, or severe psychological trauma, or abuse/neglect. It’s not going to be good.
In statistics, it’s not quite the same. Regression is quite often a good thing. But what is it?
Well, in simple terms it’s drawing a line between one thing and another – like shoe size and height.
And then using one to predict the other.
You can make it more complex by trying to predict if something is happening or not (which must be either ‘yes’ – usually 1 – or ‘no’- usually 0). This is most often ‘logistic regression’ (but may also be ‘probit regression’ for reasons which are beyond this mini-blog), and plots the predictor against the probability of the outcome.
And sometimes you can see if the relationship is different in different groups (e.g. first class vs. coach class, male vs. female when surviving the titanic). This is then multivariable regression.
When you’re looking at the use of these things we need to interpret them sensibly, and like all stats we need to know that it’s not just a data trawl and the appropriate assumptions are met (which is a Masters module in itself …), but you should – hopefully – be better able to know what’s going on.
ps – the last graph here is a great example of the discontinuity we mentioned in our post about before&after studies