What is a correlation?
So what is a correlation? Having had a look at what research is and what researchers mean when a result is significant, in this post I want to have a look at what we call causal relationships and correlations. This is where someone claims to have found that something causes something else, say smoking causes lung cancer, or bad managers cause high turnover rates in an organisation. I will also look at correlations, which is where some research finds that two (or more) things are correlated.
Be impressively well informed
Get the very latest research intelligence briefings, videos and infographics sent direct to you
What does it mean (and not mean) when a research study says it has found a correlation between two things, say e management style of their supervisor and employees intention to leave stay in the organisation.
The first to say straight off is that a correlation is not a cause. When a study states they have found a correlation between two things it only means they have discovered at relationship between those two things. It doesn’t even mean those two this are necessarily directly connected.
For example, image we find that every time a bell rings in Manchester workers in a factory in Oxford stop work and leave the factory. The correlation is very strong. Every time the bell rings the workers stop work. The correlation is pretty near perfect and is certainly significant, in research terms (see my last post). Looks good eh?
Well that is until you realise that the bell rings at 5pm every day and 5pm just happens to be going home time for the factory workers. The only relationship between the bell ringing and the workers stopping work happens to be the time. The two events aren’t actually directly connected. Indeed if either the time the bell rings or the workers finishing time is changed the correlation would immediately break down and there would be no correlation.
So what we are saying in this case is that the relationship between the bell and the people stopping work is indirect, that the relationship both events (the bell and the work stopping) is actually not with each other, rather they both have a direct relationship with time (5pm) which then in turn has a relationship with the two events. This is what is known as an indirect correlation. In this case time is what is known as an intervening variable.
There are two basic types of correlation and two directions of correlation.
The Two Basic Types of correlation
Direct correlation – this is where the two or more things (called variables) are directly connected in some way. Remember a correlation does not necessarily mean causation. For example there is a direct (and significant) correlation between the circumference of a circle and its area. The larger the circumference the larger the area. Circumference > Area
Indirect correlation. This is where the two things are not directly related or connected to each other, but there is some other third thing, or variable that has in turn a direct relation with the two variables in question. Bell > Time > Finishing work
In this last case ‘time’ is what is known as an intervening, mediating or intermediate variable.
Whilst in the case of the bell in Manchester and the workers in Oxford appears clear, more complex correlations (relationships) can at first appear to be direct may actually be indirect.
The weather and ice cream sales
For example consider the relationship between the weather and ice cream sales.
As temperatures rise ice cream sales increase. There is a correlation between the temperature and ice cream sales. At first sight this may appear to be a direct relationship. However ice cream sales don’t automatically occur as the temperature rises. At a simple level you need humans and the availability of ice cream. So if we go out into the middle of the Sahara desert and test the hypothesis that as temperatures rise so do ice cream sales we are going to find no relationship between the two variables. Mainly because there are no people and no ice creams. Even if there were a mountain of ice creams in the middle of the desert temperatures rising wouldn’t result in increased sales as there isn’t anyone to buy them.
So what at fist might appear to be a direct relationship or correlation (temperature > ice cream sales) is actually an indirect one (temperature > humans > ice cream sales). Indeed the relationship is even more complex because the intervening variable ‘humans’ would most likely not work. It is probably more likely to be ‘humans with money’ or even again ‘humans with money who like ice cream who aren’t lactose intolerant and aren’t on a diet’. Things can get quite complex!
In my next blog in this series I will the directions of correlation (not causality remember).
Be impressively well informed
Get the very latest research intelligence briefings, video research briefings, infographics and more sent direct to you as they are published
Be the most impressively well-informed and up-to-date person around...