Category Archives: Statistics

how to do a literature review


Review by Hernán Piñera, Attribution-ShareAlike Licence

This article by Will Hopkins gives some very practical advice about structuring a literature review. Although some parts of it refer to specific requirements of the journal he is writing for, there is a lot of good advice here that you won’t go wrong following. There is a very good section on assessing the quality of the published work and one on interpreting effects.

Hopkins WG (1999) How to write a literature review, Sportscience 3(1),

Power of your study

We have talked before about how the absence of evidence is not evidence of absence and how to plan sample sizes. This short video by Matt Asher, from the University of Toronto, takes an amusing look at power and sample size. As you can tell, it is written by a frustrated statistician!

Chocolate survival


Chococo’s Chocolate Box of Chocolates by Rob Howard, Attribution-NonCommercial-NoDerivs License

Something for the holidays: an entertaining analysis of the survival of chocolates in hospitals. Enjoy and have a good break with friends and family.
Gajendragadkar RP et al (2013) The survival time of chocolates on hospital wards: covert observational study, BMJ 347 doi:10.1136/bmj.f7198

Thinking about causation


German Upsidedown by Jan-Willem Reusink, Attribution License

A colleague directed me to a recently published study which examined the relationship between neuter status and joint disease in German Shepherds. We had an interesting discussion about the merits of the research methods, however the most significant flaw I saw in the study was in the conclusions the authors drew about their results.

The researchers found a statistical relationship between neutering dogs before one year of age and the occurrence of some types of joint disease. In their discussion they make it clear that they conclude that this association is causal, and that its direction is from neutering to joint disease. In other words they conclude that early neutering causes an increased risk of joint disease and they go on to propose possible mechanisms.

However, in concluding this the researchers fall into a trap that is easy to fall into. Firstly they assumed that the association they found was causal and secondly, that it was in one particular direction. Their study was designed to reduce the false detection of chance associations, however they did not discuss the (very real) possibility of other reasons for the association. One possibility is a relationships between whether owners are likely to neuter early and whether they are likely to do things which increase the chances of joint disease occurring or being detected. As an example, dogs selected for work, may be neutered because it is the organisation’s policy to do so, and the fact they are in work may increase the risk of joint disease or its detection, compared to pet dogs. This then means that there is a confounding variable—whether the dog belongs to this organisation—which links neutering to joint disease statistically, but with no direct association between neutering and joint disease. Such confounders can be accounted for in study designs, but they were not in this case, and therefore it is premature to conclude the relationship is causal.

Even if there was a reason to suspect a causal relationship, we need to be careful not to assume it is in one particular direction. This paper provides a good example of this. Although one could postulate (as the authors did) reasons why neutering might increase the risk of joint disease, one could also postulate reasons why joint disease might increase the risk of neutering. For example dogs with poor conformation, visible as pups, may be neutered because they are not good for showing or breeding, and then go on to develop detectable joint disease later.

For a previous post about the difference between correlation and causation see here . And here is the reference and link to the article about neutered German Shepherds and joint disease:

Hart, B. L., Hart, L. A., Thigpen, A. P. and Willits, N. H. (2016). Neutering of German Shepherd Dogs: Associated joint disorders, cancers and urinary incontinence. Veterinary Medicine and Science 2: 191–199. doi:10.1002/vms3.34

Tables and figures


Cocoa On The Picnic Table by Robin Horn, Attribution-NonCommercial-NoDerivs License

This webpage from Bates College is entitled “Almost Everything You Wanted to Know About Making Tables and Figures.”

It’s a very appropriate title because the page has information about captions, referring to tables and figures in the text, where to place them in your text, and formatting them so they are clear and easy to read. It has lots of helpful examples.

For other posts that talk about tables and figures see here and here.

The importance of study design


Parrot Study by benji2505, Attribution-NonCommercial License

This post directs you to a research paper by Simmons et al (2011) that is freely available online, and is well worth reading for three reasons.

One reason is its amusement value. In this paper the authors “prove” that listening to Beatles music makes you approximately a year and a half younger than you were before. They do not discover, unfortunately, whether it is just something about the song “When I’m sixty-four” or whether it is any Beatles song. This is a pity. One can easily get sick of listening to that particular song.

Another reason is that it is very well written and it is interesting to see how the authors structure their writing. Take, for example, the section headed “Nonsolutions” on the 7th page. Note how the authors introduce the point they are making first, before going on to give the detail. It is very easy to follow this sort of writing and is a pleasure to read. Also note the frequent use of the first person “we” and “our”. This shortens and simplifies sentences and makes the writing more direct.

And lastly, the message of the paper is important. The authors show how easy it is to reach different conclusions by changing the research design as you go along. One take home message is to spend sufficient time planning and ensure, in your research deign, that you can justify every step you plan to take. The other take home message is that you must record your steps and accurately report them when you write up your findings. Not keeping a research diary in which you record your steps and your thinking? You should be. See this previous post on research diaries.

Simmons JP, Nelson LD and Simonsohn U. (2011) False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366. doi:10.1177/0956797611417632. Available at SSRN:

Reading epidemiological reports

Adoring Monkey

Outbreak! by Russell Ede, Attribution-NonCommercial License

Epidemiology is the study of disease in populations. Epidemiological studies can give us useful information about the likelihood of disease, the performance of tests, the efficacy and safety of treatments and the outcomes and prognosis. This article from the British Medical Journal provides a framework for evaluating epidemiological studies in terms of bias, chance, and evidence for causation.

Figure construction


Pig on the leash!, by Arek, Attribution-NonCommercial License

This post from Stefanie at APA Style Blog takes an amusing look at some of the mistakes people make when designing figures (that includes diagrams and graphs) for their academic paper. The seven issues she identifies are:

  1. Using different font styles and sizes
  2. Poor use of colour
  3. Using shadow and other text effects
  4. Including too much information
  5. Including figures for the sake of decoration
  6. Misrepresenting proportions
  7. Not using abbreviations when they would help
  8. Overusing abbreviations when they do not help

So its about finding a balance between letting your creativity loose and keeping it on a leash!

Check out Stefanie’s posting here

Thinking about missing data


Missing by Anderson Mancini, Attribution License

Its important not to ignore missing data, but researchers often do. You need to be on the lookout for potential problems caused by missing data when you read research or conduct it. How can missing data be a problem? It means that the sample that is left–the one you analyse–is no longer representative of the whole population. Its only the ones that were not missing: those that chose to answer all the questions, the animals that enjoyed eating the food, the ones that were well enough to participate in the full trial. It also means that the sample is smaller than it might appear to be and therefore that real differences may not be detected.

In this news article Sarah Hoare explains her research which investigated the inherent bias in surveys of the wishes of patients at the end of their life. She found that conclusions reached from analyses that ignored missing data may have been flawed.

Many clinical trials suffer from problems with missing data. Another common example is weight loss trials. You need to think about this when reading the research or analysing your own results. Why do you think those that did not persist with the weight loss protocol stopped? Could it have been that the protocol did not work? If only the data from the completing patients was analysed, could this suggest the diet worked better than it really did? Yes.