Making sense of numbers and words: Statistical methods

Peter Grimbeek

Site Notes Contact details

On getting it right or wrong

A working assumption is that any data analytic work is done without errors. This is a trifle hopeful, and instead Murphy's law prevails. That is, anything that can go wrong will go wrong.

Checks and balances commonly in place include examining the data diagnostically to ensure that values are not out of range or drastically non-normally distributed. A rationale for doing the above is that results obtained with dodgy data are not worth reporting and should be avoided. If after doing the above, univariate and multivariate analyses are repeated a couple of times, this helps to ensure that outcomes are reliable (reproducible).

However, occasionally (more than once is too often here), the reported outcomes on subsequent examination turn out to be incorrect.

On one memorable occasion, I failed to notice the many occurrences of zeros as a response option in data used to examine the effect of a specific treatment. The effect of the treatment appeared to be statistically significant, an appearance that dissolved when the zeros were more properly treated as missing data.

On another occasion, AMOS was used to examine the plausibility of a model that incorporated a subset of interrelated variables. After incorporating the outcomes into a journal paper, and at the point of making editorially requested changes, I redid the analysis only to find the outcomes quite different. In fact, two of the variable labels had been swapped with the effect that associations were significant but not as reported.