Making sense of numbers and words: Statistical methods


Site Notes Contact details

A statistical game: Categorical, ordinal or equal-interval?

The starting point for a lot of my current interest in the range of statistical methods for making sense of the world of numbers came about in 1999 as a result of interactions with two clients (Fiona Bryer and Wendi Beamish) at the Mount Gravatt campus of Griffith University. They'd asked me to review statistical analyses of dichotomous(e.g., No/Yes) and Likert scale responses (e.g., strongly disagree, disagree, unsure, agree, strongly agree) items developed using the Delphi technique (working with groups of participants to the point of consensus about what is important or relevant and what isn't).

That dataset has previously been analysed by reporting correlations, means, and standard deviations per item, and by using t-tests to examine group differences relevant to these categorical/ordinal items. That is, the analyses had assumed the items were equal interval (distance between 1 and 2 equivalent to distance between 3 and 4, etc) and met the assumptions required for parametric analyses (e.g., normally distributed), even though they generally did not. One outcome of this failure to take into account the nonparametric measurement properties of the responses was to produce overly frequent and unreliable reports of statistically significant group differences.

Starting with this insight, we developed methods that involved collapsing across Likert scale responses to produce dichotomous responses (e.g., the percentage in agreement (agree, strongly agree) that could be described very readily (tables, figures), analysed very straightforwardly (e.g., contingency tables), and that generated conservative (fewer) but more reliable reports of statistically significant group differences.Wendi Beamish has now published a book documenting these developments.

As a result of that work, I've developed a broad interest in forms of statistical analysis that take into account the measurement properties of the variables concerned. With that in mind, I was happy to review a book by Michell (1999) that addressed these issues and did so very lucidly (the book not the review). For the same reason, I was also happy later on to review a book by Bond & Fox (2001, 2007) about Rasch analytic methods (the publisher, Erlbaum, used extracts from this review on its website to sell the book), and more recently to review a book on new developments in categorical data analysis methods (e.g., nonparametric factor analysis).

It is for similar reasons that I was happy to come across a modern analogue of the Correspondence analysis (developed by Pierre Bourdieu in the course of his pioneering sociological work), SPSS Optimal Scaling, that utilises nonparametric factor analytic methods to identify clusters of responses within demographic and other nonparametric sets of variables.

Finally, a suspicion that experimental design of the kind practiced in psychology and other laboratory settings is sufficient but not necessary when pursuing trails of cause and effect in data sets has led to a long-term interest in structural equation modelling (SEM). The charm of SEM is that it provides a way to test hypotheses based on survey and other data collected in semi-natural settings (though not without the usual reservations about the measurement properties of the variables concerned).