Making sense of numbers and words: Statistical methods

Peter Grimbeek

Contact details

Steps in the analysis of survey data

Surveys can be used to collect information about the respondent's demographic profile (life to this point) as well as information about his/her attitudes, beliefs, and knowledge.

What you don't get is information about what the respondent does as opposed to what they say they do, have done, or will do.

I analyse and report on this information as follows:

1. I report the demographic profile for the sample (gender, age, educational qualifications, etc), item by item, noting which items vary and which are relatively constant (i.e., 90% espousing the same response).

2. I report significant associations between these variables via cross-tabulation, correlation coefficient, and Optimal scaling.

Optimal scaling provides a 2-D spatial representation that is especially proficient at capturing complexity (For more information see relevant paper on main website).

3. I report responses to individual variables measuring attitudes, beliefs, knowledge.

Typically, these variables employ Likert scale response categories (e.g., Strongly Disagree -> Strongly Agree).

If Likert, then ordinal, and if so, then I prefer to report, say, the percent strongly agreeing (often using a graph).

4. I examine the extent to which variables with common Likert response categories form scales and subscales.

I do so by using exploratory factor analysis (EFA) or confirmatory factor analysis (CFA: I'll provide more info on this another time), depending upon sample size (bigger is better) and the credibility of item-scale relationships (if you've just written the items and believe them to form a scale then I'd prefer to treat this as a hypothesis to be tested).

5. I generate scale scores based on the previous step.

Researchers tend to prefer average scores, and I will compute these where appropriate but I also like to use EFA to save factor scores. This has the advantage of taking loadings into account, using all items, and also generating a scale score with z-score qualities (mean=zero, standard deviation=one).

6. I use ANOVA, MANOVA, Regression, or structural equation modelling (SEM) procedures to examine the associations between conceptually or empirically relevant aspects of the demographic profile and outcome scores (usually sub-scale or scale scores).

More about all of this at another time.