Making sense of numbers and words: Statistical methods

Peter Grimbeek

Starting points for qualitative analysis

I'm currently evolving brief spiels on manual vs. semi-automated vs. fully automated text analysis.

The claim that qualitative data analysis can proceed without the primary step of identifying a set of categories seems downright illogical. This is a first step regardless of whether one is identifying themes and patterns, nodes, or concepts (differing language, same thing).

Manual text analysis is what qualitative researchers do when they explore the themes and patterns in selected vignettes, usually brief extracts from larger chunks of transcribed conversations, etc.

Semi-automated analyses could be said to include the use of MS Word, Excel, or NVivo software, with the latter qualifying most properly for this category.

With NVivo (QSR), large chunks of text and other qualitative materials (photos, videos, etc) are imported into what is an electronic database. A priori categories (termed nodes) are set up (tree nodes), and one of a number of documents (internal sources) opened for coding purposes.

The researcher highlights words, phrases, or entire paragraphs, and links these to one or more of the tree nodes or creates additional (free) nodes.

One big plus in this software (from my perspective) is the word frequency query option. It generates a thesaurus listing of all available words, including words such as I, am. Selected words can be saved as new nodes or merged into existing nodes, which means one can conduct generic analyses of multiple documents very quickly indeed.

A negative is that my copy of the current version (v.8) quickly seizes up. It doesn't seem able to sustain heavy duty collection of words linked to nodes for very long at all. I assume that QSR will fix this.

Leximancer is my preference for fully automated text analysis. For the purposes of comparison with NVivo, one thing of interest is the convergence between NVivo's word frequency list and Leximancer's Ranked concept list. Imagine NVivo's list without the less clearly semantically relevant words, and you get the Leximancer list. This is comforting as it suggests these search mechanisms are reliable and valid.