Personality & Social Psychology Bulletin. 23.5 (May 1997): p526.
Inquirer is intended to tap clinical syndromes and psychodynamic themes (for examples of other
approaches to linguistic analysis, see McTavish & Pirro, 1990; Rajecki et al., 1994).
Guided by these earlier attempts, Pennebaker and Francis (in press) developed LIWC, a computer-
based technique that computes the percentage of words within various categories that writers or
speakers use in normal (i.e., nonclinical) speech samples. The program analyzes written or spoken
samples on a word-by-word basis. Each word is then compared against a file of words that is
divided into 49 dimensions, or dictionary scales. On the broadest level, these dictionary scales tap
into five general text dimensions: positive emotions, negative emotions, cognitive mechanisms,
content domains, and language composition.
The creation and selection of these primary LIWC categories was guided by recent research within
social, health, and clinical psychology. The categories of negative and positive emotion words were
based on the burgeoning literature on affect (e.g., Costa & McCrae, 1985; Watson & Pennebaker,
1989), mood, and emotion (e.g., Gross & Levinson, 1993), and tap dimensions such as anger,
depression, guilt, optimism, and serenity. Cognitive mechanisms involve words that reveal different
modes of thought. This dimension incorporates words that depict causal thinking, although not
specific styles of attribution. These include categories such as self-reflection (e.g., understand,
think; cf. Rogers, 1965; Pennebaker, 1989); discrepancy, or undoing (e.g., should, would, could, cf.
Davis, Lehman, Wortman, & Silver, 1995; Higgins, Vookles, & Tykocinski, 1992); causation (e.g.,
because, effect, cf. Peterson, Seligman, & Vaillant, 1988); and achievement, or striving (e.g., at
tempt, solve, achieve, cf. McClelland, 1976).
For the three primary dimensions just discussed, LIWC further measures a number of subordinate
categories. That is, in addition to counting all negative emotion words, LIWC is programmed to
additionally calculate the number of words related to five subscales of negative emotion words that
specifically reflect anger, depression, paranoia, anxiety, and guilt. For example, words reflecting
anger contribute to both the word count that LIWC calculates for the global negative emotion scale
and the count made for the subordinate scale of anger. Similarly, five subordinate LIWC categories
are measured for the global category of positive emotion words, and eight are calculated for the
primary dimensions of cognitive mechanisms.
Because of our original interest in the relation of disclosure and language to health, LIWC was also
programmed to measure 12 specific content categories related to physical, psychological, and
emotional well-being, such as referents to physical symptoms, illness, religion, and death. Finally,
the LIWC program calculates a number of traditional language composition elements, such as use
of verb tense, article-and preposition use, and use of negations.
The words comprising these various scales or directories were generated from a range of sources,
including dictionaries, a thesaurus, analyses of words used by participants writing about emotional
topics, and groups of judges. Once generated, the scales were validated by having independent
judges rate each word's appropriateness for that specific scale. A word (e.g., angry) was retained in
a scale (e.g., negative emotion words) only if there was very high agreement regarding its
appropriateness for inclusion (for details, see Francis & Pennebaker, 1993; Pennebaker & Francis,
1996).