PCL-R (Psychopathy Checklist Revised) Test: What’s Wrong with Psychological Tests

The PCL-R “test” is useless because it relies on input from pathological liars (psychopaths) and on witness testimonies decades after the events.

The second edition of the PCL-R test,Guest Posting originally designed by the controversial maverick Canadian criminologist Robert Hare in 1980 and again in 1991, contains 20 items designed to rate symptoms which are common among psychopaths in forensic populations (such as prison inmates or child molesters). It is designed to cover the major psychopathic traits and behaviours: callous, selfish, remorseless use of others (Factor 1), chronically unstable and antisocial lifestyle (Factor 2), interpersonal and affective deficits, an impulsive lifestyle and antisocial behaviour.

The twenty traits assessed by the PCL-R score are: glib and superficial charm; grandiose (exaggeratedly high) estimation of self; need for stimulation; pathological lying; cunning and manipulativeness; lack of remorse or guilt; shallow affect(superficial emotional responsiveness); callousness and lack of empathy; parasitic lifestyle; poor behavioral controls; sexual promiscuity; early behavior problems; lack of realistic long-term goals; impulsivity; irresponsibility; failure to accept responsibility for own actions; many short-term marital relationships; juvenile delinquency; revocation of conditional release; and criminal versatility.

Psychopaths score between 30 and 40. Normal people score between 0 and 5. But Hare himself was known to label as psychopaths people with a score as low as 13. The PCL-R is, therefore, an art rather than science and is leaves much to the personal impressions of those who administer it.

The PCL-R is based on a structured interview and collateral data gathered from family, friends, and colleagues and from documents. The questions comprising the structured interview are so transparent and self-evident that it is easy to lie one’s way through the test and completely skew its results. Moreover, scoring by the diagnostician is highly subjective (which is why the DSM and the ICD stick to observable behaviours in its criteria for Antisocial or Dissocial Personality Disorder).

The hope is that information gathered outside the scope of the structured interview will serve to rectify such potential abuse, diagnostic bias, and manipulation by both testee and tester. The PCL-R, in other words, relies on the truthfulness of responses provided by notorious liars (psychopaths) and on the biased memories of multiple witnesses, all of them close to the psychopath and with an axe to grind.

The PCL-R is not the only bad apple in an otherwise healthy crop. Psychological tests are far from scientifically rigorous.
Personality assessment is perhaps more an art form than a science. In an attempt to render it as objective and standardized as possible, generations of clinicians came up with psychological tests and structured interviews. These are administered under similar conditions and use identical stimuli to elicit information from respondents. Thus, any disparity in the responses of the subjects can and is attributed to the idiosyncrasies of their personalities.

Moreover, most tests restrict the repertory of permitted of answers. “True” or “false” are the only allowed reactions to the questions in the Minnesota Multiphasic Personality Inventory II (MMPI-2), for instance. Scoring or keying the results is also an automatic process wherein all “true” responses get one or more points on one or more scales and all “false” responses get none.

This limits the involvement of the diagnostician to the interpretation of the test results (the scale scores). Admittedly, interpretation is arguably more important than data gathering. Thus, inevitably biased human input cannot and is not avoided in the process of personality assessment and evaluation. But its pernicious effect is somewhat reined in by the systematic and impartial nature of the underlying instruments (tests).

Still, rather than rely on one questionnaire and its interpretation, most practitioners administer to the same subject a battery of tests and structured interviews. These often vary in important aspects: their response formats, stimuli, procedures of administration, and scoring methodology. Moreover, in order to establish a test’s reliability, many diagnosticians administer it repeatedly over time to the same client. If the interpreted results are more or less the same, the test is said to be reliable.

The outcomes of various tests must fit in with each other. Put together, they must provide a consistent and coherent picture. If one test yields readings that are constantly at odds with the conclusions of other questionnaires or interviews, it may not be valid. In other words, it may not be measuring what it claims to be measuring.

Thus, a test quantifying one’s grandiosity must conform to the scores of tests which measure reluctance to admit failings or propensity to present a socially desirable and inflated facade (“False Self”). If a grandiosity test is positively related to irrelevant, conceptually independent traits, such as intelligence or depression, it does not render it valid.

Most tests are either objective or projective. The psychologist George Kelly offered this tongue-in-cheek definition of both in a 1958 article titled “Man’s construction of his alternatives” (included in the book “The Assessment of Human Motives”, edited by G.Lindzey):

“When the subject is asked to guess what the examiner is thinking, we call it an objective test; when the examiner tries to guess what the subject is thinking, we call it a projective device.”

The scoring of objective tests is computerized (no human input). Examples of such standardized instruments include the MMPI-II, the California Psychological Inventory (CPI), and the Millon Clinical Multiaxial Inventory II. Of course, a human finally gleans the meaning of the data gathered by these questionnaires. Interpretation ultimately depends on the knowledge, training, experience, skills, and natural gifts of the therapist or diagnostician.

Projective tests are far less structured and thus a lot more ambiguous. As L. K.Frank observed in a 1939 article titled “Projective methods for the study of personality”:

“(The patient’s responses to such tests are projections of his) way of seeing life, his meanings, signficances, patterns, and especially his feelings.”

In projective tests, the responses are not constrained and scoring is done exclusively by humans and involves judgment (and, thus, a modicum of bias). Clinicians rarely agree on the same interpretation and often use competing methods of scoring, yielding disparate results. The diagnostician’s personality comes into prominent play. The best known of these “tests” is the Rorschach set of inkblots.

Here are a few examples:
I. MMPI-2 Test

The MMPI (Minnesota Multiphasic Personality Inventory), composed by Hathaway (a psychologist) and McKinley (a physician) is the outcome of decades of research into personality disorders. The revised version, the MMPI-2 was published in 1989 but was received cautiously. MMPI-2 changed the scoring method and some of the normative data. It was, therefore, hard to compare it to its much hallowed (and oft validated) predecessor.

The MMPI-2 is made of 567 binary (true or false) items (questions). Each item requires the subject to respond: “This is true (or false) as applied to me”. There are no “correct” answers. The test booklet allows the diagnostician to provide a rough assessment of the patient (the “basic scales”) based on the first 370 queries (though it is recommended to administer all of 567 of them).

Based on numerous studies, the items are arranged in scales. The responses are compared to answers provided by “control subjects”. The scales allow the diagnostician to identify traits and mental health problems based on these comparisons. In other words, there are no answers that are “typical to paranoid or narcissistic or antisocial patients”. There are only responses that deviate from an overall statistical pattern and conform to the reaction patterns of other patients with similar scores. The nature of the deviation determines the patient’s traits and tendencies – but not his or her diagnosis!

The interpreted outcomes of the MMPI-2 are phrased thus: “The test results place subject X in this group of patients who, statistically-speaking, reacted similarly. The test results also set subject X apart from these groups of people who, statistically-speaking, responded differently”. The test results would never say: “Subject X suffers from (this or that) mental health problem”.

There are three validity scales and ten clinical ones in the original MMPI-2, but other scholars derived hundreds of additional scales. For instance: to help in diagnosing personality disorders, most diagnosticians use either the MMPI-I with the Morey-Waugh-Blashfield scales in conjunction with the Wiggins content scales – or (more rarely) the MMPI-2 updated to include the Colligan-Morey-Offord scales.

The validity scales indicate whether the patient responded truthfully and accurately or was trying to manipulate the test. They pick up patterns. Some patients want to appear normal (or abnormal) and consistently choose what they believe are the “correct” answers. This kind of behavior triggers the validity scales. These are so sensitive that they can indicate whether the subject lost his or her place on the answer sheet and was responding randomly! The validity scales also alert the diagnostician to problems in reading comprehension and other inconsistencies in response patterns.

The clinical scales are dimensional (though not multiphasic as the test’s misleading name implies). They measure hypochondriasis, depression, hysteria, psychopathic deviation, masculinity-femininity, paranoia, psychasthenia, schizophrenia, hypomania, and social introversion. There are also scales for alcoholism, post-traumatic stress disorder, and personality disorders.

The interpretation of the MMPI-2 is now fully computerized. The computer is fed with the patients’ age, sex, educational level, and marital status and does the rest. Still, many scholars have criticized the scoring of the MMPI-2.

II. MCMI-III Test

The third edition of this popular test, the Millon Clinical Multiaxial Inventory (MCMI-III), has been published in 1996. With 175 items, it is much shorter and simpler to administer and to interpret than the MMPI-II. The MCMI-III diagnoses personality disorders and Axis I disorders but not other mental health problems. The inventory is based on Millon’s suggested multiaxial model in which long-term characteristics and traits interact with clinical symptoms.

The questions in the MCMI-III reflect the diagnostic criteria of the DSM. Millon himself gives this example (Millon and Davis, Personality Disorders in Modern Life, 2000, pp. 83-84):

“… (T)he first criterion from the DSM-IV dependent personality disorder reads ‘Has difficulty making everyday decisions without an excessive amount of advice and reassurance from others,’ and its parallel MCMI-III item reads ‘People can easily change my ideas, even if I thought my mind was made up.'”

The MCMI-III consists of 24 clinical scales and 3 modifier scales. The modifier scales serve to identify Disclosure (a tendency to hide a pathology or to exaggerate it), Desirability (a bias towards socially desirable responses), and Debasement (endorsing only responses that are highly suggestive of pathology). Next, the Clinical Personality Patterns (scales) which represent mild to moderate pathologies of personality, are: Schizoid, Avoidant, Depressive, Dependent, Histrionic, Narcissistic, Antisocial, Aggressive (Sadistic), Compulsive, Negativistic, and Masochistic. Millon considers only the Schizotypal, Borderline, and Paranoid to be severe personality pathologies and dedicates the next three scales to them.

The last ten scales are dedicated to Axis I and other clinical syndromes: Anxiety Disorder, Somatoform Disorder, Bipolar Manic Disorder, Dysthymic Disorder, Alcohol Dependence, Drug Dependence, Posttraumatic Stress, Thought Disorder, Major Depression, and Delusional Disorder.

Scoring is easy and runs from 0 to 115 per each scale, with 85 and above signifying a pathology. The configuration of the results of all 24 scales provides serious and reliable insights into the tested subject.

Critics of the MCMI-III point to its oversimplification of complex cognitive and emotional processes, its over-reliance on a model of human psychology and behavior that is far from proven and not in the mainstream (Millon’s multiaxial model), and its susceptibility to bias in the interpretative phase.

III. Rorschach Inkblot Test

The Swiss psychiatrist Hermann Rorschach developed a set of inkblots to test subjects in his clinical research. In a 1921 monograph (published in English in 1942 and 1951), Rorschach postulated that the blots evoke consistent and similar responses in groups patients. Only ten of the original inkblots are currently in diagnostic use. It was John Exner who systematized the administration and scoring of the test, combining the best of several systems in use at the time (e.g., Beck, Kloper, Rapaport, Singer).

The Rorschach inkblots are ambiguous forms, printed on 18X24 cm. cards, in both black and white and color. Their very ambiguity provokes free associations in the test subject. The diagnostician stimulates the formation of these flights of fantasy by asking questions such as “What is this? What might this be?”. S/he then proceed to record, verbatim, the patient’s responses as well as the inkblot’s spatial position and orientation. An example of such record would read: “Card V upside down, child sitting on a porch and crying, waiting for his mother to return.”

Having gone through the entire deck, the examiner than proceeds to read aloud the responses while asking the patient to explain, in each and every case, why s/he chose to interpret the card the way s/he did. “What in card V prompted you to think of an abandoned child?”. At this phase, the patient is allowed to add details and expand upon his or her original answer. Again, everything is noted and the subject is asked to explain what is the card or in his previous response gave birth to the added details.

Scoring the Rorschach test is a demanding task. Inevitably, due to its “literary” nature, there is no uniform, automated scoring system.

IV. TAT Diagnostic Test

The Thematic Appreciation Test (TAT) is similar to the Rorschach inkblot test. Subjects are shown pictures and asked to tell a story based on what they see. Both these projective assessment tools elicit important information about underlying psychological fears and needs. The TAT was developed in 1935 by Morgan and Murray. Ironically, it was initially used in a study of normal personalities done at Harvard Psychological Clinic.

The test comprises 31 cards. One card is blank and the other thirty include blurred but emotionally powerful (or even disturbing) photographs and drawings. Originally, Murray came up with only 20 cards which he divided to three groups: B (to be shown to Boys Only), G (Girls Only) and M-or-F (both sexes).

The cards expound on universal themes. Card 2, for instance, depicts a country scene. A man is toiling in the background, tilling the field; a woman partly obscures him, carrying books; an old woman stands idly by and watches them both. Card 3BM is dominated by a couch against which is propped a little boy, his head resting on his right arm, a revolver by his side, on the floor.

This entry was posted in Uncategorized and tagged , , , , , , , . Bookmark the permalink.