An Application of the Logistic Cognitive Diagnosis
Model
Matthew Burke
mjburke@uncg.edu
UNC Greensboro
Dr. Robert Henson, UNCG
rahenson@uncg.edu
This study applies the Logistic Cognitive Diagnosis Model (LCDM) to data from
the grammar section of the 2003-2004 Examination for the Certificate of
Proficiency of English (ECPE), developed and scored by the English Language
Institute of the University of Michigan. Cognitive Diagnosis Models (CDMs) are
special cases of latent class models where classes are defined by mastery or
non-mastery of a set of skills required by test items. One distinct advantage
of CDMs over Classical Test Theory and Item Response Theory is that CDMs
provide diagnostic information regarding which skills required by a test have or
have not been mastered by each examinee. This allows for directed instruction
to address any shortcomings in test performance. In the age of accountability
ushered in by the No Child Left Behind Act (2001),
CDMs offer an opportunity for standardized tests to do more than simply rank
examinees in overall ability. The LCDM is a more general expression of several
specific CDMs (Reduced RUM, DINA, DINO, and the Compensatory RUM), and is
designed to function, in part, as a model selector. Results are discussed in
relation to previous analyses of this dataset using the Reduced RUM (Henson
& Templin, in press), and explanations of findings are interpreted in the
context of the LCDM. The results suggest promising outcomes for the application
of the LCDM to the field of language testing. Alternative approaches and
suggestions for future research are also proposed.
Objectives:
Cognitive Diagnosis Models (CDMs) are relative newcomers to the field of
testing and measurement. As such, there is much work that needs to be done to
assess their utility. In contrast to Classical Test Theory and Item Response
Theory, which represent examinee ability as a continuous measure of overall
ability, CDMs describe examinee ability as the presence or absence of the
skills required to correctly respond to test items.
The primary advantage of a CDM approach over Classical Test Theory or Item
Response Theory is that it can provide diagnostic information to students and
teachers concerning which of the skills required by a test have been mastered.
This allows for the identification of which skills have not been mastered, thus
allowing for directed instruction to address those skills specifically. CDMs
differ in the way they define the relationship between skill mastery and the
ensuing probability of a correct response to a test item. As such, it is
desirable to know which CDM may be the most appropriate in a given situation.
The purpose of the current study is to investigate the utility of a newly
defined CDM, the Logistic Cognitive Diagnosis Model (LCDM). For this study, the
LCDM was applied to a large-scale, standardized language test. The language
test under investigation is the Examination for the Certificate of Proficiency
in English (ECPE). The application of this model to real testing data will be
done in an attempt to assess the capability of the LCDM function as a model
selector. Also, by carrying out this study, the capability of the LCDM to
provide diagnostic information regarding which skills required by the test
items have been mastered by the examinees will be addressed. Additionally, a
previous analysis of the ECPE data can be compared and contrasted with the
current analysis.
Theoretical Framework:
Cognitive Diagnosis Models represent examinee ability as a profile of dichotomous
skills, rather than a single continuous measure of ability. The attribute
mastery profile for each examinee is a vector of ones and zeroes representing
mastery/non-mastery of each of the skills required by test items. Additionally,
a test can be described by a Q-Matrix (Tatsuoka,
1985), which is a “blueprint” indicating which particular abilities or skills
are required by each item on a test. The Q-Matrix is an attribute by test item
matrix of ones and zeroes, where ones indicate that an attribute is required by
an item, and zeroes indicate that a skill is not required by an item. The
Q-Matrix, in conjunction with an examinee’s abilities (as represented by the
attribute mastery profile), provide information concerning the individual’s
chances of answering items on the test correctly. CDMs have the desirable
quality of providing information as to the acquired skills of examinees, which
allows for focused instruction to address any non-mastered skills. In addition
they also provide information as to the appropriateness of the Q-Matrix
specification (Henson & Templin, 2006).
The LCDM is a logistic expression of a CDM, and is designed to be a more
general form of four other CDMs. These other four models are the Deterministic
Input Noisy “And” Gate Model (DINA; Haertel, 1989),
the Deterministic Input Noisy “Or” Gate Model (DINO; Templin and Henson, 2006),
the Reduced Re-Parameterized Unified Model (Reduced RUM; Hartz,
2002), and the Compensatory Re-Parameterized Unified Model (Compensatory RUM, Hartz, 2002). Each of these four models has its own
definition of the relationship between which skills an examinee has mastered
and the probability of a correct response to an item. The DINA describes the
relationship between the Q-Matrix and the examinee attribute mastery profile in
an all or none fashion (Haertel, 1989). If the
examinee possesses all of the requisite abilities, they have a high probability
of answering the item correctly. Lacking any one of the requisite attributes
reduces the probability of a correct response to chance (or guessing) levels.
The DINO (Templin and Henson, 2006) states that mastering any one of the
required attributes for an item leads to a high probability of a correct
response. An examinee must lack ALL of the required skills to be reduced to
guessing at the correct answer. The Reduced RUM (Hartz,
2002) is defined such that if all skills required by an item are mastered by an
examinee, then the probability of a correct response is high. For each required
skill lacked by an examinee, there is a proportional reduction in the
probability of a correct response. The Compensatory Rum (Hartz, 2002) states
that mastery of each additional required skill increases the probability of a
correct response in an additive fashion.
The LCDM is designed in such a way that it can express each of these
four models via particular constraints on the coefficient to be estimated. When
left unconstrained, the pattern of coefficients obtained from the LCDM is an
indication of which of the four incorporated models seems to most appropriately
describe the relationship among attribute mastery and the probability of a
correct response to a test item.
Methods:
The LCDM is applied to data from a 2003 administration of the ECPE. A Markov
Chain Monte Carlo procedure is used to estimate the model coefficients. The
chains are of length 10,000, with a burn-in of 7000 iterations. This length
allows for stable estimates to be produced. The chains are visually inspected
for convergence. Once the estimates are obtained, comparisons of observed score
distributions and model predicted score distributions are conducted. To do
this, graphs of both distributions are plotted on the same table, and visual
comparisons are made. This acts as a check of the reasonableness of the model. Additionally,
the correlations among observed and model predicted values are calculated.
In addition to model fit issues, the attribute mastery profiles of the
examinees are produced by the LCDM. These will be compared to a previous
analysis of the ECPE using the Reduced RUM. Also, item parameters based on the
LCDM and Reduced RUM will be compared to determine if they are compatible.
Data Source:
These data are from a 2003 administration of the Examination for the
Certificate of Proficiency in English (ECPE). The ECPE is an advanced test of
English language ability developed by the English Language Institute at the
University of Michigan. This sample includes 2,922 examinees from countries in
Europe, Africa, and Asia.
Results:
Results are not currently available, but will be provided by the time of
presentation.