An Application of the Logistic Cognitive Diagnosis Model

Matthew Burke
mjburke@uncg.edu
UNC Greensboro
Dr. Robert Henson, UNCG
rahenson@uncg.edu

This study applies the Logistic Cognitive Diagnosis Model (LCDM) to data from the grammar section of the 2003-2004 Examination for the Certificate of Proficiency of English (ECPE), developed and scored by the English Language Institute of the University of Michigan. Cognitive Diagnosis Models (CDMs) are special cases of latent class models where classes are defined by mastery or non-mastery of a set of skills required by test items. One distinct advantage of CDMs over Classical Test Theory and Item Response Theory is that CDMs provide diagnostic information regarding which skills required by a test have or have not been mastered by each examinee. This allows for directed instruction to address any shortcomings in test performance. In the age of accountability ushered in by the No Child Left Behind Act (2001), CDMs offer an opportunity for standardized tests to do more than simply rank examinees in overall ability. The LCDM is a more general expression of several specific CDMs (Reduced RUM, DINA, DINO, and the Compensatory RUM), and is designed to function, in part, as a model selector. Results are discussed in relation to previous analyses of this dataset using the Reduced RUM (Henson & Templin, in press), and explanations of findings are interpreted in the context of the LCDM. The results suggest promising outcomes for the application of the LCDM to the field of language testing. Alternative approaches and suggestions for future research are also proposed.

Objectives: 
Cognitive Diagnosis Models (CDMs) are relative newcomers to the field of testing and measurement. As such, there is much work that needs to be done to assess their utility. In contrast to Classical Test Theory and Item Response Theory, which represent examinee ability as a continuous measure of overall ability, CDMs describe examinee ability as the presence or absence of the skills required to correctly respond to test items. The primary advantage of a CDM approach over Classical Test Theory or Item Response Theory is that it can provide diagnostic information to students and teachers concerning which of the skills required by a test have been mastered. This allows for the identification of which skills have not been mastered, thus allowing for directed instruction to address those skills specifically. CDMs differ in the way they define the relationship between skill mastery and the ensuing probability of a correct response to a test item. As such, it is desirable to know which CDM may be the most appropriate in a given situation. The purpose of the current study is to investigate the utility of a newly defined CDM, the Logistic Cognitive Diagnosis Model (LCDM). For this study, the LCDM was applied to a large-scale, standardized language test. The language test under investigation is the Examination for the Certificate of Proficiency in English (ECPE). The application of this model to real testing data will be done in an attempt to assess the capability of the LCDM function as a model selector.  Also, by carrying out this study, the capability of the LCDM to provide diagnostic information regarding which skills required by the test items have been mastered by the examinees will be addressed. Additionally, a previous analysis of the ECPE data can be compared and contrasted with the current analysis.

Theoretical Framework: 
Cognitive Diagnosis Models represent examinee ability as a profile of dichotomous skills, rather than a single continuous measure of ability. The attribute mastery profile for each examinee is a vector of ones and zeroes representing mastery/non-mastery of each of the skills required by test items. Additionally, a test can be described by a Q-Matrix (Tatsuoka, 1985), which is a “blueprint” indicating which particular abilities or skills are required by each item on a test. The Q-Matrix is an attribute by test item matrix of ones and zeroes, where ones indicate that an attribute is required by an item, and zeroes indicate that a skill is not required by an item. The Q-Matrix, in conjunction with an examinee’s abilities (as represented by the attribute mastery profile), provide information concerning the individual’s chances of answering items on the test correctly. CDMs have the desirable quality of providing information as to the acquired skills of examinees, which allows for focused instruction to address any non-mastered skills. In addition they also provide information as to the appropriateness of the Q-Matrix specification (Henson & Templin, 2006).


The LCDM is a logistic expression of a CDM, and is designed to be a more general form of four other CDMs. These other four models are the Deterministic Input Noisy “And” Gate Model (DINA; Haertel, 1989), the Deterministic Input Noisy “Or” Gate Model (DINO; Templin and Henson, 2006), the Reduced Re-Parameterized Unified Model (Reduced RUM; Hartz, 2002), and the Compensatory Re-Parameterized Unified Model (Compensatory RUM, Hartz, 2002). Each of these four models has its own definition of the relationship between which skills an examinee has mastered and the probability of a correct response to an item. The DINA describes the relationship between the Q-Matrix and the examinee attribute mastery profile in an all or none fashion (Haertel, 1989). If the examinee possesses all of the requisite abilities, they have a high probability of answering the item correctly. Lacking any one of the requisite attributes reduces the probability of a correct response to chance (or guessing) levels. The DINO (Templin and Henson, 2006) states that mastering any one of the required attributes for an item leads to a high probability of a correct response. An examinee must lack ALL of the required skills to be reduced to guessing at the correct answer. The Reduced RUM (Hartz, 2002) is defined such that if all skills required by an item are mastered by an examinee, then the probability of a correct response is high. For each required skill lacked by an examinee, there is a proportional reduction in the probability of a correct response.  The Compensatory Rum (Hartz, 2002) states that mastery of each additional required skill increases the probability of a correct response in an additive fashion.  The LCDM is designed in such a way that it can express each of these four models via particular constraints on the coefficient to be estimated. When left unconstrained, the pattern of coefficients obtained from the LCDM is an indication of which of the four incorporated models seems to most appropriately describe the relationship among attribute mastery and the probability of a correct response to a test item.

Methods: 
The LCDM is applied to data from a 2003 administration of the ECPE. A Markov Chain Monte Carlo procedure is used to estimate the model coefficients. The chains are of length 10,000, with a burn-in of 7000 iterations. This length allows for stable estimates to be produced. The chains are visually inspected for convergence. Once the estimates are obtained, comparisons of observed score distributions and model predicted score distributions are conducted. To do this, graphs of both distributions are plotted on the same table, and visual comparisons are made. This acts as a check of the reasonableness of the model. Additionally, the correlations among observed and model predicted values are calculated.


In addition to model fit issues, the attribute mastery profiles of the examinees are produced by the LCDM. These will be compared to a previous analysis of the ECPE using the Reduced RUM. Also, item parameters based on the LCDM and Reduced RUM will be compared to determine if they are compatible.

Data Source: 
These data are from a 2003 administration of the Examination for the Certificate of Proficiency in English (ECPE). The ECPE is an advanced test of English language ability developed by the English Language Institute at the University of Michigan. This sample includes 2,922 examinees from countries in Europe, Africa, and Asia.

Results: 
Results are not currently available, but will be provided by the time of presentation.