rm.sdt.RdThis function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).
rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30), est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE, skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL, d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1, d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100), link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1, globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" ) # S3 method for rm.sdt summary(object, file=NULL, ...) # S3 method for rm.sdt plot(x, ask=TRUE, ...) # S3 method for rm.sdt anova(object,...) # S3 method for rm.sdt logLik(object,...) # S3 method for rm.sdt IRT.factor.scores(object, type="EAP", ...) # S3 method for rm.sdt IRT.irfprob(object,...) # S3 method for rm.sdt IRT.likelihood(object,...) # S3 method for rm.sdt IRT.posterior(object,...) # S3 method for rm.sdt IRT.modelfit(object,...) # S3 method for IRT.modelfit.rm.sdt summary(object,...)
| dat | Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination. |
|---|---|
| pid | Person identifier. |
| rater | Rater identifier. |
| Qmatrix | An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of \(K\)) is used. |
| theta.k | A grid of theta values for the ability distribution. |
| est.a.item | Should item parameters \(a_i\) be estimated? |
| est.c.rater | Type of estimation for item-rater parameters \(c_{ir}\)
in the signal detection model. Options are |
| est.d.rater | Type of estimation of \(d\) parameters. Options are
the same as in |
| est.mean | Optional logical indicating whether the mean of the trait distribution should be estimated. |
| est.sigma | Optional logical indicating whether the standard deviation of the trait distribution should be estimated. |
| skillspace | Specified \(\theta\) distribution type. It can be
|
| tau.item.fixed | Optional matrix with three columns specifying fixed \(\tau\) parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3. |
| a.item.fixed | Optional matrix with two columns specifying fixed \(a\) parameters. First column: Item index. Second column: Fixed \(a\) parameter. |
| d.min | Minimal \(d\) parameter to be estimated |
| d.max | Maximal \(d\) parameter to be estimated |
| d.start | Starting value(s) of \(d\) parameters |
| c.start | Starting values of \(c\) parameters |
| tau.start | Starting values of \(\tau\) parameters |
| sd.start | Starting value for trait standard deviation |
| d.prior | Normal prior \(N(M,S^2)\) for \(d\) parameters |
| c.prior | Normal prior for \(c\) parameters. The prior for
parameter \(c_{irk}\) is defined as \(M \cdot ( k - 0.5) \)
where \(M\) is |
| tau.prior | Normal prior for \(\tau\) parameters |
| a.prior | Normal prior for \(a\) parameters |
| link_item | Type of item response function for latent responses.
Can be |
| max.increment | Maximum increment of item parameters during estimation |
| numdiff.parm | Numerical differentiation step width |
| maxdevchange | Maximum relative deviance change as a convergence criterion |
| globconv | Maximum parameter change |
| maxiter | Maximum number of iterations |
| msteps | Maximum number of iterations during an M step |
| mstepconv | Convergence criterion in an M step |
| optimizer | Choice of optimization function in M-step for
item parameters. Options are |
| object | Object of class |
| file | Optional file name in which summary should be written. |
| x | Object of class |
| ask | Optional logical indicating whether a new plot should be asked for. |
| type | Factor score estimation method. Up to now,
only |
| ... | Further arguments to be passed |
The specification of the model follows DeCarlo et al. (2011).
The second level models the ideal rating (latent response) \(\eta=0, ...,K\)
of person \(p\) on item \(i\). The option link_item='GPCM' follows the
generalized partial credit model
$$ P( \eta_{pi}=\eta | \theta_p ) \propto
exp( a_{i} q_{i \eta } \theta_p - \tau_{i \eta } ) $$. The option link_item='GRM' employs the
graded response model $$ P( \eta_{pi}=\eta | \theta_p )=
\Psi( \tau_{i,\eta + 1} - a_i \theta_p ) - \Psi( \tau_{i,\eta} - a_i \theta_p ) $$
At the first level, the ratings \(X_{pir}\) for person \(p\) on item \(i\) and rater \(r\) are modeled as a signal detection model $$ P( X_{pir} \le k | \eta_{pi} )= G( c_{irk} - d_{ir} \eta_{pi} )$$ where \(G\) is the logistic distribution function and the categories are \(k=1,\ldots, K+1\). Note that the item response model can be equivalently written as $$ P( X_{pir} \ge k | \eta_{pi} )= G( d_{ir} \eta_{pi} - c_{irk})$$
The thresholds \(c_{irk}\) can be further restricted to
\(c_{irk}=c_{k}\) (est.c.rater='e'),
\(c_{irk}=c_{ik}\) (est.c.rater='i') or
\(c_{irk}=c_{ir}\) (est.c.rater='r'). The same
holds for rater precision parameters \(d_{ir}\).
A list with following entries:
Deviance
Information criteria and number of parameters
Data frame with item parameters. The columns
N and M denote the number of observed ratings and the
observed mean of all ratings, respectively.
In addition to item parameters \(\tau_{ik}\) and \(a_i\), the mean
for the latent response (latM) is computed as
\(E( \eta_i )=\sum_p P( \theta_p ) q_{ik} P( \eta_i=k | \theta_p ) \)
which provides an item parameter at the original metric of ratings. The latent standard
deviation (latSD) is computed in the same manner.
Data frame with rater parameters.
Transformed \(c\) parameters
(c_x.trans) are computed as \(c_{irk} / ( d_{ir} )\).
Data frame with person parameters: EAP and corresponding standard errors
EAP reliability
EAP reliability
Mean of the trait distribution
Standard deviation of the trait distribution
Item parameters \(\tau_{ik}\)
Standard error of item parameters \(\tau_{ik}\)
Item slopes \(a_i\)
Standard error of item slopes \(a_i\)
Rater parameters \(c_{irk}\)
Standard error of rater severity parameter \(c_{irk}\)
Rater slope parameter \(d_{ir}\)
Standard error of rater slope parameter \(d_{ir}\)
Individual likelihood
Individual posterior distribution
Item probabilities at grid theta.k. Note that these
probabilities are calculated on the pseudo items \(i \times r\),
i.e. the interaction of item and rater.
Probabilities \(P( \eta_i=\eta | \theta )\) of latent item responses evaluated at theta grid \(\theta_p\).
Expected counts
Estimated trait distribution \(P(\theta_p)\).
Maximum number of categories
Processed data
Number of iterations
Further values
DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.
DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.
DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.
Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.
Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.
The facets rater model can be estimated with rm.facets.
############################################################################# # EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1 ############################################################################# data(data.ratings1) dat <- data.ratings1 if (FALSE) { # Model 1: Partial Credit Model: no rater effects mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="n", d.start=100, est.d.rater="n" ) summary(mod1) # Model 2: Generalized Partial Credit Model: no rater effects mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="n", est.d.rater="n", est.a.item=TRUE, d.start=100) summary(mod2) # Model 3: Equal effects in SDT mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="e", est.d.rater="e") summary(mod3) # Model 4: Rater effects in SDT mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="r", est.d.rater="r") summary(mod4) ############################################################################# # EXAMPLE 2: HRM-SDT data.ratings3 ############################################################################# data(data.ratings3) dat <- data.ratings3 dat <- dat[ dat$rater < 814, ] psych::describe(dat) # Model 1: item- and rater-specific effects mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a" ) summary(mod1) plot(mod1) # Model 2: Differing number of categories per variable mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a") summary(mod2) plot(mod2) ############################################################################# # EXAMPLE 3: Hierarchical rater model with discrete skill spaces ############################################################################# data(data.ratings3) dat <- data.ratings3 dat <- dat[ dat$rater < 814, ] psych::describe(dat) # Model 1: Discrete theta skill space with values of 0,1,2 and 3 mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" ) summary(mod1) plot(mod1) # Model 2: Modelling of one item by using a discrete skill space and # fixed item parameters # fixed tau and a parameters tau.item.fixed <- cbind( 1, 1:3, 100*cumsum( c( 0.5, 1.5, 2.5)) ) a.item.fixed <- cbind( 1, 100 ) # fit HRM-SDT mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater, tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" ) summary(mod2) plot(mod2) }