What Predicts Variation in Reliability and Validity of Online Peer Assessment? A Large-Scale Cross-Context Study

Background: For peer assessment, reliability (i.e., consistency in ratings across peers) and validity (i.e., consistency of peer ratings with instructors or experts) are frequently examined in the research literature to address a central concern of instructors and students. Although the average leve...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of computer assisted learning Vol. 39; no. 6; pp. 2004 - 2024
Main Authors:	Xiong, Yao, Schunn, Christian D, Wu, Yong
Format:	Journal Article
Language:	English
Published:	Oxford Wiley 01-12-2023 Wiley Subscription Services, Inc
Subjects:	Class size Computer Assisted Testing Consistency Context Documents Education Peer Evaluation Prediction Ratings Ratings & rankings Reliability analysis Structural Equation Models Students Teachers Test Reliability Test Validity Validity
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Background: For peer assessment, reliability (i.e., consistency in ratings across peers) and validity (i.e., consistency of peer ratings with instructors or experts) are frequently examined in the research literature to address a central concern of instructors and students. Although the average levels are generally promising, both reliability and validity can vary substantially from context to context. Meta-analyses have identified a few moderators that are related to peer assessment reliability/validity, but they have lacked statistical power to systematically investigate many moderators or disentangle correlated moderators. Objectives: The current study fills this gap by addressing what variables influence peer assessment reliability/validity using a large-scale, cross-context dataset from a shared online peer assessment platform. Methods: Using multi-level structural equation models, we examined three categories of variables: (1) variables related to the context of peer assessment; (2) variables related to the peer assessment task itself; and (3) variables related to rating rubrics of peer assessment. Results and Conclusions: We found that the extent to which assessment documents varied in quality on the given rubric played a central role in mediating the effect from different predictors to peer assessment reliability/validity. Other variables that are significantly associated with reliability and validity included: Education Level, Language, Discipline, Average Ability of Peer Raters, Draft Number, Assignment Number, Class Size, Average Number of Raters, and Length of Rubric Description. The results provide information to guide practitioners on how to improve reliability and validity of peer assessments.
ISSN:	0266-4909 1365-2729
DOI:	10.1111/jcal.12861