Inter-rater variability and repeatability in the assessment of the Tanner–Whitehouse classification of hand radiographs for the estimation of bone age
Objective To determine which bones and which grades had the highest inter-rater variability when employing the Tanner–Whitehouse (T-W) method. Materials and methods Twenty-four radiologists were recruited and trained in the T-W classification of skeletal development. The consistency and skill of the...
Saved in:
Published in: | Skeletal radiology Vol. 53; no. 12; pp. 2635 - 2642 |
---|---|
Main Authors: | , , , , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01-12-2024
Springer Nature B.V |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Objective
To determine which bones and which grades had the highest inter-rater variability when employing the Tanner–Whitehouse (T-W) method.
Materials and methods
Twenty-four radiologists were recruited and trained in the T-W classification of skeletal development. The consistency and skill of the radiologists in determining bone development status were assessed using 20 pediatric hand radiographs of children aged 1 to 18 years old. Four radiologists had a poor concordance rate and were excluded. The remaining 20 radiologists undertook a repeat reading of the radiographs, and their results were analyzed by comparing them with the mean assessment of two senior experts as the reference standard. Concordance rate, scoring, and Kendall’s
W
were calculated to evaluate accuracy and consistency.
Results
Both the radius, ulna, and short finger (RUS) system (Kendall’s
W
= 0.833) and the carpal (C) system (Kendall’s
W
= 0.944) had excellent consistency, with the RUS system outperforming the C system in terms of scores. The repeatability analysis showed that the second rating test, performed after 2 months of further bone age assessment (BAA) practice, was more consistent and accurate than the first. The capitate had the lowest average concordance rate and scoring, as well as the lowest overall concordance rate for its D classification. Moreover, the G classifications of the seven carpal bones all had a concordance rate less than 0.6. The bones with lower Kendall’s
W
were likewise those with lower scores and concordance rates.
Conclusion
The D grade of the capitate showed the highest variation, and the use of the Tanner–Whitehouse 3rd edition (T-W3) to determine bone age (BA) was frequently inconsistent. A more comprehensive description with a focus on inaccuracy bones or ratings and a modification to the T-W3 approach would significantly advance BAA. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0364-2348 1432-2161 1432-2161 |
DOI: | 10.1007/s00256-024-04664-w |