Crash injury severity analysis using a two-layer Stacking framework

•A two-layer Stacking model is proposed to predict crash injury severity.•The fist layer combines three classification methods: RF, AdaBoost and GBDT.•The second layer predicts crash injury severity based on a Logistic Regression model.•Several traditional models are compared in binary and multi cla...

Full description

Saved in:
Bibliographic Details
Published in:Accident analysis and prevention Vol. 122; pp. 226 - 238
Main Authors: Tang, Jinjun, Liang, Jian, Han, Chunyang, Li, Zhibin, Huang, Helai
Format: Journal Article
Language:English
Published: England Elsevier Ltd 01-01-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A two-layer Stacking model is proposed to predict crash injury severity.•The fist layer combines three classification methods: RF, AdaBoost and GBDT.•The second layer predicts crash injury severity based on a Logistic Regression model.•Several traditional models are compared in binary and multi classification experiments.•Prediction results show the Stacking model can achieve better performance. Crash injury severity analysis is useful for traffic management agency to further understand severity of crashes. A two-layer Stacking framework is proposed in this study to predict the crash injury severity: The fist layer integrates advantages of three base classification methods: RF (Random Forests), AdaBoost (Adaptive Boosting), and GBDT (Gradient Boosting Decision Tree); the second layer completes classification of crash injury severity based on a Logistic Regression model. A total of 5538 crashes were recorded at 326 freeway diverge areas. In the model calibration, several parameters including the number of trees in three base classification methods, learning rate, and regularization coefficient are optimized via a systematic grid search approach. In the model validation, the performance of the Stacking model is compared with several traditional models including the Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests (RF) in the multi classification experiments. The prediction results show that Stacking model achieves superior performance evaluated by two indicators: accuracy and recall. Furthermore, all the factors used in severity prediction are classified into different categories according to their influence on the results, and sensitivity analysis of several significant factors is finally implemented to explore the impact of their value variation on the prediction accuracy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0001-4575
1879-2057
DOI:10.1016/j.aap.2018.10.016