Understanding Feature Contributions for Traffic Accident Severity Prediction
DOI:
https://doi.org/10.61173/j18kfn16Keywords:
Traffic accident severity, Random forest, Feature contribution, UK road accident datasetAbstract
Concern around global traffic accidents have increased due to the large toll they take on people and the economy. To investigate factors affecting the result of vehicle accidents in the UK, this study attempts to predict the severity of accidents using a Random Forest model. It investigates difference feature sets including time, context and interaction. Following data preprocessing and class balancing using the Synthetic Minority Oversampling Technique (SMOTE), five feature sets (baseline, temporal, context, interaction and extended) were trained and assessed over 10 iterations. The results show that overall accuracy remains around 0.78 and AUC around 0.60 across all models, indicating stable performance. However, the inclusion of contextual and interaction features slightly improves recall for fatal and serious crashes and reduces the dominance of a few strong predictors, leading to a more balanced feature contribution. These results suggest that richer feature design enhances Random Forest’s ability to capture complex crash patterns beyond the majority of slight accidents.