Performance of the Gold Standard and Machine Learning in Predicting Vehicle Transactions
Logistic regression has long been the gold standard for choice modeling in the transportation field. Despite the rising popularity of machine learning (ML), few is applied to predicting the household vehicle transactions. To address the research gap, this paper presents a first use case of ML application to predicting household vehicle transaction decisions by leveraging a newly processed national panel data set. Model performances are reported for four ML models and the traditional multinomial logit model (MNL). Instead of treating the gold standard and ML models as competitors, this paper tries to use ML tools to inform the MNL model building process. We find the two gradient boosting based methods, CatBoost and LightGBM, are the best performing ML models; and improving logistic models with SHAP interpretation tools can achieve similar performance levels to the best performing ML methods.