Performance of the Gold Standard and Machine Learning in Predicting Vehicle Transactions

Publication Type

Conference Paper

Date Published

01/2022

Authors

Lazar, Alina, Ling Jin, Caitlin Brown, C Anna Spurlock, Alex Sim, Kesheng Wu

DOI

10.1109/BigData52589.2021.9671286

Abstract

Logistic regression has long been the gold standard for choice modeling in the transportation field. Despite the rising popularity of machine learning (ML), few is applied to predicting the household vehicle transactions. To address the research gap, this paper presents a first use case of ML application to predicting household vehicle transaction decisions by leveraging a newly processed national panel data set. Model performances are reported for four ML models and the traditional multinomial logit model (MNL). Instead of treating the gold standard and ML models as competitors, this paper tries to use ML tools to inform the MNL model building process. We find the two gradient boosting based methods, CatBoost and LightGBM, are the best performing ML models; and improving logistic models with SHAP interpretation tools can achieve similar performance levels to the best performing ML methods.

Journal

2021 IEEE International Conference on Big Data (Big Data)

Year of Publication

2021