Marco Vriens, Marketing and Chad Vidden, Mathematics & Statistics, co-authored the article "The benefits of the Shapley Value for key drivers analysis" in Applied Marketing Analytics and was accepted for publication by Henry Stewart. Linear (and other types of) regression are often used in what is referred to as
‘driver modelling’ in customer satisfaction studies. The goal of such research is often to
determine the relative importance of various sub-components of the product or service in
terms of predicting and explaining overall satisfaction. Driver modelling can also be used
to determine the drivers of value, likelihood to recommend, etc. A common problem is
that the independent variables are correlated, making it difficult to get a good estimate
of the importance of the ‘drivers’. This problem is well known under conditions of severe
multicollinearity, and alternatives like the Shapley-value approach have been proposed
to mitigate this issue. This paper shows that Shapley-value may even have benefits in
conditions of mild collinearity. The study compares linear regression, random forests and
gradient boosting with the Shapley-value approach to regression and shows that the
results are more consistent with bivariate correlations. However, Shapley-value regression
does result in a small decrease in k-fold validation results.
Submitted on: Jan. 11