Going from prediction to prescription: Machine learning’s next frontier?

Posted on September 17, 2024

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at briefings@dataminingapps.com and let’s get in touch!

Contributed by Christopher Bockel-Rickermann.

Key Takeaways

Predictive modeling answers: “What will happen?”. Prescriptive modeling answers: “What to do to make it happen?”
Traditional supervised learning excels in predictive modeling, but models trained on empirical data can be biased when used for prescription: We might want to do things differently than before
Training prescriptive models is challenging due to the “fundamental problem of causal inference”, but we can use tailored methods to improve performance

Introduction

The very first concept that most students of machine learning (ML) and analytics are introduced to is “supervised learning”. Supervised learning is an approach to ML, where models are trained on labeled data, matching some explanatory variables x to an outcome y. The result is a model f(x) that we can feed unseen data and that will give us a prediction of the associated outcome. In a business context, for examples, we might use a supervised learning algorithm (e.g., linear regression, regression trees, or xgboost) to train a model that predicts next year’s sales figures based on historical sales data and economic indicators, such as interest rates.

Supervised learning makes some important assumptions: 1) Explanatory variables are exogenous, so out of our control, and 2) the relationships between explanatory variables (the “features”) and the outcome remain consistent and are not influenced by our actions.

So, what if we are in a situation where we want to intervene on the environment to actively steer an outcome? Can a supervised learning algorithm help us to prescribe, e.g., the optimal amount of marketing spend per customer to maximize our sales?

Correlation versus Causation

When our goal is to know what will happen based on existing trends, we can rely on correlational models, trained in a supervised manner. These models help us prepare for future scenarios by identifying patterns in the data. However, when our objective is to change the outcome by taking specific actions, we need to understand “causation”, so the true relationships relating actions to outcomes.

Predictive models might tell us that increasing marketing spend is correlated with higher sales, but they don’t tell us whether increasing the marketing spend causes the higher sales. A particular problem here is “confounding”: What if historically most marketing efforts were allocated to customers that generated a lot of revenue regardless of our efforts? Then a model might falsely assume that high marketing spend results in higher sales, even if the impact of marketing spend is negligible.

So, is ML worth nothing in terms of prescribing optimal actions? Have no fear, research has us covered: The field of causal inference studies how to extract cause and effect relationships from data. However, many tools from the traditional causal inference literature are not suited for large and wide datasets that we often face today. That’s why the field of “causal ML” has emerged, researching algorithms that leverage concepts of both traditional causal inference and machine learning to allow for causal reasoning in complex data.

Personalized loan pricing – A potential application of causal ML

Let’s discuss a potential application of causal ML: Personalized loan pricing.

When applying for a loan, a bank typically evaluates the application and proposes a prospective price, a “bid”, that the customer can accept or refuse. To optimize the bids in terms of, e.g., revenue generation, it must estimate the individual “bid response”, so the probability of a customer to accept or refuse a certain bid.

This is complicated, as the bank only has observational data at hand for modeling, which was collected under an established pricing policy, which might not have been optimal. The data is hence confounded. Adopting supervised learning methods might yield biased estimates of bid responses.

Bockel-Rickermann et al. (2023) evaluate learning individual bid response models from observational data and show that supervised learning methods, such as random forests and neural networks indeed incorporate biases. On the contrary, tailored causal machine learning methods for estimating effects of continuous-valued interventions, so-called “dose response estimators” can appropriately adjust for confounding and build better models.

Most importantly, the study reveals that traditional performance metrics, such as the Brier score, do not reveal biases in models.

Conclusion

This article provided a (very) brief introduction to causal inference, its relations to supervised learning, and the field of causal ML. We have discussed that correlation is not causation and that supervised learning is not able to distinguish one from the other. All the above fields still grow rapidly, and we are just at the beginning of their adoption in real life. Expect more to come!

If there is one thing to take away than it is that we should be careful to just take any data and any method and expect the resulting model to be good for decision support. Don’t fall for spurious correlations of confounding biases. More important for a data scientist then ever is to carefully analyse the problem at hand and to understand an end user’s needs. Do we want to predict outcomes based on exogenous variables? Supervised learning might be an appropriate tool. Do we want to intervene in our environment and change the status quo? Causal ML might be better suited for the task.

References:

For a general introduction to causal inference:

Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic books.

For an overview of causal ML and a seminal paper:

Kaddour, J., Lynch, A., Liu, Q., Kusner, M. J., & Silva, R. (2022). Causal machine learning: A survey and open problems. arXiv preprint arXiv:2206.15475.
Shalit, U., Johansson, F. D., & Sontag, D. (2017). Estimating individual treatment effect: generalization bounds and algorithms. In International conference on machine learning (pp. 3076-3085). PMLR.

For causal ML for bid response estimation:

Bockel-Rickermann, C., Verboven, S., Verdonck, T., & Verbeke, W. (2023). A Causal Perspective on Loan Pricing: Investigating the Impacts of Selection Bias on Identifying Bid-Response Functions. arXiv preprint arXiv:2309.03730.

For a large collection of entertaining spurious correlations:

Tyler Vigen. Spurious correlations. URL: https://www.tylervigen.com/spurious-correlations