Hierarchical forecasting: end-to-end or post-hoc?

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at briefings@dataminingapps.com and let’s get in touch!

Contributed by Boje Deforce

Key points

  • We discus hierarchical forecasting as a method to enforce coherence among forecasts, linked through a hierarchy
  • We analyze how the majority of hierarchical forecasting happens post-hoc, independently from the estimation process
  • We highlight a recent trend, powered by deep learning, which allows for information sharing during the estimation process, obtaining more effective hierarchical forecasts

Introduction

In today’s fast-paced world, organizations need accurate forecasts at various levels to make well-informed decisions. From companies managing supply chains to businesses forecasting sales across regions or farmers managing irrigation needs, forecasting is key to optimal planning. A powerful approach to achieve reliable predictions across multiple levels is hierarchical forecasting (Athanasopoulos et al., 2024).

What is Hierarchical Forecasting?

Hierarchical forecasting refers to a forecasting method that organizes data into a hierarchy—where data is divided into different levels or groups—and forecasts are made for each level. For example, the top level could represent the soil moisture for an entire field, while the levels below break that down into e.g. plots, and then further into individual trees. Forecasts at each level are related, and hierarchical forecasting is achieved by enforcing given reconciliation constraints of the separate forecasts to ensure coherence between these levels (Hyndman et al., 2011).

Trends in Hierarchical Forecasting: from post-hoc to end-to-end

In the last decade, a key approach to reconcile the separate forecasts has been based on post-hoc reconciliation. In a post-hoc approach, reconciliation happens independently of the estimation process. In practice, one of the following reconciliation approaches is typically considered: bottom-up (generate forecasts at the bottom level, and aggregate to higher levels), top-down (generate forecasts at the highest level, and subdivide to lower levels based on pre-defined proportions), middle-out (a combination of bottom-up and top-down), with bottom-up performing better in practice (Athanasopoulos et al., 2024). Wickramasuriya et al. (2019) introduced an optimal reconciliation approach which minimizes the total forecast variance, building on their previous work in Hyndman et al. (2011). Several recent machine learning (ML)-based reconciliation approaches are also listed in Spiliotis et al. (2021) and Athanasopoulos et al. (2024), with many approaches proposing non-linear and data-driven reconciliation, but still in a post-hoc manner.

Recently, an interesting trend arose, powered by deep learning. The primary example is from Rangapuram et al. (2021) who highlighted the typical post-hoc nature of hierarchical forecasting, and question whether there are gains to be made from learning hierarchical forecasting “end-to-end”. The latter refers to the reconciliation constraints that are taken into account during training/estimation. As a result, the learning process forces the algorithm to share information across the hierarchy. This information sharing has clear benefits as shown in Rangapuram et al. (2021), outperforming state-of-the-at post-hoc approaches like Wickramasuriya et al. (2019). Athanasopoulos et al., 2024 highlight several other end-to-end approaches inspired by Rangapuram et al. (2021) and indicate similar benefits.

While the status-quo often still relies on post-hoc approaches (Athanasopoulos et al., 2024), clear benefits are available by adopting and end-to-end approach, allowing for information sharing during training. We believe the information sharing capability has potential in hierarchical settings where individual items in a hierarchy contain useful information about the other items (e.g., in smart irrigation).