Define Hierarchy
Cross-sectional Hierarchy
There are three kinds of cross-sectional hierarchy: regular, grouped or mix of both (Hyndman & Athanasopoulos, 2021).
The last type is most of the case in practice. In this tutorial, we will destruct the last type of hierarchy, and show how
complex hierarchical structures can be constructed using new()
.
Almost every hierarchy can be seen as a combination of multiple regular hierarchies. Here, regular hierarchy refers to hierarchy that can be represented by a tree, where nodes in each level are completely unique. For example, for the product category case, it can be “Category” -> “Subcategory” -> “item”. The geographical hierarchy is also a classical example. Another example is the file system.
Mixing two or more regular hierarchies construct a complex but widely used hierarchy. Every “level” in this hierarchy is interaction of two or more levels in different regular hierarchies. Here is an example, still the product case.
For “category”, the regular hierarchy can be total, category, subcategory and item. Each product can also be sold at multiple locations. For “location”, the regular hierarchy can be total, state, region, store.
Mixing these two hierarchies gives following levels, totally 4 * 4 = 16 levels:
total = total (category) x total (location)
category = category x total (location)
subcategory = subcategory x total (location)
item = item x total (location)
state = total (category) x state
region = total (category) x region
store = total (category) x store
category x state
category x region
category x store
subcategory x state
subcategory x region
subcategory x store
item x state
item x region
item x store (bottom level)
Another example of this kind of hierarchy is the file system with tags, while usually tags only have one level except the total level.
To construct the product hierarchy using new()
, use the following statement:
Hierarchy.new(df, structures=[('category', 'subcategory', 'item'), ('state', 'region', 'store')])
You can also specify excludes
and includes
to exclude some levels or only include some levels use the same rule.
- class pyhts.Hierarchy(s_mat, node_level, names, period, level_name=None)
Class for a hierarchy structure.
Attributes
- classmethod new(df, structures, excludes=None, includes=None, period=1)
Construct hierarchy from data table that each row represents a unique bottom level time series. This method is suitable for complex hierarchical structure.
Examples
>>> from pyhts import Hierarchy >>> df = pd.DataFrame({"City": ["A", "A", "B", "B"], "Store": ["Store1", "Store2", "Store3", "Store4"]}) >>> hierarchy = Hierarchy.new(df, [("City", "Store")]) >>> hierarchy.node_name array(['total_total', 'City_A', 'City_B', 'Store_Store1', 'Store_Store2', 'Store_Store3', 'Store_Store4'], dtype=object) >>> hierarchy.s_mat array([[1, 1, 1, 1], [1, 1, 0, 0], [0, 0, 1, 1], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=int8) >>> df = pd.DataFrame({"City": ["A", "A", "B", "B"], "Category": ["C1", "C2", "C1", "C2"]}) >>> hierarchy = Hierarchy.new(df, [("City",), ("Category",)]) >>> hierarchy.s_mat array([[1, 1, 1, 1], [1, 1, 0, 0], [0, 0, 1, 1], [1, 0, 1, 0], [0, 1, 0, 1], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=int8)
- Parameters
df (
DataFrame
) – DataFrame contains keys for determining hierarchical structural.structures (
List
[Tuple
[str
,...
]]) – The structure of the hierarchy. It should be a list, where each element represents the hierarchical structure of one natural hierarchy. The element should be tuple of string (column in the dataframe). The order of the columns is top-down. For example, (“category”, “sub-category”, “item”) can be a natural hierarchy.excludes (
Optional
[List
[Tuple
[str
,...
]]]) – middle levels excluded from the hierarchy.includes (
Optional
[List
[Tuple
[str
,...
]]]) – middle levels included in the hierarchy.period (
int
) – frequency of the time series, 1 means non-seasonality data, 12 means monthly data.
- Return type
- Returns
Hierarchy object.
Temporal Hierarchy
Temporal hierarchy is constructed by multiple levels of temporal aggregation of a time series. pyhts
provide
TemporalHierarchy
to construct temporal hierarchy. The methodology is well-known as THief (Athanasopoulos et al., 2017).
- class pyhts.TemporalHierarchy(s_mat, node_level, names, period, level_name)
Class for temporal hierarchy, constructed by multiple temporal aggregations.
Attributes
- classmethod new(agg_periods, forecast_frequency)
TemporalHierarchy constructor.
- Parameters
agg_periods (
List
[int
]) – periods of the aggregation levels, referring to how many periods in the bottom-level are aggregated. To ensure a reasonable hierarchy, each element inagg_periods
should be a factor of the the max agg_period. For example, possible aggregation periods for monthly time series could be 2 (two months ), 3 (a quarter), 4 (four months), 6(half year), 12 (a year).forecast_frequency (
int
) – frequency of the bottom level series, corresponding to the aggregation level1
in agg_periods
- Return type
- aggregate_ts(bts, levels=None)
aggregate time series
- Parameters
bts (
ndarray
) – should be a univariate time serieslevels – which level to be aggregated, should be one of the level_name
- Return type
dict
- Returns
a dict whose keys are level_name and value are temporally aggregated time series.
Reference
[1] Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). Otext. https://otexts.com/fpp3/
[2] Athanasopoulos, G., Hyndman, R. J., Kourentzes, N., & Petropoulos, F. (2017). Forecasting with temporal hierarchies. European Journal of Operational Research, 262(1), 60–74. https://doi.org/10.1016/j.ejor.2017.02.046