5.3 Hierarchical Indexing
: 20 minutes
Pandas offers a very powerful concept called Hierarchical Indexing for multilevel (or nested) indexing. We again use the sales
DataFrame from the previous sections to motivate the use of hierarchical indexing.
Let us first make the Customer ID
column our index using set_index
, and store the new DataFrame as sales1
.
Note that the index of sales1
is not unique—a customer may have bought many items.
As we discover below, there are total 25 customers, each many transactions.
Under the principal index Customer ID
, we now add a secondary index: Transaction ID
The sales2
DataFrame now has a MultiIndex. Using .sort_index()
, one can sort the DataFrame by index to see the hierarchy.