5.3 Hierarchical Indexing


: 20 minutes

Pandas offers a very powerful concept called Hierarchical Indexing for multilevel (or nested) indexing. We again use the sales DataFrame from the previous sections to motivate the use of hierarchical indexing.

Let us first make the Customer ID column our index using set_index, and store the new DataFrame as sales1.

Note that the index of sales1 is not unique—a customer may have bought many items.

As we discover below, there are total 25 customers, each many transactions.

Under the principal index Customer ID, we now add a secondary index: Transaction ID

The sales2 DataFrame now has a MultiIndex. Using .sort_index(), one can sort the DataFrame by index to see the hierarchy.