The 6 AI Engineering Patterns, come build with Greg live:Β Starts Jan 6th, 2025
Leverage
Pandas functions

Pandas Cross Tab – pd.crosstab()

Pandas Crosstab usage and parameters

For all intensive purposes, you can think of Pandas Crosstab as the same things as Pandas Pivot Table.

When to use pd.crosstab(): When you are starting with non-DataFrame based data. This could be a list of lists or dictionaries.

When to use pd.pivot_table(): When you are starting with a Dataframe.

pandas.crosstab(index=your_new_index(rows),
                columns=your_pivoted_columns,
                values=your_new_values)

Pseudo Code: With your list of arrays, construct a pivot table and return a DataFrame

Pandas Crosstab

In fact, Pandas Crosstab is so similar to Pandas Pivot Table, that crosstab uses pivot table within it’s source code.

Resample Main Parameters

  • index – This is what your want your new rows to be aggregated (or grouped) on. This will generally be the main subject of your data analysis. In the picture above, we chose index=list_1 which were peoples names.
  • columns – This is the parameter you want your index to be split or cut by. This is generally a secondary column, or a spectrum of categories you want to further look into.
  • values – The values you want to show at the intersection of your index and column. By default, this will be a frequency count (how often values occur. However, if you specify an aggfunc, then you’ll summarize your values according to the function you specified.
  • aggfunc – Short for aggregate function, this is how you will summarize your values. Often times this will be sum, or average. Most commonly used ones include: min, max, average, sum, standard deviation, or specifying your own.
  • Other Parameters – For a list of other lesser used parameters. Check out the official documentation.

Now the fun part, let’s take a look at a code sample

Link to code

Official Documentation

On this page