Pandas functions
Pandas Cross Tab β pd.crosstab()
Pandas Crosstab usage and parameters
For all intensive purposes, you can think of Pandas Crosstab as the same things as Pandas Pivot Table.
When to use pd.crosstab(): When you are starting with non-DataFrame based data. This could be a list of lists or dictionaries.
When to use pd.pivot_table(): When you are starting with a Dataframe.
Pseudo Code: With your list of arrays, construct a pivot table and return a DataFrame
Pandas Crosstab
In fact, Pandas Crosstab is so similar to Pandas Pivot Table, that crosstab uses pivot table within itβs source code.
Resample Main Parameters
- index β This is what your want your new rows to be aggregated (or grouped) on. This will generally be the main subject of your data analysis. In the picture above, we chose index=list_1 which were peoples names.
- columns β This is the parameter you want your index to be split or cut by. This is generally a secondary column, or a spectrum of categories you want to further look into.
- values β The values you want to show at the intersection of your index and column. By default, this will be a frequency count (how often values occur. However, if you specify an aggfunc, then youβll summarize your values according to the function you specified.
- aggfunc β Short for aggregate function, this is how you will summarize your values. Often times this will be sum, or average. Most commonly used ones include: min, max, average, sum, standard deviation, or specifying your own.
- Other Parameters β For a list of other lesser used parameters. Check out the official documentation.
Now the fun part, letβs take a look at a code sample