Pandas: groupby and agg
Published: | Last updated:
Here's how to group your data by specific columns and apply functions to aggregate columns in a Pandas DataFrame in Python.
Create the DataFrame with some example data
Let's start by making up some data.
If you run the above code, you should see a DataFrame that looks like this:
unit building number_units civ
0 archer archery_range 1 spanish
1 militia barracks 2 spanish
2 pikemen barracks 3 spanish
3 pikemen barracks 4 huns
Example 1: Groupby and sum specific columns
Let's say you want to count the number of units, but separate the unit count based on the type of building.
You should see this, where there is 1 unit from the archery range, and 9 units from the barracks.
building number_units
0 archery_range 1
1 barracks 9
Example 2: Groupby multiple columns
Or maybe you want to count the number of units separated by building type and civilization type.
This groups the rows and the unit count based on the type of building and the type of civilization.
building civ number_units
0 archery_range spanish 1
1 barracks huns 4
2 barracks spanish 5
Example 3: Groupby, sum and aggregate into a list
Nice nice. Okay for fun, let's do one more example. Here's how to aggregate the values into a list. Specifically, we'll return all the unit types as a list.
You can see we now have a list of the units under the unit column. Note you can apply other operations
to the agg function if needed.
building civ unit number_units
0 archery_range spanish [archer] 1
1 barracks huns [pikemen] 4
2 barracks spanish [militia, pikemen] 5
There you go! Hopefully these examples help you use the groupby and agg functions in a
Pandas DataFrame in Python!