Pandas: groupby and agg

Published: | Last updated:

Here's how to group your data by specific columns and apply functions to aggregate columns in a Pandas DataFrame in Python.

Note: You can run (and even edit this code) in the browser thanks to PyScript!

Create the DataFrame with some example data

Let's start by making up some data.

Note: Run this block to import and create data first. If you get an error, refresh the page and run blocks in order. This can take a few seconds for the first block to run.


If you run the above code, you should see a DataFrame that looks like this:

    unit             building  number_units      civ
    0   archer  archery_range             1  spanish
    1  militia       barracks             2  spanish
    2  pikemen       barracks             3  spanish
    3  pikemen       barracks             4     huns

Example 1: Groupby and sum specific columns

Let's say you want to count the number of units, but separate the unit count based on the type of building.



You should see this, where there is 1 unit from the archery range, and 9 units from the barracks.

        building  number_units
0  archery_range             1
1       barracks             9

Example 2: Groupby multiple columns

Or maybe you want to count the number of units separated by building type and civilization type.



This groups the rows and the unit count based on the type of building and the type of civilization.

        building      civ  number_units
0  archery_range  spanish             1
1       barracks     huns             4
2       barracks  spanish             5

Example 3: Groupby, sum and aggregate into a list

Nice nice. Okay for fun, let's do one more example. Here's how to aggregate the values into a list. Specifically, we'll return all the unit types as a list.



You can see we now have a list of the units under the unit column. Note you can apply other operations to the agg function if needed.

        building      civ                unit  number_units
0  archery_range  spanish            [archer]             1
1       barracks     huns           [pikemen]             4
2       barracks  spanish  [militia, pikemen]             5

There you go! Hopefully these examples help you use the groupby and agg functions in a Pandas DataFrame in Python!