Here’s how to group your data by specific columns and apply functions to other columns in a Pandas DataFrame in Python.
Create the DataFrame with some example data
import pandas as pd
# Make up some data.
data = [
{'unit': 'archer', 'building': 'archery_range', 'number_units': 1, 'civ': 'spanish'},
{'unit': 'militia', 'building': 'barracks', 'number_units': 2, 'civ': 'spanish'},
{'unit': 'pikemen', 'building': 'barracks', 'number_units': 3, 'civ': 'spanish'},
{'unit': 'pikemen', 'building': 'barracks', 'number_units': 4, 'civ': 'huns'},
]
# Create the DataFrame.
df = pd.DataFrame(data)
# View the DataFrame.
df
You should see a DataFrame that looks like this:
unit building number_units civ
0 archer archery_range 1 spanish
1 militia barracks 2 spanish
2 pikemen barracks 3 spanish
3 pikemen barracks 4 huns
Example 1: Groupby and sum specific columns
Let’s say you want to count the number of units, but separate the unit count based on the type of building.
Continue reading “Python Pandas – How to groupby and aggregate a DataFrame”