The 6 AI Engineering Patterns, come build with Greg live:Β Starts Jan 6th, 2025
Leverage
Pandas functions

Pandas Resample – pd.df.resample()

Resample time series data in pandas

Pandas Resample is an amazing function that does more than you think. This powerful tool will help you transform and clean up your time series data.

Pandas Resample will convert your time series data into different frequencies. Think of it like a group by function, but for time series data.

Example: Imagine you have a data points every 5 minutes from 10am – 11am. What if you wanted to translate your data into a data point every 20min? or 1min?

For a full range of frequencies to convert with, check out the official pandas table.

Pandas DataFrame.resample() takes in a DatetimeIndex and spits out data that has been converted to a new time frequency.

pandas.DataFrame.resample(rule='new_frequency_to_convert_to')

Pseudo Code: Convert a DataFrame time range into a different time frequency.

Pandas Resample

resample() is one of those functions that can be intimidating when you first look at the documentation. We suggest mastering the rule, closed, label, and convention parameters before anything else.

Resampling Terms

  • Up Sampling – Going from a longer time grain to a short one. Example: Going from yearly data to monthly data. It’s β€œup” because you’re going β€œup” in the number of bins you have
  • Down Sampling – Going from a fine time grain to a lower one. Example: Months to Years.

Resample Main Parameters

  • rule – How you want to resample your data. Do you want to convert your time series into minute groups? 5 minute groups? You pick! Check out the official pandas documentation for frequencies to resample.
  • axis (Default: 0) – Which axis do you want to go against? Usually this will always be rows. But set axis=1 if columns are you time index.
  • closed (Default: None) – Do your want to include the data on the edge of your time sample? Which side of the bin interval is closed (it will not include data resampled from that interval). Check out samples below
  • label (Default: None) – How do you want your new bins to be labeled? By definition, a bin has two sides, the start (label=left) and the end (label=right).
  • convention (default: start) – Where do put your data points when up sampling. Say you’re going from Years to months. Do you want to put your yearly data points on the last month? Or the first month?
  • Other parameters – There are a few other parameters, but in our experience, they don’t get used often. Feel free to check them out.

Now the fun part, let’s take a look at a code sample

Link to code

Official Documentation

On this page