Pandas Replace Values\- pd.DataFrame.replace()
Replace values in pandas DataFrame
Want to replace values in your DataFrame with something else? No problem. That is where pandas replace comes in.
Pandas DataFrame.replace() is a small but powerful function that will replace (or swap) values in your DataFrame with another value. What starts as a simple function, can quickly be expanded for most of your scenarios
This function is very similar to DataFrame.at(), or trying to set a value via DataFrame.iloc/loc. However, in .replace(), pandas will do the searching for you.
Beginner Pandas users will have fun doing simple replaces, but the kung-fu Pandas master will go 3 levels deep.
Pseudo code: Find current values within my DataFrame, then replace them with another value.
Pandas Replace
.replace() starts off easy, but quickly gets nuanced as you dig deeper. Here are the most common ways to use pandas replace.Hereβs a breakdown of the different
Code | Plain Language |
---|---|
df.replace(0, 5) | Replace all of the 0s in your DataFrame with 5s |
df.replace([0, 1, 2, 3], 4) | Replace all the 0s, 1s, 2s, 3s in your DataFrame with 4s |
df.replace([0, 1, 2, 3], [4, 3, 2, 1]) | Replace all the 0s with 4s, 1s with 3s, 2s with 2s, and 3s with 1s. Note: if you pass two lists they both much be the same length |
df.replace({0: 10, 1: 100}) | Using a dict β Replace 0s with 10s, and 1s with 100s. |
df.replace({'A': 0, 'B': 5}, 100) | Replace 0βs in column βAβ with 100, and replace 5s in column βBβ with 100 |
df.replace({'C': {1: 100, 3: 300}}) | Using a dict β Within column βCβ replace 1s with 100 and 3s with 300 |
df.replace(to_replace=r'^ba.$', value='new', regex= True) | Replace anything that matched the regex β^ba.$β with βnewβ |
Replace Parameters
- to_replace: The values, list of values, or values which match regex, that you would like to replace. If using a dict, you can also include the values you would like to do the replacing. There are a ton of details here, we recommend referring to the official documentation for more.
- value: The values that will do the replacing. Note: This can also be none if you have a dict in your to_replace parameter.
- inplace (Default: False): If true, you would like to do your operation in place (write over your current DataFrame). If false, then your DataFrame will be returned to you.
- limit: The max size you could like to forward or back fill. Example: You may want to fill from values that are 2-3 rows away, but do you really want to fill from values that are 30 rows away?
- regex: If you want to_replace to read your inputs as regex or not.
- method: The fill method to use when to_replace is either a scalar, list, or tuple. Value must be None
- pad/ffill β Take the value that is in the back of what your replacing, and fill it going forward
- bfill β Take the value that is in the front of your value to be replaced, and fill it going backward.
Hereβs a Jupyter notebook showing how to set index in Pandas