Pandas: replace NaN with zeros • datagy (2023)

Working with missing data is an essential skill for any data analyst or data scientist! In many cases you will want to replace missing data or NaN values ​​with zeros. in this tutorialYou will learn how to use pandas to replace NaN values ​​with zeros. This is a shared skill that is part of a betterCleaning and transforming your data.

By the end of this tutorial, you will have learned the following:

  • How to use pandas to replace NaN values ​​with zeros for single column, multiple columns and an entire DataFrame
  • Using NumPy to replace NaN values ​​in a Pandas DataFrame
  • How to replace NaN values ​​in a Pandas DataFrame in-place

table of contents

(Video) Pandas Conditional Columns: Set Pandas Conditional Column Based on Values of Another Column

Loading a sample pandas dataframe

To follow the tutorial, I've provided a Pandas DataFrame as an example. To load the DataFrame, we import pandas with the aliasp.d.and pass a dictionary todataframe()Constructor. As we want to include someYaya-We also import NumPy values.

# Load a pandas teacher DataFrameimport pandas as pdimport numpy as npdf = pd.DataFrame({ 'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4] , 'Col_C': [1, 2, np.NaN, 4],})print(df)# Returns:# Col_A Col_B Col_C# 0 1.0 1.0 1.0# 1 2.0 NaN 2.0# 2 3.0 3.0 NaN# 3 NaN 4.0 4.0

We can see that we have three columns, each containing missing data.

How to replace NaN values ​​with zeros for a single column in pandas

To replace all missing values ​​in a single column of a Pandas DataFrame with zeros,We can apply Fillna's method to the column. The function allows you to pass a value that will be used to replace the missing data. In this case we pass the value 0.

(Video) Style Python Pandas DataFrames! (Conditional Formatting, Color Bars and more!)

# Replace NaN values ​​with zeros for a single pandas column. 3, 4], 'Col_C': [1, 2, np.NaN, 4]})df['Col_A'] = df['Col_A'].fillna(0)print(df)# Returns:# Col_A Col_B Col_C# 0 1.0 1.0 1.0# 1 2.0 NaN 2.0# 2 3.0 3.0 NaN# 3 0.0 4.0 4.0

In the above code, we reassign the column'Wage adjustment'to yourself When reassigning, we apply the.to fill in()method, passing 0 as an argument. The next section shows how to replace all missing values ​​for multiple columns.

How to replace NaN values ​​with zeros for multiple columns in pandas

To replace the NaN values ​​of multiple columns in a Pandas DataFrame with zeros, we can apply the fillna method to multiple columns. To change multiple columns, we can pass a list of column labels to the selector. Let's see how this looks like:

# Substitute the NaN values ​​for zeros for two pandas Columnimport as pdimport numpy as npdf = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN , 3 , 4], 'Col_C': [1, 2, np.NaN, 4]})df[['Col_A', 'Col_B']] = df[['Col_A', 'Col_B']].fillna ( 0 )print(df)# Returns:# Col_A Col_B Col_C# 0 1.0 1.0 1.0# 1 2.0 0.0 2.0# 2 3.0 3.0 NaN# 3 0.0 4.0 4.0

In the code above, we select multiple columns by passing a list of column labels to thedf[]voters. We can then apply the fillna method passing in 0. This will replace all missing values ​​for various columns with 0.

How to replace NaN values ​​with zeros for a Pandas DataFrame

Pandas' fillna method can also be applied to an entire DataFrame. In this case, missing NaN values ​​for a column are filled in with the value passed to the method. This can be a useful approach when dealing with DataFrames where you want consistency in filling in missing values.

(Video) Pandas Crosstab Tutorial | Python Pandas Tutorial #7 | Aggfunc, Margins, Normalize Data

# Replace NaN values ​​with zeros for an entire dataframe, 4], 'Col_C': [1, 2, np.NaN, 4]})df = df.fillna(0)print(df)# Returns:# Col_A Col_B Col_C# 0 1.0 1.0 1.0# 1 2.0 0.0 2.0 #2 3.0 3.0 0.0 #3 0.0 4.0 4.0

In the code block above, we reassigned the DataFrame to itself and called fillna. We pass the value 0 to replace all missing values ​​with zeros.

How to replace NaN values ​​with zeros for an existing Pandas DataFrame

Similarly, we can replace all NaN values ​​in an existing Pandas DataFrame. This means we don't need to reallocate the DataFrame to ourselves. Also, the code becomes more efficient because pandas don't have to create a new object.

# Substitua os valores NaN por zeros para um DataFrame In Placeimport pandas like pdimport numpy as npdf = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np. NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})df.fillna(0, inplace=True)print(df)# Returns:# Col_A Col_B Col_C# 0 1.0 1.0 1.0 #1 2,0 0,0 2,0 #2 3,0 3,0 0,0 #3 0,0 4,0 4,0

In the code above, we simply passin place = Trueas the second argument. This will directly modify the DataFrame and replace any missing values.

How to replace NaN values ​​with zeros in pandas using NumPy for a column

Due to Pandas' close link with NumPy, we can also use NumPy methods in a Pandas DataFrame. we can apply this.substitute()-Direct method for a series of pandas (or rather column). He.substitute()The method takes two parameters:

(Video) Pandas Missing Values | Python Pandas Tutorial #6 | Pandas Dropna, Fillna, Impute Missing Values

  1. The value to replace
  2. The value to be replaced by

Let's see how this looks like:

# Replace NaN values ​​with zeros for a single pandas column using NumPyimport pandas as pdimport numpy as npdf = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1 , np .NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})df['Col_A'] = df['Col_A'].replace(np.NaN, 0)print ( df ) # Returns: # Col_A Col_B Col_C# 0 1.0 1.0 1.0# 1 2.0 NaN 2.0# 2 3.0 3.0 NaN# 3 0.0 4.0 4.0

In the code above we use thenp.replace()Method to replace all missing NaN values ​​with the value 0.

How to replace NaN values ​​with zeros in pandas using NumPy for a DataFrame

Similarly, we can use NumPy.substitute()Method to replace NaN values ​​with zeros in an entire Pandas DataFrame. To achieve this, we can simply apply the.substitute()into a complete DataFrame as shown below:

# Replace NaN values ​​with zeros for a DataFrame using NumPyimport pandas as pdimport numpy as npdf = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np. NaN , 3, 4], 'Col_C': [1, 2, np.NaN, 4]})df = df.replace(np.NaN, 0)print(df)# Returns:# Col_A Col_B Col_C# 0 1.0 1 ,0 1.0 #1 2.0 0.0 2.0 #2 3.0 3.0 0.0 #3 0.0 4.0 4.0

In the code above, we apply the.substitute()throughout the DataFrame to replace missing values.

(Video) Moving Average (Rolling Average) in Pandas and Python - Set Window Size, Change Center of Data

Diploma

In this tutorial, you learned how to use pandas to replace NaN values ​​with zeros. You learned how to do this for a single column, multiple columns, and an entire DataFrame using the pandas fillna. So you learned how to use NumPy.substitute()method to do the same for a single column and an entire DataFrame.

additional resources

For more information on related topics, see the following tutorials:

FAQs

How do you change a NaN to zero in Python? ›

You can use numpy. nan_to_num : numpy. nan_to_num(x) : Replace nan with zero and inf with finite numbers.

What is the best way to replace NaN values? ›

By using replace() or fillna() methods you can replace NaN values with Blank/Empty string in Pandas DataFrame. NaN stands for Not A Number and is one of the common ways to represent the missing data value in Python/Pandas DataFrame.

How to ignore NaN values in pandas? ›

  1. Use dropna() function to drop rows with NaN / None values in pandas DataFrame. ...
  2. Alternatively, you can also use axis=0 as a param to remove rows with NaN, for example df.dropna(axis=0) . ...
  3. Yields below output.
  4. Use how='all' to remove rows that have all NaN/None values in a row(data is missing for all elements in a row)
Jul 5, 2021

How do I remove NaN from pandas? ›

Use dropna() Method To Remove NaN Values From Series

Using dropna() method we can remove the NaN values from the series. Let's use Series. dropna() method to remove NaN (missing) values from the original Series to get a new series. This method returns a new Series after removing all NAN values.

How do I remove NaN values from data? ›

How to Drop Rows with NaN Values in Pandas DataFrame
  1. Step 1: Create a DataFrame with NaN Values. Let's say that you have the following dataset: ...
  2. Step 2: Drop the Rows with NaN Values in Pandas DataFrame. To drop all the rows with the NaN values, you may use df. ...
  3. Step 3 (Optional): Reset the Index.
Jul 16, 2021

How do you avoid NaN values? ›

Here are 4 methods to avoid NaN values.
  1. Avoid #1: Mathematical operations with non-numeric string values. ...
  2. Avoid #2: Mathematical operations with functions. ...
  3. Avoid #3: Mathematical operations with objects. ...
  4. Avoid #4: Mathematical operations with falsy values. ...
  5. Conclusion.

How do I change NaN value in Python? ›

This can be done by using the fillna() method. The basic operation of this pandas series. fillna() method is used to replace missing values (Nan or NA) with a specified value. Initially, the method verifies all the Nan values and replaces them with the assigned replacement value.

How do you convert NaN values in Python? ›

In Python, you can simply use the bin() function to convert from a decimal value to its corresponding binary value. And similarly, the int() function to convert a binary to its decimal value. The int() function takes as second argument the base of the number to be converted, which is 2 in the case of binary numbers.

How do I convert NaN to number? ›

To convert the NaN into a number we use the OR operator with NaN and any number and the OR operator will return that Number.

How do you avoid NaN values in Python? ›

Steps to Remove NaN from Dataframe using pandas dropna
  1. Step 1: Import all the necessary libraries. In our examples, We are using NumPy for placing NaN values and pandas for creating dataframe. ...
  2. Step 2: Create a Pandas Dataframe. ...
  3. Step 3: Remove the NaN values using dropna() method.

How do you remove NaN and replace it something else in Python? ›

Pandas: How to Replace NaN Values with String
  1. Method 1: Replace NaN Values with String in Entire DataFrame df. fillna('', inplace=True)
  2. Method 2: Replace NaN Values with String in Specific Columns df[['col1', 'col2']] = df[['col1','col2']]. fillna('')
  3. Method 3: Replace NaN Values with String in One Column df. col1 = df.
Nov 1, 2021

How do you replace NaN values with a column in Python? ›

For mean, use the mean() function. Calculate the mean for the column with NaN and use the fillna() to fill the NaN values with the mean.

How does pandas handle NaN value? ›

fillna() function of Pandas conveniently handles missing values. Using fillna(), missing values can be replaced by a special value or an aggreate value such as mean, median. Furthermore, missing values can be replaced with the value before or after it which is pretty useful for time-series datasets.

How to convert NaN to integer in pandas? ›

Using numpy.

Next you can check the NAN value using isnan(value) , if it is NAN value then you can convert using nan_to_num() . Here you can see the nan_to_num() changed the NaN value to 0.0 which can then be converted into an integer.

Is NaN bigger than zero? ›

NaN is neither less than nor greater than any number.

What is the integer version of NaN? ›

A nan is a floating point only thing, there is no representation of it in the integers, so no :) Picking a canonical value as invalid wouldn't be a good solution as that wouldn't replicate the same properties as nan, namely: comparisons between nan and any other value including itself should be false.

Videos

1. Python Collections Library namedtuple - Intermediate Python Tutorial
(datagy)
2. Pandas Unique Values | Python Pandas Tutorial #11 | Pandas Unique and Nunique Functions
(datagy)
3. Pandas Value_Counts Function | Python Pandas Tutorial #10 | Create Frequency Tables with Percentages
(datagy)
4. Pandas Merge Function | Python Pandas Tutorial #9 | Merge dataframes in Pandas, SQL-Joins in Pandas
(datagy)
5. Python Pivot Tables Tutorial | Create Pandas Pivot Tables | Python Tutorial | Examples, Subtotals
(datagy)
6. Introduction to Pandas and Dataframes | Python Pandas Tutorial #1 | Create Dataframe & Read from Web
(datagy)
Top Articles
Latest Posts
Article information

Author: Nicola Considine CPA

Last Updated: 14/07/2023

Views: 5791

Rating: 4.9 / 5 (49 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Nicola Considine CPA

Birthday: 1993-02-26

Address: 3809 Clinton Inlet, East Aleisha, UT 46318-2392

Phone: +2681424145499

Job: Government Technician

Hobby: Calligraphy, Lego building, Worldbuilding, Shooting, Bird watching, Shopping, Cooking

Introduction: My name is Nicola Considine CPA, I am a determined, witty, powerful, brainy, open, smiling, proud person who loves writing and wants to share my knowledge and understanding with you.