Updated April 3, 2023
Introduction to Pandas DataFrame.where()
Searching one specific item in a group of data is a very common capability that is expected among all software enlistments. From the python perspective in the pandas world, this capability is achieved by means of the where clause or more specifically the where() method. So the where method in pandas is responsible for searching the pandas data structure like a series or a dataframe on a given condition and replace the remaining elements which do not satisfy the condition with some value. The default value which gets replaced is Nan.
Syntax and Parameters
Following is syntax:
Syntax:
DataFrame.where(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False)
Following are the different parameters with description:
Parameter | Description |
Cond | The cond argument is where the condition which needs to be verified will be filled in with. So the condition could be of array-like, callable, or a pandas structure involved. when the condition mentioned here is a true one of the rows which satisfy this condition will be kept as it is, so the original values remain here without any change. when the condition becomes false then those rows for which the condition was false will be replaced with a very different value. this item in the argument section is callable so this means it can be pulled in from a different part of the code and it is a necessity to ensure that the condition returns a dataframe, array, or a Boolean value. |
Orther | All the false values or the values which do not satisfy the previously given condition will be treated accordingly by this other argument. So the value or a dataframe or a series item given in this other will get replaced for the rows which did not satisfy the condition mentioned. Again the other argument is expected to return a series, array, or a dataframe. |
inplace | This is used to determine whether the operation needs to be performed at the place of the data. |
Axis | If there are several axis levels are only considered then the axis value can be specified here. (Value should be in int) |
Level | If the alignment is considered in the level then this argument is considered. (Value should be in int) |
Try_cast | If some sought of casting process need to be performed then this option needs to be set with a Boolean representation. the default value for this option is false. |
errors | Represents whether an exception needs to be raised or not. Majorly this option allows controlling whether an exception has to be raised or not on a case where an exception could be validly occurring. ( Need to be exceptionally cautious while setting the value of the error parameter is been raise or ignore ).
|
Examples of Pandas DataFrame.where()
Following are the examples of pandas dataframe.where()
Example #1
Code:
import pandas as pd
Core_Series = pd.Series([ 10, 20, 30, 40, 50, 60])
print(" THE CORE SERIES ")
print(Core_Series)
Filtered_Series = Core_Series.where(Core_Series >= 50)
print("")
print(" THE FILTERED SERIES ")
print(Filtered_Series)
Filtered_Series_with_replace = Core_Series.where(Core_Series < 50, other=0)
print("")
print(" THE FILTERED SERIES WITH REPLACE")
print(Filtered_Series_with_replace)
Output:
Code Explanation: Here the pandas library is initially imported and the imported library is used for creating a series. The values in the series are formulated in such a way that they are a series of 10 to 60. Then the where a () method is used for filtering the given series in two ways, in the first way it includes the default value of Nan for replacing the false values, whereas in the second option all the false values are replaced with a 0. We can notice both the replacement options are applied very precisely and the expected output is depicted on to the console perfectly.
Example #2
Code:
import pandas as pd
Core_Dataframe = pd.DataFrame({'A' : [ 1, 6, 11, 15, 21, 26],
'B' : [2, 7, 12, 17, 22, 27],
'C' : [3, 8, 13, 18, 23, 28],
'D' : [4, 9, 14, 19, 24, 29],
'E' : [5, 10, 15, 20, 25, 30]})
print(" THE CORE DATAFRAME ")
print(Core_Dataframe)
Filtered_Dataframe_1 = Core_Dataframe.where(Core_Dataframe >= 15,other=0)
print("")
print(" THE FILTERED DATAFRAME WITH REPLACE")
print(Filtered_Dataframe_1)
print(" THE FILTERED DATAFRAME WITHOUT REPLACE AND INPLACE AS TRUE")
print(" CORE DATAFRAME BEFORE INPLACE:")
print(Core_Dataframe)
Core_Dataframe.where(Core_Dataframe >= 15,axis=0,inplace=True)
print("")
print(" CORE DATAFRAME AFTER INPLACE:")
print(Core_Dataframe)
Output:
Code Explanation: Here the pandas library is initially imported and the imported library is used for creating the dataframe which is a shape(6,6). all of the columns in the dataframe are assigned with headers that are alphabetic. the values in the dataframe are formulated in such a way that they are a series of 1 to n. Here again, the where() method is used in two different ways. First, initially, the core dataframe generated above is printed on to the console, then the values in the core dataframe which are greater than or equal to value 15 are pulled as a separate dataframe and pasted on to the console, at this condition all false values are replaced as zero. The next instance of the where() method does not involve other arguments for performing the replace, whereas here the inplace argument is set to true and the axis values are set, so setting the inplace value applies the effect of the where condition to the primary dataframe which is the core dataframe here. All outputs are printed on to the console.
Example #3
Code:
import pandas as pd
Core_Dataframe = pd.DataFrame({'Emp_No' : ['Emp1','Emp2','Emp3','Emp4'],
'Employee_Name' : ['Arun', 'selva', 'rakesh', 'arjith'],
'Employee_dept' : ['CAD', 'CAD', 'DEV', 'CAD']})
print(" THE CORE DATAFRAME ")
print(Core_Dataframe)
print("")
Condition = Core_Dataframe['Employee_dept'] == 'CAD'
Core_Dataframe.where(Condition,inplace=True)
print("")
print(" THE UPDATED CORE DATAFRAME ")
print(Core_Dataframe)
Output:
Code Explanation: Here an input dataframe with employee information is used, we use a callable condition for filtering a specific condition and the output dataframe is printed on to the console.
Recommended Articles
We hope that this EDUCBA information on “Pandas DataFrame.where()” was beneficial to you. You can view EDUCBA’s recommended articles for more information.