Updated March 28, 2023
Introduction to Pandas DataFrame.apply()
When every value of the panda’s data structure needs to be manipulated or operated in some specific manner, then the pandas.apply() function can be used. The apply() method is used to apply some specific function to every value in the panda’s data structure. The Objects which is applied to the function start with index (axis=0) if it is a series or the DataFrame’s columns (axis=1). By default (result_type=None), The return of the function acts as the final return element. In this topic, we are going to learn about Pandas DataFrame.apply().
Syntax
Here is the syntax mentioned below
DataFrame.apply(self, function, axis=0, raw=False, result_type=None, args=(), return)
Parameter | Description |
Function | Denotes the function which needs to be applied for the panda’s data structure |
Axis | Denotes the columns or rows for which the function needs to be applied. {0 or ‘index,’ 1 or ‘columns’}, default 0
Axis along which the function is applied: 0 or ‘index’: apply the function to each column. 1 or ‘columns’: apply the function to each row. |
Raw | Determines how the data is passed, whether it is a series or a ndarray.
False: the data is passed as a series of dataframe True: The data is passed as a ndarray object. Notably, the process of making a Numpy reduction function achieves a better performance. |
Result_type | The result_type applies when the value of the axis used is 1, which represents the columns:
1) ‘expand’: Expand allows a list value to be converted vertically into a column 2) ‘reduce’: An exact opposite of the expand where a column value will be reduced into a list kind of item. 3) ‘broadcast’: Here, the actual shape of the dataframe will be considered as the outcomes. The core index values will be retained here. |
args | Represents all the arguments that are passed to the function |
Examples of Pandas DataFrame.apply()
Different examples are mentioned below:
Example #1
Code:
import pandas as pd
Core_Series = pd.Series([ 1, 6, 11, 15, 21, 26])
print(" THE CORE SERIES ")
print(Core_Series)
Lambda_Series = Core_Series.apply(lambda Value : Value * 10)
print("")
print(" THE LAMBDA SERIES ")
print(Lambda_Series)
Output:
Explanation: Here, the pandas library is initially imported, and the imported library is used for creating a series. The values in the series are formulated in such a way that they are a series of 1 to n. the apply() method is placed over this series with a lambda function. The lambda function is responsible for taking each value from the series and multiplying it by 10. So each and every item in the series gets multiplied by 10 at the end of this process. The resulting values are captured in a series called the lambda series and is printed on to the console. We can clearly notice the lambda series holding all values from the core series with a multiple of 10 on it.
Example #2
Code:
import pandas as pd
Core_Dataframe = pd.DataFrame({'A' : [ 1, 6, 11, 15, 21, 26],
'B' : [2, 7, 12, 17, 22, 27],
'C' : [3, 8, 13, 18, 23, 28],
'D' : [4, 9, 14, 19, 24, 29],
'E' : [5, 10, 15, 20, 25, 30]})
print(" THE CORE DATAFRAME ")
print(Core_Dataframe)
Lambda_Dataframe = Core_Dataframe.apply(lambda Value : Value * 10)
print("")
print(" THE LAMBDA DATAFRAME ")
print(Lambda_Dataframe)
Output:
Explanation: Here, the pandas library is initially imported, and the imported library is used for creating the dataframe, which is a shape(6,6). All of the columns in the dataframe are assigned with the headers, which are alphabetic. The values in the dataframe are formulated in such a way that they are a series of 1 to n. this dataframe is programmatically named here as the core dataframe. The apply() method is placed over this dataframe with a lambda function. The lambda function is responsible for taking each value from the dataframe and multiplying it by 10. So each and every item in the dataframe gets multiplied by 10 at the end of this process. The resulting values are captured in a dataframe called the lambda dataframe and is printed on to the console. We can clearly notice the lambda dataframe holding all values from the core dataframe with a multiple of 10 on it.
Example #3
Code:
import pandas as pd
def Value_range_check(value):
if value < 50: return "Low" elif value >= 50 and value < 100: return "Normal" elif value > 100:
return "High"
final_dataframe = pd.DataFrame([])
Core_Dataframe = pd.DataFrame({'A' : [ 1, 6, 11, 15, 21, 26],
'B' : [2, 7, 12, 17, 22, 27],
'C' : [3, 8, 13, 18, 23, 28],
'D' : [4, 9, 14, 19, 24, 29],
'E' : [5, 10, 15, 20, 25, 30]})
print(" THE CORE DATAFRAME ")
print(Core_Dataframe)
Lambda_Dataframe = Core_Dataframe.apply(lambda Value : Value * 10)
print("")
print(" THE LAMBDA DATAFRAME ")
print(Lambda_Dataframe)
Lambda_Dataframe_size = (Lambda_Dataframe.shape[0] * Lambda_Dataframe.shape[0])
print("")
print("OVERALL DATFRAME SIZE: ", Lambda_Dataframe_size)
for i in range(Lambda_Dataframe.shape[1]):
Value_checked_dataframe = Lambda_Dataframe.iloc[:,i].apply(Value_range_check)
final_dataframe.insert(i,i,Value_checked_dataframe)
print("")
print(" FINAL DATAFRAME ")
print(final_dataframe)
Output:
Explanation: The whole initial set of operations from the above example are repeated here again; more specifically in this example, the apply() method is used in two ways, as discussed above at the first instance, the apply() method is used upon a lambda function, but in the next instance it is applied over a normal function. This function is applied over the lambda dataframe. Here, the function is used to segregate every value attained in the lambda function under three different categories: low, normal, and high. For example, if a value falls within 50, it is named as low, then the values which fall under 50 to 100 are named as normal; lastly, the values which fall over 100 are named as high. For formulating the resultant series into a dataframe, every column in the lambda dataframe is passed into the apply function by using the iloc as a column reference. So the output returned will also be a column of values. For inserting each column into the dataframe, the insert() method is used. The final dataframe is printed onto the console.
Conclusion
The apply() method in pandas shows the flexibility of applying an operation over each and every value in the dataframe in a most flexible way. It also depicts the classified set of functionalities associated to this function.
Recommended Articles
This is a guide to Pandas DataFrame.apply(). Here we discuss the Syntax and Examples of Pandas DataFrame.apply() along with the codes, outputs, and explanations. You may also have a look at the following articles to learn more –