Python - 如何在 Pandas 中用均值填充 NAN 值?
对于均值,请使用该mean()函数。用NaN计算列的平均值,并使用fillna()用平均值填充NaN值。
让我们首先导入所需的库-
import pandas as pd import numpy as np
创建一个包含2列和一些NaN值的DataFrame。我们已经使用numpy输入了这些NaN值np.NaN-
dataFrame = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN] } )
使用NaN查找列值的平均值,即此处的Units列。因此,Units列有100、150和80;因此,平均值将为110-
meanVal = dataFrame['Units'].mean()
将NaN替换为其所在列的平均值。上面计算的平均值是110,所以NaN值将替换为110-
dataFrame['Units'].fillna(value=meanVal, inplace=True)
示例
以下是代码-
import pandas as pd import numpy as np #CreateDataFrame dataFrame = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN] } ) print"DataFrame ...\n",dataFrame #findingmeanofthecolumnvalueswithNaNi.e,forUnitscolumnshere #sotheUnitscolumnhas100,150and80;thereforethemeanwouldne110 meanVal = dataFrame['Units'].mean() #ReplaceNaNswiththemeanofthecolumnwhereitislocated #themeancalculatedaboveis110,soNaNvalueswillbereplacedwith110 dataFrame['Units'].fillna(value=meanVal, inplace=True) print"\nUpdated Dataframe after filling NaN values with mean...\n",dataFrame输出结果
这将产生以下输出-
DataFrame ... Car Units 0 BMW 100.0 1 Lexus 150.0 2 Lexus NaN 3 Mustang 80.0 4 Bentley NaN 5 Mustang NaN Updated Dataframe after filling NaN values with mean... Car Units 0 BMW 100.0 1 Lexus 150.0 2 Lexus 110.0 3 Mustang 80.0 4 Bentley 110.0 5 Mustang 110.0