Python Pandas - 用模式填充缺失的列值
众数是一组值中出现次数最多的值。使用该fillna()方法并设置模式以使用模式填充缺失的列。首先,让我们使用各自的别名导入所需的库-
import pandas as pd import numpy as np
创建一个包含2列的DataFrame。我们已经使用Numpynp.NaN设置了NaN值-
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
}
)使用NaN查找列值的模式,即此处的单位列。将NaN替换为它所在的列的模式,mode()在Units列上使用-
dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)
示例
以下是完整的代码-
import pandas as pd
import numpy as np
#创建数据帧
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
}
)
print"DataFrame ...\n",dataFrame
#使用NaN查找列值的模式,即,对于此处的Units列
#用它所在列的模式替换NaNs
dataFrame.fillna(dataFrame['Units'].mode()[0], inplace = True)
print"\nUpdated Dataframe after filling NaN values with mode...\n",dataFrame输出结果这将产生以下输出-
DataFrame ...
Car Units
0 BMW 100.0
1 Lexus 150.0
2 Lexus NaN
3 Mustang 80.0
4 Bentley NaN
5 Mustang NaN
Updated Dataframe after filling NaN values with mode...
Car Units
0 BMW 100.0
1 Lexus 150.0
2 Lexus 80.0
3 Mustang 80.0
4 Bentley 80.0
5 Mustang 80.0