如何在R数据帧中用行均值替换缺失值?
如果我们在R数据帧的每一列中都具有相似的特征,则可以用行均值替换丢失的值。要用row替换丢失的值,我们可以使用na.aggregatezoo软件包的功能,但是我们需要使用数据帧的转置版本作为na.aggregatecolumnmeans的工作。
例1
考虑以下数据帧-
x1<−sample(c(NA,1,5),20,replace=TRUE) x2<−sample(c(NA,10,25),20,replace=TRUE) x3<−rpois(20,5) df1<−data.frame(x1,x2,x3) df1输出结果
x1 x2 x3 1 5 10 4 2 1 NA 2 3 NA NA 5 4 5 NA 2 5 1 25 8 6 1 10 2 7 1 NA 4 8 5 NA 4 9 5 25 3 10 1 NA 5 11 1 NA 7 12 5 NA 6 13 1 25 4 14 5 NA 8 15 1 25 6 16 NA 10 6 17 5 10 5 18 5 25 8 19 NA 25 3 20 NA 25 5
加载Zoo包并用行均值替换缺失值-
示例
library(zoo) df1[]<−t(na.aggregate(t(df1))) df1输出结果
x1 x2 x3 1 5 10.0 4 2 1 1.5 2 3 5 5.0 5 4 5 3.5 2 5 1 25.0 8 6 1 10.0 2 7 1 2.5 4 8 5 4.5 4 9 5 25.0 3 10 1 3.0 5 11 1 4.0 7 12 5 5.5 6 13 1 25.0 4 14 5 6.5 8 15 1 25.0 6 16 8 10.0 6 17 5 10.0 5 18 5 25.0 8 19 14 25.0 3 20 15 25.0 5
例2
y1<−sample(c(NA,525,235,401),20,replace=TRUE) y2<−rnorm(20,500,51.24) y3<−sample(c(NA,35,47),20,replace=TRUE) df2<−data.frame(y1,y2,y3) df2输出结果
y1 y2 y3 1 525 555.4212 47 2 401 508.7781 47 3 401 488.3973 47 4 NA 546.6707 35 5 401 497.5346 47 6 235 460.7668 35 7 NA 495.0879 35 8 401 441.4254 47 9 NA 446.8322 47 10 235 484.8106 NA 11 235 517.4665 47 12 NA 450.1524 NA 13 525 485.2432 47 14 525 506.0650 35 15 525 470.7504 47 16 NA 370.8190 35 17 525 509.6385 35 18 525 471.0552 35 19 235 468.6052 35 20 401 472.6163 47
用行代替缺失值-
示例
df2[]<−t(na.aggregate(t(df2))) df2输出结果
y1 y2 y3 1 525.0000 555.4212 47.0000 2 401.0000 508.7781 47.0000 3 401.0000 488.3973 47.0000 4 290.8353 546.6707 35.0000 5 401.0000 497.5346 47.0000 6 235.0000 460.7668 35.0000 7 265.0440 495.0879 35.0000 8 401.0000 441.4254 47.0000 9 246.9161 446.8322 47.0000 10 235.0000 484.8106 359.9053 11 235.0000 517.4665 47.0000 12 450.1524 450.1524 450.1524 13 525.0000 485.2432 47.0000 14 525.0000 506.0650 35.0000 15 525.0000 470.7504 47.0000 16 202.9095 370.8190 35.0000 17 525.0000 509.6385 35.0000 18 525.0000 471.0552 35.0000 19 235.0000 468.6052 35.0000 20 401.0000 472.6163 47.0000