如何找到R数据帧的分组汇总统计信息?
为了比较不同的组,我们需要每个组的摘要统计信息。它有助于我们观察两组之间的差异。摘要统计信息提供最小值,第一四分位数,中位数,第三四分位数和最大值。因此,我们可以比较每个组的这些值。要找到R数据帧的逐组汇总统计信息,我们可以使用tapply函数。
示例
请看以下数据帧-
> set.seed(99)
> x1<-sample(1:100,50,replace=TRUE)
> x2<-rep(c("G1","G2","G3","G4","G5"),times=10)
> df<-data.frame(x1,x2)
> head(df,20)
x1 x2
1 48 G1
2 33 G2
3 44 G3
4 22 G4
5 99 G5
6 62 G1
7 98 G2
8 32 G3
9 13 G4
10 20 G5
11 100 G1
12 31 G2
13 68 G3
14 9 G4
15 82 G5
16 88 G1
17 30 G2
18 86 G3
19 84 G4
20 32 G5查找每个组的x1的摘要统计量-
> tapply(df$x1, df$x2, summary) $G1 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 55.0 72.0 67.8 86.5 100.0 $G2 Min. 1st Qu. Median Mean 3rd Qu. Max. 4.0 31.5 60.5 52.4 69.5 98.0 $G3 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 33.5 41.0 46.9 64.5 86.0 $G4 Min. 1st Qu. Median Mean 3rd Qu. Max. 9.00 23.75 53.00 53.30 82.75 97.00 $G5 Min. 1st Qu. Median Mean 3rd Qu. Max. 7.00 31.25 32.00 42.40 44.75 99.00
让我们再看一个例子-
> y1<-rep(c(letters[1:5]),times=5) > y2<-rep(c(14,25,13,12,41,52,44,28,17,30),times=c(2,5,3,3,1,5,1,2,2,1)) > df_y<-data.frame(y1,y2) > head(df_y,20) y1 y2 1 a 14 2 b 14 3 c 25 4 d 25 5 e 25 6 a 25 7 b 25 8 c 13 9 d 13 10 e 13 11 a 12 12 b 12 13 c 12 14 d 41 15 e 52 16 a 52 17 b 52 18 c 52 19 d 52 20 e 44 > tapply(df_y$y2, df_y$y1, summary) $a Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $b Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $c Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 13.0 17.0 23.8 25.0 52.0 $d Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 17.0 25.0 29.6 41.0 52.0 $e Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 25.0 30.0 32.8 44.0 52.0
热门推荐
10 小红书平安祝福语简短
11 生日祝福语大全女孩简短
12 收生日红包祝福语 简短
13 领证幽默祝福语简短
14 法考面试祝福语简短
15 老哥出门祝福语简短语
16 送灯祝福语简短独特
17 幼儿狗年祝福语大全简短
18 好听的元旦简短祝福语