Dear R users,

When I use aggregate with table as FUN, I get what I would call a strange behaviour if it involves numerical vectors and one "level" of it is not present for every "levels" of the "by" variable:

---------------------------

> df <- data.frame(A=c(1,1,1,1,0,0,0,0),B=c(1,0,1,0,0,0,1,0),C=c(1,0,1,0,0,1,1,1))
> aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE)
  Group.1 A.0 A.1    B
1       0   1   2    3
2       1   3   2 2, 3

> table(df$C,df$B)

    0 1
  0 3 0
  1 2 3

---------------

As you can see, a comma appears in the column with the variable B in the aggregate whereas when I call table I obtain the same result as if B was defined as a factor (I suppose it comes from the fact "non-factor arguments a are coerced via factor" according to the details of the table help). I find it completely normal if I remember that aggregate first splits the data into subsets and then compute the table. But then I don't understand why it works differently with character vectors. Indeed if I use character vectors, I get the same result as with factors:

------------------------

> df <- data.frame(A=factor(c("1","1","1","1","0","0","0","0")),B=factor(c("1","0","1","0","0","0","1","0")),C=factor(c("1","0","1","0","0","1","1","1")))
> aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE)
  Group.1 A.0 A.1 B.0 B.1
1       0   1   2   3   0
2       1   3   2   2   3

> df <- data.frame(A=factor(c(1,1,1,1,0,0,0,0)),B=factor(c(1,0,1,0,0,0,1,0)),C=factor(c(1,0,1,0,0,1,1,1)))
> aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE)
  Group.1 A.0 A.1 B.0 B.1
1       0   1   2   3   0
2       1   3   2   2   3

---------------------

Is it possible to precise anything about this behaviour in the aggregate help since the result is not completely compatible with the expectation of result we can have according to the table help? Or would it be possible to have the same results independently of the vector type? This post was rejected on the R-devel mailing list so I ask my question here as suggested.


Best regards,
Alain Guillet

--
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Université catholique de Louvain
http://www.uclouvain.be/smcs

Bureau c.316
Voie du Roman Pays, 20 (bte L1.04.01)
B-1348 Louvain-la-Neuve
Belgium

Tel: +32 10 47 30 50

Accès: http://www.uclouvain.be/323631.html

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to