> On May 23, 2016, at 11:46 AM, Ernest Adrogué <e...@openmailbox.org> wrote:
> 
> Hello,
> 
> Currently it's possible to convert an object of class table to a data frame
> with as.data.frame.table(), but there's no ready-made function, AFAIK, to do
> the reverse operation, i.e. conversion of a data frame to a table.
> 
> Do you think it would be a good idea to add a data.frame method to
> as.table(), to allow such conversions?
> 
> The idea is that if `x' is a table and `y <- as.data.frame(x)', then the
> object returned by `as.table(y)' should be equal to `x'.
> 
> Below is a proof of concept
> 
> as.table.data.frame <- function(x, ..., response) {
>    if (missing(response))
>        resp <- ncol(x)
>    else {
>        resp <- match(response[1L], names(x))
>        if (is.na(resp))
>            stop('not found: ', response[1L])
>    }
>    x[-resp] <- lapply(x[-resp], as.factor)
>    if (any(do.call(table, x[-resp]) > 1L))
>        stop('repeated frequency value')
>    dn <- lapply(x[-resp], function(y) {
>        if (any(is.na(y))) c(levels(y), NA) else levels(y)
>    })
>    ind <- mapply(function(val, lev) match(val, lev), x[-resp], dn)
>    out <- array(dim=unlist(lapply(dn, length)), dimnames=dn)
>    out[ind] <- x[[resp]]
>    as.table(out)
> }
> 
> and a simple usage example:
> 
>> (y <- table(foo=c('a','a',NA,'b','b'), useNA='always'))
> foo
>   a    b <NA> 
>   2    2    1 
>> (yy <- as.data.frame(y))
>   foo Freq
> 1    a    2
> 2    b    2
> 3 <NA>    1
>> as.table(yy)
> foo
>   a    b <NA> 
>   2    2    1 
>> 
> 
> Any thoughts?

Hi,

I have not tried an exhaustive set of examples, but I believe that ?xtabs will 
get you most of the way there, with the exception of NA handling, which would 
require either nuanced use of ?addNA or preprocessing factors in the input data 
frame to possibly add NA as a factor level where needed.

BTW, from ?as.data.frame.table, see the reference there to xtabs() at the end:

"The as.data.frame method for objects inheriting from class "table" can be used 
to convert the array-based representation of a contingency table to a data 
frame containing the classifying factors and the corresponding entries (the 
latter as component named by responseName). This is the inverse of xtabs."


For example, from ?xtabs, using the ?UCBAdmissions dataset, which is a 3 
dimension table:

> str(UCBAdmissions)
 table [1:2, 1:2, 1:6] 512 313 89 19 353 207 17 8 120 205 ...
 - attr(*, "dimnames")=List of 3
  ..$ Admit : chr [1:2] "Admitted" "Rejected"
  ..$ Gender: chr [1:2] "Male" "Female"
  ..$ Dept  : chr [1:6] "A" "B" "C" "D" ...

DF <- as.data.frame(UCBAdmissions)

> str(DF)
'data.frame':   24 obs. of  4 variables:
 $ Admit : Factor w/ 2 levels "Admitted","Rejected": 1 2 1 2 1 2 1 2 1 2 ...
 $ Gender: Factor w/ 2 levels "Male","Female": 1 1 2 2 1 1 2 2 1 1 ...
 $ Dept  : Factor w/ 6 levels "A","B","C","D",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ Freq  : num  512 313 89 19 353 207 17 8 120 205 ...


# Using your function
DF.Table <- as.table.data.frame(DF)

> str(DF.Table)
 table [1:2, 1:2, 1:6] 512 313 89 19 353 207 17 8 120 205 ...
 - attr(*, "dimnames")=List of 3
  ..$ Admit : chr [1:2] "Admitted" "Rejected"
  ..$ Gender: chr [1:2] "Male" "Female"
  ..$ Dept  : chr [1:6] "A" "B" "C" "D" ...


# Using xtabs()
DF.xtabs <- xtabs(Freq ~ ., data = DF)

> str(DF.xtabs)
 xtabs [1:2, 1:2, 1:6] 512 313 89 19 353 207 17 8 120 205 ...
 - attr(*, "dimnames")=List of 3
  ..$ Admit : chr [1:2] "Admitted" "Rejected"
  ..$ Gender: chr [1:2] "Male" "Female"
  ..$ Dept  : chr [1:6] "A" "B" "C" "D" ...
 - attr(*, "class")= chr [1:2] "xtabs" "table"
 - attr(*, "call")= language xtabs(formula = Freq ~ ., data = DF)


Note that DF.xtabs has additional attributes set as a result of the use of 
xtabs().

In the example that you provided above, you would need to use something along 
the lines of:

> xtabs(Freq ~ addNA(foo), data = yy)
addNA(foo)
   a    b <NA> 
   2    2    1 

so that xtabs() includes the NA level, or for a larger data frame with a lot of 
columns, pre-process the columns so that NA is included in the factor levels 
where you desire.

That latter issue with NA's and xtabs() BTW, has bitten a lot of people over 
the years, where the recommendation to use:

> xtabs(Freq ~ foo, data = yy, exclude = NULL, na.action = na.pass)
foo
a b 
2 2 

does not actually work as believed.

Regards,

Marc Schwartz

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to