Jan -
    Here's one way:

tbl = data.frame(id=c(1,1,1,1,1,2,2,2,2,2),
                 
text=c('this','is','the','first','row','this','is','the','second','row'))
xtabs(~id+text,tbl)
  text
id  first is row second the this
  1     1  1   1      0   1    1
  2     0  1   1      1   1    1

It's a bit tricky to automatically get the column headings to be in the order you want. This comes close:
tbl$text = factor(tbl$text,levels=tbl$text[!duplicated(tbl$text)])
xtabs(~id+text,tbl)
  text
id  this is the first row second
  1    1  1   1     1   1      0
  2    1  1   1     0   1      1

Hope this helps.
                                        - Phil Spector
                                         Statistical Computing Facility
                                         Department of Statistics
                                         UC Berkeley
                                         spec...@stat.berkeley.edu



On Tue, 2 Mar 2010, Jan Hornych wrote:

Hi all,

I have a table in database that is very long and when simplified it has only
two columns in it (id, text). id is the row, and text is the column.
Technically the text is a term and and id is the document.
If simplifying this and assuming there is only one occurrence of the term
per the document. I shall be able to convert this into a binary matrix.
Table looks like this...

*ID** **Text*
------------
1 this
1 is
1 the
1 first
1 row
2 this
2 is
2 the
2 send
2 row
...


in R I would like to have it as

*id  this is the first second row*
------------------------------------------------
1     1  1   1     1          0     1
2     1  1   1     0          1     1

it would be simpler for me to do this transformation in R as I guess the
language is more handy as the SQL. The table in R have few dozen thousand of
columns and rows as well. I know how to read the data from database, but
just unsure if there is some suitable transformation available.

Thank you
Jan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to