On Sun, Feb 26, 2012 at 11:39:01AM -0800, mari681 wrote: > SORRY! > > The data in MyTable are tagsets of photos, like this: > > V1 V2 V3 V4 V5 V6 V7 V8 > 230 green nailpolish barrym 0 0 0 0 0 > 231 ny green brooklyn cleanup clean gowanus volunteer gcc > 232 green saul lecture 0 0 0 0 0 > 233 green colors cores market colores marakesh mercado malu > 234 ny green brooklyn cleanup clean gowanus volunteer gcc > 235 green saul lecture 0 0 0 0 0 > 236 portrait pet white green cat canon square eos > > V9 V10 V11 V12 V13 V14 V15 > 230 0 0 0 0 0 0 0 > 231 gowanuscanalconservancy 0 0 0 0 0 0 > 232 0 0 0 0 0 0 0 > 233 malugreen maroc souk marrocos 0 0 0 > 234 gowanuscanalconservancy 0 0 0 0 0 0 > 235 0 0 0 0 0 0 0 > 236 is eyes mark taiwan ii mk2 5d > > > while data of MyVector is a list of tags (none of the columns in particular) > whose frequency in MyTable has to be computed. Like this: > > [1] "life" "wood" "pink" "house" "green" "fall"
Hi. Just to be sure, in all the previous solutions, "malugreen" is not an occurence of "green". Is this correct? > MyTable has 21 millions rows and 15 columns, and the data is "character", > they are words. Do you use the argument stringsAsFactors=FALSE, when reading the data from a file? Otherwise, character data are converted to a factor. The discussed solutions work in both cases, however, if we try to prepare simplified data for testing efficiency, we should use the same column class as in the real situation. > When I tried the loop my computer crashed in the meaning that it freezed > (froze?) and didn't allow me to do anything. The morning after I forced it > off and rebooted. This does not seem to be a consequence of a too long computation. A possible cause can be too large memory requirements. How large memory the R process uses after loading the data? Try gc() command after loading the data and compare with the amount of memory available. On a Linux machine, it is also possible to see the memory usage with "top" command in the row, where R is reported. Petr. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.