I'm having a really difficult time understanding what you're trying to get -- copy and pasting your code is failing to run, and your question isn't clear, ie:
"For each phone call that BEGINS with the module which is denoted by 81 (i.e. of the form 81X,XXX), what is the expected number of modules in these calls?" How does one calculate the expected number of "modules" in this module? What does that even mean? Anyway, here's some using your `data` data.frame that calculates the number of unique calls and other statistics on the "call id" within each module prefix. I'm using both data.table and plyr ... there are no for loops. You will want to do `whatever it is you really want to do` inside the "blocks" below. ## R code data <- transform(data, module.prefix=substring(modules, 1, 2)) ## take a look at `data` now ## calulate "stuff" inside each module.prefix using data.table xx <- data.table(data, key="module.prefix") ans <- xx[, { ## the columns of the particular subset of your data.table ## are "injected" into the scope for this expression block ## which is where the `calls` variable below comes from tabled <- table(as.character(calls)) list(unique.calls=length(tabled), min=min(tabled), median=as.numeric(median(tabled)), max=max(tabled)) ## you will want to return your own list of "stuff" }, by='module.prefix'] ## with plyr library(plyr) ans <- ddply(data, "module.prefix", function(x) { ## `x` is a data.frame that all share the same module.prefix ## do whatever you want with it here tabled <- table(as.character(x$calls)) c(unique.calls=length(tabled), min=min(tabled), median=median(tabled), max=max(tabled)) }) You'll have to read up on the particulars of data.table and plyr. Both are really powerful packages ... you should get familiar with at least one. plyr is a bit more flexible in some ways. data.table is a bit more strict (cf. the need for `as.numeric(median(tabled))`), but also tends to be (much) faster when working over large datasets HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.