HI,
This may also help: someTags <- data.frame(tag_id = c(1, 2, 2, 3, 4, 5, 6, 6), lgth = 50*(1:8), stage=factor(rep(".",8), levels=c(".","J"))) f2<-function(x){ needsChanging<-with(someTags,is.na(match(tag_id,tag_id[duplicated(tag_id)]))&lgth<300) x$stage[needsChanging]<-"J" x } f2(someTags) # tag_id lgth stage #1 1 50 J #2 2 100 . #3 2 150 . #4 3 200 J #5 4 250 J #6 5 300 . #7 6 350 . #8 6 400 . A.K. ----- Original Message ----- From: William Dunlap <wdun...@tibco.com> To: Guillaume2883 <guillaume.bal....@gmail.com>; "r-help@r-project.org" <r-help@r-project.org> Cc: Sent: Friday, August 10, 2012 8:02 PM Subject: Re: [R] vectorization condition counting Your sum(tag_id==tag_id[i])==1, meaning tag_id[i] is the only entry with its value, may be vectorized by the sneaky idiom !(duplicated(tag_id,fromLast=FALSE) | duplicated(tag_id,fromLast=TRUE) Hence f0() (with your code in a loop) and f1() are equivalent: f0 <- function (tags) { for (i in seq_len(nrow(tags))) { if (sum(tags$tag_id == tags$tag_id[i]) == 1 & tags$lgth[i] < 300) { tags$stage[i] <- "J" } } tags } f1 <-function (tags) { needsChanging <- with(tags, !(duplicated(tag_id, fromLast = FALSE) | duplicated(tag_id, fromLast = TRUE)) & lgth < 300) tags$stage[needsChanging] <- "J" tags } E.g., > someTags <- data.frame(tag_id = c(1, 2, 2, 3, 4, 5, 6, 6), lgth = 50*(1:8), > stage=factor(rep(".",8), levels=c(".","J"))) > all.equal(f0(someTags), f1(someTags)) [1] TRUE > f1(someTags) tag_id lgth stage 1 1 50 J 2 2 100 . 3 2 150 . 4 3 200 J 5 4 250 J 6 5 300 . 7 6 350 . 8 6 400 . Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Guillaume2883 > Sent: Friday, August 10, 2012 3:47 PM > To: r-help@r-project.org > Subject: [R] vectorization condition counting > > Hi all, > > I am working on a really big dataset and I would like to vectorize a > condition in a if loop to improve speed. > > the original loop with the condition is currently writen as follow: > > if(sum(as.integer(tags$tag_id==tags$tag_id[i]))==1&tags$lgth[i]<300){ > > tags$stage[i]<-"J" > > } > > Do you have some ideas ? I was unable to do it correctly > Thanking you in advance for your help > > Guillaume > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/vectorization-condition- > counting-tp4639992.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.