Hi Santiago Keep conversation in list. Others can have better ideas.
I am still messing the reasoning Merge seems to me the solution but I am lost in your resoning what to keep and what to discard from resulting object. After merge I have this result <- structure(list(Ring = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("6106933", "6134701", "6140497", "6140719", "6140756", "6140855", "6143070", "6143090", "6143093", "6175711", "6175726", "6175730", "6175769", "6175776", "6175784", "6188609", "6188705", "6195159", "6195171", "6198153", "6198154", "6198156", "6198157", "6198172"), class = "factor"), jul = c(15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135), timepos = structure(c(1307680575, 1307680740, 1307681040, 1307681340, 1307681640, 1307681940, 1307682240, 1307682540, 1307682780, 1307683080, 1307683380, 1307683680, 1307683980, 1307684280, 1307684397, 1307684424, 1307684484, 1307684490, 1307684580, 1307684880, 1307685180, 1307685243, 1307685321, 1307685336), class = c("POSIXct", "POSIXt"), tzone = "GMT"), act = c(3822L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 27L, 60L, 6L, 753L, NA, NA, NA, 78L, 15L, 18L), wd = c("dry", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "wet", "dry", "wet", "dry", NA, NA, NA, "wet", "dry", "wet")), .Names = c("Ring", "jul", "timepos", "act", "wd" ), row.names = c(NA, -24L), class = "data.frame") > result Ring jul timepos act wd 1 6106933 15135 2011-06-10 04:36:15 3822 dry 2 6106933 15135 2011-06-10 04:39:00 NA <NA> 3 6106933 15135 2011-06-10 04:44:00 NA <NA> 4 6106933 15135 2011-06-10 04:49:00 NA <NA> 5 6106933 15135 2011-06-10 04:54:00 NA <NA> 6 6106933 15135 2011-06-10 04:59:00 NA <NA> 7 6106933 15135 2011-06-10 05:04:00 NA <NA> 8 6106933 15135 2011-06-10 05:09:00 NA <NA> 9 6106933 15135 2011-06-10 05:13:00 NA <NA> 10 6106933 15135 2011-06-10 05:18:00 NA <NA> 11 6106933 15135 2011-06-10 05:23:00 NA <NA> 12 6106933 15135 2011-06-10 05:28:00 NA <NA> 13 6106933 15135 2011-06-10 05:33:00 NA <NA> 14 6106933 15135 2011-06-10 05:38:00 NA <NA> 15 6106933 15135 2011-06-10 05:39:57 27 wet 16 6106933 15135 2011-06-10 05:40:24 60 dry 17 6106933 15135 2011-06-10 05:41:24 6 wet 18 6106933 15135 2011-06-10 05:41:30 753 dry 19 6106933 15135 2011-06-10 05:43:00 NA <NA> 20 6106933 15135 2011-06-10 05:48:00 NA <NA> 21 6106933 15135 2011-06-10 05:53:00 NA <NA> 22 6106933 15135 2011-06-10 05:54:03 78 wet 23 6106933 15135 2011-06-10 05:55:21 15 dry 24 6106933 15135 2011-06-10 05:55:36 18 wet I understand you want to keep only time values from GPL data.frame. OK this can be done in the last step. But I am a bit lost in the logic for discarding lines 15-18. Anyway, this can be what you want library(zoo) result$wd<-na.locf(result$wd) final<-result[is.na(result$act),] > final Ring jul timepos act wd 2 6106933 15135 2011-06-10 04:39:00 NA dry 3 6106933 15135 2011-06-10 04:44:00 NA dry 4 6106933 15135 2011-06-10 04:49:00 NA dry 5 6106933 15135 2011-06-10 04:54:00 NA dry 6 6106933 15135 2011-06-10 04:59:00 NA dry 7 6106933 15135 2011-06-10 05:04:00 NA dry 8 6106933 15135 2011-06-10 05:09:00 NA dry 9 6106933 15135 2011-06-10 05:13:00 NA dry 10 6106933 15135 2011-06-10 05:18:00 NA dry 11 6106933 15135 2011-06-10 05:23:00 NA dry 12 6106933 15135 2011-06-10 05:28:00 NA dry 13 6106933 15135 2011-06-10 05:33:00 NA dry 14 6106933 15135 2011-06-10 05:38:00 NA dry 19 6106933 15135 2011-06-10 05:43:00 NA dry 20 6106933 15135 2011-06-10 05:48:00 NA dry 21 6106933 15135 2011-06-10 05:53:00 NA dry > Regards Petr From: Santiago Guallar [mailto:sgual...@yahoo.com] Sent: Tuesday, July 09, 2013 10:02 PM To: PIKAL Petr Subject: Re: [R] spped up a function Dear Petr, I wanted the two data sets merged in such a way that the values of the 'wd' vector (from the intervals t of 'xact') are assigned to the corresponding intervals of 'GPS'. If there is more than one value (i.e if there is more than one interval of 'xact' for the corresponding interval of 'GPS'), then take the maximum (i.e. the value of the interval of 'xact' closest to the corresponding interval of 'GPS'). This is why the output of the particular sequence of the result I copied in the previous message contains only 'dry'. Santi From: PIKAL Petr <petr.pi...@precheza.cz<mailto:petr.pi...@precheza.cz>> To: Santiago Guallar <sgual...@yahoo.com<mailto:sgual...@yahoo.com>>; r-help <r-help@r-project.org<mailto:r-help@r-project.org>> Sent: Tuesday, July 9, 2013 11:19 AM Subject: RE: [R] spped up a function Hi Santiago I am a bit confused how is your result organised, why there are only âdryâ value regardless of timepos values. It is not necessary to attach files resulting from dput. Just copy it to your mail and anybody can copy it directly to R Ring is factor in xact but numeric in GPS > str(xact) 'data.frame': 8 obs. of 5 variables: $ Ring : Factor w/ 24 levels "6106933","6134701",..: 1 1 1 1 1 1 1 1 $ jul : num 15135 15135 15135 15135 15135 ... $ timepos: POSIXct, format: "2011-06-10 04:36:15" "2011-06-10 05:39:57" ... $ act : int 3822 27 60 6 753 78 15 18 $ wd : chr "dry" "wet" "dry" "wet" ... > str(GPS) 'data.frame': 16 obs. of 3 variables: $ Ring : int 6106933 6106933 6106933 6106933 6106933 6106933 6106933 6106933 6106933 6106933 ... $ jul : num 15135 15135 15135 15135 15135 ... $ timepos: POSIXct, format: "2011-06-10 04:39:00" "2011-06-10 04:44:00" ... So I first changed it to factor in both. GPS$Ring<-factor(GPS$Ring) after that I merged both files result<-merge(xact, GPS, all=T) and here is result dput(result) structure(list(Ring = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("6106933", "6134701", "6140497", "6140719", "6140756", "6140855", "6143070", "6143090", "6143093", "6175711", "6175726", "6175730", "6175769", "6175776", "6175784", "6188609", "6188705", "6195159", "6195171", "6198153", "6198154", "6198156", "6198157", "6198172"), class = "factor"), jul = c(15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135), timepos = structure(c(1307680575, 1307680740, 1307681040, 1307681340, 1307681640, 1307681940, 1307682240, 1307682540, 1307682780, 1307683080, 1307683380, 1307683680, 1307683980, 1307684280, 1307684397, 1307684424, 1307684484, 1307684490, 1307684580, 1307684880, 1307685180, 1307685243, 1307685321, 1307685336), class = c("POSIXct", "POSIXt"), tzone = "GMT"), act = c(3822L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 27L, 60L, 6L, 753L, NA, NA, NA, 78L, 15L, 18L), wd = c("dry", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "wet", "dry", "wet", "dry", NA, NA, NA, "wet", "dry", "wet")), .Names = c("Ring", "jul", "timepos", "act", "wd" ), row.names = c(NA, -24L), class = "data.frame") there are empty values in act and wd column. You can fill it eg. by âna.locfâ function from âzooâ package. > result$wd [1] "dry" NA NA NA NA NA NA NA NA NA NA NA [13] NA NA "wet" "dry" "wet" "dry" NA NA NA "wet" "dry" "wet" > na.locf(result$wd) [1] "dry" "dry" "dry" "dry" "dry" "dry" "dry" "dry" "dry" "dry" "dry" "dry" [13] "dry" "dry" "wet" "dry" "wet" "dry" "dry" "dry" "dry" "wet" "dry" "wet" > Is this what you want? Regards Petr From: Santiago Guallar [mailto:sgual...@yahoo.com] Sent: Tuesday, July 09, 2013 8:53 AM To: PIKAL Petr; r-help Subject: Re: [R] spped up a function Hi Petr, yes the function basically consists on merging two time series with different time intervals: one regular 'GPS' and one irregular 'xact' (the latter containing the binomial variable 'wd' that I want to add to 'GPS'. Apparently my attachments did not go through. Here you have the dputs you requested plus the desired result based on them: head(xact) Ring jul timepos act wd 6106933 15135 2011-06-10 04:36:15 3822 dry 6106933 15135 2011-06-10 05:39:57 27 wet 6106933 15135 2011-06-10 05:40:24 60 dry 6106933 15135 2011-06-10 05:41:24 6 wet 6106933 15135 2011-06-10 05:41:30 753 dry 6106933 15135 2011-06-10 05:54:03 78 wet 6106933 15135 2011-06-10 05:55:21 15 dry 6106933 15135 2011-06-10 05:55:36 18 wet head(GPS1, 16) and desired result (added column wd) Ring jul timepos wd 5 6106933 15135 2011-06-10 04:39:00 dry 6 6106933 15135 2011-06-10 04:44:00 dry 7 6106933 15135 2011-06-10 04:49:00 dry 8 6106933 15135 2011-06-10 04:54:00 dry 9 6106933 15135 2011-06-10 04:59:00 dry 10 6106933 15135 2011-06-10 05:04:00 dry 11 6106933 15135 2011-06-10 05:09:00 dry 12 6106933 15135 2011-06-10 05:13:00 dry 13 6106933 15135 2011-06-10 05:18:00 dry 14 6106933 15135 2011-06-10 05:23:00 dry 15 6106933 15135 2011-06-10 05:28:00 dry 16 6106933 15135 2011-06-10 05:33:00 dry 17 6106933 15135 2011-06-10 05:38:00 dry 18 6106933 15135 2011-06-10 05:43:00 dry 19 6106933 15135 2011-06-10 05:48:00 dry 20 6106933 15135 2011-06-10 05:53:00 dry Santi ________________________________ From: PIKAL Petr <petr.pi...@precheza.cz<mailto:petr.pi...@precheza.cz>> To: Santiago Guallar <sgual...@yahoo.com<mailto:sgual...@yahoo.com>>; r-help <r-help@r-project.org<mailto:r-help@r-project.org>> Sent: Monday, July 8, 2013 11:34 AM Subject: RE: [R] spped up a function Hi It seems to me, that you basically want merge, but I can miss the point. Try post dput(head(xact)) dput(head(GPS)) and what shall be desired result based on those 2 datasets. Regards Petr > -----Original Message----- > From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> > [mailto:r-help-bounces@r- > project.org<http://project.org/>] On Behalf Of Santiago Guallar > Sent: Tuesday, July 02, 2013 7:47 PM > To: r-help > Subject: [R] spped up a function > > Hi, > > I have written a function to assign the values of a certain variable > 'wd' from a dataset to another dataset. Both contain data from the > same time period but differ in the length of their time intervals: > 'GPS' has regular 10-minute intervals whereas 'xact' has irregular > intervals. I attached simplified text versions from write.table. You > can also get a dput of 'xact' in this address: > http://www.megafileupload.com/en/file/431569/xact-dput.html). > The original objects are large and the function takes almost one hour > to finish. > Here's the function: > > fxG= function(xact, GPS){ > l <- rep( 'A', nrow(GPS) ) > v <- unique(GPS$Ring) # the process is carried out for several > individuals identified by 'Ring' > for(k in 1:length(v) ){ > I = v[k] > df <- xact[xact$Ring == I,] > for(i in 1:nrow(GPS)){ > if(GPS[i,]$Ring== I){# the code runs along the whole data.frame for > each i; it'd save time to make it stop with the last record of each i > instead u <- df$timepos <= GPS[i,]$timepos # fill vector l for each > interval t from xact <= each interval from GPS (take the max if there's > > 1 interval) l[i] <- df[max( which(u == TRUE) ),]$wd } } } return(l)} > > vwd <- fxG(xact, GPS) > > > My question is: how can I speed up (optimize) this function? > > Thank you for your help [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.