Time to do your own homework by working through an R tutorial or two. There are many on the web -- or see the Intro to R tutorial that ships with R.
?tapply ?unique is one of many answers to your query. Cheers, Bert Bert Gunter "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." -- Clifford Stoll On Sat, Nov 21, 2015 at 11:52 AM, Ashta <sewa...@gmail.com> wrote: > Hi Bert and all, > I have related question. In each time period there were different > locations where the samples were collected (S1). I want count the > number of unique locations (S1) for each unique time period . So in > time 1 the samples were collected from two locations and time 2 only > from one location and time 3 from three locations.. > > tab <- read.table(textConnection(" time S1 rep > 1 1 1 > 1 2 1 > 1 2 2 > 2 1 1 > 2 1 2 > 2 1 3 > 2 1 4 > 3 1 1 > 3 2 1 > 3 3 1 "),header = TRUE) > > what I want is > > time S1 > 1 2 > 2 1 > 3 3 > > Thank you again. > > > > On Sat, Nov 21, 2015 at 1:30 PM, Ashta <sewa...@gmail.com> wrote: >> Thank you Bert! >> >> What I want is at least 500 samples based on random sampling of time >> period. This allows samples collected at the same time period are >> included together. >> >> Your script is doing what I wanted to do!! >> >> Many thanks >> >> >> >> >> On Sat, Nov 21, 2015 at 1:15 PM, Bert Gunter <bgunter.4...@gmail.com> wrote: >>> David's "solution" is incorrect. It can also fail to give you times >>> with a total of 500 items to sample from in the time periods. >>> >>> It is not entirely clear what you want. The solution below gives you a >>> random sample of time periods in which X1>0 and the total number of >>> samples among them is >= 500. It does not give you the fewest number >>> of periods that can do this. Is this what you want? >>> >>> tab[with(tab,{ >>> rownums<- sample(seq_len(nrow(tab))[X1>0]) >>> sz <- cumsum(X2[rownums]) >>> rownums[c(TRUE,sz<500)] >>> }),] >>> >>> Cheers, >>> Bert >>> >>> >>> Bert Gunter >>> >>> "Data is not information. Information is not knowledge. And knowledge >>> is certainly not wisdom." >>> -- Clifford Stoll >>> >>> >>> On Sat, Nov 21, 2015 at 10:56 AM, Ashta <sewa...@gmail.com> wrote: >>>> Thank you David! >>>> >>>> I rerun the your script and it is giving me the first three time periods >>>> is it doing random sampling? >>>> >>>> tab.fan >>>> time X1 X2 >>>> 2 2 5 230 >>>> 3 3 1 300 >>>> 5 5 2 10 >>>> >>>> >>>> >>>> On Sat, Nov 21, 2015 at 12:20 PM, David L Carlson <dcarl...@tamu.edu> >>>> wrote: >>>>> Use dput() to send data to the list as it is more compact: >>>>> >>>>>> dput(tab) >>>>> structure(list(time = 1:8, X1 = c(0L, 5L, 1L, 0L, 2L, 3L, 1L, >>>>> 4L), X2 = c(251L, 230L, 300L, 25L, 10L, 101L, 300L, 185L)), .Names = >>>>> c("time", >>>>> "X1", "X2"), class = "data.frame", row.names = c(NA, -8L)) >>>>> >>>>> You can just remove the lines with X1 = 0 since you don't want to use >>>>> them. >>>>> >>>>>> tab.sub <- tab[tab$X1>0, ] >>>>> >>>>> Then the following gives you a sample: >>>>> >>>>>> tab.sub[cumsum(sample(tab.sub$X2))<=500, ] >>>>> >>>>> Note, that your "solution" of times 6, 7, and 8 will never appear because >>>>> the sum of the values is 586. >>>>> >>>>> >>>>> David L. Carlson >>>>> Department of Anthropology >>>>> Texas A&M University >>>>> >>>>> -----Original Message----- >>>>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashta >>>>> Sent: Saturday, November 21, 2015 11:53 AM >>>>> To: R help <r-help@r-project.org> >>>>> Subject: [R] Conditional Random selection >>>>> >>>>> Hi all, >>>>> >>>>> I have a data set that contains samples collected over time. In >>>>> each time period the total number of samples are given (X2) The goal >>>>> is to select 500 random samples. The selection should be based on >>>>> time (select time periods until I reach 500 samples). Also the time >>>>> period should have greater than 0 for X1 variable. X1 is an indicator >>>>> variable. >>>>> >>>>> Select "time" until reaching the sum of X2 is > 500 and if X1 is > 0 >>>>> >>>>> tab <- read.table(textConnection(" time X1 X2 >>>>> 1 0 251 >>>>> 2 5 230 >>>>> 3 1 300 >>>>> 4 0 25 >>>>> 5 2 10 >>>>> 6 3 101 >>>>> 7 1 300 >>>>> 8 4 185 "),header = TRUE) >>>>> >>>>> In the above example, samples from time 1 and 4 will not be selected >>>>> ( X1 is zero) >>>>> So I could reach my target by selecting time 6,7, and 8 or time 2 and >>>>> 3 and so on. >>>>> >>>>> Can any one help to do that? >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.