Re: [R] Strange behavior when sampling rows of a data frame

2020-06-19 Thread Sébastien Lahaie
Thank you all for the responses, these are the insights I was hoping for. There are many ways to get this right, and I happened to run into one that has a glitch. I see from Luke's explanation how the strange output came about. Glad to hear that this bug/behavior is already known. On Fri, Jun 19,

Re: [R] Strange behavior when sampling rows of a data frame

2020-06-19 Thread Daniel Nordlund
On 6/19/2020 5:49 AM, Sébastien Lahaie wrote: I ran into some strange behavior in R when trying to assign a treatment to rows in a data frame. I'm wondering whether any R experts can explain what's going on. First, let's assign a treatment to 3 out of 10 rows as follows. df <- data.frame(unit =

Re: [R] Strange behavior when sampling rows of a data frame

2020-06-19 Thread William Dunlap via R-help
It is a bug that has been present in R since at least R-2.14.0 (the oldest that I have installed on my laptop). Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Jun 19, 2020 at 10:37 AM Rui Barradas wrote: > Hello, > > > Thanks, I hadn't thought of that. > > But, why? Is it evaluated once

Re: [R] Strange behavior when sampling rows of a data frame

2020-06-19 Thread Rui Barradas
Hello, Thanks, I hadn't thought of that. But, why? Is it evaluated once before assignment and a second time when the assignment occurs? To trace both sample and `[<-` gives 2 calls to sample. trace(sample) trace(`[<-`) df[sample(nrow(df), 3),]$treated <- TRUE trace: sample(nrow(df), 3) tra

Re: [R] Strange behavior when sampling rows of a data frame

2020-06-19 Thread William Dunlap via R-help
The first subscript argument is getting evaluated twice. > trace(sample) > set.seed(2020); df[i<-sample(10,3), ]$Treated <- TRUE trace: sample(10, 3) trace: sample(10, 3) > i [1] 1 10 4 > set.seed(2020); sample(10,3) trace: sample(10, 3) [1] 7 6 8 > sample(10,3) trace: sample(10, 3) [1] 1 10 4

Re: [R] Strange behavior when sampling rows of a data frame

2020-06-19 Thread Rui Barradas
Hello, I don't have an answer on the reason why this happens but it seems like a bug. Where? In which of  `[<-.data.frame` or `[<-.default`? A solution is to subset and assign the vector: set.seed(2020) df2 <- data.frame(unit = 1:10) df2$treated <- FALSE df2$treated[sample(nrow(df2), 3)] <