Hi above tapply and aggregate, split *apply could be used)
sapply(with(df, split(z, y)), mean) Cheers Petr > -----Original Message----- > From: R-help <r-help-boun...@r-project.org> On Behalf Of Luigi Marongiu > Sent: Wednesday, November 17, 2021 2:21 PM > To: r-help <r-help@r-project.org> > Subject: [R] vectorization of loops in R > > Hello, > I have a dataframe with 3 variables. I want to loop through it to get > the mean value of the variable `z`, as follows: > ``` > df = data.frame(x = c(rep(1,5), rep(2,5), rep(3,5)), > y = rep(letters[1:5],3), > z = rnorm(15), > stringsAsFactors = FALSE) > m = vector() > for (i in unique(df$y)) { > s = df[df$y == i,] > m = append(m, mean(s$z)) > } > names(m) = unique(df$y) > > (m) > a b c d e > -0.6355382 -0.4218053 -0.7256680 -0.8320783 -0.2587004 > ``` > The problem is that I have one million `y` values, so the work takes > almost a day. I understand that vectorization will speed up the > procedure. But how shall I write the procedure in vectorial terms? > Thank you > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.