Incidentally, if you intend to use these means for further analytical purposes, you may produce irreproducible conclusions: after all, a mean of 10 things is more *meaningful* than a mean of 2. However, that is a discussion too far for this list. Consult your local statistical resources if you need to go there and need help.
Cheers, Bert On Mon, Sep 16, 2024 at 11:02 AM Bert Gunter <bgunter.4...@gmail.com> wrote: > > It's NA *not* Na. Details matter. > > Ah, but note: > > mean(c(NA,NA), na.rm = TRUE) > [1] NaN > > So if that might happen, you'll have to write your own mean function, > say mymean(), to do what you want. I leave that (simple) pleasure to > you. > > -- Bert > > On Mon, Sep 16, 2024 at 8:05 AM Francesca <francesca.panco...@gmail.com> > wrote: > > > > All' Na Is Na. > > > > > > Il lun 16 set 2024, 16:29 Bert Gunter <bgunter.4...@gmail.com> ha scritto: > >> > >> See the na.rm argument of ?mean > >> > >> But what happens if all values are NA? > >> > >> -- Bert > >> > >> > >> On Mon, Sep 16, 2024 at 7:24 AM Francesca <francesca.panco...@gmail.com> > >> wrote: > >> > > >> > Sorry for posting a non understandable code. In my screen the dataset > >> > looked correctly. > >> > > >> > > >> > I recreated my dataset, folllowing your example: > >> > > >> > test<-data.frame(matrix(c( 8, 8, 5 , 5 ,NA ,NA , 1, 15, 20, 5, NA, 17, > >> > 2 , 5 , 5, 2 , 5 ,NA, 5 ,10, 10, 5 ,12, NA), > >> > c( 18, 5, 5, 5, NA, 9, 2, 2, 10, 7 , 5, > >> > 19, > >> > NA, 10, NA, 4, NA, 8, NA, 5, 10, 3, 17, NA), > >> > c( 4, 3, 3, 2, 2, 4, 3, 3, 2, 4, 4 ,3, 4, 4, 4, > >> > 2, > >> > 2, 3, 2, 3, 3, 2, 2 ,4), > >> > c(3, 8, 1, 2, 4, 2, 7, 6, 3, 5, 1, 3, 8, 4, 7, 5, > >> > 8, 5, 1, 2, 4, 7, 6, 6))) > >> > colnames(test) <-c("cp1","cp2","role","groupid") > >> > > >> > What I have done so far is the following, that works: > >> > test %>% > >> > group_by(groupid) %>% > >> > mutate(across(starts_with("cp"), list(mean = mean))) > >> > > >> > But the problem is with NA: everytime the mean encounters a NA, it > >> > creates > >> > NA for all group members. > >> > I need the software to calculate the mean ignoring NA. So when the group > >> > is > >> > made of three people, mean of the three. > >> > If the group is two values and an NA, calculate the mean of two. > >> > > >> > My code works , creates a mean at each position for three subjects, > >> > replacing instead of the value of the single, the group mean. > >> > But when NA appears, all the group gets NA. > >> > > >> > Perhaps there is a different way to obtain the same result. > >> > > >> > > >> > > >> > On Mon, 16 Sept 2024 at 11:35, Rui Barradas <ruipbarra...@sapo.pt> wrote: > >> > > >> > > Às 08:28 de 16/09/2024, Francesca escreveu: > >> > > > Dear Contributors, > >> > > > I hope someone has found a similar issue. > >> > > > > >> > > > I have this data set, > >> > > > > >> > > > > >> > > > > >> > > > cp1 > >> > > > cp2 > >> > > > role > >> > > > groupid > >> > > > 1 > >> > > > 10 > >> > > > 13 > >> > > > 4 > >> > > > 5 > >> > > > 2 > >> > > > 5 > >> > > > 10 > >> > > > 3 > >> > > > 1 > >> > > > 3 > >> > > > 7 > >> > > > 7 > >> > > > 4 > >> > > > 6 > >> > > > 4 > >> > > > 10 > >> > > > 4 > >> > > > 2 > >> > > > 7 > >> > > > 5 > >> > > > 5 > >> > > > 8 > >> > > > 3 > >> > > > 2 > >> > > > 6 > >> > > > 8 > >> > > > 7 > >> > > > 4 > >> > > > 4 > >> > > > 7 > >> > > > 8 > >> > > > 8 > >> > > > 4 > >> > > > 7 > >> > > > 8 > >> > > > 10 > >> > > > 15 > >> > > > 3 > >> > > > 3 > >> > > > 9 > >> > > > 15 > >> > > > 10 > >> > > > 2 > >> > > > 2 > >> > > > 10 > >> > > > 5 > >> > > > 5 > >> > > > 2 > >> > > > 4 > >> > > > 11 > >> > > > 20 > >> > > > 20 > >> > > > 2 > >> > > > 5 > >> > > > 12 > >> > > > 9 > >> > > > 11 > >> > > > 3 > >> > > > 6 > >> > > > 13 > >> > > > 10 > >> > > > 13 > >> > > > 4 > >> > > > 3 > >> > > > 14 > >> > > > 12 > >> > > > 6 > >> > > > 4 > >> > > > 2 > >> > > > 15 > >> > > > 7 > >> > > > 4 > >> > > > 4 > >> > > > 1 > >> > > > 16 > >> > > > 10 > >> > > > 0 > >> > > > 3 > >> > > > 7 > >> > > > 17 > >> > > > 20 > >> > > > 15 > >> > > > 3 > >> > > > 8 > >> > > > 18 > >> > > > 10 > >> > > > 7 > >> > > > 3 > >> > > > 4 > >> > > > 19 > >> > > > 8 > >> > > > 13 > >> > > > 3 > >> > > > 5 > >> > > > 20 > >> > > > 10 > >> > > > 9 > >> > > > 2 > >> > > > 6 > >> > > > > >> > > > > >> > > > > >> > > > I need to to average of groups, using the values of column groupid, > >> > > > and > >> > > > create a twin dataset in which the mean of the group is replaced > >> > > > instead > >> > > of > >> > > > individual values. > >> > > > So for example, groupid 3, I calculate the mean (12+18)/2 and then I > >> > > > replace in the new dataframe, but in the same positions, instead of > >> > > > 12 > >> > > and > >> > > > 18, the values of the corresponding mean. > >> > > > I found this solution, where db10_means is the output dataset, db10 > >> > > > is my > >> > > > initial data. > >> > > > > >> > > > db10_means<-db10 %>% > >> > > > group_by(groupid) %>% > >> > > > mutate(across(starts_with("cp"), list(mean = mean))) > >> > > > > >> > > > It works perfectly, except that for NA values, where it replaces to > >> > > > all > >> > > > group members the NA, while in some cases, the group is made of some > >> > > > NA > >> > > and > >> > > > some values. > >> > > > So, when I have a group of two values and one NA, I would like that > >> > > > for > >> > > > those with a value, the mean is replaced, for those with NA, the NA > >> > > > is > >> > > > replaced. > >> > > > Here the mean function has not the na.rm=T option associated, but it > >> > > > appears that this solution cannot be implemented in this case. I am > >> > > > not > >> > > > even sure that this would be enough to solve my problem. > >> > > > Thanks for any help provided. > >> > > > > >> > > Hello, > >> > > > >> > > Your data is a mess, please don't post html, this is plain text only > >> > > list. Anyway, I managed to create a data frame by copying the data to a > >> > > file named "rhelp.txt" and then running > >> > > > >> > > > >> > > > >> > > db10 <- scan(file = "rhelp.txt", what = character()) > >> > > header <- db10[1:4] > >> > > db10 <- db10[-(1:4)] |> as.numeric() > >> > > db10 <- matrix(db10, ncol = 4L, byrow = TRUE) |> > >> > > as.data.frame() |> > >> > > setNames(header) > >> > > > >> > > str(db10) > >> > > #> 'data.frame': 25 obs. of 4 variables: > >> > > #> $ cp1 : num 1 5 3 7 10 5 2 4 8 10 ... > >> > > #> $ cp2 : num 10 2 1 4 4 5 6 4 4 15 ... > >> > > #> $ role : num 13 5 3 6 2 8 8 7 7 3 ... > >> > > #> $ groupid: num 4 10 7 4 7 3 7 8 8 3 ... > >> > > > >> > > > >> > > And here is the data in dput format. > >> > > > >> > > > >> > > > >> > > db10 <- > >> > > structure(list( > >> > > cp1 = c(1, 5, 3, 7, 10, 5, 2, 4, 8, 10, 9, 2, > >> > > 2, 20, 9, 13, 3, 4, 4, 10, 17, 8, 3, 13, 10), > >> > > cp2 = c(10, 2, 1, 4, 4, 5, 6, 4, 4, 15, 15, 10, > >> > > 4, 2, 11, 10, 14, 2, 4, 0, 20, 18, 4, 3, 9), > >> > > role = c(13, 5, 3, 6, 2, 8, 8, 7, 7, 3, 10, 5, > >> > > 11, 5, 3, 13, 12, 15, 1, 3, 15, 10, 19, 5, 2), > >> > > groupid = c(4, 10, 7, 4, 7, 3, 7, 8, 8, 3, 2, 5, > >> > > 20, 12, 6, 4, 6, 7, 16, 7, 3, 7, 8, 20, 6)), > >> > > class = "data.frame", row.names = c(NA, -25L)) > >> > > > >> > > > >> > > > >> > > As for the problem, I am not sure if you want summarise instead of > >> > > mutate but here is a summarise solution. > >> > > > >> > > > >> > > > >> > > library(dplyr) > >> > > > >> > > db10 %>% > >> > > group_by(groupid) %>% > >> > > summarise(across(starts_with("cp"), ~ mean(.x, na.rm = TRUE))) > >> > > > >> > > # same result, summarise's new argument .by avoids the need to group_by > >> > > db10 %>% > >> > > summarise(across(starts_with("cp"), ~ mean(.x, na.rm = TRUE)), .by = > >> > > groupid) > >> > > > >> > > > >> > > > >> > > Can you post the expected output too? > >> > > > >> > > Hope this helps, > >> > > > >> > > Rui Barradas > >> > > > >> > > > >> > > -- > >> > > Este e-mail foi analisado pelo software antivírus AVG para verificar a > >> > > presença de vírus. > >> > > www.avg.com > >> > > > >> > > >> > > >> > -- > >> > > >> > Francesca > >> > > >> > > >> > ---------------------------------- > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > https://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.