Re: [R] Getting minimum value of a column according a factor column of a dataframe

Bill Dunlap Thu, 25 Aug 2022 09:08:53 -0700

The order of the rows returned by summarize is controlled by the levels of
the factors given to group_by.  If group_by is given character columns
instead of factors, it converts them to factors with the levels being the
sorted unique values of the character columns.  Convert you character
columns to factors with the desired levels to get the order you want.


E.g.,

> d <- data.frame(charGroup=rep(c("Small","Medium","Large"),3:1), x=101:106)
> d$factorGroup <- factor(d$charGroup, levels=c("Small","Medium","Large","X
Large"))
> d
  charGroup   x factorGroup
1     Small 101       Small
2     Small 102       Small
3     Small 103       Small
4    Medium 104      Medium
5    Medium 105      Medium
6     Large 106       Large
> d |> group_by(charGroup) |> summarize(minX=min(x))
# A tibble: 3 × 2
  charGroup  minX
  <chr>     <int>
1 Large       106
2 Medium      104
3 Small       101
> d |> group_by(factorGroup) |> summarize(minX=min(x))
# A tibble: 3 × 2
  factorGroup  minX
  <fct>       <int>
1 Small         101
2 Medium        104
3 Large         106
> # .drop=FALSE outputs a row for each unused level as well
> d |> group_by(factorGroup, .drop=FALSE) |> summarize(minX=min(x))
# A tibble: 4 × 2
  factorGroup  minX
  <fct>       <dbl>
1 Small         101
2 Medium        104
3 Large         106
4 X Large       Inf
Warning message:
In min(x) : no non-missing arguments to min; returning Inf

-Bill

On Thu, Aug 25, 2022 at 4:05 AM javad bayat <[email protected]> wrote:

> Dear Rui;
> Thank you very much. Both of your codes worked correctly. Now I can see the
> whole row's value.
> But I found a problem in the results. When I run your codes, the results
> are shown in a sorted table. I do not know why the results have been sorted
> according to the "Code" column, smallest to largest. Is there any way to
> get the results like their order in the first data frame (bilan2)? I used
> your codes as follow:
>
> > bilan3 <- lapply(split(bilan2, bilan2$Code), \(x) x[which.min(x$Q),])
> > bilan3 = data.frame(do.call(rbind, bilan3))
> Sincerely
>
>
>
> On Thu, Aug 25, 2022 at 11:52 AM Rui Barradas <[email protected]>
> wrote:
>
> > Hello,
> >
> > OK, what about
> >
> >
> > res <- lapply(split(df1, df1$Code), \(x) x[which.min(x$Q),])
> > do.call(rbind, res)
> > #         Code  Y  M  D     Q    N    O
> > #  41003 41003 81  1 19 0.160 7.17 2.50
> > #  41005 41005 79  8 17 0.210 5.50 7.20
> > #  41009 41009 79  2 21 0.218 5.56 4.04
> > #  41017 41017 79 10 20 0.240 5.30 7.10
> >
> >
> > A dplyr solution.
> >
> >
> >
> > suppressPackageStartupMessages(library(dplyr))
> >
> > df1 %>%
> >    group_by(Code) %>%
> >    slice_min(Q) %>%
> >    slice_head(n = 1)
> > #  # A tibble: 4 × 7
> > #  # Groups:   Code [4]
> > #    Code      Y     M     D     Q     N     O
> > #    <fct> <int> <int> <int> <dbl> <dbl> <dbl>
> > #  1 41003    81     1    19 0.16   7.17  2.5
> > #  2 41005    79     8    17 0.21   5.5   7.2
> > #  3 41009    79     2    21 0.218  5.56  4.04
> > #  4 41017    79    10    20 0.24   5.3   7.1
> >
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> >
> > Às 05:56 de 25/08/2022, javad bayat escreveu:
> > > Dear all,
> > > Many thanks for your suggested methods and codes, but unfortunately
> they
> > > did not give the desired results.
> > > All the codes you have provided are correct but they did not represent
> > the
> > > whole row which is related to the minimum of "Q".
> > > The code must result in 4 rows, with the minimum value of "Q" and other
> > > column values, as below:
> > >
> > >         Code
> > >
> > >                Y
> > >
> > >                M
> > >
> > >                 D
> > >
> > >             Q
> > >
> > >              N
> > >
> > >               O
> > >
> > > 41003
> > >
> > > 81
> > >
> > > 1
> > >
> > > 19
> > >
> > > 0.16
> > >
> > > 7.17
> > >
> > > 2.5
> > >
> > > 41005
> > >
> > > 79
> > >
> > > 8
> > >
> > > 17
> > >
> > > 0.21
> > >
> > > 5.5
> > >
> > > 7.2
> > >
> > > 41009
> > >
> > > 79
> > >
> > > 2
> > >
> > > 21
> > >
> > > 0.218
> > >
> > > 5.56
> > >
> > > 4.04
> > > 41017 79 10 20 0.24 5.3 7.1
> > >
> > >
> > >
> > >
> > >
> > >
> > > Sincerely
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > 41017 79 10 20 0.24 5.3 7.1
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Aug 24, 2022 at 9:24 PM Rui Barradas <[email protected]>
> > wrote:
> > >
> > >> Hello,
> > >>
> > >> Here are two options, the 1st outputs a vector, the 2nd a data.frame.
> > >>
> > >>
> > >> x<-'41003 81 1 19 0.16 7.17 2.5
> > >> 41003 77 9 22 0.197 6.8 2.2
> > >> 41003 79 7 28 0.21 4.7 6.2
> > >> 41005 79 8 17 0.21 5.5 7.2
> > >> 41005 80 10 30 0.21 6.84 2.6
> > >> 41005 80 12 20 0.21 6.84 2.4
> > >> 41005 79 6 14 0.217 5.61 3.55
> > >> 41009 79 2 21 0.218 5.56 4.04
> > >> 41009 79 5 27 0.218 6.4 3.12
> > >> 41009 80 11 29 0.22 6.84 2.8
> > >> 41009 78 5 28 0.232 6 3.2
> > >> 41009 81 8 20 0.233 6.39 1.6
> > >> 41009 79 9 30 0.24 5.6 7.5
> > >> 41017 79 10 20 0.24 5.3 7.1
> > >> 41017 80 7 30 0.24 6.73 2.6'
> > >> df1 <- read.table(textConnection(x))
> > >> names(df1) <- scan(what = character(),
> > >>                      text = 'Code Y M D Q N O')
> > >> df1$Code <- factor(df1$Code)
> > >>
> > >> # 1st option
> > >> with(df1, tapply(Q, Code, min))
> > >> #  41003 41005 41009 41017
> > >> #  0.160 0.210 0.218 0.240
> > >>
> > >> # 2nd option
> > >> aggregate(Q ~ Code, df1, min)
> > >> #     Code     Q
> > >> #  1 41003 0.160
> > >> #  2 41005 0.210
> > >> #  3 41009 0.218
> > >> #  4 41017 0.240
> > >>
> > >>
> > >> Hope this helps,
> > >>
> > >> Rui Barradas
> > >>
> > >> Às 08:44 de 24/08/2022, javad bayat escreveu:
> > >>> Dear all;
> > >>> I am trying to get the minimum value of a column based on a factor
> > column
> > >>> of the same data frame. My data frame is like the below:
> > >>>          Code               Y               M                D
> > >>>    Q
> > >>>        N              O
> > >>> 41003 81 1 19 0.16 7.17 2.5
> > >>> 41003 77 9 22 0.197 6.8 2.2
> > >>> 41003 79 7 28 0.21 4.7 6.2
> > >>> 41005 79 8 17 0.21 5.5 7.2
> > >>> 41005 80 10 30 0.21 6.84 2.6
> > >>> 41005 80 12 20 0.21 6.84 2.4
> > >>> 41005 79 6 14 0.217 5.61 3.55
> > >>> 41009 79 2 21 0.218 5.56 4.04
> > >>> 41009 79 5 27 0.218 6.4 3.12
> > >>> 41009 80 11 29 0.22 6.84 2.8
> > >>> 41009 78 5 28 0.232 6 3.2
> > >>> 41009 81 8 20 0.233 6.39 1.6
> > >>> 41009 79 9 30 0.24 5.6 7.5
> > >>> 41017 79 10 20 0.24 5.3 7.1
> > >>> 41017 80 7 30 0.24 6.73 2.6
> > >>>
> > >>> I want to get the minimum value of the "Q" column with the whole row
> > >>> values, according to the "Code"  column  which is a factor. Overall
> it
> > >> will
> > >>> give me 4 rows, with the value of "Q". Below is a code that I used
> but
> > it
> > >>> did not give me what I wanted.
> > >>>
> > >>>> x[which(x$Q == min(x$Q)),]
> > >>>
> > >>> Sincerely
> > >>>
> > >>>
> > >>>
> > >>
> > >
> > >
> >
>
>
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: [email protected]
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [email protected] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting minimum value of a column according a factor column of a dataframe

Reply via email to