Re: [R] highest and second highest value in row for each combination

Dennis Murphy Thu, 10 Feb 2011 18:37:59 -0800

Hi:

And yet another:  :)


library(plyr)
top2 <- function(x) {
       y <- sort(x[-c(1, 2)], decreasing = TRUE)[1:2]
       area <- rep(x[1], 2)
       type <- rep(x[2], 2)
       data.frame(area, type, value = y, variable = names(y))
     }
v <- ldply(apply(df, 1, top2), rbind)

The last line does the list creation and do.call() steps together.

HTH,
Dennis

On Thu, Feb 10, 2011 at 4:38 PM, jim holtman <jholt...@gmail.com> wrote:

> here is another way of doing it:
>
> > set.seed(19)
> >
> > area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
> > type<-c(rep(1:10,5))
> > a<-rnorm(50)
> > b<-rnorm(50)
> > c<-rnorm(50)
> > d<-rnorm(50)
> > df<-cbind(area,type,a,b,c,d)
> > df1 <- data.frame(df)
> > require(reshape2)
> > df.melt <- melt(df1, id=c('area', 'type'))
> > result <- do.call(rbind,
> +     lapply(split(df.melt, list(df.melt$area, df.melt$type),
> drop=TRUE), function(x){
> +         head(x[order(x$value, decreasing=TRUE),], 2) # get at most
> the first two if present
> +     })
> + )
> >
> > result
>         area type variable       value
> 1.1.51      1    1        b  1.70366970
> 1.1.101     1    1        c  0.79101298
> 2.1.161     2    1        d  1.56797593
> 2.1.61      2    1        b  0.79868725
> 3.1.21      3    1        a  1.42342348
> 3.1.121     3    1        c  0.44547975
> 4.1.131     4    1        c  1.72745545
> 4.1.31      4    1        a  1.50474144
> 5.1.141     5    1        c  1.72521942
> 5.1.191     5    1        d  0.52466470
>
>
> On Thu, Feb 10, 2011 at 12:55 PM, Phil Spector
> <spec...@stat.berkeley.edu> wrote:
> > Alain -
> >   Here's a reproducible data set:
> >
> > set.seed(19)
> > area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
> > type<-c(rep(1:10,5))
> > a<-rnorm(50)
> > b<-rnorm(50)
> > c<-rnorm(50)
> > d<-rnorm(50)
> > df<-cbind(area,type,a,b,c,d)
> >
> >   First I'll make a helper function to operate on one row of the data
> frame:
> >
> > get2 = function(x){
> >   y = x[-c(1,2)]
> >   oy = order(y,decreasing=TRUE)
> >   nms = colnames(df)[-c(1,2)]
> >   data.frame(area=rep(x[1],2),type=rep(x[2],2),
> >              max=y[oy[1:2]],colname=nms[oy[1:2]])
> > }
> >
> > Now I can use apply, do.call and rbind to get the answer:
> >
> >> answer = do.call(rbind,apply(df,1,get2))
> >> head(answer)
> >
> >   area type        max colname
> > b     1    1  1.7036697       b
> > c     1    1  0.7910130       c
> > c1    1    2  2.4576579       c
> > a     1    2  0.3885812       a
> > c2    1    3  1.2363598       c
> > a1    1    3 -0.3443333       a
> >
> > (My numbers differ from yours because you didn't specify
> > a seed for the random number generator)
> >
> > I'm not exactly sure how to form your column "combination", though.
> >
> >                                        - Phil Spector
> >                                         Statistical Computing Facility
> >                                         Department of Statistics
> >                                         UC Berkeley
> >                                         spec...@stat.berkeley.edu
> >
> >
> > On Thu, 10 Feb 2011, Alain D. wrote:
> >
> >> Dear R-List,
> >>
> >> I have a dataframe
> >>
> >> area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
> >> type<-c(rep(1:10,5))
> >> a<-rnorm(50)
> >> b<-rnorm(50)
> >> c<-rnorm(50)
> >> d<-rnorm(50)
> >> df<-cbind(area,type,a,b,c,d)
> >>
> >>
> >> df
> >>     area type           a              b
> >> c                      d
> >> [1,]    1    1     0.45608192  0.240378547  2.05208079 -1.18827462
> >> [2,]    1    2    -0.12119506 -0.028078577 -2.64323695 -0.83923441
> >> [3,]    1    3     0.09066133 -1.134069619  1.53344812 -0.15670239
> >> [4,]    1    4    -1.34505241  1.919941172 -1.02090099  0.75664358
> >> [5,]    1    5    -0.29279617 -0.314955019 -0.88809266  2.22282022
> >> [6,]    1    6    -0.59697893 -0.652937746  1.05132400 -0.02469151
> >> [7,]    1    7    -1.18199400  0.728165962 -1.51419348  0.65640976
> >> [8,]    1    8    -0.72925659  0.303514237  0.79758488  0.93444350
> >> [9,]    1    9    -1.60080508 -0.187562633  0.51288428 -0.55692877
> >> [10,]    1   10    0.54373268 -0.494994392  0.52902381  1.12938122
> >> [11,]    2    1    -1.29675664 -0.644990784 -2.44067511 -0.18489544
> >> [12,]    2    2     0.86330699  1.458038882  1.17514710  1.32896878
> >> [13,]    2    3     0.30069402  1.361211939  0.84757211  1.14502761
> >> ...
> >>
> >> Now I want to have for each combination of area and type the name and
> >> corresponding value of the two columns with the highest and second
> highest
> >> value a,b,c,d.
> >> In the above example it should be something like
> >>
> >> combination         max     colname
> >> 11                      2.05          c
> >> 11                      0.46          a
> >> 12                     -0.03          b
> >> 12                     -0.12          a
> >> ...
> >>
> >> (It might be arranged differently, though)
> >>
> >> Can anyone help?
> >>
> >> Thank you in advance!
> >>
> >> Alain
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] highest and second highest value in row for each combination

Reply via email to