#some sample data:
library(Hmisc)
set.seed(33)
df<-data.frame(x=(sample(letters[1:10],1000,replace=TRUE)),y=rnorm(1000,mean=20,sd=15))

x is a category from a to J, say a geographical area, into which an observation 
y falls, y being a score.  Now if I want to put my score into quantiles 
(quintiles in this case) across the whole population of observations and then 
make a quintile variable I do the following:

#make a quintile variable
df<- within(df,z<-as.integer(cut2(y,quantile(y,probs=seq(0,1,0.2)))))

I'm using cut2 here as I want the extremes of my ranges to be included in the 
upper and lower bins.

So far so good, but I would also like another variable to indicate the quintile 
of the score within the areas indicated by the x variable, so all of the scores 
where x=a, binned into quintiles for area a, the same for scores in areas b, c 
and so on.

I see that I could put my quintile variable code into a function and then split 
my data frame by x, apply the function in each of the ten groups and stitch the 
whole thing back together again (not sure I could write it though), but is 
there a much simpler solution?

Thanks,

GavinR

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to