HI, Try: ?split() source("http://www.openintro.org/stat/data/cdc.R") str(cdc) #'data.frame': 20000 obs. of 9 variables: # $ genhlth : Factor w/ 5 levels "excellent","very good",..: 3 3 3 3 2 2 2 2 3 3 ... # $ exerany : num 0 0 1 1 0 1 1 0 0 1 ... # $ hlthplan: num 1 1 1 1 1 1 1 1 1 1 ... # $ smoke100: num 0 1 1 0 0 0 0 0 1 0 ... # $ height : num 70 64 60 66 61 64 71 67 65 70 ... # $ weight : int 175 125 105 132 150 114 194 170 150 180 ... # $ wtdesire: int 175 115 105 124 130 114 185 160 130 170 ... # $ age : int 77 33 49 42 55 55 31 45 27 44 ... # $ gender : Factor w/ 2 levels "m","f": 1 2 2 2 2 2 1 1 2 1 ... cdc$genhlth<- as.character(cdc$genhlth) cdclst1<- split(cdc,cdc$genhlth) lapply(cdclst1,head,2) #$excellent # genhlth exerany hlthplan smoke100 height weight wtdesire age gender #11 excellent 1 1 1 69 186 175 46 m #13 excellent 1 0 1 66 185 220 21 m # #$fair # genhlth exerany hlthplan smoke100 height weight wtdesire age gender #12 fair 1 1 1 69 168 148 62 m #15 fair 1 0 0 69 170 170 23 m # #$good # genhlth exerany hlthplan smoke100 height weight wtdesire age gender #1 good 0 1 0 70 175 175 77 m #2 good 0 1 1 64 125 115 33 f # #$poor # genhlth exerany hlthplan smoke100 height weight wtdesire age gender #53 poor 1 1 1 62 140 130 64 f #79 poor 1 1 0 63 142 120 52 f
#$`very good` # genhlth exerany hlthplan smoke100 height weight wtdesire age gender #5 very good 0 1 0 61 150 130 55 f #6 very good 1 1 0 64 114 114 55 f sapply(cdclst1,nrow) #excellent fair good poor very good # 4657 2019 5675 677 6972 cdcGood<-cdclst1[["good"]] str(cdcGood) #'data.frame': 5675 obs. of 9 variables: # $ genhlth : chr "good" "good" "good" "good" ... # $ exerany : num 0 0 1 1 0 1 1 0 1 1 ... # $ hlthplan: num 1 1 1 1 1 1 1 0 1 1 ... # $ smoke100: num 0 1 1 0 1 0 1 1 1 1 ... # $ height : num 70 64 60 66 65 70 73 67 75 65 ... # $ weight : int 175 125 105 132 150 180 185 156 200 160 ... # $ wtdesire: int 175 115 105 124 130 170 175 150 190 140 ... # $ age : int 77 33 49 42 27 44 79 47 43 54 ... # $ gender : Factor w/ 2 levels "m","f": 1 2 2 2 2 1 1 1 1 2 ... A.K. >Hi I am trying to figure out how to subset a bunch of data. As an example I am >using the cdc data from openintro.org. > >In the first column with the name "genhlth" there are various options that the persons could respond. For exmaple "good" "very good" and "poor". Now >what i would like to do is to seperate the data so that everyone who answered good are stored in one variable and everyone who answered poor are in >another variable. > >Now I know i could just do subset(cdc, cdc$genhlth == "poor") to get the poor, but would really like for a code that would seperate data into each >group, regardless of what the text or the number of groups are. > >Can anyone give me a hint? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.