I'm not sure, but I think that using the "cut" function would solve your problem?
?cut On Wed, 22 Apr 2009 14:56:10 -0400, "Alan Cohen" <coh...@smh.toronto.on.ca> wrote: > Hi R users, > > I am trying to assign ages to age classes for a large data set (123,000 > records), and using a for-loop was too slow, so I wrote a function and used > apply. However, the function does not properly assign the first two > classes (the rest are fine). It appears that when age is one digit, it > does not get assigned properly. > > I tried to provide a small-scale work-up (at the end of the email) but it > does not reproduce the problem; the best I can do is to provide my code and > the output below. As you can see, I've confirmed that age is numeric, that > all values are integers, and that pieces of the code work independently. > Any thoughts would be appreciated. > > To add to the mystery, depending which rows of my data set I select, I get > different problems. mds[1:100,] gives the problem above, as do > mds[100:200,] , mds[150:250,] and mds[10000:10100,]. However, with > mds[200:300,], mds[250:350,] and mds[1000:1100,], only ages with 3 digits > are correctly assigned - all ages <100 are returned as NA. > > I'm using R v 2.8.1 on Windows XP. > > Cheers, > Alan Cohen > Centre for Global Health Research, > Toronto,ON > >> ageassign <- function(x){ > + y <- NA > + if (x[11] %in% c(0:4)) {y <- "0-4"} > + else if (x[11] %in% c(5:14)) {y <- "5-14" } > + else if (x[11] %in% c(15:29)) {y <- "15-29" } > + else if (x[11] %in% c(30:69)) {y <- "30-69"} > + else if (x[11] %in% c(70:79)) {y <- "70-79"} > + else if (x[11] %in% c(80:125)) {y <- "80+"} > + return(y) > + } >> jj <- apply(mds[1:100,],1,FUN=ageassign) >> jj > 1 2 3 4 5 6 7 8 9 > 10 11 12 13 > NA "80+" "30-69" "30-69" "80+" NA "30-69" "30-69" "70-79" > "15-29" "15-29" "30-69" "70-79" > 14 15 16 17 18 19 20 21 22 > 23 24 25 26 > "80+" NA "30-69" "30-69" "30-69" "80+" "80+" "15-29" "70-79" > "30-69" "70-79" "70-79" "30-69" > 27 28 29 30 31 32 33 34 35 > 36 37 38 39 > "70-79" "80+" NA "80+" "70-79" NA "15-29" "15-29" NA > NA "70-79" "30-69" "30-69" > 40 41 42 43 44 45 46 47 48 > 49 50 51 52 > "70-79" "30-69" "30-69" "30-69" "70-79" "30-69" "30-69" "70-79" "15-29" > "30-69" NA "15-29" "30-69" > 53 54 55 56 57 58 59 60 61 > 62 63 64 65 > "30-69" NA "70-79" "30-69" "30-69" "30-69" "30-69" "15-29" "30-69" > "30-69" "70-79" "30-69" NA > 66 67 68 69 70 71 72 73 74 > 75 76 77 78 > "30-69" "30-69" "30-69" "30-69" "30-69" "80+" "30-69" "80+" "70-79" > "30-69" "30-69" "30-69" NA > 79 80 81 82 83 84 85 86 87 > 88 89 90 91 > "30-69" "30-69" "30-69" NA "80+" "30-69" "30-69" "30-69" NA > "15-29" "30-69" "30-69" "30-69" > 92 93 94 95 96 97 98 99 100 > "30-69" "30-69" "30-69" "30-69" "70-79" "30-69" "30-69" "30-69" "30-69" >> mds[1:100,11] > [1] 3 82 40 35 82 1 37 57 71 22 21 52 73 86 1 43 60 63 84 88 29 73 69 > 75 73 43 75 83 4 83 77 1 27 > [34] 15 1 6 76 51 45 71 54 64 69 70 48 38 74 26 37 4 18 63 59 8 78 63 > 67 62 50 21 66 69 75 57 4 50 > [67] 58 60 61 62 83 69 92 75 30 49 69 1 69 63 69 0 93 64 59 69 2 25 32 > 60 66 67 54 53 64 79 59 49 59 > [100] 64 >> table(mds[,11]) > > 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 > 15 16 17 18 19 > 3123 6441 3856 2884 1968 1615 1386 1088 1098 721 943 681 511 380 426 > 835 571 555 719 653 > 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 > 35 36 37 38 39 > 879 715 672 631 655 773 680 713 769 538 685 566 729 702 652 > 766 683 723 821 675 > 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 > 55 56 57 58 59 > 774 650 908 892 784 925 781 1043 1161 924 1087 827 1261 1356 1297 > 1272 1277 1614 1831 1523 > 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 > 75 76 77 78 79 > 1702 1251 1954 2157 1901 2090 1874 2705 3085 2529 2488 1777 2701 2586 2308 > 2020 1801 2269 2486 1856 > 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 > 95 96 97 98 99 > 1762 1047 1413 1326 967 1013 753 870 884 531 601 277 364 301 193 > 288 149 174 169 470 > 100 101 102 103 104 105 106 107 108 114 115 117 118 120 125 > 15 2 5 7 2 4 1 1 2 1 1 2 2 2 1 >> mode(mds[,11]) > [1] "numeric" > >> mds[1,11] %in% c(0:4) > [1] TRUE >> if (mds[1,11] %in% c(0:4)) {y <- "0-4"} >> y > [1] "0-4" > >> xx <- matrix(trunc(runif(30,0,125)),15,2) >> aassign <- function(x){ > + y <- NA > + if (x[2] %in% c(0:4)) {y <- "0-4"} > + else if (x[2] %in% c(5:14)) {y <- "5-14" } > + else if (x[2] %in% c(15:29)) {y <- "15-29" } > + else if (x[2] %in% c(30:69)) {y <- "30-69"} > + else if (x[2] %in% c(70:79)) {y <- "70-79"} > + else if (x[2] %in% c(80:125)) {y <- "80+"} > + return(y) > + } >> jj <- apply(xx,1,FUN=aassign) >> t(xx) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] > [,14] [,15] > [1,] 23 98 107 94 76 103 106 40 66 11 109 101 96 > 37 18 > [2,] 11 57 58 91 43 123 103 77 4 79 64 10 8 > 105 76 >> jj > [1] "5-14" "30-69" "30-69" "80+" "30-69" "80+" "80+" "70-79" "0-4" > "70-79" "30-69" "5-14" > [13] "5-14" "80+" "70-79" >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.