Thanks for the feedback, this is really helpful. 4) Your solution works like a charm. I opted for reference by column number since I had so many.
ynFields = c(12:22,58:229) ynLabel = c("No","Yes") X[ynFields] <- lapply(X[ynFields], factor, levels=0:1, labels=ynLabel) whenFields = c(24:56) whenLabel = c("Never","<=Week","<=3 months","<=year","regular use >=3 mo","user for unknown time") X[whenFields] <- lapply(X[whenFields], factor, levels=0:5, labels=whenLabel) 5) For the creation of new columns with names based off the old ones I adjusted to: Z <- as.data.frame(X[24:56]>0) names(Z) <- sub("_when$", "_yn", names(Z)) X <- cbind(X, Z) instead of: Z <- as.data.frame(as.numeric(X[24:56]>0)) names(Z) <- sub("_when$", "_yn", names(Z)) X <- cbind(X, Z) because R would literally name column 197 as "as.numeric(X[24:56>0])", and leave the rest alone. As is, it does not put in the proper values (maybe because I had to drop the "as.numeric()" portion). I added: for (i in 24:56) { X[,i+173] <- as.numeric(X[,i] > 0) } afterwards and get the values I want, but maybe a slight change to the previous code can eliminate my need of the for loop. 6) Print is exactly what I needed to get output from loops, that helps me greatly. I'm making more of a mess with the code at the moment, trying nasty things like: for (j in 3:5) { print (names(X[j])) for (i in 197:229) { print (names(X[i])) print(table(X[,j],X[,i])) #print(prop.test(table(X$race,X[,i]))) print ("--------------------------------") } } My intent is to look at the drug usage by demographic data(frequency and x-square). I sort of get that in a piecemeal way, but it's quite a nasty output. a) I commented out "prop.test(table(X$race,X[,i]))" because it works until it runs into a drug with no successes on a column, then the program halts. My first instinct would be to add an if statement, but I bet R has something more elegant. b) My final goal would be to get some output similar to the following in a sample size of 50 persay. _____________________________________________________________________ variable drug1 drug2 drug3 drug4 drug5 ... drug n (n=5) (n=10) (n=8) (n=7) (n=5) (n=0) no (%) no (%) no (%) no (%) no (%) no (%) _____________________________________________________________________ Gender * ** Male 2 (0.04) 9 (0.18) ... Female 3 (0.06) 1 (0.02) ... Ethnicity ... Caucasian African American ... Demographic Level1 level2 ... LevelN _____________________________________________________________________ * p < 0.05 ** p < 0.01 I figure some of this would be needed to be done by hand, but the closer I can get the better. At the moment, I plan on reading up on table and xtabs, and try to find a way to skip the x-square tests that would hang (maybe one of the apply functions works for this). If I can store the resulting p-values then maybe I can output a custom table in the worst case. Thanks again for the help. The discussion helps me understand how to use R quite a bit better. -- View this message in context: http://r.789695.n4.nabble.com/Non-Parametric-Adventures-in-R-tp2952754p2956852.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.