Adrienne - this solves the problem nicely. Thanks for your help.
David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf Of Adrienne Wootten Sent: Friday, October 22, 2010 9:09 AM To: David Herzberg Cc: r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c<=ncol(testmat) | testmat[i,c] != "1" ){ if( testmat[i,c] == "1"){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg <dav...@wpspublish.com<mailto:dav...@wpspublish.com>> wrote: Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. + comp LCfirst1 = #i. + end if. end loop. exe. After several hours of trying to translate this procedure to R, I'm stumped. I played around with creating a list to hold the item responses variables (analogous to 'vector' in SPSS), but when I tried to use the list in an R procedure, I kept getting a warning along the lines of 'the list contains > 1 element, only the first element will be used'. So perhaps a list is not the appropriate class to 'hold' these variables? It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will allow me to recreate the operation described above? How do I set up the indexing operation analogous to 'loop #i' in SPSS? Any help is appreciated, and I'm happy to provide more information if needed. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com<mailto:dav...@wpspublish.com> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org<mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.