Hi, May be this helps: As you wanted to match only from row3 onwards to row2, the corresponding values on row1 and row2 were set to NA. dat1<- read.table(text=" S.No AB001A AB0002A AB362 P1 -/- C/C A/A P2 C/C C/C A/A 3 C/C C/C A/A 4 C/C C/C A/A 5 C/C C/C A/A 6 C/C C/C A/A 7 C/C C/C A/A 8 -/- -/- -/- 9 C/C C/C A/A 10 C/C C/C A/A 11 -/- C/C A/A 12 C/C C/C A/A 13 C/C C/C A/A 14 C/C C/C A/A 15 C/C -/- A/A 16 -/- C/C A/A 17 A/A A/C A/A 18 C/A A/A A/A ",sep="",header=TRUE,stringsAsFactors=FALSE) dat2<-cbind(dat1,(1*mapply("==",dat1[,-1],dat1[2,-1]))) names(dat2)[duplicated(names(dat2))]<- paste0(names(dat2)[duplicated(names(dat2))],"_1") library(plyr) dat3<-mutate(dat2,SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)), MATCH=(SUM/3)*100) dat3[1:2,5:9]<-NA res<-mutate(dat3,RANK=rank(MATCH,ties.method="min")) head(res) # S.No AB001A AB0002A AB362 AB001A_1 AB0002A_1 AB362_1 SUM MATCH RANK #1 P1 -/- C/C A/A NA NA NA NA NA 17 #2 P2 C/C C/C A/A NA NA NA NA NA 18 #3 3 C/C C/C A/A 1 1 1 3 100 7 #4 4 C/C C/C A/A 1 1 1 3 100 7 #5 5 C/C C/C A/A 1 1 1 3 100 7 #6 6 C/C C/C A/A 1 1 1 3 100 7 A.K.
>Hi Arun, >Thank you very much for your help in solving my problem, >S. No AB001A AB0002A AB362 AB001A AB0002A AB362 SUM %Match Rank > P1 -/- C/C A/A > P 2 C/C C/C A/A > 3 C/C C/C A/A > 4 C/C C/C A/A > 5 C/C C/C A/A > 6 C/C C/C A/A > 7 C/C C/C A/A > 8 -/- -/- -/- > 9 C/C C/C A/A >10 C/C C/C A/A > 11 -/- C/C A/A > 12 C/C C/C A/A > 13 C/C C/C A/A > 14 C/C C/C A/A >16 C/C -/- A/A >Actually i want to match observation from 3 to 16 with the value in p2 (i.e 3 with p2, 4 with p2, 5 with p2 etc), if they match i would like to give >value 1 and store it in corresponding dummy variable i.e. AB001A and i would like to do samething for remaining vars too and storing in their >dummy vars. Finally i want make sum of all the matched (i.e. 1 score) in each row and calculate percentage of match and then rank. This what i >want, sorry for not expressing my problem exactly in understandable way. >Hi to all bloggers, >my data looks like this, > >S. No AB001A AB0002A AB362 VAR1 VAR2 VAR3 SUM %Match Rank > 1 -/- C/C A/A > 2 C/C C/C A/A > 3 C/C C/C A/A > 4 C/C C/C A/A > 5 C/C C/C A/A > 6 C/C C/C A/A > 7 C/C C/C A/A > 8 -/- -/- -/- > 9 C/C C/C A/A > 10 C/C C/C A/A > 11 -/- C/C A/A > 12 C/C C/C A/A > 13 C/C C/C A/A > 14 C/C C/C A/A > 16 C/C -/- A/A > 17 -/- C/C A/A > 18 C/C C/C A/A > 19 C/C C/C A/A >I want to match obs 3 with obs 2 if it exactly matched then score will be 1 else 0, that will be stored in var1 for AB001a, in var2 for ab0002a and in >var3 for ab362 and i want to calculate sum of all the 1's and observation match percent and their rank (top ten matchers), I did this successfully in >excel but it took me lot of time, i used if condition in excel like (=if(A3=A$2,1,0) and then i dragged among all obs and i did sum of all obs, their >%match and rank. My question is how can i do this in R? can i use match package for this? or other packages will help me? my data is so big with >5,15,567 obs. can any one guide me how to do this in sas because i want to reduce my time to analyze my data. Thanking you Regards, ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.