Dear R helpers, I was doing a genetic project with two datasets X and Y. There are some IDs in both data sets, and others in either data set. I used "merge(x,y,by="ID",all=TRUE)". The data set Y contains a variable (a genotype) which is also in data X. When I merge X with Y, these two variables were automatically re-named by appending .x and .y to the original variable names. As you can see on the following list, I would like to take whatever available (non-missing non-NA) in X or Y as the final value for the genotype S3Allel1. I used paste() function. However, it converts <NA> to NA as character. Would you please tell me how I can just get the genotype without pasting the NA to it? I checked the document of paste() and noticed that it used as.character() to the vector argument. I guess that is the reason I got "NA" as a string for the new variable I created (S3Allele1). Should I use any other funtion to avoid this problem? Any insight is appreciated!
ID S3Allele1.x S3Allele1.y S3Allele1 1 10003 G <NA> G NA 2 10004 A <NA> A NA 3 10005 A <NA> A NA 4 10006 A <NA> A NA 5 10007 G <NA> G NA 6 10008 A <NA> A NA 7 10009 A <NA> A NA 8 10010 A <NA> A NA 9 10011 A <NA> A NA 10 10013 A <NA> A NA 11 10014 A <NA> A NA 12 10015 A <NA> A NA 13 10016 A <NA> A NA 14 10017 A <NA> A NA 15 10018 A <NA> A NA 16 10019 G <NA> G NA 17 10020 A <NA> A NA 18 10021 G <NA> G NA 19 10022 A <NA> A NA 20 10023 G <NA> G NA 21 10024 G <NA> G NA 22 10025 G <NA> G NA 23 10027 G <NA> G NA 24 10028 G <NA> G NA 25 10029 G <NA> G NA 26 10031 G <NA> G NA 27 10032 A <NA> A NA 28 10033 <NA> NA 29 10035 A <NA> A NA 30 10037 A <NA> A NA 31 10038 <NA> A NA A 32 10039 <NA> A NA A ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.