On Mon, Nov 21, 2011 at 7:42 AM, set <asta...@hotmail.com> wrote: > Hello R users, > > I'm trying to replace numerical values in a datamatrix with strings. R does > this except for numbers under 10000 starting with a 9 (eg 98, 970, 9504 > etc). This is really weird and I wondered whether someone had encountered > such a problem or knows the solution. I'm using the next script: > > test_1 <- read.table("5+ref_151111clusters3.csv", header = TRUE, sep = ",", > colClasses = "numeric") > test_1[test_1 > 94885 & test_1 <= 113835] = "KE3926OT" > test_1[test_1 != 0 & test_1 <= 18954] = "I8456" > test_1[test_1 > 75944 & test_1 <= 94885] = "KE3873" > test_1[test_1 > 56951 & test_1 <= 75944] = "KE3870" > test_1[test_1 > 37991 & test_1 <= 56951] = "Cyprus1" > test_1[test_1 > 18954 & test_1 <= 37991] = "ref" > write.table(test_1, file = "test_replace7.txt", quote = FALSE, sep="\t")
I think others have already hinted at the problem, but here it is once again more explicitly: your line test_1[test_1 > 94885 & test_1 <= 113835] = "KE3926OT" converts the entire test1 to character (or at least the columns in which a replacement happens). When something is a character, you will find "strange" results: a = "109" b = "9" a<b > a<b [1] TRUE Note that when one side of a comparison is numeric and the other character, the numeric is converted to character and then they are compared: > b = 9 > class(a) [1] "character" > class(b) [1] "numeric" > a<b [1] TRUE This is why your entries starting with 9 are "ignored" - because as character strings they are the largest. The solution is simple: create a test2 initialized to test1: test2 = test1 then replace elements in test2 depending on test1, for example test_2[test_1 > 94885 & test_1 <= 113835] = "KE3926OT" This way your test1 remains numeric and the comparisons will work as you expect. HTH Peter ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.