Thanks! As I said, cute exercise. Best, Bert
On Fri, Jul 10, 2020 at 1:21 PM Fox, John <j...@mcmaster.ca> wrote: > Dear Bert, > > Wouldn't you know it, but your contribution arrived just after I pressed > "send" on my last message? So here's how your solution compares: > > > microbenchmark(John = John <- xn[x], > + Rich = Rich <- xn[match(x, xc)], > + Jeff = Jeff <- { > + n <- as.integer( sub( "[a-i]$", "", x ) ) > + d <- match( sub( "^\\d+", "", x ), letters[1:9] ) > + d[ is.na( d ) ] <- 0 > + n + d / 10 > + }, > + David = David <- as.numeric(gsub("a", ".3", > + gsub("b", ".5", > + gsub("c", ".7", x)))), > + Bert = Bert <- { > + nums <- sub("[[:alpha:]]+","",x) > + alph <- sub("\\d+","",x) > + as.numeric(nums) + ifelse(alph == "",0, vals[alph]) > + }, > + times=1000L > + ) > Unit: microseconds > expr min lq mean median uq max > neval cld > John 261.739 373.9765 599.9411 536.571 569.3750 14489.48 > 1000 a > Rich 250.697 372.4450 542.3208 520.383 554.7215 10682.73 > 1000 a > Jeff 10879.223 13477.7665 15647.7856 15549.255 17516.7420 146155.28 > 1000 b > David 14337.510 18375.0100 20325.8796 20187.174 22161.0195 32575.31 > 1000 d > Bert 12344.506 15753.2510 18024.2757 17702.838 19973.0465 32043.80 > 1000 c > > all.equal(John, Rich) > [1] TRUE > > all.equal(John, David) > [1] "names for target but not for current" > > all.equal(John, Jeff) > [1] "names for target but not for current" "Mean relative difference: > 0.1498243" > > all.equal(John, Bert) > [1] "names for target but not for current" > > To make the comparison fair, I moved the parts of the solutions that don't > depend on the length of the data outside the benchmark. Your solution does > have the virtue of providing the right answer. > > Best, > John > > > On Jul 10, 2020, at 3:54 PM, Bert Gunter <bgunter.4...@gmail.com> wrote: > > > > ... and continuing with this cute little thread... > > > > I found the OP's specification a little imprecise -- are your values > always a string that begins with *some sort" of numeric value followed by > "some sort" of alpha code? That is, could the numeric value be several > digits and the alpha code several letters? Probably not, and the existing > solutions you have been provided are almost certainly all you need. But for > fun, assuming this more general specification, here is a general way to > split your alphanumeric codes up into numeric and alpha parts and then > convert by using a couple of sub() 's. > > > > > set.seed(131) > > > xc <- sample(c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c"), 15, > replace = TRUE) > > > nums <- sub("[[:alpha:]]+","",xc) ## extract numeric part > > > alph <- sub("\\d+","",xc) ## extract alpha part > > > codes <- letters[1:3] ## whatever alpha codes are used > > > vals <- setNames(c(.3,.5,.7), codes) ## whatever numeric values to > convert codes to > > > xnew <- as.numeric(nums) + ifelse(alph == "",0, vals[alph]) > > > data.frame (xc = xc, xnew = xnew) > > xc xnew > > 1 1a 1.3 > > 2 2 2.0 > > 3 1c 1.7 > > 4 1c 1.7 > > 5 1b 1.5 > > 6 1a 1.3 > > 7 2 2.0 > > 8 2 2.0 > > 9 1a 1.3 > > 10 1a 1.3 > > 11 2c 2.7 > > 12 1b 1.5 > > 13 1b 1.5 > > 14 1 1.0 > > 15 1c 1.7 > > > > Echoing others, no claim for optimality in any sense. > > > > Cheers, > > Bert > > > > > > On Fri, Jul 10, 2020 at 12:28 PM David Carlson <dcarl...@tamu.edu> > wrote: > > Here is a different approach: > > > > xc <- c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c") > > xn <- as.numeric(gsub("a", ".3", gsub("b", ".5", gsub("c", ".7", xc)))) > > xn > > # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7 > > > > David L Carlson > > Professor Emeritus of Anthropology > > Texas A&M University > > > > On Fri, Jul 10, 2020 at 1:10 PM Fox, John <j...@mcmaster.ca> wrote: > > > > > Dear Jean-Louis, > > > > > > There must be many ways to do this. Here's one simple way (with no > claim > > > of optimality!): > > > > > > > xc <- c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c") > > > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7) > > > > > > > > set.seed(123) # for reproducibility > > > > x <- sample(xc, 20, replace=TRUE) # "data" > > > > > > > > names(xn) <- xc > > > > z <- xn[x] > > > > > > > > data.frame(z, x) > > > z x > > > 1 2.5 2b > > > 2 2.5 2b > > > 3 1.5 1b > > > 4 2.3 2a > > > 5 1.5 1b > > > 6 1.3 1a > > > 7 1.3 1a > > > 8 2.3 2a > > > 9 1.5 1b > > > 10 2.0 2 > > > 11 1.7 1c > > > 12 2.3 2a > > > 13 2.3 2a > > > 14 1.0 1 > > > 15 1.3 1a > > > 16 1.5 1b > > > 17 2.7 2c > > > 18 2.0 2 > > > 19 1.5 1b > > > 20 1.5 1b > > > > > > I hope this helps, > > > John > > > > > > ----------------------------- > > > John Fox, Professor Emeritus > > > McMaster University > > > Hamilton, Ontario, Canada > > > Web: http::/socserv.mcmaster.ca/jfox > > > > > > > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol <abit...@sent.com> > > > wrote: > > > > > > > > Dear All > > > > > > > > I have a character vector, representing histology stages, such as > for > > > example: > > > > xc <- c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c") > > > > > > > > and this goes on to 3, 3a etc in various order for each patient. I do > > > have of course a pre-established classification available which does > > > change according to the histology criteria under assessment. > > > > > > > > I would want to convert xc, for plotting reasons, to a numeric vector > > > such as > > > > > > > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7) > > > > > > > > Unfortunately I have no clue on how to do that. > > > > > > > > Thanks for any help and apologies if I am missing the obvious way to > do > > > it. > > > > > > > > JL > > > > -- > > > > Verif30042020 > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$ > > > > PLEASE do read the posting guide > > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$ > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$ > > > PLEASE do read the posting guide > > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$ > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.