Is there a generic binary search routine in a standard library which a) works for character vectors b) runs in O(log(N)) time?
I'm aware of findInterval(x,vec), but it is restricted to numeric vectors. I'm also aware of various hashing solutions (e.g. new.env(hash=TRUE) and fastmatch), but I need the greatest-lower-bound match in my application. findInterval is also slow for large N=length(vec) because of the O(N) checking it does, as Duncan Murdoch has pointed out<https://stat.ethz.ch/pipermail/r-help/2008-September/174584.html>: though its documentation says it runs in O(n * log(N)), it actually runs in O(n * log(N) + N), which is quite noticeable for largish N. But that is easy enough to work around by writing a variant of findInterval which calls find_interv_vec without checking. -s PS Yes, binary search is a one-liner in R, but I always prefer to use standard, fast native libraries when possible.... binarysearch <- function(val,tab,L,H) {while (H>=L) { M=L+(H-L) %/% 2; if (tab[M]>val) H<-M-1 else if (tab[M]<val) L<-M+1 else return(M)}; return(L-1)} [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.