Is there a generic binary search routine in a standard library which

   a) works for character vectors
   b) runs in O(log(N)) time?

I'm aware of findInterval(x,vec), but it is restricted to numeric vectors.

I'm also aware of various hashing solutions (e.g. new.env(hash=TRUE) and
fastmatch), but I need the greatest-lower-bound match in my application.

findInterval is also slow for large N=length(vec) because of the O(N)
checking it does, as Duncan Murdoch has pointed
out<https://stat.ethz.ch/pipermail/r-help/2008-September/174584.html>:
though
its documentation says it runs in O(n * log(N)), it actually runs in O(n *
log(N) + N), which is quite noticeable for largish N.  But that is easy
enough to work around by writing a variant of findInterval which calls
find_interv_vec without checking.

            -s

PS Yes, binary search is a one-liner in R, but I always prefer to use
standard, fast native libraries when possible....

binarysearch <- function(val,tab,L,H) {while (H>=L) { M=L+(H-L) %/% 2; if
(tab[M]>val) H<-M-1 else if (tab[M]<val) L<-M+1 else return(M)};
return(L-1)}

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to