Try data.table:::sortedmatch, which is implemented in C. It requires it's input to be sorted (and doesn't check)
"Stavros Macrakis" <macra...@alum.mit.edu> wrote in message news:BANLkTi=j2lf5syxytv1dd4k9wr0zgk8...@mail.gmail.com... > Is there a generic binary search routine in a standard library which > > a) works for character vectors > b) runs in O(log(N)) time? > > I'm aware of findInterval(x,vec), but it is restricted to numeric vectors. > > I'm also aware of various hashing solutions (e.g. new.env(hash=TRUE) and > fastmatch), but I need the greatest-lower-bound match in my application. > > findInterval is also slow for large N=length(vec) because of the O(N) > checking it does, as Duncan Murdoch has pointed > out<https://stat.ethz.ch/pipermail/r-help/2008-September/174584.html>: > though > its documentation says it runs in O(n * log(N)), it actually runs in O(n * > log(N) + N), which is quite noticeable for largish N. But that is easy > enough to work around by writing a variant of findInterval which calls > find_interv_vec without checking. > > -s > > PS Yes, binary search is a one-liner in R, but I always prefer to use > standard, fast native libraries when possible.... > > binarysearch <- function(val,tab,L,H) {while (H>=L) { M=L+(H-L) %/% 2; if > (tab[M]>val) H<-M-1 else if (tab[M]<val) L<-M+1 else return(M)}; > return(L-1)} > > [[alternative HTML version deleted]] > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.