FYI I got this reaction from Elias (this is a forward it to the list so it will be archived correctly. Thank you Elias btw)
On Wed, Nov 30, 2011 at 5:45 PM, Elias Levy <fearsome.lucid...@gmail.com>wrote: > On Wed, Nov 30, 2011 at 6:01 AM, <riak-users-requ...@lists.basho.com>wrote: > >> From: Jeroen van Dijk <jeroentjevand...@gmail.com> >> >> The use case I'm talking about is when you are looking for a term that is >> very common and thus will yield many results. My understanding of the >> implementation of Riak [citation needed] is that the search is divided >> into >> a few phases. The first one is collecting results for each term. After >> that >> comes merging, sorting and limiting the result set. So for this particular >> case collecting all results would be infeasible and would kill >> performance. >> Even when a limit is set because limiting comes in a phase after >> collecting >> and the merging of results. >> > > That's correct. We have similar issues. We've resorted to creating the > equivalent of multicolumn indexes by joining certain fields together and > indexing those. That is only possible because most of the data we want to > index is structured or semi-structured. You'd have to determine whether > such an approach is feasible for your purposes. > > We also found 2i to be faster than Search, at the expense of requiring our > app to perform tokenization for some of the fields we want to index, but > we've stuck with Search as we need composable queries, which 2i does not > yet provide. > > I've read here [1] that one can use search_fold to interrupt the collecting >> phase when enough results are fetched. I would like to know if this a >> best/official practice and if it really solves the issue? >> > > Search_fold will only be useful if you plan on developing in Erlang and, > if my understanding is correct, if you don't care about the order of the > results (i.e. no scoring or field sorting). Actually, the results may be > partially ordered, as the merge_index backend may store the postings sorted > by the inverse of time. >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com