FYI I got this reaction from Elias (this is a forward it to the list so it
will be archived correctly. Thank you Elias btw)

On Wed, Nov 30, 2011 at 5:45 PM, Elias Levy <fearsome.lucid...@gmail.com>wrote:

> On Wed, Nov 30, 2011 at 6:01 AM, <riak-users-requ...@lists.basho.com>wrote:
>
>> From: Jeroen van Dijk <jeroentjevand...@gmail.com>
>>
>> The use case I'm talking about is when you are looking for a term that is
>> very common and thus will yield many results. My understanding of the
>> implementation of Riak [citation needed] is that the search is divided
>> into
>> a few phases. The first one is collecting results for each term. After
>> that
>> comes merging, sorting and limiting the result set. So for this particular
>> case collecting all results would be infeasible and would kill
>> performance.
>> Even when a limit is set because limiting comes in a phase after
>> collecting
>> and the merging of results.
>>
>
> That's correct.  We have similar issues.  We've resorted to  creating the
> equivalent of multicolumn indexes by joining certain fields together and
> indexing those.  That is only possible because most of the data we want to
> index is structured or semi-structured.  You'd have to determine whether
> such an approach is feasible for your purposes.
>
> We also found 2i to be faster than Search, at the expense of requiring our
> app to perform tokenization for some of the fields we want to index, but
> we've stuck with Search as we need composable queries, which 2i does not
> yet provide.
>
> I've read here [1] that one can use search_fold to interrupt the collecting
>> phase when enough results are fetched. I would like to know if this a
>> best/official practice and if it really solves the issue?
>>
>
> Search_fold will only be useful if you plan on developing in Erlang and,
> if my understanding is correct, if you don't care about the order of the
> results (i.e. no scoring or field sorting).  Actually, the results may be
> partially ordered, as the merge_index backend may store the postings sorted
> by the inverse of time.
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to