Having just got back from the Lucene Revolution convention and seen several hours of presentations that were essentially about how to configure large distributed Lucene applications, Riak Search looks REALLY interesting. A couple questions:
I know that in Lucene and Solr, committing many newly-indexed documents at once will provide much better performance than committing them individually. Will there be a similar performance cost to indexing one document at a time via the pre-commit hook, as opposed to indexing in bulk via the search-cmd program? Also, the decision to partition the index by terms rather than by documents strikes me as the most interesting design decision in Riak Search. Could this lead to unbalanced node utilization in queries? For example, I'd like to implement a large search application that implements access control via the index (adding some extra clauses to the queries generated by users), so there would be a handful of terms that are used in almost all queries. Would a query set like that lead to a few nodes being much more utilized than others? Awesome, awesome work, I can't wait to try this out. Greg
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com