On Sun, Oct 14, 2012 at 12:33 AM, Pavel Kogan <pavel.ko...@cortica.com>wrote: > > > 1) Is search enabling has any impact on read latency/throughput? >
If you are reading and searching at the same time there is a good chance it will. It will cause more disk seeks. > 2) Is search enabling has any impact on RAM usage? > Yes, the index engine behind Riak Search makes heavy usage of Erlang ETS tables. Each partition has an in-memory buffer as well as an in-memory offset table for every segment. It also uses a temporary ETS table for every write to store posting data. The ETS system limit can even become an issue in overload scenarios. > 3) In production we have no search enabled. What is the best way to > enable search without stop production? I thought about something like: > 1) Enable search node after node. > You could change the app env dynamically but that's only half the problem. The other half is then starting the Riak Search application. I think application:start(merge_index) followed by application:start(riak_search) should work but I'm not 100% sure and this has not been tested. You'll also want to make sure to edit all app.configs so that it is persistent. > 2) Execute some night script that runs on all keys and overwrite them > back > with proper mime type. > Yes, you'll want to install the commit hook on the buckets you wish to index. Then you'll want to do a streaming list-keys or bucket map-reduce and re-write the data. > 4) If we see that search overhead is something we can't handle, is there > simple > way to disable it without stop production? > I think the best course of action in this case would be to disable the commit hook. But you would have to keep track of anything written during this time and re-write it after re-installing the hook. If you don't then you'll have to re-index everything because you don't know what you missed. 5) In what case we would need repair? It is said - on replica loss, but if > I understand > correct we have 3 replicas on different nodes don't we? If it happens > how difficult and > long would it be for large cluster (about 100 nodes)? > Repair is on a per partition basis. Number of nodes doesn't come into play. Repair is very specific in that it requires the adjacent partitions to be in a good, convergent state. If they aren't then repair isn't much help. A lot of these entropy issues go away in Yokozuna. Repairing indexes is done automatically, in the background, in an efficient manner. There is no need to re-write data or run manual repair commands. -Z
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com