Hi Ryan, >From the looks of the crash log, it seems that one of your merge index files >may be corrupt (did you run out of disk space, or crash a node?)
At any rate, what seems to be happening is that the search vnode is in the middle of a handoff (presumably to the new machine), and while it is doing a full scan of the merge index segment files to transfer data, it encounters a bad file. This results in the crash log is that it tries to do binary_to_term(<<131,109,0,0,128,40,...>>) on a 46-byte binary; but the encoded stream says that the data length should be 128*16+40 bytes, i.e. 2088 bytes long. So, something is too short, which I would guess could have happened because either the server crashed or ran out of disk. From a casual inspection of the code, it doesn't look like merge indexes are resilient to a node crashing while it is writing to disk. I don't know search intimately, but I have seen mention of problems before that were caused by "bad indexes", and the resolution seems to be to delete the merge index files (the search index in your case /var/lib/riak/merge_index/159851741583067506678528028578343455274867621888), and then iterate over all values and re-write them. Bummer. Perhaps someone from Basho can chime in and tell us (A) if it seems plausible that the merge index segment files are indeed corrupt, and (B) if so, what is the right way to recover from that. Kresten On Jan 25, 2012, at 9:18 PM, Fisher, Ryan wrote: Hello all, We are hitting an issue with a riak 1.0.3 cluster when adding new nodes to the ring. Specifically the handoff appears stuck and isn't making any progress. I have read a number of the threads on here and realize handoff will take a while, and have also tried attaching to the console and doing a force_update along w/ force_handoffs. However over 12 hours later the nodes haven't made any progress. After digging through the log files it appears that the search merge_index could be my problem? Possibly the compaction isn't occurring properly? We are running a riak 1.0.3 cluster for a research project, where we are utilizing the python client for reads, writes, and queries of the cluster. Using a small data set of 20k keys things were humming along nicely. We then started to ramp up the number of objects and ended up getting to around 1M objects. At this same time I added an additional node (w/ plans to expand to 8 nodes total). However it appears that the partition handoff is stuck after performing the 'join' command on the 5th node I was adding. So currently it is a 4 + 1 node cluster with 4 gig of memory per node, am running the bitcask backend with 'search' enabled on some of the buckets. Specifically I am using the 'out of the box' JSON encoding schema by simply setting the mime-type to "application/json", when I do the store from the python client. I'm wondering if enabling search and using the default JSON schema was too much data to index? Outside of increasing the linux file limit on the nodes, enabling 'search' (in the config file and w/ the pre-commit hook), and upping the ring_creation_size to 256 (before I started or added any nodes) there shouldn't be much else out of the ordinary going on. This was an original 1.0 riak cluster which I have been performing rolling upgrades on as the bug fix versions come out. However currently all 4 + 1 nodes are 1.0.3 Here are the *I hope* relevant error logs? Riak error log: http://pastebin.com/99cdPdCk Riak crash log: http://pastebin.com/07FRZkf2 Riak erlang log: http://pastebin.com/DvdasWyR Does anyone have any ideas on how to 'unstick' the partition handoff? Or maybe the bigger question is indexing all of the incoming data (outside of the disk space requirements) a bad idea? Perhaps I need to write a custom schema that limits what gets indexed? I should mention that the search is a 'nice-to-have' but the data is structured in a way that we know the keys we need at lookup time (for the most part) and I can probably use m/r to query the rest… With that I'm wondering if it comes down to it can search be easily 'undone' on the cluster? Maybe as simply as disabling the pre-commit hook, turning it off in the app.config and them deleting the riak/merge_index directories on each node? Thanks, ryan <smime.p7s>_______________________________________________ riak-users mailing list riak-users@lists.basho.com<mailto:riak-users@lists.basho.com> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab Trifork A/S | Margrethepladsen 4 | DK- 8000 Aarhus C | Phone : +45 8732 8787 | www.trifork.com<http://www.trifork.com> _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com