Re: Severe problems when adding a new node

2011-11-09 Thread John Axel Eriksson
To illustrate ONE problem we have (another problem is that the data returned is sometimes garbage): john@app-001:~$ curl -I http://localhost:8098/luwak/a5bbc21f0bcfcea4d51c4eedbc9ee5596b4cc6f1 HTTP/1.1 200 OK Vary: Accept-Encoding Transfer-Encoding: chunked Server: MochiWeb/1.1 WebMachine/1.9.0

Re: Severe problems when adding a new node

2011-11-08 Thread John Axel Eriksson
Thanks for the emails detailing this issue - private and to the list. I've got a question for the list on our situation: As stated we did an upgrade from 0.14.2 to 1.0.1 and after that we added a new node to our cluster. This really messed things up and nodes started crashing. In the end I opted

Re: Severe problems when adding a new node

2011-10-28 Thread David Smith
Hi John, et. al. On Fri, Oct 28, 2011 at 5:03 PM, John Axel Eriksson wrote: > > I don't want to be too hard on you fine people of Basho and you provide a > really great system in Riak and I understand what you're aiming for, but if > anything as bad as this ever happens in the future you might w

Re: Severe problems when adding a new node

2011-10-28 Thread John Axel Eriksson
I've got the utmost respect for developers such as yourselves(Basho) and we've had great success using Riak - we have been using it in production since 0.11. We've had our share of problems with it during this whole time but none as big as this. I can't understand why this wasn't posted somewhere

Re: Severe problems when adding a new node

2011-10-28 Thread Kelly McLaughlin
John, It appears you've run into a race condition with adding and leaving nodes that's present in 1.0.1. The problem happens during handoff and can cause bitcask directories to be unexpectedly deleted. We have identified the issue and we are in the process of correcting it, testing, and generat

Re: Severe problems when adding a new node

2011-10-28 Thread Aphyr
I was waiting for Basho to write an official notice about this, but it's been three days and I really don't want anyone else to go through this shitshow. 1.0.1 contains a race condition which can cause vnodes to crash during partition drop. This crash will kill the entire riak process. On our