Would it work to change /usr/sbin/riak to delete stray .lock files on start?

On May 10, 2011, at 4:10 AM, Nico Meyer <nico.me...@adition.com> wrote:

> Hi again!
> I just encountered this problem again myself, so I was able to check my 
> theory.
> So one of the bitcask.write.lock files contained this:
> 2272 
> /var/lib/riak/bitcask/1121816686466884466511812771987303177196838846464/1305008752.bitcask.data
> and sure enough 'ps axu' gives me:
> riak      2269  0.0  0.0  10624   396 ?        S    12:46   0:00 inet_gethost 
> 4
> riak      2270  0.0  0.0  10624   432 ?        S    12:46   0:00 inet_gethost 
> 4
> riak      2271  0.0  0.0  10624   432 ?        S    12:46   0:00 inet_gethost 
> 4
> riak      2272  0.0  0.0  10624   384 ?        S    12:46   0:00 inet_gethost 
> 4
> root      3139  0.0  0.0      0     0 ?        S    13:00   0:00 [flush-254:1]
> Cheers,
> Nico
Am 10.05.2011 03:07, schrieb Gary William Flake:
>> (Removing riak-users.)
>> This was on an Umbuntu 10.04 box.  Riaksearch was auto started in init.d but 
>> we occasionally start/stop the service as part of our application stack.  In 
>> this one case, we did a shutdown from an admin web console, which may have 
>> not called the proper shutdown procedures in init.d.  On restart, I noticed 
>> the issues and found the locked files.  Removing them did the trick.
>> -- GWF
On May 9, 2011, at 7:10 AM, David Smith wrote:
>>> Hmm...ok. Will have to ponder how we can fix that.
>>> Thanks!
>>> D.
On Mon, May 9, 2011 at 8:09 AM, Nico Meyer<nico.me...@adition.com>  wrote:
>>>> Hi Dave,
>>>> I believe problem occours if there happens to be another process with
>>>> the same PID as the old (now gone) riak node. This can happen if the
>>>> machine was rebooted since the riak node crashed or if the PIDs wrapped,
>>>> they are only two bytes after all.
>>>> os_pid_exists/1 only checks for ANY process with the PID from the
>>>> lockfile
>>>> (https://github.com/basho/bitcask/blob/master/src/bitcask_lockops.erl#L116).
Am Montag, den 09.05.2011, 07:06 -0600 schrieb David Smith:
On Sat, May 7, 2011 at 9:25 AM, Gary William Flake<g...@flake.org>  wrote:
>>>>>> That was it, Nico.  Thanks.
>>>>>> I know we did a forced shutdown this week, which was probably the cause. 
>>>>>>  But I would have thought that riak would have taken care of its own 
>>>>>> lock file bookkeeping on restarting.
>>>>> Bitcask does:
>>>>> https://github.com/basho/bitcask/blob/master/src/bitcask_lockops.erl#L46
>>>>> It's curious that the logic didn't handle the case. What platform/OS
>>>>> are you on? Are you using init scripts to restart on boot?
>>>>> Thanks,
>>>>> D.
>>> -- 
>>> Dave Smith
>>> Director, Engineering
>>> Basho Technologies, Inc.
>>> diz...@basho.com
