I had a thought that you could compare the file's ctime to the process' stime, but apparently those will not be equal. Just posting in case it sparks an idea for anyone else:

jesse@minerva:~> sudo !!
sudo cat /var/lib/riaksearch/bitcask/570899077082383952423314387779798054553098649600/bitcask.write.lock
[sudo] password for jesse: 
16510 /var/lib/riaksearch/bitcask/570899077082383952423314387779798054553098649600/1304623349.bitcask.data
jesse@minerva:~> ls -lc /var/lib/riaksearch/bitcask/570899077082383952423314387779798054553098649600/bitcask.write.lock
-rw------- 1 riak riak 107 2011-05-05 15:17 /var/lib/riaksearch/bitcask/570899077082383952423314387779798054553098649600/bitcask.write.lock
jesse@minerva:~> ps -eo pid,lstart | egrep ^16510
16510 Wed May  4 22:00:45 2011





On 05/10/2011 09:11 AM, Greg Nelson wrote:
Would it work to change /usr/sbin/riak to delete stray .lock files on start?

Sent from my iPhone

On May 10, 2011, at 4:10 AM, Nico Meyer <nico.me...@adition.com> wrote:

Hi again!

I just encountered this problem again myself, so I was able to check my theory.
So one of the bitcask.write.lock files contained this:

2272 /var/lib/riak/bitcask/1121816686466884466511812771987303177196838846464/1305008752.bitcask.data

and sure enough 'ps axu' gives me:

riak      2269  0.0  0.0  10624   396 ?        S    12:46   0:00 inet_gethost 4
riak      2270  0.0  0.0  10624   432 ?        S    12:46   0:00 inet_gethost 4
riak      2271  0.0  0.0  10624   432 ?        S    12:46   0:00 inet_gethost 4
riak      2272  0.0  0.0  10624   384 ?        S    12:46   0:00 inet_gethost 4
root      3139  0.0  0.0      0     0 ?        S    13:00   0:00 [flush-254:1]

Cheers,
Nico

Am 10.05.2011 03:07, schrieb Gary William Flake:
(Removing riak-users.)

This was on an Umbuntu 10.04 box.  Riaksearch was auto started in init.d but we occasionally start/stop the service as part of our application stack.  In this one case, we did a shutdown from an admin web console, which may have not called the proper shutdown procedures in init.d.  On restart, I noticed the issues and found the locked files.  Removing them did the trick.

-- GWF






On May 9, 2011, at 7:10 AM, David Smith wrote:

Hmm...ok. Will have to ponder how we can fix that.

Thanks!

D.

On Mon, May 9, 2011 at 8:09 AM, Nico Meyer<nico.me...@adition.com>  wrote:
Hi Dave,

I believe problem occours if there happens to be another process with
the same PID as the old (now gone) riak node. This can happen if the
machine was rebooted since the riak node crashed or if the PIDs wrapped,
they are only two bytes after all.
os_pid_exists/1 only checks for ANY process with the PID from the
lockfile
(https://github.com/basho/bitcask/blob/master/src/bitcask_lockops.erl#L116).



Am Montag, den 09.05.2011, 07:06 -0600 schrieb David Smith:
On Sat, May 7, 2011 at 9:25 AM, Gary William Flake<g...@flake.org>  wrote:
That was it, Nico.  Thanks.

I know we did a forced shutdown this week, which was probably the cause.  But I would have thought that riak would have taken care of its own lock file bookkeeping on restarting.
Bitcask does:

https://github.com/basho/bitcask/blob/master/src/bitcask_lockops.erl#L46

It's curious that the logic didn't handle the case. What platform/OS
are you on? Are you using init scripts to restart on boot?

Thanks,

D.

            
-- 
Dave Smith
Director, Engineering
Basho Technologies, Inc.
diz...@basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to