Quick follow-up: As a bit of a hack, deleting all the .hint files prior to each start-up does resolve the errors, and immediately results in a whole lot of Bitcask merges happening. But that doesn't strike me as a good long-term fix.
On Fri, 23 Oct 2015 at 10:52 Toby Corkindale <t...@dryft.net> wrote: > Hi Hector, > You can see the Dockerfile here: > https://gist.github.com/TJC/cb3184705bc0eacde885 > > It's a work in progress, but also, not that involved. > > Ubuntu 14.04 is used as both the docker host, and the docker container. > It's on the btrfs storage driver. (I've had too many issues with the other > two) > The Riak data directory is a volume, and is mounted to an external, > persistent location. (Which is also btrfs) > > I suspect there's an issue around Riak shutting down uncleanly when the > docker container is stopped. > I have already had to add this to the start-up each time: > find /var/lib/riak -name "bitcask.*.lock" -delete > > So it's clear that Riak is getting killed rather than shutting down > cleanly; but even so, I'd hope that Riak would cope with that, rather than > getting into a permanent state of throwing errors. > > Toby > > > On Fri, 23 Oct 2015 at 00:01 Hector Castro <hectcas...@gmail.com> wrote: > >> Can't say I've paid enough attention to the logs in my single-machine >> Riak within Docker setups to confirm. >> >> Do you have the container image definitions somewhere public? That may >> help someone reproduce the issue. Also, did you ensure that the Riak >> data directory is setup as a Docker volume? >> >> Other things that come to mind: >> >> - What OS is the Docker host running? >> - What storage driver are you using for Docker? >> - What file system is the Docker data directory using? >> >> -- >> Hector >> >> >> On Thu, Oct 22, 2015 at 2:27 AM, Toby Corkindale <t...@dryft.net> wrote: >> > Anyone? >> > >> > I note that after 24 hours (on a very lightly loaded test cluster) I'm >> still >> > seeing these scroll by a lot - 600 an hour per node. >> > Really curious to know if this is expected behaviour or if this is >> resulting >> > from some kind of node corruption. >> > >> > Cheers >> > Toby >> > >> > >> > >> > On Wed, 21 Oct 2015 at 12:23 Toby Corkindale <t...@dryft.net> wrote: >> >> >> >> Hi, >> >> I've been working on getting Riak to run inside Docker containers - in >> a >> >> multi-machine cluster. (Previous work I've seen has only run Riak as a >> >> cluster all on the same machine.) >> >> I thought I had it cracked, although I tripped up on the existing issue >> >> with Riak and lockfiles[1]. But the nodes have been generating an >> awful lot >> >> of errors like the below, and I wondered if anyone here can give me an >> >> explanation? (And, is it a problem?) >> >> >> >> 2015-10-21 01:19:23.567 [error] <0.24495.0> Error folding keys for >> >> "/var/lib/riak/bitcask.1h/2283596 >> >> 30832953580969325755111919221821239459840/2.bitcask.data": >> >> {incomplete_hint,4} >> >> >> >> 1: Related issues to the lockfiles -- >> >> I note that many are closed, but the problem still exists, and is >> >> particularly triggered by using Docker and stopping/killing Riak more >> >> violently than it likes. >> >> https://github.com/basho/bitcask/issues/163 (closed) >> >> https://github.com/basho/riak/issues/535 (open) >> >> https://github.com/basho/bitcask/issues/167 (closed) >> >> https://github.com/basho/bitcask/issues/99 (closed) >> > >> > >> > _______________________________________________ >> > riak-users mailing list >> > riak-users@lists.basho.com >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com