thanks a lot for the explanation. if I understand it correctly it basically
back pressure from C*, it's telling me that it's overloaded and that I need
to back off.

I better start a few more nodes, I guess.

T#


On Thu, May 30, 2013 at 10:47 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, May 30, 2013 at 8:24 AM, Theo Hultberg <t...@iconara.net> wrote:
> > I'm using Cassandra 1.2.4 on EC2 (3 x m1.large, this is a test cluster),
> and
> > my application is talking to it over the binary protocol (I'm using JRuby
> > and the cql-rb driver). I get this error quite frequently: "Too many in
> > flight hints: 2411" (the exact number varies)
> >
> > Has anyone any idea of what's causing it? I'm pushing the cluster quite
> hard
> > with writes (but no reads at all).
>
> The code that produces this message (below) sets the bound based on
> the number of available processors. It is a bound of   number of in
> progress hints. An in progress hint (for some reason redundantly
> referred to as "in flight") is a hint which has been submitted to the
> executor which will ultimately write it to local disk. If you get
> OverloadedException, this means that you were trying to write hints to
> this executor so fast that you risked OOM, so Cassandra refused to
> submit your hint to the hint executor and therefore (partially) failed
> your write.
>
> "
> private static volatile int maxHintsInProgress = 1024 *
> FBUtilities.getAvailableProcessors();
> [... snip ...]
> for (InetAddress destination : targets)
>         {
>             // avoid OOMing due to excess hints.  we need to do this
> check even for "live" nodes, since we can
>             // still generate hints for those if it's overloaded or
> simply dead but not yet known-to-be-dead.
>             // The idea is that if we have over maxHintsInProgress
> hints in flight, this is probably due to
>             // a small number of nodes causing problems, so we should
> avoid shutting down writes completely to
>             // healthy nodes.  Any node with no hintsInProgress is
> considered healthy.
>             if (totalHintsInProgress.get() > maxHintsInProgress
>                 && (hintsInProgress.get(destination).get() > 0 &&
> shouldHint(destination)))
>             {
>                 throw new OverloadedException("Too many in flight
> hints: " + totalHintsInProgress.get());
>             }
> "
>
> If Cassandra didn't return this exception, it might OOM while
> enqueueing your hints to be stored. Giving up on trying to enqueue a
> hint for the failed write is chosen instead. The solution is to reduce
> your write rate, ideally by enough that you don't even queue hints in
> the first place.
>
> =Rob
>

Reply via email to