The broker learns from historicals and tasks even though recently a PR has
been merged to keep published segments in memory (
https://github.com/apache/incubator-druid/pull/6901) in brokers.
Probably it makes sense to filter out segments in brokers too if they are
from historicals and not in the metadata store.

Jihoon

On Fri, Mar 1, 2019 at 1:24 PM David Glasser <glas...@apollographql.com>
wrote:

> That makes sense. Does the coordinator's decisions about what segments are
> 'used' affect the broker's choices for routing queries, or does it just
> learn about things directly from historicals/ingestion tasks (via...
> zookeeper?)
>
> --dave
>
> On Fri, Mar 1, 2019 at 1:15 PM Jihoon Son <ghoon...@gmail.com> wrote:
>
> > Hi Dave,
> >
> > I think the third option sounds most reasonable to fix this issue. Though
> > the second option sounds useful in general.
> > And yes, it wouldn't be easy to refuse to announce unknown segments in
> > historicals.
> > I think it makes more sense to check only in the coordinator because it's
> > the only node who would directly access to the metadata store (except
> > overlord).
> > So, the coordinator may not update the "used" flag if overshadowing
> > segments are not in the metadata store.
> > In stream ingestion, segments might not be in the metadata store until
> they
> > are published. However, this shouldn't be a problem because segments are
> > always appended in stream ingestion.
> >
> > Jihoon
> >
> > On Fri, Mar 1, 2019 at 12:49 AM David Glasser <glas...@apollographql.com
> >
> > wrote:
> >
> > > (I sent this message to druid-user last week and got no response. Since
> > it
> > > is proposing making improvements to Druid, I thought maybe it would be
> > > appropriate to resend here. Hope that's OK.)
> > >
> > > We had a big outage in our Druid cluster last week.  We run our Druid
> > > servers in Kubernetes, and our historicals use machine local SSDs for
> > their
> > > segment caches.  We made the unfortunate choice to have our production
> > and
> > > staging historicals share the same pool of machines, and today got bit
> by
> > > this for the first time.
> > >
> > > A production historical started up on a machine whose segment cache
> > > contained segments from our staging cluster.  Our prod and staging
> > clusters
> > > use the same names for data sources.
> > >
> > > This meant that these segments overshadowed production segments which
> > > happened to have lower versions.  Worse, when
> > > DruidCoordinatorCleanupOvershadowed kicked in, all of the production
> > > segments that were overshadowed got used=false set, and quickly got
> > dropped
> > > from historicals. This ended up being the majority of our data.  We
> > > eventually figured out what was going on and did a bunch of manual
> steps
> > to
> > > clean up (turning off and clearing the cache of the two historicals
> that
> > > had staging segments on them, manually setting used=true for all
> entries
> > in
> > > druid_segments, waiting a long long time for data to re-download), but
> > > figuring out what was going on was subtle (I was very lucky I had
> > randomly
> > > decided to read a lot of the code about how the `used` column works and
> > how
> > > versioned timelines are calculated just a few days before!).
> > >
> > > (We were also lucky that we had turned off coordinator automatic
> killing
> > > literally that morning!)
> > >
> > > I feel like Druid should have been able to protect me from this to some
> > > degree. (Yes, we are going to address the root cause by making it
> > > impossible for prod and staging to reuse each others' disks.) Some
> > thoughts
> > > on changes that could have helped:
> > >
> > > - Is the Druid standard to prepend the "cluster" name to the data
> source
> > > name, so that conflicts like this are never possible?  We are certainly
> > > tempted to do this now but nobody ever told us to. If that's the
> > standard,
> > > should it be documented?
> > >
> > > - Should clusters have an optional name/namespace, and DataSegments
> have
> > > that namespace recorded in it, and clusters refuse to handle segments
> > they
> > > find that are from a different namespace? This would be like the common
> > > database setup where a single server/cluster has a set of database
> which
> > > each have a set of tables.
> > >
> > > - Should historicals refuse to announce segments that don't exist in
> the
> > > druid_segments table, or should coordinators/brokers/etc refuse to pay
> > > attention to segments announced *by historicals* that don't exist in
> the
> > > druid_segments table.  I'm going to guess this is difficult to do in
> the
> > > historical because the historical probably doesn't actually talk to the
> > sql
> > > DB at all? But maybe it could be done by coordinator and broker?
> > >
> > > --dave
> > >
> >
>

Reply via email to