The issue seems to be that it is already the next day when the event
arrives at the agent. You can either move the timestamp interceptor to the
first Flume agent in the pipeline - which reduces the time window in which
this can occur or insert a timestamp header when you create the event (the
heade
Hi Hari,
Below is the config for one of our source-channel-sink combos. In
hadoop/spark world, how do you then handle the events that arrive late to
the bucket? That is, events for July 15 UTC end up in the July 16 bucket.
The ugly way I have been handling this to date is that for any query for
Can you send your config? There are a couple of params that allow the files
to be rolled faster - idleTimeout and rollInterval. I am assuming you are
using rollInterval already. idleTimeout will close a file when it is not
written to for the configured time. That might help with the rolling.
Rememb