subject:"Re\: Flume log rolling when you need to do rollups for multiple time zones"

Re: Flume log rolling when you need to do rollups for multiple time zones

2014-07-29 Thread Hari Shreedharan

The issue seems to be that it is already the next day when the event arrives at the agent. You can either move the timestamp interceptor to the first Flume agent in the pipeline - which reduces the time window in which this can occur or insert a timestamp header when you create the event (the heade

Re: Flume log rolling when you need to do rollups for multiple time zones

2014-07-29 Thread Gary Malouf

Hi Hari, Below is the config for one of our source-channel-sink combos. In hadoop/spark world, how do you then handle the events that arrive late to the bucket? That is, events for July 15 UTC end up in the July 16 bucket. The ugly way I have been handling this to date is that for any query for

Re: Flume log rolling when you need to do rollups for multiple time zones

2014-07-29 Thread Hari Shreedharan

Can you send your config? There are a couple of params that allow the files to be rolled faster - idleTimeout and rollInterval. I am assuming you are using rollInterval already. idleTimeout will close a file when it is not written to for the configured time. That might help with the rolling. Rememb