Re: File Channel Backup Checkpoints are I/O Intensive

2014-07-02 Thread Hari Shreedharan
Thanks a lot. I will take a look at this tomorrow or early next week. On Wed, Jul 2, 2014 at 5:29 PM, Abraham Fine wrote: > Hari- > > I added the new tests and created a new revision to my patch. > > > https://issues.apache.org/jira/secure/attachment/12653728/compress_backup_checkpoint_new_test

Re: File Channel Backup Checkpoints are I/O Intensive

2014-07-02 Thread Abraham Fine
Hari- I added the new tests and created a new revision to my patch. https://issues.apache.org/jira/secure/attachment/12653728/compress_backup_checkpoint_new_tests.patch Thanks, Abe -- Abraham Fine | Software Engineer (516) 567-2535 BrightRoll, Inc. | Smart Video Advertising | www.brightroll.co

Re: File Channel Backup Checkpoints are I/O Intensive

2014-07-02 Thread Hari Shreedharan
Hi Abraham, In general, the patch looks good. Can you add a couple of tests - * Original checkpoint is uncompressed, config changes to compress checkpoint - does the file channel restart from original checkpoint? are new checkpoints compressed? * Compressed checkpoint, config changes to not compre

Re: File Channel Backup Checkpoints are I/O Intensive

2014-07-02 Thread Abraham Fine
Hi Brock and Hari- I was just wondering if either of you had a chance to take a look at the patch and if there is anything I can do to improve it. Thanks, Abe -- Abraham Fine | Software Engineer (516) 567-2535 BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com On Wed, Jun 11, 201

Re: File Channel Backup Checkpoints are I/O Intensive

2014-06-11 Thread Brock Noland
This is a great suggestion Abraham! On Wed, Jun 11, 2014 at 5:39 PM, Hari Shreedharan wrote: > Thanks. I will review it :) > > > Thanks, > Hari > > On Wednesday, June 11, 2014 at 5:00 PM, Abraham Fine wrote: > > I went ahead and created a JIRA and patch: > https://issues.apache.org/jira/browse

Re: File Channel Backup Checkpoints are I/O Intensive

2014-06-11 Thread Hari Shreedharan
Thanks. I will review it :) Thanks, Hari On Wednesday, June 11, 2014 at 5:00 PM, Abraham Fine wrote: > I went ahead and created a JIRA and patch: > https://issues.apache.org/jira/browse/FLUME-2401 > > The option is configurable with: > agentX.channels.ch1.compressBackupCheckpoint = true >

Re: File Channel Backup Checkpoints are I/O Intensive

2014-06-11 Thread Abraham Fine
I went ahead and created a JIRA and patch: https://issues.apache.org/jira/browse/FLUME-2401 The option is configurable with: agentX.channels.ch1.compressBackupCheckpoint = true As per your recommendation, I used snappy-java. I also considered the snappy and lz4 implementations in Hadoop IO but no

Re: File Channel Backup Checkpoints are I/O Intensive

2014-06-09 Thread Hari Shreedharan
Hi Abraham, Compressing the backup checkpoint is very possible. Since the backup is rarely read (only if the original one is corrupt on restarts), is it used. So I think compressing it using something like Snappy would make sense (GZIP might hit performance). Can you try using snappy-java and

File Channel Backup Checkpoints are I/O Intensive

2014-06-09 Thread Abraham Fine
Hello- We are using Flume 1.4 with File Channel configured to use a very large capacity. We keep the checkpoint and backup checkpoint on separate disks. Normally the file channel is mostly empty (<<1% of capacity). For the checkpoint the disk I/O seems to be very reasonable due to the usage of a