Thanks a lot. I will take a look at this tomorrow or early next week.
On Wed, Jul 2, 2014 at 5:29 PM, Abraham Fine <a...@brightroll.com> wrote: > Hari- > > I added the new tests and created a new revision to my patch. > > > https://issues.apache.org/jira/secure/attachment/12653728/compress_backup_checkpoint_new_tests.patch > > Thanks, > Abe > > -- > Abraham Fine | Software Engineer > (516) 567-2535 > BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com > > > On Wed, Jul 2, 2014 at 4:32 PM, Hari Shreedharan < > hshreedha...@cloudera.com> wrote: > >> Hi Abraham, >> >> In general, the patch looks good. Can you add a couple of tests - >> * Original checkpoint is uncompressed, config changes to compress >> checkpoint - does the file channel restart from original checkpoint? are >> new checkpoints compressed? >> * Compressed checkpoint, config changes to not compress checkpoint - does >> channel start up? are new checkpoints uncompressed? >> >> >> Hari >> >> >> On Wed, Jul 2, 2014 at 3:06 PM, Abraham Fine <a...@brightroll.com> wrote: >> >>> Hi Brock and Hari- >>> >>> I was just wondering if either of you had a chance to take a look at the >>> patch and if there is anything I can do to improve it. >>> >>> Thanks, >>> Abe >>> >>> -- >>> Abraham Fine | Software Engineer >>> (516) 567-2535 >>> BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com >>> >>> >>> On Wed, Jun 11, 2014 at 6:48 PM, Brock Noland <br...@cloudera.com> >>> wrote: >>> >>>> This is a great suggestion Abraham! >>>> >>>> >>>> On Wed, Jun 11, 2014 at 5:39 PM, Hari Shreedharan < >>>> hshreedha...@cloudera.com> wrote: >>>> >>>>> Thanks. I will review it :) >>>>> >>>>> >>>>> Thanks, >>>>> Hari >>>>> >>>>> On Wednesday, June 11, 2014 at 5:00 PM, Abraham Fine wrote: >>>>> >>>>> I went ahead and created a JIRA and patch: >>>>> https://issues.apache.org/jira/browse/FLUME-2401 >>>>> >>>>> The option is configurable with: >>>>> agentX.channels.ch1.compressBackupCheckpoint = true >>>>> >>>>> As per your recommendation, I used snappy-java. I also considered the >>>>> snappy and lz4 implementations in Hadoop IO but noticed that the >>>>> Hadoop IO dependency was removed in >>>>> https://issues.apache.org/jira/browse/FLUME-1285 >>>>> >>>>> Thanks, >>>>> Abe >>>>> -- >>>>> Abraham Fine | Software Engineer >>>>> (516) 567-2535 >>>>> BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com >>>>> >>>>> >>>>> On Mon, Jun 9, 2014 at 4:01 PM, Hari Shreedharan >>>>> <hshreedha...@cloudera.com> wrote: >>>>> >>>>> Hi Abraham, >>>>> >>>>> Compressing the backup checkpoint is very possible. Since the backup is >>>>> rarely read (only if the original one is corrupt on restarts), is it >>>>> used. >>>>> So I think compressing it using something like Snappy would make sense >>>>> (GZIP >>>>> might hit performance). Can you try using snappy-java and see if that >>>>> gives >>>>> good perf and reasonable compression? >>>>> >>>>> Patches are always welcome. I’d be glad to review and commit it. I >>>>> would >>>>> suggest making the compression optional via configuration so that >>>>> anyone >>>>> with smaller channels don’t end up using CPU for not much gain. >>>>> >>>>> >>>>> Thanks, >>>>> Hari >>>>> >>>>> On Monday, June 9, 2014 at 3:56 PM, Abraham Fine wrote: >>>>> >>>>> Hello- >>>>> >>>>> We are using Flume 1.4 with File Channel configured to use a very >>>>> large capacity. We keep the checkpoint and backup checkpoint on >>>>> separate disks. >>>>> >>>>> Normally the file channel is mostly empty (<<1% of capacity). For the >>>>> checkpoint the disk I/O seems to be very reasonable due to the usage >>>>> of a MappedByteBuffer. >>>>> >>>>> On the other hand, the backup checkpoint seems to be written to disk >>>>> in its entirety over and over again, resulting in very high disk >>>>> utilization. >>>>> >>>>> I noticed that, because the checkpoint file is mostly empty, it is >>>>> very compressible. I was able to GZIP our checkpoint from 381M to >>>>> 386K. I was wondering if it would be possible to always compress the >>>>> backup checkpoint before writing it to disk. >>>>> >>>>> I would be happy to work on a patch to implement this functionality if >>>>> there is interest. >>>>> >>>>> Thanks in Advance, >>>>> >>>>> -- >>>>> Abraham Fine | Software Engineer >>>>> (516) 567-2535 >>>>> BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com >>>>> >>>>> >>>>> >>>> >>> >> >