RE: Adding SSL peer cert info to AvroSource

2014-01-30 Thread Pritchard, Charles X. -ND
I need to put the CN on the cert into a variable; it's essentially an authenticated string the server knows to be valid (since it has been signed). I'd like to route messages to a directory based on the string or otherwise send them to a fallback directory on a failed cert. ___

Writing custom source

2014-01-30 Thread Chhaya Vishwakarma
Hi, I want to write my own custom source to handle application specific logs(e.g. stack traces ) So, What are the prerequisites for writing custom source. >From Where should I start ? I am doing it for the first time Any reference for doing this ? Regards, Chhaya Vishwakarma _

RE: checkpoint lifecycle

2014-01-30 Thread Umesh Telang
Hi Hari, The capacity of the channel is 150,000,000. The other properties of the file channel are as below: a1.channels.s3-file-channel.type = file a1.channels.s3-file-channel.checkpointDir = /mnt/flume-file-channels/s3-file-channel/checkpoint a1.channels.s3-file-channel.dataDirs = /mnt/flume-f

RE: checkpoint lifecycle

2014-01-30 Thread Brock Noland
How large is your heap? You will likely want two data directories per disk. Also with a channel that large I strongly recommend using back up checkpoints. Additionally https://issues.apache.org/jira/browse/FLUME-2155 will be very useful to you as well. On Jan 30, 2014 4:21 AM, "Umesh Telang" wro

Re: Writing custom source

2014-01-30 Thread Brock Noland
I am guessing you want to write a Spooling Directory Source deserializer. http://flume.apache.org/FlumeUserGuide.html#event-deserializers On Thu, Jan 30, 2014 at 3:58 AM, Chhaya Vishwakarma < chhaya.vishwaka...@lntinfotech.com> wrote: > Hi, > > > > I want to write my own custom source to handl

RE: checkpoint lifecycle

2014-01-30 Thread Umesh Telang
Hi Brock, Our heap size is 2GB. Thanks for the advice on data directories. Could you please let me know the heuristic for that? (e.g. 1 data directory per N-sized channel where N is...) Thanks also for suggesting back up checkpoints - are these something that increases the integrity of Flume

Re: checkpoint lifecycle

2014-01-30 Thread Brock Noland
On Thu, Jan 30, 2014 at 8:16 AM, Umesh Telang wrote: > Hi Brock, > > Our heap size is 2GB. > That is not enough heap for 150M events. It's 150 million * 32 bytes = 4.5GB + say 100-500MB for the rest of Flume. > > Thanks for the advice on data directories. Could you please let me know > the h

RE: checkpoint lifecycle

2014-01-30 Thread Umesh Telang
Ah, ok. So 32 bytes is required for each pointer to an event. We'll amend our heap size accordingly. We may also be able to reduce our FileChannel size. We hadn't understood the implications of the capacity value of the FileChannel we have been using. Regarding the multiple data directories, I

Re: checkpoint lifecycle

2014-01-30 Thread Brock Noland
On Thu, Jan 30, 2014 at 9:29 AM, Umesh Telang wrote: > Ah, ok. So 32 bytes is required for each pointer to an event. > Yep :) > We'll amend our heap size accordingly. We may also be able to reduce our > FileChannel size. We hadn't understood the implications of the capacity > value of the File

RE: checkpoint lifecycle

2014-01-30 Thread Umesh Telang
Thanks very much, Brock for all your help. From: Brock Noland [br...@cloudera.com] Sent: 30 January 2014 16:28 To: user@flume.apache.org Subject: Re: checkpoint lifecycle On Thu, Jan 30, 2014 at 9:29 AM, Umesh Telang mailto:umesh.tel...@bbc.co.uk>> wrote: Ah, o

hdfs.fileType = CompressedStream

2014-01-30 Thread Jimmy
I am running few tests and would like to confirm whether this is true... hdfs.codeC = gzip hdfs.fileType = CompressedStream hdfs.writeFormat = Text hdfs.batchSize = 100 now lets assume I have large number of transactions I roll file every 10 minutes it seems the tmp file stay 0bytes and flushes

Re: hdfs.fileType = CompressedStream

2014-01-30 Thread Jeff Lord
You are using gzip so the files won't splittable. You may be better off using snappy and sequence files. On Thu, Jan 30, 2014 at 10:51 AM, Jimmy wrote: > I am running few tests and would like to confirm whether this is true... > > hdfs.codeC = gzip > hdfs.fileType = CompressedStream > hdfs.writ

Re: hdfs.fileType = CompressedStream

2014-01-30 Thread Jimmy
snappy is not splittable neither, combining with sequence files it gives identical result - bulk dumps whole file into HDFS I feel a bit uneasy to keep 120MB (almost 1GB uncompressed) file open for one hour. On Thu, Jan 30, 2014 at 1:59 PM, Jeff Lord wrote: > You are using gzip so the fil

Re: Adding SSL peer cert info to AvroSource

2014-01-30 Thread Mike Percy
I am not an expert in the JSSE API, so without specifics regarding APIs you are trying to use I don't think I can be of much help. From browsing around a little bit, it looks like we can simply have the server specify the CA certs that it respects and the client will attempt to use one of the certs

Re: Transferring another server using flume

2014-01-30 Thread ed
Hi Burak, Unfortunately I don't have any experience with Scribe so can't provide any advice there. I briefly checked out the Github site for it and it did not look like there is much (if any) activity on that project at this point. I think all of the Flume sources use a push model (rather than P

RE: Writing custom source

2014-01-30 Thread Chhaya Vishwakarma
Hi Where can I get he code for deserializer not finding on github. And I want to write a custom source From: Brock Noland [mailto:br...@cloudera.com] Sent: Thursday, January 30, 2014 7:45 PM To: user@flume.apache.org Subject: Re: Writing custom source I am guessing you want to write a Spooling D

Re: Writing custom source

2014-01-30 Thread Ashish
https://git-wip-us.apache.org/repos/asf?p=flume.git;a=tree;f=flume-ng-core/src/main/java/org/apache/flume/serialization;h=9ad9357a923f219255a9657399d8c8b5bf97ddcd;hb=HEAD On Fri, Jan 31, 2014 at 11:48 AM, Chhaya Vishwakarma < chhaya.vishwaka...@lntinfotech.com> wrote: > Hi > > Where can I get he

java.lang.OutOfMemoryError: Direct buffer memory on HDSF sink

2014-01-30 Thread Chen Wang
Hi Guys, My topology is like this: I have set up 2 flume nodes, from avrc to hdfs: StormAgent.sources = avro StormAgent.channels = MemChannel StormAgent.sinks = HDFS StormAgent.sources.avro.type = avro StormAgent.sources.avro.channels = MemChannel StormAgent.sources.avro.bind = ip StormAgent.sour