Flume/Avro schema manipulation

2014-08-29 Thread Ed Judge
I am looking for some good documentation that would explain how I can pull logs from a file and apply a certain schema to that log file format. I have gone threw the exercise of sending a file using a Avro source/sink with the flume_ng avro_client to a flume_ng agent but my understanding is tha

Re: Flume/Avro schema manipulation

2014-08-29 Thread Ed Judge
can apply > org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder interceptor > and then apply to Avro morphline. > > Is that what you are looking for? > > Regards. > > > 2014-08-29 13:43 GMT+02:00 Ed Judge : > I am looking for some good documentation that would explain how I c

Avro source and sink

2014-09-02 Thread Ed Judge
Does anyone know of any good documentation that talks about the protocol/negotiation used between an Avro source and sink? Thanks, Ed

Re: Avro source and sink

2014-09-03 Thread Ed Judge
e.java > > https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/sink/AvroSink.java > > https://flume.apache.org/FlumeDeveloperGuide.html#transaction-interface > > -Jeff > > > > > > > On Tue, Sep 2, 2014 at 6:36 PM, Ed

Re: Avro source and sink

2014-09-04 Thread Ed Judge
omplish what I want to do? Just looking for some guidance. Thanks, Ed On Sep 4, 2014, at 4:44 AM, Ashish wrote: > Avro records shall have the schema embedded with them. Have a look at source, > that shall help a bit > > > On Wed, Sep 3, 2014 at 10:30 PM, Ed Judge wrote: > Th

Re: Avro source and sink

2014-09-08 Thread Ed Judge
s would be part of Event > body, rest would be same as Step 1-3 > > HTH ! > > > On Fri, Sep 5, 2014 at 12:58 AM, Ed Judge wrote: > Ok, I have looked over the source and it is making a little more sense. > > I think what I ultimately want to do is this: > >

Re: Avro source and sink

2014-09-09 Thread Ed Judge
hub.com/kite-sdk/kite-examples/tree/master/json > [8] > https://github.com/joey/kite-examples/tree/cdk-647-datasetsink-flume-log4j-appender > >> On Mon, Sep 8, 2014 at 12:00 PM, Ed Judge wrote: >> Thanks for the reply. My understanding of the current avro sink/source is >&

HDFS sink to a remote HDFS node

2014-09-29 Thread Ed Judge
I am trying to run the flume-ng agent on one node with an HDFS sink pointing to an HDFS filesystem on another node. Is this possible? What packages/jar files are needed on the flume agent node for this to work? Secondary goal is to install only what is needed on the flume-ng node. # Describe

Re: HDFS sink to a remote HDFS node

2014-09-30 Thread Ed Judge
f-2.5 the flume-ng will fail to start > > > > > 2014-09-30 > shengyi.pan > 发件人:Ed Judge > 发送时间:2014-09-29 22:38 > 主题:HDFS sink to a remote HDFS node > 收件人:"user@flume.apache.org" > 抄送: > > I am trying to run the flume-ng agent on one nod

Re: HDFS sink to a remote HDFS node

2014-09-30 Thread Ed Judge
ommons Configuration is missing in classpath. > > Thanks, > Hari > > > On Tue, Sep 30, 2014 at 11:48 AM, Ed Judge wrote: > > Thank you. I am using hadoop 2.5 which I think uses protobuf-java-2.5.0.jar. > > I am getting the following error even after adding those 2 j

Re: HDFS sink to a remote HDFS node

2014-09-30 Thread Ed Judge
gt; > On Tue, Sep 30, 2014 at 11:48 AM, Ed Judge wrote: > > Thank you. I am using hadoop 2.5 which I think uses protobuf-java-2.5.0.jar. > > I am getting the following error even after adding those 2 jar files to my > flume-ng classpath: > > 30 Sep 201

Re: HDFS sink to a remote HDFS node

2014-09-30 Thread Ed Judge
u'd need to add the jars that hadoop itself depends on. Flume pulls it in > if Hadoop is installed on that machine, else you'd need to manually download > it and install it. If you are using Hadoop 2.x, install the RPM provided by > Bigtop. > > On Tue, Sep 30, 2014 at 12

Re: HDFS sink to a remote HDFS node

2014-10-01 Thread Ed Judge
es are failing as blocks are allocated to that > one. > > Thanks, > Hari > > > On Tue, Sep 30, 2014 at 7:33 PM, Ed Judge wrote: > > I’ve pulled over all of the Hadoop jar files for my flume instance to use. I > am seeing some slightly different errors

Re: HDFS sink to a remote HDFS node

2014-10-02 Thread Ed Judge
Hari > > > On Wed, Oct 1, 2014 at 6:04 AM, Ed Judge wrote: > > Looks like they are up. I see the following on one of the nodes but both > look generally the same (1 live datanode). > > [hadoop@localhost bin]$ hdfs dfsadmin -report > 14/10/01 12:51:56 WARN util.NativeCo

detecting when exec cat source is complete

2014-10-15 Thread Ed Judge
Is there a reliable mechanism to know when a flume source is complete (i.e. when it is done catting a file or when the flume source script exits) AND all of the events have been written to the sink? Thanks, -Ed

Flume stats

2014-10-23 Thread Ed Judge
Does anyone know if there are byte level statistics available with Flume? $ curl --fail --silent --show-error http://localhost:14001/metrics {"SOURCE.r1":{"OpenConnectionCount":"0","Type":"SOURCE","AppendBatchAcceptedCount":"0","AppendBatchReceivedCount":"0","EventAcceptedCount":"4480","StopTime":

HDFS IO error

2014-10-30 Thread Ed Judge
I am running into the following problem. 30 Oct 2014 18:43:26,375 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:463) - HDFS IO error java.io.IOException: Callable timed out after 1 ms on file: hdfs://localhost:9000/tmp/dm/dm-1-19.141

Re: HDFS IO error

2014-10-30 Thread Ed Judge
. There is a fix to this incorporated In > flume 1.5 (i havent test it yet) but if u are on anything older the only way > to make this work is restart the flume process > > On Oct 30, 2014 11:54 AM, "Ed Judge" wrote: > I am running into the following problem. &

Re: HDFS IO error

2014-10-30 Thread Ed Judge
014, at 5:58 PM, Asim Zafir wrote: > > Ed, > > Are you saying you resolved the problem with 1.5.0 or you still have an issue? > > Thanks, > > Asim Zafir. > >> On Thu, Oct 30, 2014 at 1:47 PM, Ed Judge wrote: >> Thanks for the replies. We are us

Re: HDFS IO error

2014-11-03 Thread Ed Judge
taking more than 10 seconds? Thanks, Ed On Oct 30, 2014, at 9:14 PM, Ed Judge wrote: > I have been using 1.5 all along. I end up with a 0 length file which is a > little concerning. Not to mention that the timeout is adding 10 seconds to > the overall transfer. Is this normal or

starting flume from another Java process

2014-11-10 Thread Ed Judge
Has anyone had experience with starting/stopping flume from another Java process via ProcessBuilder? Seems like I am able to start it the first time (have it do a transfer and log messages) then destroy it. However, the next time I start it, it doesn’t do any transfer or writing to its log file.

Re: starting flume from another Java process

2014-11-11 Thread Ed Judge
FYI, this issue seems to be related to not handling Flume IO. My Java process was inheriting IO and not reading it causing flume to suspend. -Ed On Nov 10, 2014, at 7:27 PM, Ed Judge wrote: > Has anyone had experience with starting/stopping flume from another Java > proce