Thank you Ashish, this looks pretty straightforward and I will try it. George
On Wed, Oct 30, 2013 at 6:53 PM, Ashish <[email protected]> wrote: > George, > > Just to get things working, you can use UUID Interceptor > http://flume.apache.org/FlumeUserGuide.html#uuid-interceptor > > Put the headerName field value as rowKey and the code should work. I have > not used this, but if it still doesn't work let us know. I will quickly > hack out a working example. > > > On Thu, Oct 31, 2013 at 1:22 AM, George Pang <[email protected]> wrote: > >> Thank you, but I am not so sure I can insert header with the example in >> this blog. I miss a part for the whole picture. >> >> George >> >> >> On Wed, Oct 30, 2013 at 6:56 AM, Brock Noland <[email protected]> wrote: >> >>> I just googled and found this. Not sure if there is a better one. >>> >>> >>> http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/ >>> >>> >>> On Wed, Oct 30, 2013 at 12:34 AM, George Pang <[email protected]> wrote: >>> >>>> Is there a tutorial for this topic out there? >>>> >>>> Thanks, >>>> >>>> George >>>> >>>> >>>> On Tue, Oct 29, 2013 at 6:50 PM, George Pang <[email protected]> wrote: >>>> >>>>> Hi Brock, >>>>> >>>>> The morphline comand addValue looks like the one I need, but how can I >>>>> add the event head key-value pair? >>>>> >>>>> Thank you, >>>>> >>>>> George >>>>> >>>>> >>>>> On Tue, Oct 29, 2013 at 1:02 PM, George Pang <[email protected]> wrote: >>>>> >>>>>> Hi Brock, >>>>>> >>>>>> Yes, I think morphline interceptor should be something I am looking >>>>>> for. I am studying it now. >>>>>> >>>>>> Thank you, >>>>>> >>>>>> George >>>>>> >>>>>> >>>>>> On Tue, Oct 29, 2013 at 12:56 PM, Brock Noland <[email protected]>wrote: >>>>>> >>>>>>> In a very simple demo you could use the static interceptor: >>>>>>> http://flume.apache.org/FlumeUserGuide.html#static-interceptor >>>>>>> >>>>>>> but you probably want to use morphlines interceptor a custom >>>>>>> interceptor: >>>>>>> http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor >>>>>>> >>>>>>> >>>>>>> On Tue, Oct 29, 2013 at 2:52 PM, Hari Shreedharan < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Nope. You need to insert it at some other location. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Hari >>>>>>>> >>>>>>>> On Tuesday, October 29, 2013 at 12:48 PM, George Pang wrote: >>>>>>>> >>>>>>>> Hi Hari, >>>>>>>> >>>>>>>> Is it (inserting a rowKey header into event) something I can do in >>>>>>>> flume.conf? I tried to do that but I am new to flume. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> >>>>>>>> George >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 29, 2013 at 12:40 PM, Hari Shreedharan < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>> Did you insert a rowKey header into the event? If the header is >>>>>>>> not there, you are obviously going to get null returned from >>>>>>>> currentEvent.getHeaders().get(“rowKey”). You need to insder the header >>>>>>>> into >>>>>>>> the event at some point. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Hari >>>>>>>> >>>>>>>> On Tuesday, October 29, 2013 at 12:30 PM, George Pang wrote: >>>>>>>> >>>>>>>> Hi Ashish, >>>>>>>> >>>>>>>> Actually it starts with headers. In the example code has " String >>>>>>>> rowKeyStr = currentEvent.getHeaders().get("rowKey");" but there is no >>>>>>>> such >>>>>>>> header found. If I get rid of this line, the rest will complain unable >>>>>>>> to >>>>>>>> deliver event. But I checked the event, it's not null. >>>>>>>> >>>>>>>> I am trying to use flume to save to hbase, and use the example >>>>>>>> http://blog.cloudera.com/blog/2012/11/streaming-data-into-apache-hbase-using-apache-flume/for >>>>>>>> customized serializer. >>>>>>>> >>>>>>>> flume.conf: >>>>>>>> >>>>>>>> logger-agent.sources = Syslog-UDP >>>>>>>> logger-agent.sinks = Syslog-HBase >>>>>>>> logger-agent.channels = Syslog-HBase-Channel >>>>>>>> >>>>>>>> logger-agent.sources.Syslog-UDP.channels = Syslog-HBase-Channel >>>>>>>> logger-agent.sinks.Syslog-HBase.channel = Syslog-HBase-Channel >>>>>>>> >>>>>>>> logger-agent.sources.Syslog-UDP.type = syslogudp >>>>>>>> logger-agent.sources.Syslog-UDP.port = 5140 >>>>>>>> logger-agent.sources.Syslog-UDP.host = localhost >>>>>>>> >>>>>>>> logger-agent.sinks.Syslog-HBase.type = org.apache.flume.sink.hbase. >>>>>>>> AsyncHBaseSink >>>>>>>> logger-agent.sinks.Syslog-HBase.table = syslog2 >>>>>>>> logger-agent.sinks.Syslog-HBase.columnFamily = cluster >>>>>>>> logger-agent.sinks.Syslog-HBase.serializer.payloadColumn = dev >>>>>>>> logger-agent.sinks.Syslog-HBase.serializer.incrementColumn = icol >>>>>>>> logger-agent.sinks.Syslog-HBase.serializer.columns = >>>>>>>> forum,inbound,outbound >>>>>>>> logger-agent.sinks.Syslog-HBase.batchSize = 5000 >>>>>>>> logger-agent.sinks.Syslog-HBase.serializer = >>>>>>>> org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer >>>>>>>> >>>>>>>> logger-agent.channels.Syslog-HBase-Channel.type = memory >>>>>>>> >>>>>>>> >>>>>>>> Flume version: 1.4 >>>>>>>> >>>>>>>> org.apache.flume.FlumeException: No row key found in headers! >>>>>>>> at >>>>>>>> com.ib.SplittingSerializer.setEvent(SplittingSerializer.java:43) >>>>>>>> at >>>>>>>> org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:184) >>>>>>>> at >>>>>>>> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >>>>>>>> at >>>>>>>> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >>>>>>>> at java.lang.Thread.run(Thread.java:662) >>>>>>>> >>>>>>>> Thank you, >>>>>>>> >>>>>>>> George >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 29, 2013 at 2:29 AM, Ashish <[email protected]>wrote: >>>>>>>> >>>>>>>> George, >>>>>>>> >>>>>>>> Can you share more details about what you are trying to achieve? If >>>>>>>> possible, please share Flume version, Agent configuration and exception >>>>>>>> stacktrace. >>>>>>>> You may also look at HBase Sink for more info >>>>>>>> http://flume.apache.org/FlumeUserGuide.html#hbasesinks >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 29, 2013 at 2:50 PM, George Pang <[email protected]>wrote: >>>>>>>> >>>>>>>> I use the serializer example in this blog post: >>>>>>>> http://blog.cloudera.com/blog/2012/11/streaming-data-into-apache-hbase-using-apache-flume/ >>>>>>>> >>>>>>>> but got "Unable to deliver event. Exception follows. >>>>>>>> java.lang.NullPointerException". From looking it up in forums, I think >>>>>>>> it >>>>>>>> may be caused by empty header. If so, how is a timestamp header is >>>>>>>> added? >>>>>>>> if not what cause the event undelivery to happen? >>>>>>>> >>>>>>>> Thank you, >>>>>>>> >>>>>>>> George >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> thanks >>>>>>>> ashish >>>>>>>> >>>>>>>> Blog: http://www.ashishpaliwal.com/blog >>>>>>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >>> >> >> > > > -- > thanks > ashish > > Blog: http://www.ashishpaliwal.com/blog > My Photo Galleries: http://www.pbase.com/ashishpaliwal >
