RE: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread Sutanu Das
HI Ian, It is working with your regex with extra \ Wow Ian, Big thank you I’ll test some more stuff and report tomorrow, thanks again Ian, Huge Help From: iain wright [mailto:iainw...@gmail.com] Sent: Wednesday, February 17, 2016 9:27 PM To: user@flume.apache.org Subject: Re: regex_extractor NO

Control character stuffing when using Kafka Sink

2016-02-17 Thread Pravesh Bhardwaj
Greetings, I am trying to use Flume to read my source files (pipe delimited text files) and feed them to kafka. All of the plumbing seems to work fine and all the records are getting in to kafka successfully. However, Flume seems to add NUL and STX control characters at the start of each data lin

Re: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread iain wright
Hi Sutanu, This is working out as well: multi-ale2-station.sources.source1.interceptors.i1.regex = host=(\\w+-\\d+-\\w+.attwifi.com) When in doubtescape i guess :p Cheers, -- Iain Wright This email message is confidential, intended only for the recipient(s) named above and may contain in

Re: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread iain wright
It's definitely something to do with the regex or how flume/java is using it/pulling it in from config Specifically the \w+-\d+-\w+ isn't matching when used in the regex (but matches in regex testers) The below works you don't mind being less strict about the contents of host when matching: mul

RE: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread Sutanu Das
Thanks Ian, Here is the s.out which is a text file of the python script output We run Hortonworks and we are on HDP 2.3 – I think it is Flume 1.5 I look forward to your testing, thanks again Ian. From: iain wright [mailto:iainw...@gmail.com] Sent: Wednesday, February 17, 2016 8:06 PM To: user@f

Re: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread iain wright
Hi Sutanu, Bummer. Its definitely supported, we use it for writing to S3 in the exact manner you intend too. If you want to run this to generate some data as its presented to the source: /usr/local/bin/multi_ale2.py -f /etc/flume/ale_station_conf/m_s.cfg >> out.txt And throw it in a pastebin, or

RE: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread Sutanu Das
Hi Ian, Yes, events are getting written to but the regex_extractor variable is not getting substituted in the HDFS path I’ve tried both hostname with the regex you advised yet, No luck Is regex_extrator for the HDFS path of Sink even supported ? 18 Feb 2016 00:58:40,855 INFO [SinkRunner-Poll

Re: regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread iain wright
Config looks sane, Are events being written to /prod/hadoop/smallsite/flume_ ingest_ale2//%Y/%m/%d/%H? A couple things that may be worth trying if you haven't yet: - Try host=(ale-\d+-\w+.attwifi.com) instead of .*host=(ale-\d+-\w+. attwifi.com).* - Try hostname or another header instead of host

regex_extractor NOT replacing the HDFS path vaiable

2016-02-17 Thread Sutanu Das
Hi Hari/Community, We are trying to replace the hdfs path with the regex_extrator interceptor but apparently the variable is not getting replaced in the HDFS path in the HDFS Sink. We are trying to replace the HDFS path of the HDFS Sink with /prod/hadoop/smallsite/flume_ingest_ale2/%{host}/%Y/

Control characters problem with Kafka Sink

2016-02-17 Thread Pravesh Bhardwaj
Greetings, I am trying to use Flume to read my source files (pipe delimited text files) and feed them to kafka. All of the plumbing seems to work fine and all the records are getting in to kafka successfully. However, Flume seems to add NUL and STX control characters at the start of each data lin