Re: Flume bz2 issue while processing by a map reduce job

2012-11-02 Thread Mike Percy
Hi Jagadish, My understanding based on investigating this issue over the last couple of days is that MapReduce jobs will only read the first section of a concatenated bzip2 file. I believe you are correct that https://issues.apache.org/jira/browse/HADOOP-6852 is the only way to solve this issue, an

Re: SNMP Source

2012-11-10 Thread Mike Percy
Hi Simon, Nothing that I know of. Of course, contributions are welcome! :) Regards, Mike On Fri, Nov 9, 2012 at 3:04 AM, Simon Monecke wrote: > Hi, > > is there any solutions to receive SNMP-Logs with flume? > > Regards, > Simon >

Re: [ANNOUNCE] New Apache Flume committer - Patrick Wendell

2012-11-13 Thread Mike Percy
Patrick, welcome! Great to have you on board. Regards, Mike On Mon, Nov 12, 2012 at 1:04 PM, Hari Shreedharan wrote: > On behalf of the Apache Flume PMC, I am excited to welcome Patrick > Wendell as a committer on Flume! Patrick has contributed significantly to > the project, by adding new fea

Re: .tmp in hdfs sink

2012-11-15 Thread Mike Percy
Hi Mohit, this is a complicated issue. I've filed https://issues.apache.org/jira/browse/FLUME-1714 to track it. In short, it would require a non-trivial amount of work to implement this, and it would need to be done carefully. I agree that it would be better if Flume handled this case more gracefu

Re: .tmp in hdfs sink

2012-11-20 Thread Mike Percy
e limit? > > On Thu, Nov 15, 2012 at 8:14 PM, Mohit Anchlia wrote: > >> Thanks Mike it makes sense. Anyway I can help? >> >> >> On Thu, Nov 15, 2012 at 11:54 AM, Mike Percy wrote: >> >>> Hi Mohit, this is a complicated issue. I've filed >>

Re: .tmp in hdfs sink

2012-11-20 Thread Mike Percy
t; > > > Thanks again for committing this change. Do you know when 1.3.0 is out? > I am > > currently using the snapshot version of 1.3.0 > > > > On Tue, Nov 20, 2012 at 11:16 AM, Mike Percy wrote: > >> > >> Mohit, > >> FLUME-1660 is now comm

Re: Netcat source stops processing data

2012-11-20 Thread Mike Percy
Rahul, A patch and a unit test to add this as an option would be greatly appreciated! There is already a JIRA open for this: https://issues.apache.org/jira/browse/FLUME-1713 Regards, Mike On Tue, Nov 20, 2012 at 3:20 PM, Rahul Ravindran wrote: > Pinging on this slightly old thread. > > I want

Re: Running multiple flume versions on the same box

2012-11-21 Thread Mike Percy
There are no system level singletons or hard-coded file paths or ports if that is what you mean. But in a production scenario, Flume should be resilient to failures since it will just buffer events in the channel at each agent. So why run simultaneous versions when doing minor version upgrades? (I

Re: A customer use case

2012-12-04 Thread Mike Percy
Hi Emile, On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao wrote: > > 1. Which is the best way to implement such a scenario using Flume/ Hadoop? > You could use the file spooling client / source to stream these files back in the latest trunk and upcoming Flume 1.3.0 builds, along with hdfs sink. 2. Th

Re: [ANNOUNCE] Apache Flume 1.3.1 released

2013-01-04 Thread Mike Percy
Hari, Thanks for taking care of this release! Well done! Regards, Mike On Wed, Jan 2, 2013 at 3:53 PM, Hari Shreedharan wrote: > The Apache Flume team is pleased to announce the release of Flume > version 1.3.1. > > Flume is a distributed, reliable, and available service for efficiently > colle

New blog post on Flume performance tuning

2013-01-11 Thread Mike Percy
Hi folks, I just posted to the Apache blog on how to do performance tuning with Flume. I plan on following it up with a post about using the Flume monitoring capabilities while tuning. Feedback is welcome. https://blogs.apache.org/flume/entry/flume_performance_tuning_part_1 Regards, Mike

Re: New blog post on Flume performance tuning

2013-01-11 Thread Mike Percy
Thanks Brock! I've been working on this, off and on, for a while. :) On Fri, Jan 11, 2013 at 12:18 PM, Brock Noland wrote: > Nice post! > > On Fri, Jan 11, 2013 at 12:13 PM, Mike Percy wrote: > > Hi folks, > > I just posted to the Apache blog on how to do perfo

Re: New blog post on Flume performance tuning

2013-01-11 Thread Mike Percy
> Thank you so much Mike, for all the good work. > > > > Warm Regards, > > Tariq > > https://mtariq.jux.com/ > > > > > > On Sat, Jan 12, 2013 at 2:15 AM, Mike Percy wrote: > >> > >> Thanks Brock! I've been working on this, off and on

Re: Constant Traffic on port 35872

2013-01-16 Thread Mike Percy
I know next to nothing about Flume OG but if I had to guess I'd say it's either a heartbeat or metrics collection. Why do you want it to stop? On Wed, Jan 16, 2013 at 5:06 PM, James Stewart wrote: > Hello all, > > ** ** > > I’m using flume 0.9.4 – before anybody mentions it, we aren’t in a

Re: Constant Traffic on port 35872

2013-01-16 Thread Mike Percy
ss a WAN and with a lot of nodes it’s a significant > enough amount of data to be a problem. > > ** ** > > I don’t know much about Java, but could this be something to do with > Thrift? > > ** ** > > ** ** > > *From:* Mike Percy [mailto:mpe...@apache.org] > *Sent:* Th

Re: Uncaught Exception When Using Spooling Directory Source

2013-01-17 Thread Mike Percy
Hi Henry, The files must be immutable before putting them into the spooling directory. So if you copy them from a different file system then you can run into this issue. The right way to do it is to copy them to the same file system and then atomically move them into the spooling directory. Regard

Re: Uncaught Exception When Using Spooling Directory Source

2013-01-17 Thread Mike Percy
give me some advice about how to design the architecture? Which > type of source and sink can fit? > > Thanks! > > > On Fri, Jan 18, 2013 at 2:05 PM, Mike Percy wrote: > >> Hi Henry, >> The files must be immutable before putting them into the spooling >> directo

Re: Uncaught Exception When Using Spooling Directory Source

2013-01-18 Thread Mike Percy
source with "tail -F" but that is much more >>> unreliable than the spooling file source. >>> >>> Regards, >>> Mike >>> >>> >>> On Thu, Jan 17, 2013 at 10:23 PM, Henry Ma wrote: >>> >>>> OK, thank you very m

Re: Can we treat a whole file as a Flume event?

2013-01-22 Thread Mike Percy
Check out the latest changes to SpoolingFileSource w.r.t. EventDeserializers on trunk. You can deserialize a whole file that way if you want. Whether that is a good idea depends on your use case, though. It's on trunk, lacking user docs for the latest changes but I will try to hammer out updated d

Re: Can we treat a whole file as a Flume event?

2013-01-22 Thread Mike Percy
is case the file being moved) as flume is not quite intended > for large events. Mike perhaps you can throw some light on that aspect ? > > > On Tue, Jan 22, 2013 at 12:17 AM, Mike Percy wrote: > >> Check out the latest changes to SpoolingFileSource w.r.t. >> EventDeseria

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
What version of Flume are you using? Are you using Maven for your build? You should be able to get away with just flume-ng-core. On Wed, Jan 23, 2013 at 10:02 AM, yogender nerella wrote: > Hi, > > I would like to make my app directly write events to an flume agent. > > What are the libraries ne

Re: Can we treat a whole file as a Flume event?

2013-01-23 Thread Mike Percy
Jan 22, 2013 at 6:39 PM, Mike Percy wrote: > >> Hi Roshan, >> Yep in general I'd have concerns w.r.t. capacity planning and garbage >> collector behavior for large events. Flume holds at least one event batch >> in memory at once, depending on # of sources/

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
as the same issue. > > Yogi > > > On Wed, Jan 23, 2013 at 11:36 AM, Mike Percy wrote: > >> What version of Flume are you using? Are you using Maven for your build? >> >> You should be able to get away with just flume-ng-core. >> >> >> On Wed, Jan 2

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
at > org.apache.flume.api.RpcClientFactory.getDefaultInstance(RpcClientFactory.java:168) > at > org.apache.flume.api.RpcClientFactory.getDefaultInstance(RpcClientFactory.java:128) > at > org.apache.flume.clients.log4jappender.Log4jAppender.activateOptions(Log4jAppender.java:184) > > > Appreciate y

Re: Authentication - Avro Source, Sink, RpcClient

2013-01-23 Thread Mike Percy
I agree that AvroSource/Sink SASL and Kerberos auth would be really useful. It would need some work at the Avro level, though. There is also the possibility of doing the same thing on top of Thrift, in which case it would require a brand new source/sink/client implementation but it wouldn't requir

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
eds flume-ng-sdk.jar file. > > In that case, if I want to ship flume log4jappender, should I have to ship > all these jar files in flume/lib directory? > > Yogi > > > On Wed, Jan 23, 2013 at 12:08 PM, Mike Percy wrote: > >> I don't use Eclipse but my understa

Re: Can we treat a whole file as a Flume event?

2013-01-23 Thread Mike Percy
Yep my bad, typo :) On Wed, Jan 23, 2013 at 1:04 PM, Roshan Naik wrote: > Thats SpoolDirectorySource.java .. i thought you referred to > SpoolingFileSource > earlier. i assume that was a typo ? > > > On Wed, Jan 23, 2013 at 11:53 AM, Mike Percy wrote: > >> >>

Re: Setting up flume to use ganglia results in a lot of error messages in /var/log/messages

2013-01-23 Thread Mike Percy
Not sure when or how it broke, as I know of people using it in production. There is a way to configure it for different versions of Ganglia, like 3.0, 3.1. Might be worth trying both values to see if it's a problem with one or the other: http://flume.apache.org/FlumeUserGuide.html#ganglia-reporting

Re: flume-cassandra

2013-01-23 Thread Mike Percy
Hi Sri, Cloudera originally created Flume, then contributed it to the Apache Software Foundation (ASF), and continues to invest heavily into it under the auspices of the ASF. The current generation of Flume is called Flume NG. I encourage you to use the latest "NG" generation of Flume (version 1.x)

Re: flume-cassandra

2013-01-23 Thread Mike Percy
lume collector with cassandra. If > any body tried it before please help me. > thank in advance. > > > On Thu, Jan 24, 2013 at 10:26 AM, Mike Percy wrote: > >> Hi Sri, >> Cloudera originally created Flume, then contributed it to the Apache >> Software Foundati

Re: Reliability in Flume

2013-01-23 Thread Mike Percy
Henry, Please see inline... On Wed, Jan 23, 2013 at 7:26 PM, Henry Ma wrote: > Dear Flume developers and users, > > I understand that Flume NG uses channel-based transactions to guarantee > reliable message delivery between agents. But in some extreme failure > scenes, will Flume keep total Reli

Re: flume-cassandra

2013-01-24 Thread Mike Percy
gt; On Thu, Jan 24, 2013 at 10:43 AM, Mike Percy wrote: > >> What do you mean by "collector"? >> >> >> On Wed, Jan 23, 2013 at 9:05 PM, Sri Ramya wrote: >> >>> Thank you very much. But I need a collector in my application, flume-ng >>>

Re: HDFS Test Failure

2013-01-25 Thread Mike Percy
Seems strange. Connor have you tried running "mvn clean install" and do you get the same results? Flume is weird because we push SNAPSHOT builds per commit so you have to install to avoid strange dependency issues sometimes. It's especially insidious to do mvn clean package. I don't know if it's

Re: Flume-NG : Spooling dir source : java.io.IOException: Stream closed

2013-01-27 Thread Mike Percy
bcc: cdh-u...@cloudera.org No version of CDH currently ships with Flume 1.3.1, so redirecting this question to the user@flume.apache.org user list. Regards, Mike On Sun, Jan 27, 2013 at 8:56 PM, NGuyen thi Kim Tuyen wrote: > I'm using Flume-Ng 1.3.1 . > > Vào 11:33:49 UTC+7 Thứ hai, ngày 28 th

Re: Flume-NG 1.3.1 : Spooling dir source : java.io.IOException: Stream closed

2013-01-27 Thread Mike Percy
Hi Nguyễn, The spooling source only works on "done", immutable files. So they have to be atomically moved and they cannot be modified after being placed into the spooling directory. Regards, Mike On Sun, Jan 27, 2013 at 11:14 PM, NGuyen thi Kim Tuyen < tuyen03a...@gmail.com> wrote: > Hi , > > Pl

Re: SpoolDir marks item as completed, when sink fails

2013-02-01 Thread Mike Percy
Tzur, that is expected, because the data is committed by the source onto the channel. Sources and sinks are decoupled, they only interact via the channel, which buffers the data and serves to mitigate impedance mismatches. On Thu, Jan 31, 2013 at 2:35 PM, Tzur Turkenitz wrote: > Hello all, > >

Re: SpoolDir marks item as completed, when sink fails

2013-02-05 Thread Mike Percy
fore it crashed then a > "Replay" will be done to resend the whole data? > > Just trying to grasp the basics > > > > > On Fri, Feb 1, 2013 at 4:56 AM, Mike Percy > > > wrote: > >> Tzur, that is expected, because the data is committed by the sourc

Re: SpoolDir marks item as completed, when sink fails

2013-02-05 Thread Mike Percy
taken from the sink at the next opportunity. Regards Mike On Tuesday, February 5, 2013, Mike Percy wrote: > Tzur, > The source and sink are decoupled completely. The source will fill the > channel until there is no more work or the channel is full. So the data is > sitting buffered in

Re: Authentication - Avro Source, Sink, RpcClient

2013-02-05 Thread Mike Percy
; > ** ** > > Regards, > > Rudolf**** > > ** ** > > *From:* Mike Percy [mailto:mpe...@apache.org 'mpe...@apache.org');>] > *Sent:* Wednesday, January 23, 2013 9:16 PM > *To:* user@flume.apache.org 'user@flume.apache.org');> > *Subject

Re: Authentication - Avro Source, Sink, RpcClient

2013-02-05 Thread Mike Percy
gt; > ** ** > > This might not be the nicest way to implement authentication, but this way > it’s pretty much transparent to Flume. > > I think it would be pretty easy to implement some kind of encryption too > using the other org.apache.avro.ipc.RPCPlugin methods. > > ** ** > &g

Re: Flume and JMX

2013-02-06 Thread Mike Percy
What exactly were you looking for? On Wed, Feb 6, 2013 at 7:35 AM, wrote: > I am looking to deploy and manage multiple Flume deployment through JMX. > Besides the JMXPollUtil does Flume have an hooks that would enable this?** > ** > > ** ** > > Thanks, > > ** ** > > Matt > > This messag

Re: Flume and JMX

2013-02-06 Thread Mike Percy
ing is if flume provides > JMX type services like your standard app containers like jboss, web sphere, > etc. If not can we make use of the mbeans that are already there? > > ** ** > > *From:* Mike Percy [mailto:mpe...@apache.org] > *Sent:* Wednesday, February 06

Re: Flume and JMX

2013-02-06 Thread Mike Percy
terested in helping out :) Regards, Mike On Wed, Feb 6, 2013 at 11:43 AM, wrote: > Deploy, delete, start, stop, update configuration, restart, etc > > ** ** > > *From:* Mike Percy [mailto:mpe...@apache.org] > *Sent:* Wednesday, February 06, 2013 2:38 PM > >

Re: Flume in Windows?

2013-02-06 Thread Mike Percy
I know of someone who does this. They wrote their own startup scripts and stuff. Regards Mike On Wed, Feb 6, 2013 at 5:34 AM, venkatramanan wrote: > Hi, > > Am new in apache flume. > > Is there any possible to run the flume agent in windows 7. > > please advise > > thanks, > Venkat N > >

Re: Analysis of Data

2013-02-07 Thread Mike Percy
Let's take this conversation further. What is missing? On Thu, Feb 7, 2013 at 2:39 AM, Inder Pall wrote: > flume is a platform to get events to the right sink (HDFS, local-file, > ) > analytics is not something which falls in it's territory > > - Inder > > > On Thu, Feb 7, 2013 at 3:22 PM,

Re: Flume NG and zookeeper

2013-02-07 Thread Mike Percy
Integrate in what way? On Thu, Feb 7, 2013 at 6:36 PM, 吳瑞琳 wrote: > Hi all, > > I am trying to integrate Flume NG and zookeeper. However, I did not find > any configuration about this in Flume NG. Could you please advise how to > deal with this? > > Thanks, > RL

Re: Analysis of Data

2013-02-07 Thread Mike Percy
ut because Flume can pipe data to downstream agents who can do the heavy processing, it seems to me that this requirement is easily fulfilled by Flume. Regards, Mike On Thu, Feb 7, 2013 at 4:29 PM, Mike Percy wrote: > >> Let's take this conversation further. What is missing?

Re: Analysis of Data

2013-02-07 Thread Mike Percy
Hi Steven, Thanks for chiming in! Please see my responses inline: On Thu, Feb 7, 2013 at 3:04 PM, Steven Yates wrote: > The only missing link within the Flume architecture I see in this > conversation is the actual channel's and brokers themselves which > orchestrate this lovely undertaking of da

Re: Flume NG and zookeeper

2013-02-07 Thread Mike Percy
; some Zookeeper node when it is up. > > Regards, > RL > > > 2013/2/8 Mike Percy > >> Integrate in what way? >> >> >> On Thu, Feb 7, 2013 at 6:36 PM, 吳瑞琳 wrote: >> >>> Hi all, >>> >>> I am trying to integrate Flume NG an

Re: Analysis of Data

2013-02-08 Thread Mike Percy
Nitin, Good to hear more of your thoughts. Please see inline. On Thu, Feb 7, 2013 at 8:55 PM, Nitin Pawar wrote: I can understand the idea of having data processed inside flume by > streaming it to another flume agent. But do we really need to re-engineer > something inside flume is what I am t

Re: Analysis of Data

2013-02-08 Thread Mike Percy
sink can continue. >> >> In another route, we can have it to sink to a processor source of flume >> which then converts the data and runs quick analysis on data in memory and >> update the global counters kind of things which then can be sink to live >> reporting syst

Re: How to load zip file into hdfs sink using flume-ng

2013-02-08 Thread Mike Percy
Actually it might be tricky to use the directory spooling source to read a compressed archive. It's possible, but you would definitely need to write your own deserializer. Flume is an event-oriented streaming system, it's not really optimized to be a plain file transfer mechanism like FTP. Regard

Re: It's better not to use thrift?

2013-02-20 Thread Mike Percy
Hari has done recent work on a modern Thrift RPC implementation. The existing impl. is there for legacy purposes and does not have a batch append() call so it turns out to be quite slow. Have you considered using the HTTP source? With decent batch sizes and keep-alive the performance might be fine

Re: Flume Ng replaying events when the source is idle

2013-03-04 Thread Mike Percy
Sagar, Just try "tail -F" on the same file over and over on the command line. It will display the last few lines. If you want to avoid this, try "tail -F -n 0 filename" and you should not see this. Every time you reload your configuration file, the specified command is re-executed by the source.

Re: Flume secure communication

2013-03-12 Thread Mike Percy
No network encryption support yet but there is a patch up at https://issues.apache.org/jira/browse/FLUME-997 for this functionality. You are welcome to take a look and provide any comments. Not sure what you mean by #2, you would have to share more about your requirements / use case. Regards, M

Re: Flume secure communication

2013-03-12 Thread Mike Percy
eouts/network errors >> >> Inder >> >> >> On Tuesday, March 12, 2013, Mike Percy wrote: >>> No network encryption support yet but there is a patch up at >>> https://issues.apache.org/jira/browse/FLUME-997 for this functionality. You >>> are

Re: Flume secure communication

2013-03-12 Thread Mike Percy
il recently, so nobody had worked on it. Regards, Mike On Tue, Mar 12, 2013 at 12:41 PM, Mike Percy wrote: > It's certainly possible to sniff the wire traffic using some tool like > WireShark. > > Regards, > Mike > > Sent from my iPhone > > On Mar 12, 2013, at 5:29

Re: Writing to HDFS from multiple HDFS agents (separate machines)

2013-03-14 Thread Mike Percy
Hi Gary, All the suggestions in this thread are good. Something else to consider is that adding multiple HDFS sinks pulling from the same channel is a recommended practice to maximize performance (competing consumers pattern). In that case, not only would it be a good idea to put the data into dire

Re: Parameters in Configuration File

2013-03-15 Thread Mike Percy
Hey Connor, Take a look at the discussion @ https://issues.apache.org/jira/browse/FLUME-1941 I you want to help work on this you are more than welcome to :) Regards, Mike On Thu, Mar 14, 2013 at 5:09 PM, Connor Woodson wrote: > Does the Flume configuration file support parameters, for instanc

Re: Parameters in Configuration File

2013-03-15 Thread Mike Percy
there is a lot of boilerplate logic in the > various implementations of Configurable.configure(Context) that is just > completely unnecessary. In addition, you wouldn't have to spend time > building something like property substitution that's been done many times > before. > &g

Re: Writing to HDFS from multiple HDFS agents (separate machines)

2013-03-15 Thread Mike Percy
In my experience, 3-5 HDFS sinks will give optimal performance, but it's dependent on whether you use memory channel or file channel, your overall throughput, batch sizes, and event sizes. Regards, Mike On Thu, Mar 14, 2013 at 7:42 PM, Gary Malouf wrote: > Thanks for the pointer Mike. Any tho

Re: Simple HDFS Sink file rolling question please.

2013-03-25 Thread Mike Percy
Hi Chris, Check out hdfs.idleTimeout parameter. Maybe set it to 5 minutes (i.e. hdfs.idleTimeout = 300) or something. http://flume.apache.org/FlumeUserGuide.html Regards, Mike On Thu, Mar 21, 2013 at 1:21 PM, Chris Neal wrote: > Hi :) > > I have an ExecSource running a tail -F on a bunch of

Re: How do I search the past posts for a topic?

2013-04-03 Thread Mike Percy
Jayashree, I like to use the search-hadoop.com site provided by Sematext: http://search-hadoop.com/?q=&fc_project=Flume The logger is intended mainly for debugging. It will print data to the flume.log file itself. Regards, Mike On Sun, Mar 31, 2013 at 6:56 PM, JR wrote: > Hello, > >I wou

Re: Flume service error from cloudera manager

2013-04-19 Thread Mike Percy
(bcc: user@flume.apache.org) Hi Madhu, Thanks for reaching out. The appropriate support channel for Cloudera Manager is the cdh-u...@cloudera.org email list. I have redirected your question there. Regards, Mike On Thu, Apr 18, 2013 at 9:01 PM, Madhusudhan Reddy Munagala < madhu.munag...@gmail.c

Re: [NEW FEATURE] - FLUME-1687 - Solr Sink for Apache Flume Now In Beta

2013-04-22 Thread Mike Percy
Israel, Nice! I'll find some time to dig into your patch this week. Regards, Mike On Sat, Apr 20, 2013 at 10:43 AM, Israel Ekpo wrote: > Fellow Flume Users, > > I have just created an Apache Solr sink for Flume against version 1.3.1 > > This has been tested and it works fine. > > This sink is

Re: How to get a bad message out of the channel?

2013-05-10 Thread Mike Percy
Hook up a HDFS sink to them that doesn't use %Y, %m, etc in the configured path. HTH, Mike On May 10, 2013, at 11:00 AM, Matt Wise wrote: > Eek, this was worse than I thought. Turns out message continued to be added > to the channels, but no transactions could complete to take messages out of

Re: Expirience in using Apache Flume in OSGi environment

2013-05-20 Thread Mike Percy
Andrey, What is the use case? Can you provide more detail? Thanks, Mike On Sat, May 18, 2013 at 2:53 AM, Andrey Poltavtsev wrote: > Hi, > > I did not found in existing Apache Flume distribution | documentation > (User guide | Developers Guide) any information regarding using of Apache > Flume

Re: Flume 1.4 release

2013-05-21 Thread Mike Percy
Hi Rahul, I think end of June is a little tight, usually it takes a while to do a release and we have not discussed it lately. I'd say early July is more likely. Let me start a discussion. Regards, Mike On Tue, May 21, 2013 at 10:49 AM, Rahul Ravindran wrote: > Hi, > Is there a rough estim

Re: Expirience in using Apache Flume in OSGi environment

2013-05-22 Thread Mike Percy
Andrey, I don't know of anyone doing that and I'd be surprised if you didn't run into some issues. We try to avoid static instances but who knows. I would try to just use the Flume client API (SDK) in your app and deploy Flume as a normal daemon. Mike On Tue, May 21, 2013 at 5:31 AM, Andrey Polt

Re: Spooling fileSuffix attribute ignored

2013-05-22 Thread Mike Percy
Hi Phil, Nice approach. How is the spooling directory source working for you? Any thoughts on how it could be improved? Mike On Tue, May 21, 2013 at 8:17 AM, Phil Scala wrote: > Hi, > > ** ** > > Based on my use and understanding that setting “fileSuffix” is simpy the > extension to the fil

Re: How to get a bad message out of the channel?

2013-05-22 Thread Mike Percy
27;failsafe' path to write messages to when they are missing that kind of > data? > > --Matt > > On May 10, 2013, at 6:30 PM, Mike Percy wrote: > > > Hook up a HDFS sink to them that doesn't use %Y, %m, etc in the > configured path. > > > > HTH, >

Re: What does the file header mean ? Flume always add headers to file header

2013-05-22 Thread Mike Percy
You probably figured this out by now but those are Avro container files :) see http://avro.apache.org Regards Mike On Wed, May 15, 2013 at 3:06 AM, higkoohk wrote: > Maybe it make by 'tengine.sinks.hdfs4log.serializer = avro_event' , but > still don't know why and howto ... > > > 2013/5/15 h

Re: Setting Hadoop-specific settings for the HDFS plugin?

2013-05-22 Thread Mike Percy
You can do it in your hdfs-site.xml file which Flume will pull in when it detects Hadoop from the environment. Mike On Wed, May 15, 2013 at 9:22 AM, Matt Wise wrote: > How do I pass hadoop-specific configuration settings to the HDFS plugin in > Flume 1.3.0? Specifically, I need to modify the f

Re: AvroSource HTTP vs Netty with Python bindings..

2013-05-22 Thread Mike Percy
Yep still true. There is a Thrift source on trunk though, also consider the HTTP source for integration with Python. Mike On Wed, May 8, 2013 at 1:25 PM, Matt Wise wrote: > It seems like the current Python Avro package does not support the > Flume-NG AvroSource... Is this still true? > > > htt

Re: Spooling fileSuffix attribute ignored

2013-05-22 Thread Mike Percy
around. Thanks a lot. > > De: Mike Percy > Responder a: Flume User List > Fecha: miércoles, 22 de mayo de 2013 09:35 > Para: Flume User List > Asunto: Re: Spooling fileSuffix attribute ignored > > Hi Phil, > Nice approach. How is the spooling directory source working

Re: Checking channel size.

2013-05-22 Thread Mike Percy
You can attach to the process locally via JMX and pull the metric from there. I'm not sure how to do it via the command line though. Mike On Wed, May 22, 2013 at 12:54 AM, Pranav Sharma wrote: > Is there a way to check the size of a channel either programmatically or > using a command line? I'

Re: Checking channel size.

2013-05-22 Thread Mike Percy
On Wed, May 22, 2013 at 1:00 AM, Mike Percy wrote: >> You can attach to the process locally via JMX and pull the metric from >> there. I'm not sure how to do it via the command line though. >> >> Mike >> >> >> >> On Wed, May 22, 2013 at 12:

Re: Missing headers when using AVRO Sink/Source

2013-05-22 Thread Mike Percy
FYI there is a stock timestamp interceptor, if you want to use that. Mike On May 22, 2013, at 3:20 AM, ZORAIDA HIDALGO SANCHEZ wrote: > Dear all, > > I made a custom interceptor in order to insert the timestamp header that is > used by the HDFS sink. > Firstly, I run an example using SPOOLING

Re: How to get a bad message out of the channel?

2013-05-23 Thread Mike Percy
e event to an alternate channel where it can be handled > differently > > Anything other than "stop pulling data from the channel and let the > channel fill" > > --Matt > > On May 22, 2013, at 12:39 AM, Mike Percy wrote: > > Hi Matt, > Nope, there is cu

Re: Apache Flume meetup at Hadoop Summit

2013-06-25 Thread Mike Percy
This event is tonight! Hope to see many of you there. Mike On Tue, Jun 25, 2013 at 12:58 PM, Hari Shreedharan < hshreedha...@cloudera.com> wrote: > Hi all, > > I am sorry if this is a bit late, but I''d like to invite you all to the > Flume meetup at Hadoop Summit in San Jose, CA. Please see >

[ANNOUNCE] Apache Flume 1.4.0 released

2013-07-02 Thread Mike Percy
anov Jarek Jarcec Cecho Jeff Lord Joey Echeverria Jolly Chen Juhani Connolly Mark Grover Mike Percy Mubarak Seyed Nitin Verma Oliver B. Fischer Patrick Wendell Paul Chavez Pedro Urbina Escos Phil Scala Rahul Ravindran Ralph Goers Roman Shaposhnik Roshan Naik Sravya Tirukkovalur Steve Hoffman Ted Mal

Re: Block Under-replication detected. Rotating file.

2013-08-22 Thread Mike Percy
Are you sure your HDFS cluster is configured properly? How big is the cluster? It's complaining that your HDFS blocks are not replicated enough based on your configured replication factor, and tries to get a sufficiently replicated pipeline by closing the current file and opening a new one to wr

Re: Block Under-replication detected. Rotating file.

2013-08-22 Thread Mike Percy
unter for replication attempts, that explains it. > > Thanks. > > > > On Thu, Aug 22, 2013 at 1:13 PM, Mike Percy wrote: >> Are you sure your HDFS cluster is configured properly? How big is the >> cluster? >> >> It's complaining that you

Re: [ANNOUNCE] New Flume Committer - Wolfgang Hoschek

2013-09-24 Thread Mike Percy
Congrats Wolfgang, and welcome! Mike On Tue, Sep 24, 2013 at 3:46 PM, Jarek Jarcec Cecho wrote: > Congratulations Wolfgang, well done! > > Jarcec > > On Tue, Sep 24, 2013 at 03:39:12PM -0700, Hari Shreedharan wrote: > > On behalf of the Apache Flume PMC, I am excited to welcome Wolfgang > Hosch

Re: [ANNOUNCE] New Flume Committer - Roshan Naik

2013-09-24 Thread Mike Percy
Congrats Roshan, welcome! Mike On Tue, Sep 24, 2013 at 3:47 PM, Jarek Jarcec Cecho wrote: > Congratulations Roshan, well done! > > Jarcec > > On Tue, Sep 24, 2013 at 03:39:13PM -0700, Hari Shreedharan wrote: > > On behalf of the Apache Flume PMC, I am excited to welcome Roshan Naik > as a > > c

Flume user meetup @ Hadoop World NYC on Oct 29th (Tue)

2013-10-10 Thread Mike Percy
Hi all, We are hosting a Flume user meetup during Strata / Hadoop World in New York on Tuesday, Oct 29th @ 6:30PM at the Hilton, which is the conference venue. This is your chance to meet up with other users, committers and PMC members, bat around ideas, bounce problems off of each other, and explo

Re: Extra information being delivered via Flume

2013-10-10 Thread Mike Percy
Check out the latest trunk code... We just committed FLUME-1666 courtesy of Jeff Lord this week. Mike Sent from my iPhone > On Oct 10, 2013, at 11:56 AM, DSuiter RDX wrote: > > Hi all, > > We set up a pipeline to get rsyslog input from a remote server via TCP using > rsyslog remote TCP forw

Re: Extra information being delivered via Flume

2013-10-10 Thread Mike Percy
Or if that doesn't work try the Netcat source. Sent from my iPhone > On Oct 10, 2013, at 11:46 PM, Mike Percy wrote: > > Check out the latest trunk code... We just committed FLUME-1666 courtesy of > Jeff Lord this week. > > Mike > > Sent from my iPhone >

Re: Adding SSL peer cert info to AvroSource

2014-01-29 Thread Mike Percy
If it's using a signed cert then what do you need to put into the filter? You mean a list of allowed peers? If so then you could either try to piggyback on the IpFilter and make it accept hostnames, or yes add another filter config option such as hostFilter. Mike On Wed, Jan 29, 2014 at 12:23 PM

Re: Adding SSL peer cert info to AvroSource

2014-01-30 Thread Mike Percy
to a fallback directory on a failed cert. > > > > ____ > From: Mike Percy [mpe...@apache.org] > Sent: Wednesday, January 29, 2014 6:44 PM > To: user@flume.apache.org > Subject: Re: Adding SSL peer cert info to AvroSource > > If it's using a signed cert then what do you

Re: Adding SSL peer cert info to AvroSource

2014-02-07 Thread Mike Percy
SSL. I have not thought about it a lot but this came up on StackOverflow, maybe it's applicable here. http://stackoverflow.com/questions/9573894/set-up-netty-with-2-way-ssl-handsake-client-and-server-certificate Mike > > > -Charles > > > On Jan 30, 2014, at 7:21 PM, Mike P

Re: Adding SSL peer cert info to AvroSource

2014-02-07 Thread Mike Percy
On Fri, Feb 7, 2014 at 5:15 PM, Pritchard, Charles X. -ND < charles.x.pritchard@disney.com> wrote: > > I’m finding it a challenge to see where in the AvroSource class I could > actually push the data into Event headers. > All of those methods are stateless when it comes to the connection — they

Re: Adding SSL peer cert info to AvroSource

2014-02-07 Thread Mike Percy
g: any other servers in a distributed > flow are not going to be looking at the client SSL cert, of course, > wouldn’t make any sense. > Most of them aren’t using SSL either, as it’s within a trusted network at > that point. > Yeah, that was my point. :) Mike > > > -Char

Re: what will gracefully shut down flume?

2014-03-24 Thread Mike Percy
How are you starting Flume? What platform / environment are you running on? Did you write your own init scripts or are you using a vendor Hadoop distribution (i.e. Cloudera) or something else (i.e. directly using Bigtop)? On Linux, if you are writing your own init scripts then running Flume via

Re: Json over netcat source

2014-04-09 Thread Mike Percy
Not sure either but make sure you're using a compatible version of ElasticSearch. Sent from my iPhone > On Apr 9, 2014, at 9:43 PM, Hari Shreedharan > wrote: > > Then I really don't know what the issue is. Someone more familiar with > elastic search sink will need to look at it. > > Hari >

Re: Is Memory Channel data lost on process stop?

2014-05-28 Thread Mike Percy
The traditional memory channel does not store anything to disk on process shutdown, so you lose the data if you kill the process. However, a plain reconfiguration will not reinitialize or clear the channel as long as the channel name remains the same. The recoverable memory channel thing (FLUME-89

Re: [ANNOUNCE] Apache Flume 1.5.0 released

2014-05-28 Thread Mike Percy
Congrats! On Wed, May 21, 2014 at 5:52 PM, Jarek Jarcec Cecho wrote: > Nice, thank you for driving the release Hari, great work! > > Jarcec > > On Thu, May 22, 2014 at 05:21:40AM +0530, Mohammad Tariq wrote: > > +1 > > > > Big thanks to the entire Flume team for all their efforts :) > > > > > >

Re: one-to-many interceptor

2014-06-24 Thread Mike Percy
Hi Matt, If you can guarantee there are a certain # of events in a single "wrapper" event, or bound the limit, then you could potentially get away with this. However if you're not careful you could get stuck in an infinite fail-backoff-retry loop due to exceeding the (configurable) channel transact

Re: flume starting through service

2014-06-24 Thread Mike Percy
Can you please provide details on the errors you are seeing? What version of Flume? On Thu, Jun 19, 2014 at 12:43 AM, kishore alajangi < alajangikish...@gmail.com> wrote: > > The flume is writing to hdfs when I start flume manually through config > file like > > flume-ng agent -c /etc/flume-ng/c

Re: one-to-many interceptor

2014-06-24 Thread Mike Percy
our recommendation and avoid these shenanigans. But > it's nice to know that I _can_ do it this way if I need to. > > Cheers > -mt > > >> On Tue, Jun 24, 2014 at 7:45 PM, Mike Percy wrote: >> Hi Matt, >> If you can guarantee there are a certain # of eve

  1   2   >