Hi Simon,
Assuming you are using flume ng you can just modify the config add another
collector and save the file. No need for a restart. The agent will check in
periodically for changes. AFAIK every 30 seconds.
-Jeff
On Thu, Nov 29, 2012 at 12:43 AM, Simon Monecke wrote:
> Hi,
>
> i want to us
Andy,
The current stable release of flume is 1.3.0 and you can always check which
release is current on this page:
http://flume.apache.org/releases/index.html
In order to checkout this release from git you can issue the following
command:
git clone https://git-wip-us.apache.org/repos/asf/flume.g
Rahul,
As of Cloudera Manager 4.1.0 you have the ability to manage a flume
service. As well as reporting on various component metrics.
https://ccp.cloudera.com/display/ENT41DOC/Adding+Services#AddingServices-AddingFlume
https://ccp.cloudera.com/display/ENT41DOC/Flume+Metric+Details
-Jeff
On T
Felix,
In Flume 1.4 there is an embedded agent.
You can download and build trunk and would be able to have this
functionality.
https://issues.apache.org/jira/browse/FLUME-1502
https://issues.apache.org/jira/secure/attachment/12560587/embedded-agent-3.pdf
-Jeff
On Thu, Jan 3, 2013 at 9:32 PM, F
Felix,
Try adding some heap using MAVEN_OPTS.
e.g.
export MAVEN_OPTS="-Xms512m -Xmx1024m -XX:PermSize=256m
-XX:MaxPermSize=512m"
Than try to build.
-Jeff
On Sat, Jan 5, 2013 at 2:55 AM, Felix.徐 wrote:
> Hi,
>
> I encountered with a problem while executing "mvn clean install
> -DskipTests":
>
Hi Bashkar,
1) Batch Size
1.a) When configured by client code using the flume-core-sdk , to send
events to flume avro source.
The flume client sdk has an appendBatch method. This will take a list of
events and send them to the source as a batch. This is the size of the
number of events to be pas
ve direct implications on the performance of flume nodes.
>
> thanks
> Bhaskar
>
>
> On Tue, Jan 8, 2013 at 9:40 PM, Jeff Lord wrote:
>
>> Hi Bashkar,
>>
>> 1) Batch Size
>> 1.a) When configured by client code using the flume-core-sdk , to send
>>
Hi Andrew,
You may try lowering transactionCapacity here.
The transactionCapacity should be set to the value of the largest batch
size that will be used to store or remove events from that channel. You
currently have it equal to the capacity of the channel. So essentially the
channel *could be* fi
On Tue, Jan 22, 2013 at 2:51 AM, Alain B. wrote:
> My question is: will these 2 channels store their events in separate derby
> DB by default or do I need to configure my 2 jdbc-channels with specific
> properties in order to get 2 embedded derby DB started ?
>
By default they will use the same
Seshu,
It really is going to depend on your use case.
Though it sounds that you may need to run an agent on each of the source
machines.
Which source do you plan to use? It may also be the case that you can use
the flume rpc client to write data directly from your application to the
flume collecto
into HDFS.
> I can have a channel/collector machine where I install flume. I guess,
> my question is, do I need to install flume on the servers where the log
> messages lie and do I need to install flume in HDFS namenode too?
>
> Thanks,
> - Seshu
>
>
> On Wed, Feb 6, 2
The spooling directory source assumes that the files in the directory your
are spooling are immutable.
java.lang.**IllegalStateException: File name has been re-used with
different files. Spooling assumption violated for
/var/log/testhbase/hbase_1.**log.COMPLETED
This message is indicative that a
Madhu,
When the channel is full the source will no longer be able to accept
transactions and place them on the channel. It will not hang and will begin
accepting transactions again once the channel has availability. This means
the upstream sink|application will start to back up and is by design.
Noel,
What test did you perform?
Did you stop sink-2?
Currently you have set a higher priority for sink-2 so it will be the
default sink so long as it is up and running.
-Jeff
http://flume.apache.org/FlumeUserGuide.html#failover-sink-processor
On Tue, Feb 19, 2013 at 5:03 PM, Noel Duffy wrote:
r of assertions online that this can be done, but so far, I've not
> seen any examples of how to actually configure it.
>
> From: Jeff Lord [mailto:jl...@cloudera.com]
> Sent: Wednesday, 20 February 2013 2:17 p.m.
> To: user@flume.apache.org
> Subject: Re: Architecting Flume f
Daniel,
Flume was designed as a configurable pipeline for discrete events in order
to get them reliably from a source (e.g. web server application) -> to a
destination (e.g. into hdfs).
Flume provides the facility to write the same event to multiple
destinations (e.g. HDFS and Hbase or HDFS and Ca
Have you considered using the move command instead of copy?
On Tue, Feb 26, 2013 at 10:49 PM, 周梦想 wrote:
> Hello,
> I have a question using spooldir source.
>
> If I have a large file such as more than 100MB, when I copy this file to
> spooldir, the flume agent will find it immediately and begi
Hi Paul,
Would you kindly attach the logs from both tier 2 collectors where you
observe the sinks occasionally stepping on each other. Can you please
attach your flume config and note the version of flume-ng?
Best,
Jeff
On Sun, Mar 31, 2013 at 7:12 PM, JR wrote:
> Hi Paul,
>
>I apologize
Hi Jagadish,
Have you considered using a custom event serializer to modify your event?
Its possible to replicate your flow using two channels and then have one
sink that implements a custom serializer to modify the event.
-Jeff
On Tue, Apr 16, 2013 at 11:12 PM, Jagadish Bihani <
jagadish.bih...
Jagadish,
Here is an example of how to write a custom serializer.
https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MyCustomSerializer.java
-Jeff
On Fri, Apr 19, 2013 at 9:34 AM, Jeff Lord wrote:
> Hi Jagadish,
>
>
Hi Dave,
You are on the right track with thoughts here.
The best way to ensure all events are successfully delivered to Hbase as
well would be to use a separate channel for the hbase sink.
-Jeff
On Mon, Apr 22, 2013 at 8:11 AM, David Quigley wrote:
> Hi,
>
> I am using flume to write events f
Mike Percy contributed a most excellent blog post on this topic.
Have you had a chance to read over this?
https://blogs.apache.org/flume/entry/flume_performance_tuning_part_1
"*
Tuning the batch size trades throughput vs. latency and duplication under
failure. With a small batch size, throughput
Vikas,
This message is normal and harmless.
2013-04-29 08:26:11,868 (conf-file-poller-0) [DEBUG -
org.apache.flume.conf.file.AbstractFileConfigurationProvi
der$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)]
Checking file:conf/flume.conf for changes
If you change your log se
What version of flume are you running?
rpm -qa | grep flume
flume-ng version
Can you please post your full config? and the log file?
On Wed, May 8, 2013 at 8:34 PM, GuoWei wrote:
> Hi,
>
> Recently I met the following problem. When I Process event in my custom
> source .
>
> Channel closed [ch
Hi Ashish,
What version of flume are you running?
flume-ng version
-Jeff
On Fri, May 24, 2013 at 3:38 AM, Ashish Tadose
wrote:
> Hi All,
>
> We are facing this issue in production flume setup.
>
> Issue initiates when HDFS sink BucketWriter fails to append a batch for a
> file because of had
Hi Deepak,
1. When using the load balancing sink group the list of sinks will be
processed serially as opposed to in parallel.
2. The batch size on your source is very small.
agent.sources.1374869469492.batchSize = 1
You may try increasing that for better throughput.
3. The AsyncHbaseSink is goi
Miguel,
These errors usually indicate that there is a problem on your HDFS cluster.
You should probably investigate the health of the cluster first.
-Jeff
On Wed, Aug 7, 2013 at 7:21 AM, Miguel Coelho dos Santos
wrote:
> Hi,
>
> we are using flume to write data to hdfs.
> Our hdfs sinks recentl
n what typically in unhealthy in the HDFS cluster
> when this error occurs?
>
> Miguel
> ____
> From: Jeff Lord [jl...@cloudera.com]
> Sent: 07 August 2013 19:31
> To: user@flume.apache.org
> Subject: Re: java.io.IOException: Bad res
Deepesh,
The FileChannel uses a fixed size checkpoint file so it is not possible to
set it to unlimited size (the checkpoint file is mmap-ed to a fixed size
buffer). To change the capacity of the channel, use the following procedure:
Shutdown the agent.
Delete all files in the file channel's chec
ersion of Flume are you running. Looks like you are hitting
>> https://issues.apache.org/jira/browse/FLUME-1918 as well due to an
>> unsupported channel size in a previous version. This was fixed in Flume
>> 1.4.0
>>
>>
>> Hari
>>
>>
>> Thanks,
&g
Yes the file channel is designed to handle this and is what you should be
using.
You are also on the right track regarding sizing your file channel to
account for the number of events that could accumulate in the event that
your terminal sink is unable to complete transactions. With the amount of
d
So if you use trunk and set the keepFields property to true than the
Timestamp and Hostname will be preserved in the body of the event now.
https://github.com/apache/flume/blob/trunk/flume-ng-doc/sphinx/FlumeUserGuide.rst#syslog-sources
On Fri, Oct 11, 2013 at 7:29 AM, David Sinclair <
dsincl...
Luu,
Have you tried using the spooling directory source?
-Jeff
On Mon, Oct 21, 2013 at 3:25 AM, Cuong Luu wrote:
> Hi all,
>
> I need to copy data in a local directory (hadoop server) into hdfs
> regularly and automatically. This is my flume config:
>
> agent.sources = execSource
> agent.chan
Jeremy,
Datastream fileType will let you write text files.
CompressedStream will do just that.
SequenceFile will create sequence files as you have guessed and you can use
either Text or Writeable (bytes) for your data here.
So flume is configureable out of the box with regards to the size of your
Devin,
FLUME-1666 added a keepFields property that will allow you to preserve the
timestamp and hostname in the body of the generated flume event.
That patch was committed to trunk a couple of weeks ago so if you use trunk
to build it should be available.
https://issues.apache.org/jira/browse/FLUM
; Thanks again!
>
> -- Jeremy
>
>
>
> On Thu, Oct 31, 2013 at 4:42 PM, Jeff Lord wrote:
>
>> Jeremy,
>>
>> Datastream fileType will let you write text files.
>> CompressedStream will do just that.
>> SequenceFile will create sequence files as you ha
Zookeeper should already be running on the hbase server.
If you are using standalone mode it is run within the same jvm as hbase.
On Fri, Nov 1, 2013 at 2:14 PM, George Pang wrote:
> Hi Ashish,
>
> Does it mean I have to install zookeeper too in the HBase box, in order to
> talk to Hbase from
Its fine to run in a VM.
Out of curiosity why are you running two agents on the machine though?
On Mon, Nov 25, 2013 at 1:54 PM, Brock Noland wrote:
> It the channel is full your clients will get a rejection notice.
>
> Capacity planning on the FC is a mix between event size, channel size,
> a
2 in each VM)
>
> Is running single agent per VM recommend ?
>
> -Ritesh
>
>
>
> On Nov 25, 2013, at 3:23 PM, Jeff Lord wrote:
>
> Its fine to run in a VM.
> Out of curiosity why are you running two agents on the machine though?
>
>
>
> On Mon, Nov 25, 201
Can you provide the logfile and config?
On Tue, Nov 26, 2013 at 12:20 PM, Cochran, David wrote:
> I've got a pretty good sized box collecting logs for a number of sources
> (about a dozen or so).
> Actually two instances were running on this box (one production and the
> other a testing environm
Sounds reasonable to allow this via a config property.
Can you please submit the Jira?
On Tue, Dec 3, 2013 at 7:24 AM, James Estes wrote:
> We're on flume 1.4.0. Hm. So looking at the code you are right…I'd not
> looked closely enough at the transaction behavior for the MemoryChannel.
> When
Can you post your entire config and log ?
On Thu, Dec 5, 2013 at 1:10 AM, Salih Kardan wrote:
> I have a problem with adding time-stamp to flume header. Here is a snipped
> from my conf file.
>
>
> agent.sources.avrosource.interceptors.addTimestamp.type =
> org.apache.flume.interceptor.Timesta
Hi Otis,
It makes sense for flume to support RELP protocol.
Will need to do some digging to determine whether it makes sense to have
its own unique source or we can bolt this onto the multiport tcp source as
a config switch. Unless someone on the list has any ideas?
Best,
Jeff
On Tue, Dec 17,
Monitoring and configuration are two separate things here.
Flume is typically monitored using either ganglia or http/json.
Both methods are documented here:
http://flume.apache.org/FlumeUserGuide.html#monitoring
As for configuration management and changes a common way of handling this
would be to
Chen,
Have you taken a look at this presentation on Planning and Deploying Flume
from ApacheCon?
http://archive.apachecon.com/na2013/presentations/27-Wednesday/Big_Data/11:45-Mastering_Sqoop_for_Data_Transfer_for_Big_Data-Arvind_Prabhakar/Arvind%20Prabhakar%20-%20Planning%20and%20Deploying%20Apac
and i am looking for a fault tolerant deployment of flume, that
> can read from this single data source and sink to hdfs in fault tolerant
> mode: when one node dies, another flume node can pick up and continue;
> Thanks,
> Chen
>
>
> On Thu, Jan 9, 2014 at 7:49 PM, Jeff Lord wr
Bean,
Can you please open a jira?
Thank You,
Jeff
On Fri, Jan 10, 2014 at 7:16 AM, Bean Edwards wrote:
> If I change the condition (allowableDiff > delta) to
> (allowableDiff < delta), it works fine. from line 103 of OrderSelector
>
>
> On Fri, Jan 10, 2014 at 11:13 PM, Bean Edward
Josh,
If you modify your config than the flume agent will see the config has
changed and reload any components that have been modified. Are you able to
provide the logs from the flume agent which occur following a modification
of the config? What command are you using to signal for a shutdown?
-J
Your config and anymore logfile context you can provide will help get you
an answer.
On Thu, Jan 16, 2014 at 10:29 AM, P lva wrote:
> Hello everyone,
>
> I'm trying to configure a jms source in flume agent, but i get this error
>
> Could not create initial context
> com.tibco.tibjms.naming.Tibj
Connection(TibjmsxCFImpl.java:253)
> at
> com.tibco.tibjms.TibjmsQueueConnectionFactory.createQueueConnection(TibjmsQueueConnectionFactory.java:87)
> at
> com.tibco.tibjms.naming.TibjmsContext$Messenger.request(TibjmsContext.java:325)
> at
> com.tibco.tibjms.nami
If you don't intend to roll based on # of events than you will want to set
rollCount to 0.
MyAgent.sinks.HDFS.hdfs.rollCount = 0
On Mon, Jan 20, 2014 at 12:35 PM, Jimmy wrote:
> Seems like the only reason is "too many files" issue, correct?
>
> File Crusher executed regularly might be better op
You are using gzip so the files won't splittable.
You may be better off using snappy and sequence files.
On Thu, Jan 30, 2014 at 10:51 AM, Jimmy wrote:
> I am running few tests and would like to confirm whether this is true...
>
> hdfs.codeC = gzip
> hdfs.fileType = CompressedStream
> hdfs.writ
Mayur,
The hdfs sink is going to keep trying to connect for maxRetries=10
Are you able to post the complete log? or at least another couple of
minutes ?
-Jeff
On Fri, Feb 7, 2014 at 1:32 AM, Mayur Gupta wrote:
> 1) The source is Avro client. The events are lost. The intent of the
> question
Gary,
I'm going to just quote the design doc here:
https://issues.apache.org/jira/secure/attachment/12560587/embedded-agent-3.pdf
1. A Flume Embedded agent would be useful to applications which send
data to a Flume agent acting as a "collector". Currently using the
RPCClient or HTTPSource, if th
Logs ?
On Mon, Feb 17, 2014 at 5:51 AM, Kris Ogirri wrote:
> Dear Mailing Group,
>
> I am currently having issues with the Hbase sink function. I have developed
> an agent with a fanout channel setup ( single source, multiple channels,
> multiple sinks) sinking to a HDFS cluster and Hbase deploym
Have you tried using the fqcn of the connection factory?
On Monday, February 24, 2014, P lva wrote:
> If there is no connection factory called 'GenericConnectionFactory' the
> lookup fails and you get this.
>
>
>
> On Mon, Feb 24, 2014 at 1:29 PM, richard ross
> wrote:
>
> Thanks for the reply.
,
> Richard.
>
> On Feb 24, 2014, at 4:05 PMEST, Jeff Lord wrote:
>
> Have you tried using the fqcn of the connection factory?
>
> On Monday, February 24, 2014, P lva wrote:
>>
>> If there is no connection factory called 'GenericConnectionFactory' the
>>
; On Feb 24, 2014, at 6:09 PMEST, Jeff Lord wrote:
>
>> I think you can just drop the connectionFactory property from the
>> config altogether with activemq and it will work.
>>
>> On Mon, Feb 24, 2014 at 2:17 PM, Richard Ross
>> wrote:
>>> Thanks for these
Richard,
Flume does not enforce any guarantees on ordering of events.
-Jeff
On Wed, Feb 26, 2014 at 5:41 AM, richard ross
wrote:
> Hello:
>
> I am using Flume 1.4 with a JMS --> File Channel --> HDFS data pipeline, and
> was wondering if Flume can guarantee order of messages/events (i.e., the
>
It looks like you have not configured any properties for "rolling"
files on hdfs.
The default rollCount is 10 (events).
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
The flume hdfs sink can be configured to roll based on size, # of
events, or time.
hdfs.rollInterval30Number of seconds to
Are you using the spooling directory source?
We added the ability to just set the basename of a file (without
absolute path) in FLUME-2056. Allow SpoolDir to pass just the filename
that is the source of an event.
On Fri, Feb 28, 2014 at 7:01 AM, Iván Fernández Perea
wrote:
> Hi,
>
> I'm a newbie
You can setup flume to use hdfs.proxyUser
https://cwiki.apache.org/confluence/display/FLUME/Flume+1.x+Secure+HDFS+Setup
On Thu, Mar 13, 2014 at 2:26 PM, Christopher Shannon
wrote:
> What if your sinks have to write out to destinations that have different
> users and different levels of authoriz
https://blogs.apache.org/flume/entry/apache_flume_filechannel
On Thu, Mar 20, 2014 at 12:21 AM, Bean Edwards wrote:
> i use filechannel,and monitor it from http response.i
> found ChannelFillPercentage
> will increase and never get back to 0% what happens? what's
> more,filechannel dataDirs al
Increase your batch sizes
On Thu, Mar 27, 2014 at 12:29 PM, Chris Schneider <
ch...@christopher-schneider.com> wrote:
> Thanks for all the great replies.
>
> My specific situation is a bit more complex than I let on initially.
>
> Flume running multiple agents will absolutely be able to scale to
Do you have the appropriate interceptors configured?
On Fri, Mar 28, 2014 at 12:28 PM, Ryan Suarez <
ryan.sua...@sheridancollege.ca> wrote:
> RTFM indicates I need the following sink properties:
>
> ---
> hadoop-t1.sinks.hdfs1.serializer = org.apache.flume.serialization.
> HeaderAndBodyTextEvent
mem1.type = memory
> hadoop-t1.channels.mem1.capacity = 1000
> hadoop-t1.channels.mem1.transactionCapacity = 100
>
> # Bind the source and sink to the channel
> hadoop-t1.sources.r1.channels = mem1
> hadoop-t1.sinks.s1.channel = mem1
>
>
>
> On 14-03-28 3:37 PM, Je
Mohit,
Are you using memory channel? You mention you are getting OOME but you
don't even say what the heap you are setting on the flume jvm is?
Don't run an agent on the namenode. Occasionally you will see folks
installing an agent on one of the datanodes in the cluster but its not
typically reco
ctor nodes and even change their configurations.
>
> Absolutely Cloudera Manager can be used to install, manage, and monitor
your flume agents.
> So we are very much beginners in this field, any suggestions or
> recommendations are welcome. Thanks for your help :)
>
>
> Mohit
&
No. If you need to guarantee delivery of events please use a file channel.
https://blogs.apache.org/flume/entry/apache_flume_filechannel
On Mon, Apr 7, 2014 at 8:38 AM, Christopher Shannon
wrote:
>
> On Apr 7, 2014 9:35 AM, "Jeff Lord" wrote:
> >
> >
> >
>
That would have to mean that the downstream agent is sending an
> ack to the upstream agent before it actually persists the event.
>
> On Apr 7, 2014 10:48 AM, "Jeff Lord" wrote:
> >
> > No. If you need to guarantee delivery of events please use a file
>
http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source
On Wed, Apr 16, 2014 at 5:14 PM, Something Something <
mailinglist...@gmail.com> wrote:
> Hello,
>
> Needless to say I am newbie to Flume, but I've got a basic flow working in
> which I am importing a log file from my linux bo
tions about this?
>
>
> On Wed, Apr 16, 2014 at 5:16 PM, Jeff Lord wrote:
>
>> http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source
>>
>>
>> On Wed, Apr 16, 2014 at 5:14 PM, Something Something <
>> mailinglist...@gmail.com> wrote:
r) is probably
>>> your best bet to ingest files from a remote machine that you only have read
>>> access to. But then again you're sorta stepping outside of the use case of
>>> flume at some level here as rsync is now basically a part of your flume
>>> topol
> Hi Jeff,
>
> On Thu, Apr 17, 2014 at 1:11 PM, Jeff Lord wrote:
>
>> Using the exec source with a tail -f is not considered a production
>> solution.
>> It mainly exists for testing purposes.
>>
>
> This statement surprised me. Is that the gener
Kushal,
Have you considered removing the sinks from the sinkGroup?
This will increase your concurrency for processing channel events by
allowing both sinks to read from the channel simultaneously. With a sink
group in place only one sink will read at a time.
Hope this helps.
-Jeff
On Fri, May
Have you looked at some of the test classes?
That may be a good way to see how you can accomplish this with straight
java.
https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
On Tue, May 20, 2014 at 6:46 AM, Ja
Can you try adding this line to your config?
tier1.sinks.sinkDHCP_Raw.serializer = text
Adam,
You are mostly correct. The one thing I might add that may help is to know
that the sink is consuming the events from the channel, writing them to the
next hop source and then committing the transaction. As opposed to the
channel pushing the events, as the channel is a passive component. You
start() is called when the agent is started and the sink component is then
started.
calling process() will take a batch of events off the channel and send to
the next hop or terminal location.
stop() is called when the agent is shutdown and the sink component
resources are unloaded.
Have you seen t
ined very well. I'm new to Java (and flume) so maybe
> that's just me.
>
> Your explanation helps.
>
> --
> Sharninder
>
> On 21-Jul-2014, at 8:33 pm, Jeff Lord > wrote:
>
> start() is called when the agent is started and the sink component is then
> star
I believe the way this works is that flume creates a meta directory to
track which file is being read.
In the event of a restart of the agent the entire file will be re-read
which will create some duplicate events.
https://github.com/apache/flume/blob/flume-1.5/flume-ng-core/src/main/java/org/apac
Also all of your sinks are pointing to the same host for the next hop.
So if the agent on that host is unavailable for some reason than failover
is pointless.
For testing this ok, for production there is a better way.
On Wednesday, August 13, 2014, Hari Shreedharan
wrote:
> Each sink needs to ha
I think you want this to bind to slave2 or even better the appropriate ip
tier2.sources.source2.bind= slave3
If that doesn't work please send the log snippet.
On Thursday, August 28, 2014, Blade Liu wrote:
> Hi folks,
>
> I ran into a configuration problem of setting up multi-tier avro age
Ed,
Did you take a look at the javadoc in the source?
Basically the source uses netty as a server and the sink is just an rpc
client.
If you read over the doc which is in the two links below and take a look at
the developer guide and still have questions just ask away and someone will
help to answ
Whether or not flume can handle 20k eps will depend on several factors.
The main ones being:
1. What is the avg size of event
2. What source will you be using
With that said I have seen a single flume agent handle well over 20k eps
using the multiport syslog source.
Here is a link to a presentati
You can also use a regex interceptor to extract hostname from the message
(assuming it's there) and put that in an event header. From there you can
route and create partitions with the header.
On Wednesday, October 15, 2014, Hari Shreedharan
wrote:
> The Multiport syslog source can add the port
gt;> that there would be a some random device which will not send their logs in
>> the proper format and my regex will break. This is the way I'll implement
>> it if I can't find anything better.
>>
>> Thanks,
>> Sharninder
>>
>>
>>
>>
Pal,
You can add more sinks to your config.
Don't put them in a sink group just have multiple sinks pulling from the
same channel. This should increase your throughput.
Best,
Jeff
On Mon, Oct 20, 2014 at 3:49 AM, Pal Konyves wrote:
> Hi there,
>
> We would like to write lots of logs to HDFS v
y functional
> benefits?
>
> Thanks,
> Pal
>
> On Mon, Oct 20, 2014 at 3:22 PM, Jeff Lord wrote:
> > Pal,
> >
> > You can add more sinks to your config.
> > Don't put them in a sink group just have multiple sinks pulling from the
> > same channel.
I know this is not exactly what you are asking for but have you had a look
at the spillable memory channel.
https://flume.apache.org/FlumeUserGuide.html#spillable-memory-channel
On Sun, Oct 19, 2014 at 1:38 AM, terreyshih wrote:
> In other words, I would like to explicitly drop the events if the
What about your flume config?
Did you try increasing the eventSize?
On Mon, Oct 27, 2014 at 11:30 AM, Mohit Durgapal
wrote:
> Hi,
>
> I am using rsyslog to send messages to flume nodes via AWS ELB. On flume
> nodes I am using the source type *syslogtcp * where the ELB forwards the
> messages.
Hi Traino,
The syslog multiport source should automatically build the event using the
hostname from the syslog message. From there you can just use the macro on
your hdfs sink to use the value of the hostname event header.
e.g.
agent.sinks.sink-1.hdfs.path = /user/flume/Syslog/%{host}/
Hope thi
Congrats Roshan
On Tue, Nov 4, 2014 at 2:31 PM, Hari Shreedharan
wrote:
> Congrats Roshan!
>
>
> Thanks,
> Hari
>
> On Tue, Nov 4, 2014 at 2:12 PM, Arvind Prabhakar
> wrote:
>
> > On behalf of Apache Flume PMC, it is my pleasure to announce that Roshan
> > Naik has been elected to the Flume Pro
I am not familiar with gnip.
Did you take a look at the twitter source?
On Thu, Nov 6, 2014 at 4:09 AM, Rafeeq S wrote:
> I am new to flume and I am trying to stream tweets which is from gnip
> using Flume.
>
> Please suggest , which Flume source need to be used to stream tweets from
> Gnip.
> D
Guy,
What version of flume is this?
-Jeff
On Fri, Nov 7, 2014 at 1:19 AM, Needham, Guy
wrote:
> Hi all,
>
> I have a configuration with a file channel configured such that:
>
> a1.channels.ch1.type = file
> a1.channels.ch1.checkpointDir = /hadoop/user/flume/channels/checkpoint
> a1.channels.c
Do you have the jms class in your cp?
java.lang.NoClassDefFoundError: javax/jms/JMSException
On Fri, Dec 19, 2014 at 1:02 PM, Darshan Pandya wrote:
>
> Hi Folks,
> I am new to flume.
> I wanted to check if anyone has connected an IBM MQ to the JMS Source in
> Flume.
> I quickly configured flume w
You should be able to use the channelsize
On Mon, Jan 26, 2015 at 2:29 PM, Carlotta Hicks
wrote:
> Are these the counters from MonitoredCounterGroup? What is the scope of
> these counters? Can you reset these counters?
>
> -Original Message-
> From: Joey Echeverria [mailto:j...@clouder
Have you considered increasing the size of the memory channel? I haven't
played with Kafka sink much but in regards to hdfs we often add sinks which
can help to increase the flow of the channel.
The multi port Syslog source is the way to go here as it will give better
performance. We should probabl
Bob,
You may want to have a look at Apache Nifi.
http://ingest.tips/2014/12/22/getting-started-with-apache-nifi/
Regards,
Jeff
On Mon, Feb 2, 2015 at 3:49 PM, Bob Metelsky wrote:
> Steve - I appreciate you time on this...
>
> Yes, I want to use flume to copy .xml or .whatever files from a s
1 - 100 of 115 matches
Mail list logo