You can use the metrics Kafka publishes. I think the relevant metrics are:
Log.LogEndOffset
Log.LogStartOffset
Log.size
Gwen
On Thu, Feb 5, 2015 at 11:54 AM, Bhavesh Mistry
wrote:
> HI All,
>
> I just need to get the latest offset # for topic (not for consumer group).
> Which API to get this i
Hi Xinyi,
The core dump never made it to the list. Perhaps its too large for an
attachment.
Can you upload it somewhere and send a link? Also, the broker logs will be
helpful, and a description of what the test did.
Gwen
On Thu, Feb 5, 2015 at 11:27 PM, Xinyi Su wrote:
> Hi,
>
> I encounter a
Hi Eduardo,
1. "Why sometimes the applications prefer to connect to zookeeper instead
brokers?"
I assume you are talking about the clients and some of our tools?
These are parts of an older design and we are actively working on fixing
this. The consumer used Zookeeper to store offsets, in 0.8.2 t
I'm wondering how much of the time is spent by Logstash reading and
processing the log vs. time spent sending data to Kafka. Also, I'm not
familiar with log.stash internals, perhaps it can be tuned to send the data
to Kafka in larger batches?
At the moment its difficult to tell where is the slowdo
storage.
> On Sun, Feb 8, 2015 at 9:25 AM, Gwen Shapira
> wrote:
>
> > Hi Eduardo,
> >
> > 1. "Why sometimes the applications prefer to connect to zookeeper instead
> > brokers?"
> >
> > I assume you are talking about the clients and some o
I have a 0.8.1 high level consumer working fine with 0.8.2 server. Few of
them actually :)
AFAIK the API did not change.
Do you see any error messages? Do you have timeout configured on the
consumer? What does the offset checker tool say?
On Fri, Feb 6, 2015 at 4:49 PM, Ricardo Ferreira wrote:
g.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /consumers/test-consumer-group/offsets/test/0.
>
> Perhaps the solution is downgrade the consumer libs to 0.8.1?
>
> Thanks,
>
> Ricardo
>
> On Sun, Feb 8, 2015 at 11:27 AM, Gwen Shapira
Looks good to me.
I like the idea of not blocking additional sends but not guaranteeing that
flush() will deliver them.
I assume that with linger.ms = 0, flush will just be a noop (since the
queue will be empty). Is that correct?
Gwen
On Sun, Feb 8, 2015 at 10:25 AM, Jay Kreps wrote:
> Follow
e marker.
>
> I would almost like the client to have a callback mechanism for processing,
> and the marker only gets moved if the high level consumer successfully
> implements my callback/processor (with no exceptions, at least).
>
>
>
> On Sun, Feb 8, 2015 at 9:49 AM, Gw
I know from the consumer, and the offset checker gave me nothing
> about that group.
>
> Thanks,
>
> Ricardo
>
> On Sun, Feb 8, 2015 at 1:19 PM, Gwen Shapira
> wrote:
>
>> Sorry, both the consumer and the broker are 0.8.2?
>>
>> So what's on 0.8.1?
Yep, still applicable.
They will do the same thing (commit offset on regular intervals) only with
Kafka instead of Zookeeper.
On Mon, Feb 9, 2015 at 2:57 AM, tao xiao wrote:
> Hi team,
>
> If I set offsets.storage=kafka can I still use auto.commit.enable to turn
> off auto commit and auto.commi
Since the console-consumer seems to display strings correctly, it sounds
like an issue with LogStash parser. Perhaps you'll have better luck asking
on LogStash mailing list?
Kafka just stores the bytes you put in and gives the same bytes out when
you read messages. There's no parsing or encoding d
It's safe.
Just note that if you send Kafka anything it does not like, it will close
the connection on you. This is intentional and doesn't signal an issue with
Kafka.
Not sure if Nagios does this, but I like "canary" tests - produce a message
with timestamp every X seconds and have a monitor tha
l I can tell .
>
> 1189855393 LogEndOffset
> 1165330350 Size
> 1176813232 LogStartOffset
>
> Thanks for your help !!
>
> Thanks,
>
> Bhaevsh
>
> Thanks,
> Bhavesh
>
> On Thu, Feb 5, 2015 at 12:55 PM, Gwen Shapira
> wrote:
>
> > You can use the m
btw. the name LogCleaner is seriously misleading. Its more of a log
compacter.
Deleting old logs happens elsewhere from what I've seen.
Gwen
On Tue, Feb 10, 2015 at 8:07 AM, Jay Kreps wrote:
> Probably you need to enable the log cleaner for those to show up? We
> disable it by default and so I
Thanks Joe :)
We definitely found Kafka to be solid enough for enterprise. It does
require taking care in shipping "safe" configurations.
Whether its secure enough depends on requirements - with no authentication,
authorization or encryptions, its definitely not secure enough for some use
cases (
shy wrote:
>
> > +1
> >
> > On Tue, Feb 10, 2015 at 01:32:13PM -0800, Jay Kreps wrote:
> > > I agree that would be a better name. We could rename it if everyone
> likes
> > > Compactor better.
> > >
> > > -Jay
> > >
> > >
Looks good. Thanks for working on this.
One note, the Channel implementation from SSL only works on Java7 and up.
Since we are still supporting Java 6, I'm working on a lighter wrapper that
will be a composite on SocketChannel but will not extend it. Perhaps you'll
want to use that.
Looking forwa
Looking at the code, it does look like appending a message set to a
partition will succeed or fail as a whole.
You can look at Log.append() for the exact logic if you are interested.
Gwen
On Wed, Feb 11, 2015 at 6:13 AM, Piotr Husiatyński wrote:
> Hi,
>
> I'm writing new client library for kaf
.
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
> http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>
> On Thu, Feb 12, 2015 at 11:37 AM, Harsha wrote:
>
> >
> > Thanks for the review Gwen. I'll keep in mind about java 6 support.
> > -Harsha
> > On Wed, Feb 11, 2015,
It would.
The way we see things is that if retrying has a chance of success, we will
retry.
Duplicates are basically unavoidable, because the producer can't always
know if the leader received the message or not.
So we expect the consumers to de-duplicate messages.
Gwen
On Thu, Feb 12, 2015 at 7:
Is it possible that you have iptables on the Ubuntu where you run your
broker?
Try disabling iptables and see if it fixes the issue.
On Tue, Feb 17, 2015 at 8:47 PM, Richard Spillane wrote:
> So I would like to have two machines: one running zookeeper and a single
> kafka node and another machi
ort is explicitly
> EXPOSEd and other ports in a similar range (e.g., 8080) can be accessed
> just fine.
>
> > On Feb 17, 2015, at 8:56 PM, Gwen Shapira wrote:
> >
> > Is it possible that you have iptables on the Ubuntu where you run your
> > broker?
> >
> > T
refused
> telnet: Unable to connect to remote host
>
>
> > On Feb 17, 2015, at 10:03 PM, Gwen Shapira
> wrote:
> >
> > What happens when you telnet to port 9092? try it from both your mac and
> > the ubuntu vm.
> >
> >
> > On Tue, Feb 17, 2015 at 9
Hi Daniel,
I think you can still use the same logic you had in the custom partitioner
in the old producer. You just move it to the client that creates the
records.
The reason you don't cache the result of partitionsFor is that the producer
should handle the caching for you, so its not necessarily
We store offsets in INT64, so you can go as high as:
9,223,372,036,854,775,807
messages per topic-partition before looping around :)
Gwen
On Fri, Feb 20, 2015 at 12:21 AM, Clement Dussieux | AT Internet <
clement.dussi...@atinternet.com> wrote:
> Hi,
>
>
> I am using Kafka_2.9.2-0.8.2 and play a
* ZK was not built for 5K/s writes type of load
* Kafka 0.8.2.0 allows you to commit messages to Kafka rather than ZK. I
believe this is recommended.
* You can also commit batches of messages (i.e. commit every 100 messages).
This will reduce the writes and give you at least once while controlling
Nice :) I like the idea of tying topic name to avro schemas.
I have experience with other people's data, and until now I mostly
recommended:
...
So we end up with things like:
etl.onlineshop.searches.validated
Or if I have my own test dataset that I don't want to share:
users.gshapira.newapp.tes
Camus uses the simple consumer, which doesn't have the concept of "consumer
group" in the API (i.e. Camus is responsible for allocating threads to
partitions on its own).
The client-id is hard coded and is "hadoop-etl" in some places (when it
initializes the offsets) and "camus" in other places.
T
e to be able to improve throughput on the consumer, will need to play
> with the number of partitions. Is there any recommendation on that ratio
> partition/topic or that can be scaled up/out with powerful/more hardware?
>
> Thanks
> Anand
>
> On Tue, Feb 24, 2015 at 8:11 PM, Gwen Shapira
Actually, thats a good point.
I don't think we are publishing our Scala / Java docs anywhere (well,
Jun Rao has the RC artifacts in his personal apache account:
https://people.apache.org/~junrao/)
Any reason we are not posting those to our docs SVN and linking our
website? Many Apache projects wit
Take a look at the Reassign Partition Tool. It lets you specify which
replica exists on which broker:
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool
Its a bit tricky to use, so feel free to follow up with more questions :)
Gwen
On Mo
Do you have the command you used to run Camus? and the config files?
Also, I noticed your file is on maprfs - you may want to check with
your vendor... I doubt Camus was extensively tested on that particular
FS.
On Mon, Mar 2, 2015 at 3:59 PM, Bhavesh Mistry
wrote:
> Hi Kakfa User Team,
>
> I ha
new broker and eventually move leader also to new broker (with out
> incrementing replica count for that partition)
> Please let me know if more information required.
>
> t
> SunilKalva
>
> On Mon, Mar 2, 2015 at 10:43 PM, Gwen Shapira
> wrote:
>
>> Take a look at
ot;This file was not closed so try to close during the
> JVM finalize..");
>
> try{
>
> writer.close();
>
> }catch(Throwable th){
>
> log.error("File Close erorr during finalize()");
>
> }
>
> }
>
>
of course :)
unclean.leader.election.enable
On Mon, Mar 2, 2015 at 9:10 PM, tao xiao wrote:
> How do I achieve point 3? is there a config that I can set?
>
> On Tue, Mar 3, 2015 at 1:02 PM, Jiangjie Qin
> wrote:
>
>> The scenario you mentioned is equivalent to an unclean leader election.
>> The
Hi,
Good catch, Joe. Releasing with a broken test is not a good habit.
I provided a small patch that fixes the issue in KAFKA-1999.
Gwen
On Tue, Mar 3, 2015 at 9:08 AM, Joe Stein wrote:
> Jun, I have most everything looks good except I keep getting test failures
> from wget
> https://people.a
I hate bringing bad news, but...
You can't really reassign replicas if the leader is not available.
Since the leader is gone, the replicas have no where to replicate the data from.
Until you bring the leader back (or one of the replicas with unclean
leader election), you basically lost this parti
Jay Kreps has a gist with step by step instructions for reproducing
the benchmarks used by LinkedIn:
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
And the blog with the results:
https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
Gwe
be really helpful.
>
> Thanks gain.
>
> best,
> Yuheng
>
> On Thu, Mar 5, 2015 at 3:22 PM, Gwen Shapira wrote:
>
>> Jay Kreps has a gist with step by step instructions for reproducing
>> the benchmarks used by LinkedIn:
>> https://gist.github.com/jkreps/c7ddb4041
It doesn't keep track specifically, but there are open sockets that may
take a while to clean themselves up.
Note that if you use the async producer and don't close the producer
nicely, you may miss messages as the connection will close before all
messages are sent. Guess how we found out? :)
Sim
Nice! Thank you, this is super helpful.
I'd add that not just the client can run out of memory with large
number of partitions - brokers can run out of memory too. We allocate
max.message.size * #partitions on each broker for replication.
Gwen
On Thu, Mar 12, 2015 at 8:53 AM, Jun Rao wrote:
> S
There's a KafkaRDD that can be used in Spark:
https://github.com/tresata/spark-kafka. It doesn't exactly replace
Camus, but should be useful in building Camus-like system in Spark.
On Fri, Mar 13, 2015 at 12:15 PM, Alberto Miorin
wrote:
> We use spark on mesos. I don't want to partition our clust
ed-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html
>>
>> -Will
>>
>> On Fri, Mar 13, 2015 at 3:28 PM, Alberto Miorin > > wrote:
>>
>>> I'll try this too. It looks very promising.
>>>
>>> Thx
>>>
>>>
using an HDFS write-ahead log to ensure zero data loss while
>>> streaming:
>>> https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html
>>>
>>> -Will
>>>
>>> On Fri, Mar 13, 2015 at 3:28
Camus uses MapReduce though.
If Alberto uses Spark exclusively, I can see why installing MapReduce
cluster (with or without YARN) is not a desirable solution.
On Fri, Mar 13, 2015 at 1:06 PM, Thunder Stumpges wrote:
> Sorry to go back in time on this thread, but Camus does NOT use YARN. We hav
If you want each application to handle half of the partitions for the
topic, you need to configure the same group.id for both applications.
In this case, Kafka will just see each app as a consumer in the group
and it won't care that they are on different boxes.
If you want each application to hand
Kafka currently has request types 0-12.
If the bytes Kafka got were parsed to request type 17260, it looks
like someone is sending malformed or even random data.
Do you recognize the sending IP? do you know about client errors from
the same point in time?
Perhaps someone is running a port scanner?
; or
> If i start one application does it start consuming all 6 partitions, and if
> start later the other one, does it start sharing the partitions with the
> other application threads ?
>
> t
> SunilKalva
>
> On Mon, Mar 16, 2015 at 10:48 PM, Gwen Shapira
> wrote:
>>
Any reason not to use SparkStreaming directly with HDFS files, so
you'll get locality guarantees from the Hadoop framework?
StreamContext has textFileStream() method you could use for this.
On Mon, Mar 16, 2015 at 12:46 PM, Daniel Haviv
wrote:
> Hi,
> Is it possible to assign specific partitions
Really really nice!
Thank you.
On Mon, Mar 16, 2015 at 7:18 AM, Pierre-Yves Ritschard
wrote:
> Hi kafka,
>
> I just wanted to mention I published a very simple project which can
> connect as MySQL replication client and stream replication events to
> kafka: https://github.com/pyr/sqlstream
>
>
t; to make several copies of it, which wasteful in terms of space and time.
>
> Thanks,
> Daniel
>
>> On 16 במרץ 2015, at 22:12, Gwen Shapira wrote:
>>
>> Any reason not to use SparkStreaming directly with HDFS files, so
>> you'll get locality guarantees from the
ill
> listen on directoryA, B on directoryB, C on directoryC.
> For the file to be processed I will have to copy the file both to
> directoryA and to directoyB and only then will the streaming apps will
> start processing it.
>
> Daniel
>
>
>
> On Mon, Mar 16, 2015 at 10:25 P
Your kafka log directory (in config file, under log.dir) contains
directories that are not KafkaTopics. Possibly hidden directory.
Check what "ls -la" shows in that directory.
Gwen
On Mon, Mar 16, 2015 at 1:58 PM, Zakee wrote:
> Running into problem starting Kafka Brokes after migrating to a ne
I think the issue is with " --from-latest" - this means consumers will
consume only data that arrives AFTER they start.
If you do that, first start consumers, leave them running, and then start
producers.
If you want to run producers first and only start consuming when producers
are done, remove
We have a patch with examples:
https://issues.apache.org/jira/browse/KAFKA-1982
Unfortunately, its not committed yet.
Gwen
On Fri, Mar 20, 2015 at 11:24 AM, Pete Wright
wrote:
> Thanks that's helpful. I am working on an example producer using the
> new API, if I have any helpful notes or exam
If you have feedback, don't hesitate to comment on the JIRA.
On Mon, Mar 23, 2015 at 4:19 PM, Pete Wright
wrote:
> Hi Gwen - thanks for sending this along. I'll patch my local checkout
> and take a look at this.
>
> Cheers,
> -pete
>
> On 03/20/15 21:16, Gwen Sha
On Thu, Apr 2, 2015 at 12:19 PM, Wes Chow wrote:
>
> How reusable are node ids? Specifically:
>
> 1. Node goes down and loses all data in its local Kafka logs. I bring it
> back up, same hostname, same IP, same node id, but with empty logs. Does it
> properly sync with the replicas and continue o
wow, thats scary for sure.
Just to be clear - all you did is restart *one* broker in the cluster?
everything else was ok before the restart? and that was controlled shutdown?
Gwen
On Wed, Apr 1, 2015 at 11:54 AM, Thunder Stumpges
wrote:
> Well it appears we lost all the data on the one node ag
This is available from 0.8.2.0, and is enabled on server by default. The
consumer needs to specify offsets.storage parameter - the default is still
zookeeper, so the consumers should set it to 'kafka'.
The documentation also explain how to migrate from zookeeper offsets to
kafka offsets.
Gwen
On
a
> class org.apache.kafka.clients.consumer.ConsumerConfig and I do not see
> there a constant for offsets.storage
>
> Am I missing something?
>
> This is my pom dependency definition:
>
>
> org.apache.kafka
> kafka_2.10
> 0.8.2.1
>
>
> On Wed, Apr 8, 2015 at 6:29 PM,
this seems like a bug. We expect broker settings to set
defaults for topics. Perhaps open a JIRA?
On Fri, Apr 10, 2015 at 1:32 PM, Bryan Baugher wrote:
> To answer my own question via testing, setting min.insync.replicas on the
> broker does not change the default. The only way I can fin
Here's the algorithm as described in AdminUtils.scala:
/**
* There are 2 goals of replica assignment:
* 1. Spread the replicas evenly among brokers.
* 2. For partitions assigned to a particular broker, their other
replicas are spread over the other brokers.
*
* To achieve this goa
Were there any errors on broker logs when you sent the new batch of messages?
On Tue, Apr 14, 2015 at 7:41 AM, Pavel Bžoch wrote:
> Hi all,
> I have a question about automatic re-/creation of the topic. I started my
> broker with property "auto.create.topics.enable " set to true. Then I tried
> t
I think of it as default.offset :)
Realistically though, changing the configuration name will probably
cause more confusion than leaving a bad name as is.
On Tue, Apr 14, 2015 at 11:10 AM, James Cheng wrote:
> "What to do when there is no initial offset in ZooKeeper or if an offset is
> out of
You should receive OffsetOutOfRange code (1) for the partition.
You can also send a FetchOffsetRequest and check for the last
available offset (log end offset) - this way you won't have to send a
fetch request that is likely to fail.
Are you implementing your own consumer from scatch? or using on
nd offset) - this way you won't have to send a
>> fetch request that is likely to fail.
>
> Does this takes in account specific consumer offsets stored in Zookeeper?
>
> On Fri, Apr 17, 2015 at 5:57 PM, Gwen Shapira wrote:
>
>> You should receive OffsetOutOfRange code
1. getOffsetsBefore is per partition, not per consumer group. The
SimpleConsumer is completely unaware of consumer groups.
2. "offsets closest to specified timestamp per topic-consumerGroup" <-
I'm not sure I understand what you mean. Each consumer group persists
one offset per partition, thats the
nd* procedure. For this we need to be
> able to get offset for topic and partition for specified timestamp.
>
> On Tue, Apr 21, 2015 at 4:36 AM, Gwen Shapira wrote:
>
>> I believe it doesn't take consumers into account at all. Just the
>> offset available on the partiti
They should be trying to get back into sync on their own.
Do you see any errors in broker logs?
Gwen
On Tue, Apr 21, 2015 at 10:15 AM, Thomas Kwan wrote:
> We have 5 kafka brokers available, and created a topic with replication
> factor of 3. After a few broker issues (e.g. went out of file desc
Perhaps this will help:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whycan'tmyconsumers/producersconnecttothebrokers?
On Thu, Apr 23, 2015 at 3:24 PM, madhavan kumar
wrote:
> Dear all,
> I am trying to connect my python consumer to a remote kafka server. But
> in kafka/conn.py#r
Since it is a runtime error, Maven dependency is less relevant than
what you have in your class path (unless you built a shaded uber-jar).
You'll need Scala runtime and zkclient jar in the classpath, can you
check that you have those around?
On Thu, Apr 23, 2015 at 6:15 AM, abdul hameed pathan
w
We don't normally plan dates for releases, when we are done with
features we want in the next release and happy with quality, we
release. Many Apache communities are like that.
If you need firmer roadmaps and specific release dates, there are few
vendors selling Kafka distributions and support.
A
Ouch. That can be a painful discovery after a client upgrade. It can
break a lot of app code.
I can see the reason for custom hash algorithm (lots of db products do
this, usually for stability, but sometimes for other hash properties
(Oracle has some cool guarantees around modifying number of part
We are doing work for supporting custom partitioner, so everything is
customizable :)
On Sun, Apr 26, 2015 at 8:52 PM, Wes Chow wrote:
>
> Along these lines too, is the function customizable? I could see how mmh3
> (or 2) would be generally sufficient, however in some cases you may want
> someth
Definitely +1 for advertising this in the docs.
What I can't figure out is the upgrade path... if my application assumes
that all data for a single user is in one partition (so it subscribes to a
single partition and expects everything about a specific subset of users to
be in that partition), thi
Batch failure is a bit meaningless, since in the same batch, some records
can succeed and others may fail.
To implement an error handling logic (usually different than retry, since
the producer has a configuration controlling retries), we recommend using
the callback option of Send().
Gwen
P.S
Aw
client internal construct?
> If the former I was under the impression that a produced MessageSet either
> succeeds delivery or errors in its entirety on the broker.
>
> Thanks,
> Magnus
>
>
> 2015-04-27 23:05 GMT+02:00 Gwen Shapira :
>
> > Batch failure is a bit meani
@Roshan - if the data was already written to Kafka, your approach will
generate LOTS of duplicates. I'm not convinced its ideal.
What's wrong with callbacks?
On Mon, Apr 27, 2015 at 2:53 PM, Roshan Naik wrote:
> @Gwen
>
> - A failure in delivery of one or more events in the batch (typical Flum
Kind of what you need but not quiet:
Sqoop2 is capable of getting data from HDFS to Kafka.
AFAIK it doesn't support Hive queries, but feel free to open a JIRA for
Sqoop :)
Gwen
On Tue, Apr 28, 2015 at 4:09 AM, Svante Karlsson
wrote:
> What's the best way of exporting contents (avro encoded) fr
on, it seems like the current high level consumer API does
> > not seem to be supporting it, atleast not in a straight forward fashion.
> >
> > Appreciate any alternate solutions.
> >
> > On Thu, Apr 23, 2015 at 8:08 PM, Gwen Shapira
> > wrote:
> >
>
I'm starting to think that the old adage "If two people say you are drunk,
lie down" applies here :)
Current API seems perfectly clear, useful and logical to everyone who wrote
it... but we are getting multiple users asking for the old batch behavior
back.
One reason to get it back is to make upgr
Why do we think atomicity is expected, if the old API we are emulating here
lacks atomicity?
I don't remember emails to the mailing list saying: "I expected this batch
to be atomic, but instead I got duplicates when retrying after a failed
batch send".
Maybe atomicity isn't as strong requirement a
I feel a need to respond to the Sqoop-killer comment :)
1) Note that most databases have a single transaction log per db and in
order to get the correct view of the DB, you need to read it in order
(otherwise transactions will get messed up). This means you are limited to
a single producer reading
Which Kafka version are you using?
On Thu, Apr 30, 2015 at 4:11 PM, Dillian Murphey
wrote:
> Scenerio with 1 node broker, and 3 node zookeeper ensemble.
>
> 1) Create topic
> 2) Delete topic
> 3) Re-create with same name
>
> I'm noticing this recreation gives me Leader: non, and Isr: as empty.
>
en.
>
> On Thu, Apr 30, 2015 at 5:34 PM, Gwen Shapira
> wrote:
>
> > Which Kafka version are you using?
> >
> > On Thu, Apr 30, 2015 at 4:11 PM, Dillian Murphey <
> crackshotm...@gmail.com>
> > wrote:
> >
> > > Scenerio with 1 node broker,
Hi,
The leader always serves all requests. Replicas are used just for high
availability (i.e to become leader if needed).
On Fri, May 1, 2015 at 3:27 PM, Gomathivinayagam Muthuvinayagam <
sankarm...@gmail.com> wrote:
> I am just wondering if I have no of replicas as 3, and
> if min.insync.replic
to achieve nice parallelism. So it seems the producer is the potential
> bottleneck, but I imagine you can scale that appropriately vertically and
> put the proper HA.
>
> Would love to hear your thoughts on this.
>
> Jonathan
>
>
>
> On Thu, Apr 30, 2015 at 5:
For Flume, we use the timeout configuration and catch the exception, with
the assumption that "no messages for few seconds" == "the end".
On Sat, May 9, 2015 at 2:04 AM, James Cheng wrote:
> Hi,
>
> I want to use the high level consumer to read all partitions for a topic,
> and know when I have
You may want to announce this at kafka-clie...@googlegroups.com, a mailing
list specifically for Kafka clients.
I'm sure they'll be thrilled to hear about it. It is also a good place for
questions on client development, if you ever need help.
On Mon, May 11, 2015 at 4:57 AM, Yousuf Fauzan
wrote:
Also, Flume 1.6 has Kafka source, sink and channel. Using Kafka source or
channel with Dataset sink should work.
Gwen
On Tue, May 12, 2015 at 6:16 AM, Warren Henning
wrote:
> You could start by looking at Linkedin's Camus and go from there?
>
> On Mon, May 11, 2015 at 8:10 PM, Rajesh Datla
> w
Kafka with Flume is one way (Just use Flume's SpoolingDirectory source with
Kafka Channel or Kafka Sink).
Also, Kafka itself has a Log4J appender as part of the project, this will
work if the log is written with log4j.
Gwen
On Tue, May 12, 2015 at 12:52 PM, ram kumar wrote:
> hi,
>
> How can i
Being an open source project, we don't have committed release dates.
You can follow the progress of our security features by watching
https://issues.apache.org/jira/browse/KAFKA-1682 and its sub-tasks.
Hope this helps!
Gwen
On Sat, May 16, 2015 at 1:22 PM, Madabhattula Rajesh Kumar <
mrajaf...@
Flume with netcat source and Kafka Channel or Kafka Sink will do that.
A bit more complex than a Kafkacat equivalent, but will get the job done.
On Tue, May 19, 2015 at 3:02 AM, clay teahouse
wrote:
> Hi All,
>
> Does anyone know of an implementation of kafkacat that reads from socket?
> Or for
See inline.
On Wed, May 20, 2015 at 12:51 PM, Harut Martirosyan <
harut.martiros...@gmail.com> wrote:
> Hi. I've got several questions.
>
> 1. As far as I understood from docs, if rebalancing feature is needed (when
> consumer added/removed), High-level Consumer should be used, what about
> Simpl
If you set advertised.hostname in server.properties to the ip address, the
IP will be registered in ZooKeeper.
On Fri, May 22, 2015 at 2:20 PM, Achanta Vamsi Subhash <
achanta.va...@flipkart.com> wrote:
> Hi,
>
> Currently Kakfa brokers register the hostname in zookeeper.
>
> [zk: localhost:2181
t; configs to be reloaded but only variable external config changes should be
> allowed.
>
> https://issues.apache.org/jira/browse/KAFKA-1229
>
> On Sun, May 24, 2015 at 1:14 PM, Gwen Shapira
> wrote:
>
> > If you set advertised.hostname in server.properties to the ip address,
Hi,
What you said is exactly correct :)
We check ISR size twice. Once before writing to leader, and once when
checking for acks. The first error is thrown if we detect a small ISR
before writing to the leader. The second if the ISR shrank after we
wrote to the leader but before we got enough acks
The existing consumer uses Zookeeper both to commit offsets and to
assign partitions to different consumers and threads in the same
consumer group.
While offsets can be committed to Kafka in 0.8.2 releases, which
greatly reduces the load on Zookeeper, the consumer still requires
Zookeeper to manage
What do the logs show?
On Wed, Jun 10, 2015 at 5:07 PM, Dillian Murphey
wrote:
> Ran this:
>
> $KAFKA_HOME/bin/kafka-reassign-partitions.sh
>
> But Kafka did not actually do the replication. Topic description shows the
> right numbers, but it just didn't replicate.
>
> What's wrong, and how do I
101 - 200 of 535 matches
Mail list logo