Re: Kafka in virtualized environments

2017-11-30 Thread John Yost
Great point by Girish--its the delays of syncing with Zookeeper that are particularly problematic. Moreover, Zookeeper sync delays and session timeouts impact other systems as well such as Storm. --John On Thu, Nov 30, 2017 at 10:14 AM, Girish Aher wrote: > We did not face any problems with kaf

Re: Kafka 0.9.0.1 partitions shrink and expand frequently after restart the broker

2017-11-09 Thread John Yost
Yep, the team here, including Ismael, pointed me in the right direction, which was much appreciated. :) On Thu, Nov 9, 2017 at 10:02 AM, Viktor Somogyi wrote: > I'm happy that it's solved :) > > On Thu, Nov 9, 2017 at 3:32 PM, John Yost wrote: > > > Excellent point

Re: Kafka 0.9.0.1 partitions shrink and expand frequently after restart the broker

2017-11-09 Thread John Yost
t; problems. I would recommend you to upgrade to 1.0.0 if that is feasible. > > Viktor > > > On Thu, Nov 9, 2017 at 2:59 PM, John Yost wrote: > > > I've seen this before and it was due to long GC pauses due in large part > to > > a memory heap > 8 GB. >

Re: Kafka 0.9.0.1 partitions shrink and expand frequently after restart the broker

2017-11-09 Thread John Yost
I've seen this before and it was due to long GC pauses due in large part to a memory heap > 8 GB. --John On Thu, Nov 9, 2017 at 8:17 AM, Json Tu wrote: > Hi, > we have a kafka cluster which is made of 6 brokers, with 8 cpu and > 16G memory on each broker’s machine, and we have about 1600 t

Re: Kafka JVM heap limit

2017-11-08 Thread John Yost
I did and it did not help. The heap size was the issue. --John On Wed, Nov 8, 2017 at 9:30 AM, Ted Yu wrote: > Did you use G1GC ? > Thanks > Original message ----From: John Yost > Date: 11/8/17 5:48 AM (GMT-08:00) To: users@kafka.apache.org Cc: > ja...@scholz

Re: Kafka JVM heap limit

2017-11-08 Thread John Yost
In addition, in my experience, a memory heap > 8 GB leads to long GC pauses which causes the ISR statuses to constantly change, leading to an unstable cluster. --John On Wed, Nov 8, 2017 at 4:30 AM, chidigam . wrote: > Meaning, already read the doc, but couldn't relate, having large Heap for >

Re: No. of Kafk Instances in Single machine

2017-11-06 Thread John Yost
I would say the key thing is that the each Kafka server write to a separate set of 1..n disks and plan accordingly. --John On Mon, Nov 6, 2017 at 6:23 AM, chidigam . wrote: > Hi All, > Let say, I have big machine, which having 120GB RAM, with lot of cores, > and very high disk capacity. > > Ho

Re: Idle cluster high CPU usage

2017-09-21 Thread John Yost
Oh wow, okay, not sure what it is then. On Thu, Sep 21, 2017 at 11:57 AM, Elliot Crosby-McCullough < elliot.crosby-mccullo...@freeagent.com> wrote: > I cleared out the DB directories so the cluster is empty and no messages > are being sent or received. > > On 21 September 2

Re: Idle cluster high CPU usage

2017-09-21 Thread John Yost
The only thing I can think of is message format...do the client and broker versions match? If the clients are a lower version than brokers (i.e., 0.9.0.1 client, 0.10.0.1 broker), then I think there could be message format conversions both for incoming messages as well as for replication. --John

Re: Upgraded brokers from 0.9.0.1 -> 0.10.0.1: how to upgrade message format to 0.10.0.1?

2017-09-19 Thread John Yost
on is not particularly > costly since the data is in the heap and, if you use compression, 0.9.0.x > would do recompression either way. > > Ismael > > On Tue, Sep 19, 2017 at 2:41 PM, John Yost wrote: > > > Hi Everyone, > > > > We recently upgraded our cluster from 0

Upgraded brokers from 0.9.0.1 -> 0.10.0.1: how to upgrade message format to 0.10.0.1?

2017-09-19 Thread John Yost
Hi Everyone, We recently upgraded our cluster from 0.9.0.1 to 0.10.0.1 but had to keep our Kafka clients at 0.9.0.1. We now want to upgrade our clients and, concurrently, the message version to 0.10.0.1. When we did the 0.9.0.1 -> 0.10.0.1 broker upgrade we were not able to upgrade the kafka clie

Re: kafka cluster crashs periodically

2017-07-18 Thread John Yost
I saw this recently as well. This could result from either really long GC pauses or slow Zookeeper responses. The former can result from too big of a memory heap or sub-optimal GC algorithm/GC configuration. --John On Tue, Jul 18, 2017 at 3:18 AM, Mackey star wrote: > [2017-07-15 08:45:19,071]

Re: Kafka compatibility matrix needed

2017-07-18 Thread John Yost
Hi Everyone, I personally found that the 0.8.x clients do not work with 0.10.0. We upgraded our clients (KafkaSpout and custom consumers) to 0.9.0.1 and then Kafka produce/consume worked fine. --John On Tue, Jul 18, 2017 at 6:36 AM, Sachin Mittal wrote: > OK. > > Just a doubt I have is that my

Re: Infinite loop trying to connect to coordinator

2017-07-11 Thread John Yost
Hi Pierre, Do your brokers remain responsive? In other words, do you see any other symptoms such as decreased write or read throughput which may indicate long GC pauses or possibly heavy load on your zookeeper cluster as evidenced by any SocketTimeoutExceptions on the Kafka and/or Zookeeper sides?

Re: 0.9.x to 0.10.x upgrade--any default settings gotchas?

2017-07-10 Thread John Yost
he data without copying it > to the JVM heap. > > Ismael > > On Sun, Jul 9, 2017 at 4:23 PM, John Yost wrote: > > > Hi Ismael, > > > > Gotcha, will do. Okay, in reading to docs you linked, that may explain > what > > we're seeing. When we upgraded to

Re: 0.10.1 memory and garbage collection issues

2017-07-10 Thread John Yost
gt; > Hope this helps. > > > On Sun, 9 Jul 2017 at 10:45 John Yost wrote: > > > Hey Ismael, > > > > Thanks a bunch for responding so quickly--really appreciate the > follow-up! > > I will have to get those details tomorrow when I return to the office.

Re: 0.9.x to 0.10.x upgrade--any default settings gotchas?

2017-07-09 Thread John Yost
some > questions in the related "0.10.1 memory and garbage collection issues" > thread that you started. > > Ismael > > On Sun, Jul 9, 2017 at 3:30 PM, John Yost wrote: > > > Hi Everyone, > > > > Ever since we've upgraded from 0.9.0.1 to 0.10.0 o

Re: 0.10.1 memory and garbage collection issues

2017-07-09 Thread John Yost
We would need more details to be able to help. What is the version of your > producers and consumers, is compression being used (and the compression > type if it is) and what is the broker/topic message format version? > > Ismael > > On Sun, Jul 9, 2017 at 1:13 PM, John Yost

0.9.x to 0.10.x upgrade--any default settings gotchas?

2017-07-09 Thread John Yost
Hi Everyone, Ever since we've upgraded from 0.9.0.1 to 0.10.0 our five-node Kafka cluster is unstable. Specifically, whereas before a 6GB memory heap worked fine, following the upgrade all five brokers crashed with out of memory errors within an hour of the upgrade. I boosted the memory heap to 10

Re: 0.10.1 memory and garbage collection issues

2017-07-09 Thread John Yost
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 On Sun, Jul 9, 2017 at 8:13 AM, John Yost wrote: > Hey Everyone, > > When we originally upgraded from 0.9.0.1 to 0.10.0 with the exact same > settings we immediately observed OOM errors. I upped the heap size from 6 > GB to 10 GB and that

0.10.1 memory and garbage collection issues

2017-07-09 Thread John Yost
Hey Everyone, When we originally upgraded from 0.9.0.1 to 0.10.0 with the exact same settings we immediately observed OOM errors. I upped the heap size from 6 GB to 10 GB and that solved the OOM issue. However, I am now seeing that the ISR count for all partitions goes from 3 to 1 after about an h

Broker leaving cluster--even though Kafka broker remains up

2017-07-06 Thread John Yost
Hi Everyone, What causes a broker to leave a cluster even when the broker remains running? Is it loss of sync with Zookeeper? --John

Kafka broker stays up but leaves cluster

2017-07-05 Thread John Yost
Hi Everyone, We just upgraded to 0.10.0, and we've repeatedly seen a situation where a broker is up and appears to be fetching replica state from the lead replicas but the broker is not listed as part of the cluster. Any ideas as to why this is happening? Anything I should grep for in the logs? T

Decreasing number of partitions INCREASED write throughput--thoughts?

2016-02-20 Thread John Yost
Hi Everyone, I discovered yesterday completely by accident. When I went from writing to 10 topics each with 10 partitions to 10 topics each with 2 partitions--all other config params are the same--my write throughput almost doubled! I was not expecting this as I've always thought--and, admittedly

Re: Does kafka.common.QueueFullException indicate back pressure in Kafka?

2016-02-20 Thread John Yost
; etc). The docs go through each JMX metric relevant here. Then from there > you can start understanding how to alleviate the problem. > > Feel free to share metrics and more information and we can explore them > together. > > Alex > > On Thu, Feb 18, 2016 at 5:18 AM, John Y

Re: kafka.common.QueueFullException

2016-02-18 Thread John Yost
Hi Alex, Great info, thanks! I asked a related question this AM--is a full queue possibly a symptom of back pressure within Kafka? --John On Thu, Feb 18, 2016 at 12:38 PM, Alex Loddengaard wrote: > Hi Saurabh, > > This is occurring because the produce message queue is full when a produce > req

Does kafka.common.QueueFullException indicate back pressure in Kafka?

2016-02-18 Thread John Yost
Hi Everyone, I am encountering this exception similar to Saurabh's report earlier today as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e., add more KafkaBolt executors). Question...does this necessarily indicate back pressure from Kafka where the Kafka writes cannot keep up wit

Re: Migrating both Zookeeper ensemble and Kafka cluster please confirm steps

2015-11-02 Thread John Yost
lusters, or simply adding instance to the existing > one? > > > > On Sun, Nov 1, 2015 at 1:18 PM, John Yost wrote: > > > Hi Everyone, > > > > I need to migrate my organization's Kafka cluster along with the > underlying > > Zookeeper ensemble (cluste

Re: Quick kafka-reassign-partitions.sh script question

2015-11-02 Thread John Yost
hes up - if you > continue producing at this time, it will take longer to catch up. > 3. Once the new replica caught up, the old replica will get deleted. > > Hope this clarifies. > > On Mon, Nov 2, 2015 at 5:43 AM, John Yost wrote: > > > Hi Everyone, > > > >

Quick kafka-reassign-partitions.sh script question

2015-11-02 Thread John Yost
Hi Everyone, Perhaps a silly question...does one need to shut down incoming data feeds to Kafka prior to moving partititions via the kafka-reassign-partitions.sh script? My thought is yes, but just want to be sure. Thanks --John

Migrating both Zookeeper ensemble and Kafka cluster please confirm steps

2015-11-01 Thread John Yost
Hi Everyone, I need to migrate my organization's Kafka cluster along with the underlying Zookeeper ensemble (cluster) from one set of racks to another within our data center. I am pretty sure I have the steps correct, but I need to confirm just to ensure I am not missing anything. Here's what I t