I am no storage or ESX expert, what I was told by our storage folks is that they essentially created a dedicated storage pool in the SAN for zookeeper VMs plus other VMs that did not have a lot of IO activity (non DB VMs). I assume that implies dedicated physical disks in the SAN for that pool.
I am not sure if a dedicated datastore was created in ESX for this pool, I am guessing they did. I have not seen the issue since then. Of course, the best solution is to have zookeeper on their own physicals and dedicated disks especially if you plan to use it for purposes in addition to Kafka. Also want to mention that a *temporary* solution around this problem is to increase the connection and session timeouts between Kafka and zookeeper. On Thu, Nov 30, 2017 at 2:33 PM, Sean Glover <sean.glo...@lightbend.com> wrote: > Giresh, I'm curious what your solution was. Did you use locally attached > storage for your ZK ensemble? Did you move it to static machines? > > On Thu, Nov 30, 2017 at 4:50 PM, John Yost <hokiege...@gmail.com> wrote: > > > Great point by Girish--its the delays of syncing with Zookeeper that are > > particularly problematic. Moreover, Zookeeper sync delays and session > > timeouts impact other systems as well such as Storm. > > > > --John > > > > On Thu, Nov 30, 2017 at 10:14 AM, Girish Aher <girisha...@gmail.com> > > wrote: > > > > > We did not face any problems with kafka application per se but we have > > > faced problems with zookeeper in virtualized environments due to > slowness > > > in fsyncs. We were using a shared SAN storage with shared pools with > > other > > > VMs. So every time, there was some kind of considerable storage > activity > > > like DB backup or something, our zookeeper fsyncs used to take tens of > > > seconds causing kafka-zookeeper sessions to timeout. > > > > > > On Nov 30, 2017 2:22 AM, "Viktor Somogyi" <viktorsomo...@gmail.com> > > wrote: > > > > > > > Hi folks, > > > > > > > > Recently I bumped into an interesting question: using kafka in > > > virtualized > > > > environments, such as vmware. I'm not really familiar with > > virtualization > > > > in-depth (how disk virtualization works, what are the OS level > supports > > > > etc.), therefore I think this is an interesting discussion from > Kafka's > > > > point. As far as I know Kafka is designed for a non-virtualized > > > environment > > > > mainly (although I haven't seen it explicitly anywhere) but thinking > of > > > > it's hard reliance on disk optimization I always assumed this. > > > > > > > > Anyone has experiences with virtualized Kafka? Are you aware of any > > pain > > > > points that people should consider (or performance issues)? > > > > Are there any publications on this topic? > > > > > > > > Regards, > > > > Viktor > > > > > > > > > > > > > -- > Senior Software Engineer, Lightbend, Inc. > > <http://lightbend.com> > > @seg1o <https://twitter.com/seg1o> >