@Girish, wow, that could be a nice issue to debug. I was thinking about exactly these kind of issues with virtualized environments.
@Wim, how did you overcome the problem? Thinking about such issues my first thoughts are increasing the VM's memory that can be utilized to read/write caching by the OS or using smaller segments so it won't sync a big chunk of data at once (by possibly switching to synchronized <https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/> from async) but more smaller ones. On Fri, Dec 1, 2017 at 2:08 AM, Girish Aher <girisha...@gmail.com> wrote: > I am no storage or ESX expert, what I was told by our storage folks is that > they essentially created a dedicated storage pool in the SAN for zookeeper > VMs plus other VMs that did not have a lot of IO activity (non DB VMs). I > assume that implies dedicated physical disks in the SAN for that pool. > > I am not sure if a dedicated datastore was created in ESX for this pool, I > am guessing they did. > I have not seen the issue since then. > > Of course, the best solution is to have zookeeper on their own physicals > and dedicated disks especially if you plan to use it for purposes in > addition to Kafka. > > Also want to mention that a *temporary* solution around this problem is to > increase the connection and session timeouts between Kafka and zookeeper. > > > On Thu, Nov 30, 2017 at 2:33 PM, Sean Glover <sean.glo...@lightbend.com> > wrote: > > > Giresh, I'm curious what your solution was. Did you use locally attached > > storage for your ZK ensemble? Did you move it to static machines? > > > > On Thu, Nov 30, 2017 at 4:50 PM, John Yost <hokiege...@gmail.com> wrote: > > > > > Great point by Girish--its the delays of syncing with Zookeeper that > are > > > particularly problematic. Moreover, Zookeeper sync delays and session > > > timeouts impact other systems as well such as Storm. > > > > > > --John > > > > > > On Thu, Nov 30, 2017 at 10:14 AM, Girish Aher <girisha...@gmail.com> > > > wrote: > > > > > > > We did not face any problems with kafka application per se but we > have > > > > faced problems with zookeeper in virtualized environments due to > > slowness > > > > in fsyncs. We were using a shared SAN storage with shared pools with > > > other > > > > VMs. So every time, there was some kind of considerable storage > > activity > > > > like DB backup or something, our zookeeper fsyncs used to take tens > of > > > > seconds causing kafka-zookeeper sessions to timeout. > > > > > > > > On Nov 30, 2017 2:22 AM, "Viktor Somogyi" <viktorsomo...@gmail.com> > > > wrote: > > > > > > > > > Hi folks, > > > > > > > > > > Recently I bumped into an interesting question: using kafka in > > > > virtualized > > > > > environments, such as vmware. I'm not really familiar with > > > virtualization > > > > > in-depth (how disk virtualization works, what are the OS level > > supports > > > > > etc.), therefore I think this is an interesting discussion from > > Kafka's > > > > > point. As far as I know Kafka is designed for a non-virtualized > > > > environment > > > > > mainly (although I haven't seen it explicitly anywhere) but > thinking > > of > > > > > it's hard reliance on disk optimization I always assumed this. > > > > > > > > > > Anyone has experiences with virtualized Kafka? Are you aware of any > > > pain > > > > > points that people should consider (or performance issues)? > > > > > Are there any publications on this topic? > > > > > > > > > > Regards, > > > > > Viktor > > > > > > > > > > > > > > > > > > > > -- > > Senior Software Engineer, Lightbend, Inc. > > > > <http://lightbend.com> > > > > @seg1o <https://twitter.com/seg1o> > > >