I would say those specs are probably a bit much for zookeeper particularly the memory and SAS disks assuming your usage of zookeeper is consistent with doing many more reads than writes which is the typical zookeeper use case. The CPU and network interface seem about right but I would go with lower priced servers (perhaps R210's with SATA drives and the cheapest xeon) and go with five servers instead of three if possible to tolerate up to two server failures.
Ray On Mon, Mar 17, 2014 at 8:26 PM, Carlile, Ken <carli...@janelia.hhmi.org>wrote: > OK, I understand. So for the Zookeeper cluster, can I go with something > like: > > 3 x Dell R320: > Single hexcore 2.5GHz Xeon, 32GB RAM, 4x10K 300GB SAS drives, 10GbE > > and if I do, can I drop the CPU specs on the broker machines to say, dual > 6 cores? Or are we looking at something that is core bound here? > > Thanks, > Ken > > On Mar 15, 2014, at 11:09 AM, Ray Rodriguez <rayrod2...@gmail.com> wrote: > > > Imagine a situation where one of your nodes running a kafka broker and > > zookeeper node goes down. You now have to contend with two distributed > > systems that need to do leader election and consensus in the case of a > > zookeeper ensemble and partition rebalancing/repair in the case of a > kafka > > cluster so I think Jun's point is that when running distributed systems > try > > to isolate them as much as possible from running on the same node to > > achieve better fault tolerance and high availability. > > > > From the Kafka docs you can see that a zookeeper cluster does't need to > sit > > on very powerful hardware to be reliable so I believe the suggestion is > to > > run a small independent zookeeper cluster that will be used by kafka and > by > > all means don't hesitate to reuse that zookeeper ensemble for other > systems > > as long as you can guarantee that all the systems using the zk ensemble > use > > some form of znode root to keep their data seperated within the zookeeper > > znode directory structure. > > > > This is an interesting topic and I'd love to hear if anyone else is > running > > their zk alongside their kafka brokers in production? > > > > Ray > > > > > > On Sat, Mar 15, 2014 at 10:28 AM, Carlile, Ken < > carli...@janelia.hhmi.org>wrote: > > > >> I'd rather not purchase dedicated hardware for ZK if I don't absolutely > >> have to, unless I can use it for multiple clusters (ie Kafka, HBase, > other > >> things that rely on ZK). Would adding more cores help with ZK on the > same > >> machine? Or is that just a waste of cores, considering that it's java > under > >> all of this? > >> > >> --Ken > >> > >> On Mar 15, 2014, at 12:07 AM, Jun Rao <jun...@gmail.com> wrote: > >> > >>> The spec looks reasonable. If you have other machines, it may be better > >> to > >>> put ZK on its own machines. > >>> > >>> Thanks, > >>> > >>> Jun > >>> > >>> > >>> On Fri, Mar 14, 2014 at 10:52 AM, Carlile, Ken < > >> carli...@janelia.hhmi.org>wrote: > >>> > >>>> Hi all, > >>>> > >>>> I'm looking at setting up a (small) Kafka cluster for streaming > >> microscope > >>>> data to Spark-Streaming. > >>>> > >>>> The producer would be a single Windows 7 machine with a 1Gb or 10Gb > >>>> ethernet connection running http posts from Matlab (this bit is a > little > >>>> fuzzy, and I'm not the user, I'm an admin), the consumer would be > 10-60 > >> (or > >>>> more) Linux nodes running Spark-Streaming with 10Gb ethernet > >> connections. > >>>> Target data rate per the user is <200MB/sec, although I can see this > >>>> scaling in the future. > >>>> > >>>> Based on the documentation, my initial thoughts were as follows: > >>>> > >>>> 3 nodes, all running ZK and the broker > >>>> > >>>> Dell R620 > >>>> 2x8 core 2.6GHz Xeon > >>>> 256GB RAM > >>>> 8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5) > >>>> 10Gb ethernet (single port) > >>>> > >>>> Do these specs make sense? Am I over or under-speccing in any of the > >>>> areas? It made sense to me to make the filesystem cache as large as > >>>> possible, particularly when I'm dealing with a small number of > brokers. > >>>> > >>>> Thanks, > >>>> Ken Carlile > >>>> Senior Unix Engineer, Scientific Computing Systems > >>>> Janelia Farm Research Campus, HHMI > >>>> > >> > >> > >