I'd rather not purchase dedicated hardware for ZK if I don't absolutely have to, unless I can use it for multiple clusters (ie Kafka, HBase, other things that rely on ZK). Would adding more cores help with ZK on the same machine? Or is that just a waste of cores, considering that it's java under all of this?
--Ken On Mar 15, 2014, at 12:07 AM, Jun Rao <jun...@gmail.com> wrote: > The spec looks reasonable. If you have other machines, it may be better to > put ZK on its own machines. > > Thanks, > > Jun > > > On Fri, Mar 14, 2014 at 10:52 AM, Carlile, Ken > <carli...@janelia.hhmi.org>wrote: > >> Hi all, >> >> I'm looking at setting up a (small) Kafka cluster for streaming microscope >> data to Spark-Streaming. >> >> The producer would be a single Windows 7 machine with a 1Gb or 10Gb >> ethernet connection running http posts from Matlab (this bit is a little >> fuzzy, and I'm not the user, I'm an admin), the consumer would be 10-60 (or >> more) Linux nodes running Spark-Streaming with 10Gb ethernet connections. >> Target data rate per the user is <200MB/sec, although I can see this >> scaling in the future. >> >> Based on the documentation, my initial thoughts were as follows: >> >> 3 nodes, all running ZK and the broker >> >> Dell R620 >> 2x8 core 2.6GHz Xeon >> 256GB RAM >> 8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5) >> 10Gb ethernet (single port) >> >> Do these specs make sense? Am I over or under-speccing in any of the >> areas? It made sense to me to make the filesystem cache as large as >> possible, particularly when I'm dealing with a small number of brokers. >> >> Thanks, >> Ken Carlile >> Senior Unix Engineer, Scientific Computing Systems >> Janelia Farm Research Campus, HHMI >>