The spec looks reasonable. If you have other machines, it may be better to put ZK on its own machines.
Thanks, Jun On Fri, Mar 14, 2014 at 10:52 AM, Carlile, Ken <carli...@janelia.hhmi.org>wrote: > Hi all, > > I'm looking at setting up a (small) Kafka cluster for streaming microscope > data to Spark-Streaming. > > The producer would be a single Windows 7 machine with a 1Gb or 10Gb > ethernet connection running http posts from Matlab (this bit is a little > fuzzy, and I'm not the user, I'm an admin), the consumer would be 10-60 (or > more) Linux nodes running Spark-Streaming with 10Gb ethernet connections. > Target data rate per the user is <200MB/sec, although I can see this > scaling in the future. > > Based on the documentation, my initial thoughts were as follows: > > 3 nodes, all running ZK and the broker > > Dell R620 > 2x8 core 2.6GHz Xeon > 256GB RAM > 8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5) > 10Gb ethernet (single port) > > Do these specs make sense? Am I over or under-speccing in any of the > areas? It made sense to me to make the filesystem cache as large as > possible, particularly when I'm dealing with a small number of brokers. > > Thanks, > Ken Carlile > Senior Unix Engineer, Scientific Computing Systems > Janelia Farm Research Campus, HHMI >