I'd rather not purchase dedicated hardware for ZK if I don't absolutely have 
to, unless I can use it for multiple clusters (ie Kafka, HBase, other things 
that rely on ZK). Would adding more cores help with ZK on the same machine? Or 
is that just a waste of cores, considering that it's java under all of this? 

--Ken

On Mar 15, 2014, at 12:07 AM, Jun Rao <jun...@gmail.com> wrote:

> The spec looks reasonable. If you have other machines, it may be better to
> put ZK on its own machines.
> 
> Thanks,
> 
> Jun
> 
> 
> On Fri, Mar 14, 2014 at 10:52 AM, Carlile, Ken 
> <carli...@janelia.hhmi.org>wrote:
> 
>> Hi all,
>> 
>> I'm looking at setting up a (small) Kafka cluster for streaming microscope
>> data to Spark-Streaming.
>> 
>> The producer would be a single Windows 7 machine with a 1Gb or 10Gb
>> ethernet connection running http posts from Matlab (this bit is a little
>> fuzzy, and I'm not the user, I'm an admin), the consumer would be 10-60 (or
>> more) Linux nodes running Spark-Streaming with 10Gb ethernet connections.
>> Target data rate per the user is <200MB/sec, although I can see this
>> scaling in the future.
>> 
>> Based on the documentation, my initial thoughts were as follows:
>> 
>> 3 nodes, all running ZK and the broker
>> 
>> Dell R620
>> 2x8 core 2.6GHz Xeon
>> 256GB RAM
>> 8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5)
>> 10Gb ethernet (single port)
>> 
>> Do these specs make sense? Am I over or under-speccing in any of the
>> areas? It made sense to me to make the filesystem cache as large as
>> possible, particularly when I'm dealing with a small number of brokers.
>> 
>> Thanks,
>> Ken Carlile
>> Senior Unix Engineer, Scientific Computing Systems
>> Janelia Farm Research Campus, HHMI
>> 

Reply via email to