Re: Samza checkpoints in ZK

2015-02-26 Thread Chris Riccomini
Hey Dotan, Samza has the checkpoint-tool.sh script, which can be used to read checkpoints for a given task. The MetricsSnapshotReporter can also be used to read metrics from a Samza job to check its offset progress. I don't believe that there's anything on the OS side that's plug and play, but yo

Re: Samza checkpoints in ZK

2015-02-26 Thread Dotan Patrich
Hi Chris, Thanks for the info! very helpful! Seems very reasonable, by the way, it all started when I was looking for some open source monitoring tool for Samza/Kafka to see which tasks are the bottleneck in terms of performance. Do you have any experience with such a tool (other than the internal

Re: Samza checkpoints in ZK

2015-02-26 Thread Chris Riccomini
Hey Dotan, The high-level (ZK-based) Kafka consumer (not Samza's) currently uses ZK to store offsets. They (Kafka) are moving away from this when they re-write their new NIO-based consumer. They will adopt the strategy of storing offsets in a Kafka topic, just like Samza has for years. The main m

Samza checkpoints in ZK

2015-02-25 Thread Dotan Patrich
Hi, I was looking for a quick and easy way to monitor tasks offsets and stumbled upon this utility: https://github.com/quantifind/KafkaOffsetMonitor It didn't work for me and what I discovered is that it they apparently look for the consumers list and offsets in zookeeper, while Samza stores thos