Re: Can't get all stored values via range iterator

2015-11-16 Thread Yi Pan
Hi, Alexander, Sorry to reply late on this one. I embedded my questions and comments in-between the lines: On Sun, Nov 15, 2015 at 7:07 PM, Alexander Filipchik wrote: > > nodeIterator = store.range( > String.join(".", nodeId, String.valueOf(Character.MIN_VALUE)), > String.join("

Re: Monitoring consumer lag

2015-11-16 Thread Boris Shkolnik
Just to add to Jagasish's suggestion - you can configure MetricsSnapshotRecorder to output the metrics into a stream and read them from there if it works better for you. Boris. On Mon, Nov 16, 2015 at 12:32 PM, Jordan Shaw wrote: > Michael, > I should have added if your using Burrow in the cont

Re: question on yarn.container.cpu.cores

2015-11-16 Thread Navina Ramesh
Hi Chen, Samza container is still single threaded. In case of yarn based deployment, Samza uses this config value to verify that the cluster has sufficient capacity to support running your job. Apart from this verification, I don't believe we utilize this config value. If you set it to > 1, it won

question on yarn.container.cpu.cores

2015-11-16 Thread Chen Song
According to the documentation, each Samza container is single threaded. Why giving yarn.container.cpu.cores as a config option and what is the implication to set this to a value > 1? -- Chen Song

Re: Monitoring consumer lag

2015-11-16 Thread Jordan Shaw
Michael, I should have added if your using Burrow in the context of samza consumers it probably won't work because samza does it's own offset tracking (see checkpoint topics). The messages-behind-high-watermark is probably your best bet if you just want something out of the box and don't care about

Re: Monitoring consumer lag

2015-11-16 Thread Michael Ravits
Thanks Jagadish! I'll look further into this. Jordan, I tested Burrow with 0.8.3-SNAPSHOT and set it to read consumer offsets from zookeeper because I assumed that it's the default Kafka config for commiting offsets. Will try again with Burrow set to read from __consumer_offsets. Thanks On Mon,

Re: Review Request 40313: SAMZA-785

2015-11-16 Thread Boris Shkolnik
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40313/#review106697 --- Ship it! - Boris Shkolnik On Nov. 13, 2015, 10:51 p.m., VENKATA

Re: Monitoring consumer lag

2015-11-16 Thread Jordan Shaw
Michael, It depends on how you define lag. 1) If you define lag as the total number of messages behind then burrow is a good tool as long as all your infrastructure is on 0.8.2, it basically works by inspecting the __consumer_offsets topic which was introduced in 0.8.2 (they said they were going t

Re: Monitoring consumer lag

2015-11-16 Thread Jagadish Venkatraman
The following metrics are reported via JMX. 1. Samza exposes a per-partition metric called "$topicname-$partition_name- messages-behind-high-watermark". This measures the number of messages behind the high watermark of your consumer. Ideally, at steady state, you would expect this metric to be zer

Monitoring consumer lag

2015-11-16 Thread Michael Ravits
Hi, I'd like to monitor consumer's lag. Found this tool https://github.com/linkedin/Burrow. But now realized that Samza is using it's own checkpointing mechanism. So question is what's the best way to monitor whether and how much the consumer is lagging? On a related subject, I'd also like to mo

Re: [Multiple producers for one task]

2015-11-16 Thread Jagadish Venkatraman
Hi Aram, I assume that based on the message fields, you would want to output to Cassandra, Graphite etc. A single samza job is an implementation of the StreamTask/WindowableTask interface. Samza will create multiple instances of your implementation and assign it to containers. Having a single sa

[Multiple producers for one task]

2015-11-16 Thread Aram Mkrtchyan
Hi guys, We're processing JSON data from Kafka using Samza, and we'd like to have a single Samza Job that's able to process and produce the messages to different systems. For example, consume messages from kafka, and produce them to Cassandra, Graphite and other systems, so that the messages are