Hi, Alexander,
Sorry to reply late on this one. I embedded my questions and comments
in-between the lines:
On Sun, Nov 15, 2015 at 7:07 PM, Alexander Filipchik
wrote:
>
> nodeIterator = store.range(
> String.join(".", nodeId, String.valueOf(Character.MIN_VALUE)),
> String.join("
Just to add to Jagasish's suggestion - you can configure
MetricsSnapshotRecorder to output the metrics into a stream and read them
from there if it works better for you.
Boris.
On Mon, Nov 16, 2015 at 12:32 PM, Jordan Shaw wrote:
> Michael,
> I should have added if your using Burrow in the cont
Hi Chen,
Samza container is still single threaded. In case of yarn based deployment,
Samza uses this config value to verify that the cluster has sufficient
capacity to support running your job.
Apart from this verification, I don't believe we utilize this config value.
If you set it to > 1, it won
According to the documentation, each Samza container is single threaded.
Why giving yarn.container.cpu.cores as a config option and what is the
implication
to set this to a value > 1?
--
Chen Song
Michael,
I should have added if your using Burrow in the context of samza consumers
it probably won't work because samza does it's own offset tracking (see
checkpoint topics). The messages-behind-high-watermark is probably your
best bet if you just want something out of the box and don't care about
Thanks Jagadish! I'll look further into this.
Jordan, I tested Burrow with 0.8.3-SNAPSHOT and set it to read consumer
offsets from zookeeper because I assumed that it's the default Kafka config
for commiting offsets. Will try again with Burrow set to read from
__consumer_offsets.
Thanks
On Mon,
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40313/#review106697
---
Ship it!
- Boris Shkolnik
On Nov. 13, 2015, 10:51 p.m., VENKATA
Michael,
It depends on how you define lag.
1) If you define lag as the total number of messages behind then burrow is
a good tool as long as all your infrastructure is on 0.8.2, it basically
works by inspecting the __consumer_offsets topic which was introduced in
0.8.2 (they said they were going t
The following metrics are reported via JMX.
1. Samza exposes a per-partition metric called "$topicname-$partition_name-
messages-behind-high-watermark". This measures the number of messages
behind the high watermark of your consumer. Ideally, at steady state, you
would expect this metric to be zer
Hi,
I'd like to monitor consumer's lag.
Found this tool https://github.com/linkedin/Burrow.
But now realized that Samza is using it's own checkpointing mechanism.
So question is what's the best way to monitor whether and how much the
consumer is lagging?
On a related subject, I'd also like to mo
Hi Aram,
I assume that based on the message fields, you would want to output to
Cassandra, Graphite etc.
A single samza job is an implementation of the StreamTask/WindowableTask
interface. Samza will create multiple instances of your implementation and
assign it to containers.
Having a single sa
Hi guys,
We're processing JSON data from Kafka using Samza, and we'd like to have a
single Samza Job that's able to process and produce the messages to
different systems.
For example, consume messages from kafka, and produce them to Cassandra,
Graphite and other systems, so that the messages are
12 matches
Mail list logo