mproved resiliency and if retries fail, the exception will be
> propagated to your *StreamTask*, and will end up failing the container (if
> you don't swallow it).
>
> Thanks,
> Jagadish
>
>
>
> On Fri, Oct 28, 2016 at 10:12 AM, David Yu
> wrote:
>
> > Hi,
&g
Hi,
We recently experienced a Kafka broker crash. When a new broker was brought
up, we started seeing the following errors in Samza (0.10.1):
WARN o.a.k.c.producer.internals.Sender - Got error produce response with
correlation id 5199601 on topic-partition
session_key_partitioned_sessions-39, re
e max.requst.size for the producer. In
> this case you need to make sure the broker also set up the corresponding
> max message size. Last, you might also try to split the key into subkeys so
> the value will be smaller.
>
> Thanks,
> Xinyu
>
> On Thu, Oct 6, 2016 at 9:30
Hi,
Our Samza job (0.10.1) throws RecordTooLargeExceptions when flushing the KV
store change to the changelog topic, as well as sending outputs to Kafka.
We have two questions to this problem:
1. It seems that after the affected containers failed multiple times, the
job was able to recover and mo
27;s a change
> to the way Samza reads/writes coordinator stream messages.
>
> -Jake
>
> On Thu, Aug 25, 2016 at 8:44 AM, David Yu wrote:
>
> > After digging around a bit using kafka-console-consumer.sh, I'm able to
> > peek into the coordinator stream and see
with new configs. I was assuming that it would get
incremented with every deployment/update.
Thanks,
David
On Wed, Aug 24, 2016 at 5:39 PM David Yu wrote:
> Hi,
>
> I'm trying to understand role of the coordinator stream during a job
> redeployment.
>
> From the Samza
Hi,
I'm trying to understand role of the coordinator stream during a job
redeployment.
>From the Samza documentation, I'm seeing the following about the
coordinator stream:
The Job Coordinator bootstraps configuration from the coordinator stream
each time upon job start-up. It periodically catch
in the window or commit methods, which
> could cause periodic spikes in utilization.
>
>
>
> On Wed, Aug 24, 2016 at 2:55 PM, David Yu wrote:
>
> > Interesting.
> >
> > To me, "event-loop-utilization" looks like a good indicator that shows us
> > ho
ignificant because the timer runs even for no-ops
>
> Since you're on 10.1, there's another useful metric
> "event-loop-utilization", which represents
> (process-ns+window-ns+commit-ns)/event-loop-ns
> (as you defined it). In other words, the proportion of time spend
> this behavior, see the following properties in the config table (
> > http://samza.apache.org/learn/documentation/0.10/jobs/
> > configuration-table.html)
> > systems.system-name.samza.fetch.threshold
> > task.poll.interval.ms
> >
> >
> >
> > On W
on:
Is choose-ns the total number of ms used to choose a message from the input
stream? What are some gating factors (e.g. serialization?) for this metric?
Thanks,
David
On Wed, Aug 24, 2016 at 12:34 AM David Yu wrote:
> Some metric updates:
> 1. We started seeing some containers with a high
<https://www.dropbox.com/s/n1wxtngquv607nb/Screenshot%202016-08-24%2000.21.05.png?dl=0>
.
On Tue, Aug 23, 2016 at 5:56 PM David Yu wrote:
> Hi, Jake,
>
> Thanks for your suggestions. Some of my answers inline:
>
> 1.
> On Tue, Aug 23, 2016 at 11:53 AM Jacob Maes wrot
w time is high, but since it's only called once per minute,
>it looks like it only represents 1% of the event loop utilization. So I
>don't think that's a smoking gun.
>
> -Jake
>
> On Tue, Aug 23, 2016 at 9:18 AM, David Yu wrote:
>
> > Dear Samza guys,
Dear Samza guys,
We are here for some debugging suggestions on our Samza job (0.10.0), which
lags behind on consumption after running for a couple of hours, regardless
of the number of containers allocated (currently 5).
Briefly, the job aggregates events into sessions (in Avro) during process()
ll
> need to make sure you're using Kafka 0.9 or higher on the Brokers.
>
> -Jake
>
> On Fri, Aug 5, 2016 at 10:57 PM, David Yu wrote:
>
> > I guess this might be the problem:
> >
> > 2016-08-06 05:23:23,622 [main ] WARN o.a.s.s.kafka.KafkaSystemFactory$ -
Samza has pretty good support for collectng metrics:
https://samza.apache.org/learn/documentation/0.10/container/metrics.html
With these metrics logged in Kafka, you can simply consume from this stream
for monitoring. In our case, we pipe these metrics into OpenTSDB for
visualization.
For example
our
> Kafka team and they said that's not uncommon. You might want to try
> gzip or some other compression.
>
>
> On Fri, Aug 5, 2016 at 12:10 PM, David Yu wrote:
>
> > I'm reporting back my observations after enabling compression.
> >
> > Looks like co
a.producer.compression.type=snappy
Am I missing anything?
Thanks,
David
On Wed, Aug 3, 2016 at 1:48 PM David Yu wrote:
> Great. Thx.
>
> On Wed, Aug 3, 2016 at 1:42 PM Jacob Maes wrote:
>
>> Hey David,
>>
>> what gets written to the changelog topic
>>
>&g
itions owned by the tasks in the container. Restoration should
> happen only once and not multiple times.
>
> What logs statements do you see that indicate that it is going through the
> changelog multiple times? Can you please share that ?
>
> Thanks!
> Navina
>
> On Fri, A
Within a given container, does the restoration process go through the
changelog topic once to restore ALL stores in that container? From the
logs, I have a feeling that it is going through the changelog multiple
times.
Can anyone confirm?
Thanks,
David
.producer.compression.type
>
> Hope this helps.
> -Jake
>
>
>
> On Wed, Aug 3, 2016 at 11:16 AM, David Yu wrote:
>
> > I'm trying to understand what gets written to the changelog topic. Is it
> > just the serialized value of the particular state store entry?
I'm trying to understand what gets written to the changelog topic. Is it
just the serialized value of the particular state store entry? If I want to
compress the changelog topic, do I enable that from the producer?
The reason I'm asking is that, we are seeing producer throughput issues and
suspect
Is there a way to control the number of producers? Our Samza job writes a
lot of data to the downstream Kafka topic. I was wondering if there is a
way to optimize concurrency by creating more async producers.
Thanks,
David
I'm using 0.10.
Thanks for confirming.
On Tue, Jul 26, 2016 at 1:12 PM Navina Ramesh
wrote:
> Hi David,
> Which version of Samza are you using? Starting with Samza 0.8, I think we
> always used async producer.
>
> Thanks!
> Navina
>
> On Tue, Jul 26, 2016 a
Our session-roll-up Samza job is experience throughput issues. We would
like to fine-tune our Kafka producer. Does Samza use "producer.type=sync"
by default? From the JMX console, I'm seeing some producer batching
metrics. I thought those are for async producers.
Thanks,
David
containers than others
(3), whereas now things are perfectly balanced. But I'm befuddled as to why
that might cause the particular issue.
Any insight is appreciated.
Thanks,
David
On Mon, Jun 13, 2016 at 10:57 PM, David Yu wrote:
> Hi, Yi,
>
> I couldn't find any errors in the
state store from the changelog. You can use it to
> recover the state store from changelog and compare w/ the local RocksDB to
> verify if there is any discrepancies.
>
> Best.
>
> -Yi
>
> On Sun, Jun 12, 2016 at 12:49 PM, David Yu
> wrote:
>
> > Jagadish,
> &
session's
> sessionId is *alway* at the tail of your session table, and your session
> expiration order is the same as the order determined by the sessionId, that
> should work as well.
>
> -Yi
>
> On Mon, Jun 6, 2016 at 3:17 PM, David Yu wrote:
>
> > Hi, Yi,
&g
s older than the retention are purged.
> 2. Compaction: Newer key-values will over-write older keys and only the
> most recent value is retained.
>
> I'm not sure if offsets are always monotonically increasing in Kafka or
> could change after a compaction/ a time based retenti
My understanding of store changelog is that, each task writes store changes
to a particular changelog partition for that task. (Does that mean the
changelog keys are task names?)
One thing that confuses me is that, the last offsets of some changelog
partitions do not move. I'm using the kafka GetO
Not sure if it is configurable. But you should be able to find an entry in
your samza log like this:
/data/b/yarn/nm/usercache/david/appcache/application_1464853403568_0010/container_e04_1464853403568_0010_01_02/state/session-store/Partition_14
On Fri, Jun 10, 2016 at 7:01 PM, Jack Huang wro
rtant to
> achieve fast and efficient operation. Please refer to this blog from
> RocksDB team:
>
> https://github.com/facebook/rocksdb/wiki/Implement-Queue-Service-Using-RocksDB
>
> -Yi
>
> On Mon, Jun 6, 2016 at 2:25 PM, David Yu wrote:
>
> > We use Samza RocksDB
We use Samza RocksDB to keep track of our user event sessions. The task
periodically calls window() to update all sessions in the store and purge
all closed sessions.
We do all of this in the same iterator loop.
Here's how we are doing it:
public void window(MessageCollector collector, TaskCoor
IGTERM and a SIGKILL, though I can't find it at
> the moment.
>
> -Jake
>
> On Wed, May 18, 2016 at 10:32 AM, David Yu
> wrote:
>
> > From the NM log, I'm seeing:
> >
> > 2016-05-18 06:29:06,248 INFO
> >
> >
> org.apache.hadoop.yarn.ser
265 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application *application_1463512986427_0007* transitioned from RUNNING to
FINISHING_CONTAINERS_WAIT
(*Highlighted* is the particular samza application.)
The status never transitioned from FINISHING_CONTAINERS_WAIT :(
On Wed, May 18, 2016 at 10:21 AM, David Y
nning when the NM shut down. This option has virtually eliminated
> orphaned containers in our clusters.
>
> -Jake
>
> On Tue, May 17, 2016 at 11:54 PM, David Yu
> wrote:
>
> > Samza version = 0.10.0
> > YARN version = Hadoop 2.6.0-cdh5.4.9
>
Samza version = 0.10.0
YARN version = Hadoop 2.6.0-cdh5.4.9
We are experience issues when killing a Samza job:
$ yarn application -kill application_1463512986427_0007
Killing application application_1463512986427_0007
16/05/18 06:29:05 INFO impl.YarnClientImpl: Killed application
application_14
I'm trying to debug our samza job, which seem to be stuck from consuming
from our Kafka stream.
Every time I redeploy the job, only the same handful of events get
consumed, and then no more events get processed. I manually checked to make
sure the input stream is live and flowing. I also tried bot
stuck :(
Any ideas that could help me debug this will be appreciated.
On Wed, Mar 16, 2016 at 4:19 PM, David Yu wrote:
> No, instead, I updated the checkpoint topic with the "upcoming" offsets.
> (I should have done a check before that though).
>
> So a related question: if
e the debug line after
> troubleshooting. Else you risk filling up your logs.
>
> Let me know if you have more questions.
>
> Thanks!
> Navina
>
> On Wed, Mar 16, 2016 at 2:12 PM, David Yu wrote:
>
> > I'm trying to debug our samza job, which seem to be stuck fro
Looks like this has nothing to do with checkpointing. Our samza job has an
issue communicating an external service, which left the particular
process() call waiting indefinitely. And it doesn't look like samza has a
way to timeout a processing cycle.
On Thu, Mar 17, 2016 at 5:42 PM, Dav
om the correct offset in the stream :)
>
> Thanks!
> Navina
>
> On Wed, Mar 16, 2016 at 3:16 PM, David Yu wrote:
>
> > Finally seeing events flowing again.
> >
> > Yes, the "systems.kafka.consumer.auto.offset.reset" option is probably
> not
> > a fact
Strangely, I was not able to get checkpoint value for one particular
partition. Could this cause the job to be stuck?
On Thu, Mar 17, 2016 at 5:23 PM, David Yu wrote:
> Hi, I wanna resurface this thread because I'm still facing issues with our
> samza not receiving events.
>
&
nd then compare it
> with the topic offset.
>
> it's not that straight forward comparing to existing tools such as yours,
> and requires additional effort to integrate with dashboard tools such as
> grafana.
>
> I think this group have better ways to do this ;-)
>
d = KafkaUtil.getClientId("samza-consumer", config)
>
> KafkaUtil.getClientId has this logic.
>
> I'm curious about your usecase as to why you are interested in inspecting
> this information?
>
> Thanks,
> Jagadish
>
> On Tuesday, March 15, 2016, David
Our samza job is consuming from a Kafka topic. AFAIU, samza will auto
assign the job a consumer group id and client id. However, I'm not able to
see that showing up under zookeeper. Am I missing something?
basic metrics by googling the following
> > classes:
> > SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
> > SystemProducersMetrics. For your example, you can use the
> > "process-calls"
> > in SamzaContainerMetrics to get
Hi,
Where can I find the detailed descriptions of the out of the box metrics
provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
I'm interested in seeing the basic metrics of the my samza job (e.g.
messages_processed_per_sec). But it's hard to ping point to the specific
metric that
48 matches
Mail list logo