Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Gaurav Agarwal
Thanks a lot Jagadish for bearing with us for this long: We were able to locate the configuration that ensures there is at max one request on the wire per partition: if(producerProperties.containsKey(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION) && producerProperties.get(Producer

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Jagadish Venkatraman
*"If I make two send() calls (writing to same Kafka topic/partition), Samza is going to call those two send via KafkaProucer's send() method and wait for last send()'s future to complete. Is it guaranteed that these two messages will be delivered to Kafka broker in the order in which the send() wer

答复: build encode error

2017-03-09 Thread wu shaodong
Hi Jagadish No no no no!!! not the problem of containers number.is one container use 1G momory is ok? My task use very much memory. I want this abnormal phenomenon. If data cleaning task create 3 container. How much memory it uses? It is no data stored in memory. My task cre

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Gaurav Agarwal
Hi Jagadish, I must be grossly overlooking something here: Samza already guarantees that for you. Currently, two *send()* calls that produce to the same topic partition are *always* delivered in-order. (regardless of whether you called commit, batched up or otherwise) > If I make two send() calls

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Jagadish Venkatraman
To be clear, I'm merely saying we have not had a scenario for it yet. It's not unreasonable to imagine a scenario where you want the send to have succeeded, perform some action later in the same process call. *>> There is one more case to consider - what if a single process call sends multiple mes

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Jagadish Venkatraman
> we use "synchronous" guarantee to ensure that the messages we emit from samza are delivered to Kafka linearly. Samza already guarantees that for you. Currently, two *send()* calls that produce to the same topic partition are *always* delivered in-order. (regardless of whether you called commit,

Re: Multi task Container Starvation

2017-03-09 Thread Ankit Malhotra
Hi Jagdish, I failed to mention an important detail, which is that we had change-logging on for the store. What is interesting is that most of our time is spent in the “send” method. I see from the code that we only send changelogs when we “putAllDirtyEntries()” from the object cache. Is ther

[GitHub] samza pull request #80: SAMZA-1112:BrokerProxy does not log fatal errors

2017-03-09 Thread twbecker
GitHub user twbecker opened a pull request: https://github.com/apache/samza/pull/80 SAMZA-1112:BrokerProxy does not log fatal errors Add an UncaughtExceptionHandler to the broker proxy thread so failures there get logged. You can merge this pull request into a Git repository by

Re: Multi task Container Starvation

2017-03-09 Thread Jagadish Venkatraman
1. What's the on-disk size of the store? (In one of earlier experiments, if the state size is larger than 10G per partition, we 've observed writes slow down). 2. Can you benchmark how long writing to RocksDb takes on your SSD? You can look at https://github.com/apache/samza/blob/master/samza-test

Re: Multi task Container Starvation

2017-03-09 Thread Ankit Malhotra
Also, the PUTs are taking 10x of GETs is what baffles me a little bit. We’re running on SSDs and here is our config: stores.stage- store.write.batch.size=10 stores.stage- store.object.cache.size=20 stores.stage- store.rocksdb.num.write.buffers=3 stores.stage- store.rocksdb.compaction.styl

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Gaurav Agarwal
There is one more case to consider - what if a single process call sends multiple messages? In this case would we need to call checkpoint after every send() call inside same process() call.. that seems to be problematic, as once checkpointed, there is no safety net against any failures in subseq

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Gaurav Agarwal
Hi Jagadish, please find reply inline: (it appears that there is no easy way today to guarantee ordered delivery of messages to Kafka from Samza without consuming the checkpointing flexibility). On Thu, Mar 9, 2017 at 11:01 PM, Jagadish Venkatraman < jagadish1...@gmail.com> wrote: > Hi Gaurav, >

Re: Samza 0.12.0 + synchronous KafkaProducer ?

2017-03-09 Thread Jagadish Venkatraman
Hi Gaurav, >> process->process->->doWork()->checkpoint->process.. What does *doWork()* do? Does it actually iterate over accumulated in-memory state, and send messages to Kafka? *>> I found the configuration 'batch.size' which says that ''a batch size of zero will disable batching entirely"

Re: Multi task Container Starvation

2017-03-09 Thread Ankit Malhotra
Replies inline On 3/9/17, 11:24 AM, "Jagadish Venkatraman" wrote: I understand you are receiving messages from *all* partitions (but fewer messages from some partitions). Some questions: 1. Is it possible that you may have saturated the capacity of the entire contai

Re: build encode error

2017-03-09 Thread Jagadish Venkatraman
I'm not very clear what you mean. Perhaps, you could help me understand a bit more. *>>"But this task very very neey memory the one container to eat 1G memory!!"* You can certainly increase the number of containers (assuming you're running on Yarn). Please look at samza.apache.org/learn/documenta

Re: Multi task Container Starvation

2017-03-09 Thread Jagadish Venkatraman
I understand you are receiving messages from *all* partitions (but fewer messages from some partitions). Some questions: 1. Is it possible that you may have saturated the capacity of the entire container? 2. What is the time you spend inside *process* and *window* for the affected container? (How

Re: Multi task Container Starvation

2017-03-09 Thread Ankit Malhotra
Replies inline. -- Ankit > On Mar 9, 2017, at 12:34 AM, Jagadish Venkatraman > wrote: > > We can certainly help you debug this more. Some questions: > > 1. Are you processing messages (at all) from the "suffering" containers? > (You can verify that by observing metrics/ logging etc.) Processi

Re: 答复: hm...What's wrong with it?

2017-03-09 Thread Renato Marroquín Mogrovejo
Happy to help Wu! Best, Renato M. 2017-03-09 9:58 GMT+01:00 wu shaodong : > Hi > > Thanks man . > > Edit gradle version to 2.8 build success. > > Thanks very much! > > > > 发送自 Windows 10 版邮件 应用 > > > > > > > *发件人: *wu shaodong > *发送时间: *2017年3月9日

build encode error

2017-03-09 Thread wu shaodong
Hi everybody Here I come again. I’m use samza runtime clean data project. But this task very very neey memory the one container to eat 1G memory!!! this task start three container be going to eat 3G memory, add am.container 768M! OGM!!! I’m just running six or seven task! It is my w