Review Request 33280: [SAMZA-561] Basic streaming SQL query planning support

2015-04-16 Thread Milinda Pathirage
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33280/ --- Review request for samza and Yi Pan (Data Infrastructure). Bugs: SAMZA-561

Re: How to deal with bootstrapping

2015-04-16 Thread Benjamin Black
New-Rules-Job will need to know the complete map of partitions to offsets. On Thu, Apr 16, 2015 at 2:06 PM, jeremy p wrote: > Ben : I think we are talking about different things here. I'm not trying > to maintain ordering across a topic. I know that is not what Kafka and > Samza are meant for.

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
Yan : It sounds like the checkpoint stream might help me! I would like to learn more about how New-Rules-Job can access the checkpoint stream for Old-Rules-Job. Can you please give me an example of how I would do this? Or could you please point me to some documentation or an article where I can l

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
Ben : I think we are talking about different things here. I'm not trying to maintain ordering across a topic. I know that is not what Kafka and Samza are meant for. What I'm trying to do here is give my Old-Rules-Job a way of telling New-Rules-Job, "Once you hit this offset, start applying both

Re: How to deal with bootstrapping

2015-04-16 Thread Benjamin Black
If you need to maintain ordering of a sequence of messages, those messages should all be written to the same partition. If you are concerned with global ordering of all messages in a topic then kafka is likely not going to be what you want. Ordering guarantees are strictly per partition. samza is b

Re: How to deal with bootstrapping

2015-04-16 Thread Yan Fang
Hi Jeremy, Samza already has a checkpoint stream, which records the latest-processed offset. The new-job can reuse old-job's checkpoint stream. Thanks, Fang, Yan yanfang...@gmail.com On Thu, Apr 16, 2015 at 1:51 PM, jeremy p wrote: > Thank you for the response. Does this mean the Old-Rules-J

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
Thank you for the response. Does this mean the Old-Rules-Job would need to maintain a Last-Processed-Old-Rules offset for each partition? On Thu, Apr 16, 2015 at 4:47 PM, Benjamin Black wrote: > Offsets are per partition. The alternative would have poor scaling behavior > for both brokers and c

Re: How to deal with bootstrapping

2015-04-16 Thread Benjamin Black
Offsets are per partition. The alternative would have poor scaling behavior for both brokers and consumers. On Thu, Apr 16, 2015 at 1:01 PM, jeremy p wrote: > Thanks to everybody for the responses! > > Yi : The queue must be processed in order, which means that I cannot use > Ben and Guozhang's

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
Thanks to everybody for the responses! Yi : The queue must be processed in order, which means that I cannot use Ben and Guozhang's approach. However, it is not necessary that all rules be processed at the same offset and at the same speed. This is why I considered a solution where we had a separ

Re: How to deal with bootstrapping

2015-04-16 Thread Yi Pan
Hi, Jeremy, I saw the following requirements from your use case: 1) New rules need to be dynamically added w/ creating too many Samza jobs (e.g. 1 Samza job per new rule is too much) 2) Old rules need to continue processing when new rules are added I want to ask a few more questions regarding to

Re: Review Request 33142: [SAMZA-561] Review in progress

2015-04-16 Thread Milinda Pathirage
> On April 14, 2015, 10:14 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-sql/src/main/java/org/apache/samza/sql/metadata/RelDataTypeToAvroSchemaConverter.java, > > line 28 > > > > > > One question here: it seems

Re: How to deal with bootstrapping

2015-04-16 Thread Yan Fang
you are able to call coordinator.shutdown to shut the job down after it reaches the offset. Thanks, Fang, Yan yanfang...@gmail.com On Thu, Apr 16, 2015 at 8:59 AM, Guozhang Wang wrote: > I feel Ben's solution a bit simpler that you just need to restart your > current job with both rules on the

Re: Review Request 33142: [SAMZA-561] Review in progress

2015-04-16 Thread Milinda Pathirage
> On April 14, 2015, 10:14 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-sql/src/main/java/org/apache/samza/sql/expressions/RexToJavaCompiler.java, > > line 102 > > > > > > Just a question, is it supposed to thr

Re: How to deal with bootstrapping

2015-04-16 Thread Guozhang Wang
I feel Ben's solution a bit simpler that you just need to restart your current job with both rules on the check pointed offset, and start a new job from offset 0 with only the new rule and it will stop at the checkout pointed offset. But of course it requires the second job to be able to shutdown i

Re: Review Request 33146: New KeyValueStore Features

2015-04-16 Thread Mohamed Mahmoud (El-Geish)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33146/ --- (Updated April 16, 2015, 10:43 a.m.) Review request for samza. Changes --