Re: Testing dev@samza.apache.org

2015-01-23 Thread Yi Pan
Yes! On Fri, Jan 23, 2015 at 9:49 AM, Chris Riccomini wrote: > Hey all, > > Could you please confirm that you're seeing this? I'm trying to verify the > TLP migration for: > > https://issues.apache.org/jira/browse/INFRA-9055 > > Cheers, > Chris >

Re: Streaming SQL - object models, ASTs and algebras

2015-01-28 Thread Yi Pan
Hi, Julian, First, welcome to join the community! Let me try to answer some of your comments, in addition to what Jon, Milinda, and others already commented on. In general, I don't think that our thoughts differ too far. As you mentioned, the full-stack would be a SQL parser -> AST -> a logic alg

Re: Streaming SQL - object models, ASTs and algebras

2015-01-28 Thread Yi Pan
ommon representation layer, or the physical operator API in Samza? Thanks! -Yi On Wed, Jan 28, 2015 at 10:02 AM, Yi Pan wrote: > Hi, Julian, > > First, welcome to join the community! Let me try to answer some of your > comments, in addition to what Jon, Milinda, and others already c

Re: Streaming SQL - object models, ASTs and algebras

2015-01-28 Thread Yi Pan
e academic meanings. But I am not sure w/o the physical operators performing the relation/stream conversions, how do we implement the window operator? -Yi On Wed, Jan 28, 2015 at 2:01 PM, Julian Hyde wrote: > > On Jan 28, 2015, at 10:02 AM, Yi Pan wrote: > > > I try to understand

Re: Streaming SQL - object models, ASTs and algebras

2015-01-28 Thread Yi Pan
stays in streaming format all the way > through. > > My point was that stream-to-relation and relation-to-stream occur in EVERY > CQL query (and logical algebra) but do not necessarily occur in the > physical algebra. > > Julian > > > > On Jan 28, 2015, at 2:18

Re: Streaming SQL - object models, ASTs and algebras

2015-01-29 Thread Yi Pan
erious consideration. > An extended standard SQL is much more useful than a SQL-like language, and > I believe I have shown that we can add the necessary extensions to SQL > without destroying it. > > Building a SQL parser, validator, relational algebra, JDBC driver and > planni

Re: Streaming SQL - object models, ASTs and algebras

2015-01-29 Thread Yi Pan
ber of partitions for a stream has to be explicitly specified. Where do you suggest to put this information in w/o breaking SQL syntax? On Thu, Jan 29, 2015 at 3:32 PM, Julian Hyde wrote: > > > On Jan 29, 2015, at 3:04 PM, Yi Pan wrote: > > > > Hi, Julian, > > > >

Re: Streaming SQL - object models, ASTs and algebras

2015-01-29 Thread Yi Pan
re blocking nature. > > Is it possible to do custom validations depending on the context? If I > > rephrase it, is validation logic extensible? > > > > Thanks > > Milinda > > > > On Thu, Jan 29, 2015 at 6:32 PM, Julian Hyde > wrote: > >

Re: Streaming SQL - object models, ASTs and algebras

2015-01-29 Thread Yi Pan
Thanks! Posted to SAMZA-390. On Thu, Jan 29, 2015 at 4:44 PM, Julian Hyde wrote: > > > On Jan 29, 2015, at 4:42 PM, Yi Pan wrote: > > > > One more, Julian, do you mind if I post your proposed SQL model to > > SAMZA-390? That way, more ppl can view it and we s

Re: Versioning beyond 0.9.0

2015-01-30 Thread Yi Pan
+1. On Fri, Jan 30, 2015 at 9:22 AM, Chris Riccomini wrote: > Hey all, > > I'm planning on opening a post-0.9.0 ticket. I was going to go with 0.10.0. > Is everyone good with that? Some people are kind of finicky about > double-digit minor version numbers. > > Cheers, > Chris >

Re: [DISCUSS] SQL workflow

2015-02-04 Thread Yi Pan
Just did the update w/ SAMZA-482. On Wed, Feb 4, 2015 at 10:31 AM, Chris Riccomini wrote: > I think so. There was some RB downtime, but it just got fixed. Yi, Navina, > Milinda, can you make sure your JIRAs have up to date RBs? > > On Wed, Feb 4, 2015 at 10:24 AM, sriram wrote: > > > Can we hav

Re: [DISCUSS] SQL workflow

2015-02-04 Thread Yi Pan
Hi, Jacob, Yes, for sure. -Yi On Wed, Feb 4, 2015 at 11:43 AM, Jakob Homan wrote: > This submodule will still be under the review-then-commit (RTC) > regime, correct? > > On 4 February 2015 at 11:13, Yi Pan wrote: > > Just did the update w/ SAMZA-482. > > > > O

Re: [DISCUSS] SQL workflow

2015-02-05 Thread Yi Pan
Hi, Jakob, > > Eh? Not sure what this means... > > I mean SAMZA-484 depends on SAMZA-482, and neither are committed. So Navina > is having to post Yi's patch, as well as her own, on the JIRA. It makes it > really hard to do code reviews because you can't tell whether Yi made the > changes or Navi

Re: [DISCUSS] SQL workflow

2015-02-08 Thread Yi Pan
+1 On Sat, Feb 7, 2015 at 8:38 AM, Chris Riccomini wrote: > Hey all, > > Are we +1 on this? I think Jakob was the only one who was curious about it. > > Cheers, > Chris > > On Thu, Feb 5, 2015 at 1:22 PM, Yi Pan wrote: > > > Hi, Jakob, > > &

Window spec in SQL language vs Samza system details

2015-02-09 Thread Yi Pan
Hi, Julian and all, We had a discussion in LinkedIn last week regarding to the window spec in SQL language on top of Samza systems. There are some issues in the window spec that I want to discuss: Consider that we want to have a count of stock trades (as a infinite stream) happened in the last ho

Re: Window spec in SQL language vs Samza system details

2015-02-10 Thread Yi Pan
t, you might break > composability, and that is a huge problem. Or you end up writing the > planner so that it produces the right plan when it sees the query in its > sugared version but not when expressed using the fundamentals (case in > point: if we had introduced a "tumbling window&q

Re: Terminology: Tumbling and sliding windows

2015-02-17 Thread Yi Pan
+1 on consolidating the terminology as well. Azure's definition looks good to me. On Tue, Feb 17, 2015 at 9:13 AM, Chris Riccomini wrote: > Hey Julian, > > +1 I'm not sure if we actually *are* using the right terminology, but I > agree that Azure's terminology is what we should use. I think this

Re: Stream SQL for Samza Query Language Guide and Design

2015-02-17 Thread Yi Pan
I shared the same pain with wiki before. Either cwiki or Markdown sounds good to me. -Yi On Tue, Feb 17, 2015 at 9:53 AM, Chris Riccomini wrote: > Hey Milinda, > > Yea, I agree. Confluence is better than Moin Moin. If others agree, I think > we should just switch to Confluence. > > So, shall we

Re: Reprocessing and windowing

2015-02-23 Thread Yi Pan
Hey, Geoffry, We have started some work in SAMZA-552 to create a window operator API in samza, as part of effort to implement support for a high-level language. I will probably be able to have something to share in a few days and would love to get feedbacks regarding to the window operator. Thank

Re: Test failure in samza-sql branch

2015-02-25 Thread Yi Pan
Hi, Milinda, I have seen a similar intermittent test failure on my boxes as well, just did not have time to dig into it yet. It seems to be a timing issue in the unit test. Could you open a JIRA s.t. we don't forget it? -Yi On Wed, Feb 25, 2015 at 6:54 AM, Milinda Pathirage wrote: > Hi Chris,

Re: Handling defaults and windowed aggregates in stream queries

2015-03-01 Thread Yi Pan
Hi, Milinda, Sorry to reply late on this. Here are some of my comments: 1) In Calcite's model, it seems that there is no stream-to-relation conversion step. In the first example where the window specification is missing, I like your solution to add the default LogicalNowWindow operator s.t. it mak

Re: Handling defaults and windowed aggregates in stream queries

2015-03-02 Thread Yi Pan
out a way to move the window to the input stream if > Calcite can move the window out from Project. I’ll see how we can do this. > > Also I’ll go ahead and implement default windows. We can change it later if > Julian or someone from Calcite comes up with a better suggestion. > > Th

Re: Handling defaults and windowed aggregates in stream queries

2015-03-05 Thread Yi Pan
licit window. > > > > In the algebra, we will start introducing Chi. It will evaporate for > > simple queries such as Filter. It will remain for more complex queries > such > > as stream-to-stream join, because you are joining the current row of one > > stream to a tim

Re: Handling defaults and windowed aggregates in stream queries

2015-03-06 Thread Yi Pan
> > interval '1' hour > > > > > > > > Query 6 is equivalent to query 5. But the system can notice the join > > > > condition involving the two streams' rowtimes and trim down the > windows > > > > (one window to an hour, another window t

Re: Handling defaults and windowed aggregates in stream queries

2015-03-06 Thread Yi Pan
hu, Mar 5, 2015 at 2:42 PM, Milinda Pathirage > wrote: > > > Hi Yi, > > > > Please find my comments inline. > > > > On Thu, Mar 5, 2015 at 1:18 PM, Yi Pan wrote: > > > >> Hi, Milinda, > >> > >> We have recently some d

A question regarding to the default semantic meaning of join

2015-03-06 Thread Yi Pan
Hi, Julian, I am writing down some detailed examples of join and need your further help in understanding the semantic meaning of the following example: SELECT id, value, cost FROM Orders OVER (ROWS 3 PRECEDING) JOIN Shipments OVER (ROWS 3 PROCEDING) ON Orders.id = Shipments.id In this example, i

Re: A question regarding to the default semantic meaning of join

2015-03-09 Thread Yi Pan
ns about joining on rowtime, then > your query and queries (1), (2), (3) are valid, but due to their > different windows are not equivalent. > > Julian > > On Fri, Mar 6, 2015 at 4:28 PM, Yi Pan wrote: > > Hi, Julian, > > > > I am writing down some detailed examples

Re: A question regarding to the default semantic meaning of join

2015-03-13 Thread Yi Pan
as a > short semantics email. :) > > > On Mar 9, 2015, at 12:48 PM, Yi Pan wrote: > > > > Hi, Julian, > > > > Thanks for the reply. I want to make sure that I understand your > > explanation on windows in JOIN more explicitly. > > For the following

Re: Samza questions

2015-03-26 Thread Yi Pan
Hi, Ori, My interpretation on the MV usage in Martin's talk is exactly what you have mentioned: it is considered as a "view" instead of a regular table in DB, hence, read-only and possibly, derived data that already went through the business logic. On Thu, Mar 26, 2015 at 1:55 PM, Yan Fang wrote

Re: [VOTE] Apache Samza 0.9.0 RC0

2015-03-26 Thread Yi Pan
I have ran the integration test suite w/ 0.9.0-rc0. There were some issues related w/ the integration test: SAMZA-621, but the test suite passed after I manually created a symlink to the file name the test script is looking for. Hence, +1 on the release. On Thu, Mar 26, 2015 at 5:39 PM, Roger Hoo

Re: Kafka Question

2015-03-31 Thread Yi Pan
Hi, Shekar, For windowing and SQL-like features, please watch the following tickets: SAMZA-552, SAMZA-561, SAMZA-562. As Chris said, we are still actively design and develop those features in samza-sql branch, and will merge it back to the master in a later point. Cheers! -Yi On Tue, Mar 31, 20

Re: Stream SQL Query Planner Update

2015-04-06 Thread Yi Pan
Hi, Milinda, Great! Thanks for making the excellent progress in this! I will try to follow up with the patch today. Thanks! -Yi On Mon, Apr 6, 2015 at 11:00 AM, Milinda Pathirage wrote: > Hi All, > > I have attached a patch to SAMZA-561 ( > https://issues.apache.org/jira/browse/SAMZA-561) wh

Re: Joining Avro records

2015-04-09 Thread Yi Pan
Hi, Roger, Good question on that. I am actually not aware of any "automatic" way of doing this in Avro. I have tried to add generic Schema and Data interface in samza-sql branch to address the morphing of the schemas from input streams to the output streams. The basic idea is to have wrapper Schem

Re: Updating samza-sql branch to Java 1.7

2015-04-14 Thread Yi Pan
Hi, Milinda, Jacob already committed a change to remove Java 1.6 support in SAMZA-646. I think that it would be fine to move samza-sql branch to Java 1.7. Regards. -Yi On Tue, Apr 14, 2015 at 12:47 PM, Milinda Pathirage wrote: > Hi Devs, > > Calcite dropped support for Java 1.6 in 1.1.0-incub

Re: Updating samza-sql branch to Java 1.7

2015-04-14 Thread Yi Pan
Hi, Chris, Let me do it. Thanks! -Yi On Tue, Apr 14, 2015 at 2:19 PM, Chris Riccomini wrote: > @Yi, are you going to merge master into the samza-sql branch, or should I? > > On Tue, Apr 14, 2015 at 2:02 PM, Yi Pan wrote: > > > Hi, Milinda, > > > > Jacob alread

Re: Updating samza-sql branch to Java 1.7

2015-04-14 Thread Yi Pan
Merged master to samza-sql. On Tue, Apr 14, 2015 at 2:57 PM, Jakob Homan wrote: > Yes, I removed the tests for JDK6 yesterday. We're 1.7 or above now > for development. > > On 14 April 2015 at 12:47, Milinda Pathirage > wrote: > > Hi Devs, > > > > Calcite dropped support for Java 1.6 in 1.1.0-

Re: How to deal with bootstrapping

2015-04-16 Thread Yi Pan
Hi, Jeremy, I saw the following requirements from your use case: 1) New rules need to be dynamically added w/ creating too many Samza jobs (e.g. 1 Samza job per new rule is too much) 2) Old rules need to continue processing when new rules are added I want to ask a few more questions regarding to

Re: Questions about partitioning

2015-04-24 Thread Yi Pan
Hi, Susan, Welcome to Samza! First I will try to answer your question about partition assignment in Samza. The assignment from stream partition to Samza tasks is determined by the SystemStreamPartitionGrouper. The default implementation include two assignment methods: 1 task per system stream par

Re: Errors and hung job on broker shutdown

2015-04-28 Thread Yi Pan
Roger, could you paste the full log from Samza container? If you can figure out which Kafka broker the message was sent to, it would be helpful if we get the log from the broker as well. On Tue, Apr 28, 2015 at 3:31 PM, Roger Hoover wrote: > Hi, > > I need some help figuring out what's going on.

Re: What next for streaming SQL?

2015-05-04 Thread Yi Pan
Hi, Julian, Thanks for the reply. I want to add a few more points here: {quote} Once you have computed that boundary and stored it in your data structure you can keep on adding rows until you see one rowtime 11:00:00 or higher. {quote} The above is not true when the incoming messages in the real

Re: What next for streaming SQL?

2015-05-04 Thread Yi Pan
Mon, May 4, 2015 at 12:22 AM, Yi Pan wrote: > Hi, Julian, > > Thanks for the reply. I want to add a few more points here: > {quote} > Once you have computed that boundary and stored it in your data structure > you can keep on adding rows until you see one rowtime 11:00:00 o

Re: Questions regarding Samza in production

2015-05-05 Thread Yi Pan
Hi, Jose, Good to know that you chose Samza! I will embed my answers inline below: On Mon, May 4, 2015 at 5:02 PM, José Barrueta wrote: > > - I assume caching will help a lot with serialization/deserialization of > the Value, but have you guys used the value to be of type other than > primiti

Re: Local state in Samza - sharing data between tasks

2015-05-05 Thread Yi Pan
Hi, Andreas, Are you describing a use case where the *same* copy of data is shared among all tasks? That will depend on a lot factors: 1. is your data size huge? 2. Can your data be partitioned to work with a single partition of input stream? 3. Do you have a means to bootstrap the data from a str

Re: What next for streaming SQL?

2015-05-05 Thread Yi Pan
Hi, Julian, Great! I am looking forward to it. Could you help to answer my question regarding to the sliding windows in the previous email? Thanks a lot! -Yi On Tue, May 5, 2015 at 10:46 AM, Julian Hyde wrote: > > On May 4, 2015, at 10:52 AM, Yi Pan wrote: > > > Just one obs

Re: Quick question regarding deserialization

2015-05-11 Thread Yi Pan
Hi, Jose, Please refer to the configure wiki: http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html Samza actually allows multiple Serde classes to be defined for different topics, as long as you don't have multiple schemas of messages in the same topic. Best, -Yi On Mon

Re: Log rotation on Samza/yarn logs

2015-05-14 Thread Yi Pan
Hi, Shekar, Are you having a problem w/ retention of too many old log files on disk? I did a quick search online to see whether there is any configuration for DailyRollingFileAppender and couldn't find any. The closest thing is this one: http://wiki.apache.org/logging-log4j/DailyRollingFileAppende

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread Yi Pan
Hi, George, Could you share w/ us the code and configuration of your sample test job? Thanks! -Yi On Wed, May 20, 2015 at 1:19 PM, George Li wrote: > Hi, > > We are evaluating Samza's performance, and our sample job with > TestPerformanceTask is much slower than a program reading directly from

Library version conflict issues

2015-05-20 Thread Yi Pan
Hi, all, Just curious about one thing: - Samza as a platform brings in a set of dependency libraries - Applications developed in Samza may bring in other libraries that conflicts w/ the Samza libraries (we have got one use case that requires jackson 1.4.2 which conflicts with jackson 1.8.5 that Sa

Re: Do we want to release the 0.9.1 now?

2015-05-21 Thread Yi Pan
Hi, Yan, I am voting to start it now. Guozhang has already signed up to follow the release process that Chris wrote up. There will be an announcement soon. Thanks! -Yi On Thu, May 21, 2015 at 2:21 PM, Yan Fang wrote: > Hi guys, > > Just ask, are there any other bugs that we want to back port

Re: Do we want to release the 0.9.1 now?

2015-05-21 Thread Yi Pan
SAMZA-658 (fix cached store iterator remove() function), > SAMZA-608 (don't hange on serde errors in system consumers) and > SAMZA-616 (make shutdown hook wait for container to finish) are all > bug fixes that I think should be eligible for a 0.9.1 point release. > Should we work

Re: Samza YarnJobFactory support for https

2015-05-21 Thread Yi Pan
Hi, Jose, Thanks a lot! I I have opened a JIRA to support that: SAMZA-688. -Yi On Thu, May 21, 2015 at 8:03 PM, José Barrueta wrote: > Hi all, > > Once we figure it out the problem we were able to easily come up with a > solution for this. > > Basically, we want to be able to set the `yarn.pac

Re: Do we want to release the 0.9.1 now?

2015-05-21 Thread Yi Pan
above and if you can give a +1 to move forward quickly with 0.9.1 release, that would be great! Thanks a lot! -Yi On Thu, May 21, 2015 at 4:21 PM, Yi Pan wrote: > Hi, Jakob, > > Thanks a lot for the thorough check-through. I agree w/ your point that > those bug fixes are important a

Re: Offset (counter) as String - why?

2015-05-28 Thread Yi Pan
Hi, Michael, The reason that offset in the IncomingMessageEnvelope is string type instead of long is the following: if you integrate non-Kafka messaging system in Samza, there is no guarantee that the offset is of type long. E.g. ActiveMQ uses a composite format from connectionId, sessionId, produ

Re: ProcessJobFactory parent process

2015-05-29 Thread Yi Pan
Hi, Lukas, I assume that when you say "the job crashes", you were referring to the child process running the container, not the parent process? If yes, we were actually talking about adding container health-check/failure-detection in the JobCoordinator. SAMZA-680 would be the good place to start t

Re: ProcessJobFactory parent process

2015-05-29 Thread Yi Pan
15 at 12:59 PM, Lukas Steiblys wrote: > Yes, I'm talking about the child process crashing. I'd like the parent to > die as well if the child crashes so Docker can understand that the process > failed and restart the container. > > Lukas > > -----Original Message-

Re: ProcessJobFactory parent process

2015-06-01 Thread Yi Pan
ll for > > you. > > > > Thanks, > > Michael > > > > On Sat, May 30, 2015 at 12:22 AM, Lukas Steiblys > > wrote: > > > > > Yes, I think switching to ThreadJobFactory is a good solution. I think > > the > > > reasons why I switch

Re: [2/2] samza git commit: Yi's TopologyBuilder RB 34500

2015-06-01 Thread Yi Pan
Hi, Milinda, That was an accidental mistake. I have reverted the check-in. I am still working on that. Thanks! -Yi On Mon, Jun 1, 2015 at 9:34 PM, Milinda Pathirage wrote: > Hi Navina, > > Did we decided to push this patch to samza-sql branch. I thought Yi is > still working on this. Some Git

Re: Containers stuck in event loop

2015-06-02 Thread Yi Pan
Hi, Davide, Which version of Samza are you using now? Did you check SAMZA-608? It seems to me that you may be experiencing that bug. We are including this fix in the upcoming release soon. Regards! -Yi On Tue, Jun 2, 2015 at 12:44 AM, Davide Simoncelli wrote: > Hello, > > I have had problems

Re: Samza Consumer / Producer Question

2015-06-08 Thread Yi Pan
Hi, Chas, Could you share your job configuration as well? Thanks! -Yi On Mon, Jun 8, 2015 at 8:42 AM, Chas Personal wrote: > Hello, > > > > I am working on a Samza script currently and had a couple questions. I am > able to work with the Hello-Samza application and have been able to add to >

Confluent wiki pages are down

2015-06-12 Thread Yi Pan
Hi, all, Just FYI that the cwiki links are down now. I have filed an infra ticket for that: INFRA-9806 - Cwiki site down for Samza -Yi

Re: [DISCUSS] Samza 0.9.1 release

2015-06-16 Thread Yi Pan
+1 Agreed. Thanks! On Tue, Jun 16, 2015 at 10:15 AM, Yan Fang wrote: > Agreed on this. > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Tue, Jun 16, 2015 at 10:14 AM, Guozhang Wang > wrote: > > > Hi all, > > > > We have been running a couple of our jobs against `0.9.1` branch last > wee

Re: [DISCUSS] Samza 0.9.1 release

2015-06-16 Thread Yi Pan
Hi, Shekar, This 0.9.1 is a bug-fix only release. No features added yet. New features are expected in 0.10.0. Thanks! On Tue, Jun 16, 2015 at 10:59 AM, Shekar Tippur wrote: > Wang, > > I have not caught up but can you please highlight if there are any feature > additions as well? > > - Shekar

Re: Measuring Samza Job Throughput

2015-06-17 Thread Yi Pan
Hi, Milinda, Tao @LinkedIn has done some Samza benchmark test using a standard word-count task. You may want to reach out to him for some detailed ideas on how to set up the perf tests. Best! -Yi On Wed, Jun 17, 2015 at 11:25 AM, Milinda Pathirage wrote: > Thank you all for the ideas. I'll ha

Re: [VOTE] Apache Samza 0.9.1 RC0

2015-06-19 Thread Yi Pan
+1. Ran the Samza failure test suite and succeeded over night. On Wed, Jun 17, 2015 at 5:54 PM, Guozhang Wang wrote: > Hey all, > > This is a call for a vote on a release of Apache Samza 0.9.1. This is a > bug-fix release against 0.9.0. > > The release candidate can be downloaded from here: > >

Re: [VOTE] Apache Samza 0.9.1 RC0

2015-06-22 Thread Yi Pan
gt; Thanks, > > > > > > > > Fang, Yan > > > > yanfang...@gmail.com > > > > > > > > On Sat, Jun 20, 2015 at 3:46 PM, Guozhang Wang > > > wrote: > > > > > > > > > Since we only get one vote so far, I t

Re: [VOTE] Apache Samza 0.9.1 RC0

2015-06-22 Thread Yi Pan
t; > Yan, > > > > I tested to patch locally and it looks good. Creating a patched release > > for myself to test in our environment. Thanks, again. > > > > Sent from my iPhone > > > > > On Jun 22, 2015, at 10:59 AM, Yi Pan wrote: > > > &

Re: [VOTE] Apache Samza 0.9.1 RC0

2015-06-22 Thread Yi Pan
, 2015 at 5:25 PM, Yan Fang wrote: > Hi Yi Pan, > > " Is there any document regarding to how to publish the maven staging link? > " > > -- Yes. Check the last part of the > https://github.com/apache/samza/blob/master/RELEASE.md . Not sure if you > have seen this. I

Re: [SAMZA-690] Changelog topic creation should not be in the container code

2015-06-25 Thread Yi Pan
Hi, Robert, Thanks for digging into this. I am embedding my answers below: On Thu, Jun 25, 2015 at 7:40 AM, Robert Zuljevic wrote: > 1. Is checkpoint topic referred to in the description coordinator > stream/topic? > In the master branch, checkpoint topic is deprecated (except for migr

Re: Installing Samza w/o internet connection

2015-06-25 Thread Yi Pan
Hi, Amos, I assume that you are referring to "preparing the build environment for Samza source code". As Milinda said, to set up the build environment, you will need a) an Internet connection to download required packages from Maven; b) a cached collection of required package on your local machine

Re: [VOTE] Apache Samza 0.9.1 RC0

2015-06-25 Thread Yi Pan
r completing the vote, you can "release" the artifacts to the public > repository by clicking the "release" button. :) > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Mon, Jun 22, 2015 at 5:30 PM, Yi Pan wrote: > > > Hi, Yan, > > &

Re: Triggering emits for streaming window aggregates

2015-06-26 Thread Yi Pan
Hi, Milinda, I thought that in your example, the ordering field is given in GROUP BY. Are we missing a way to pass the ordering field(s) to the LogicalAggregate? -Yi On Fri, Jun 26, 2015 at 10:49 AM, Milinda Pathirage wrote: > Hi Julian, > > Even though this is a general question across all th

[VOTE] Apache Samza 0.9.1 RC1

2015-06-28 Thread Yi Pan
Hey all, This is a call for a vote on a release of Apache Samza 0.9.1. This is a bug-fix release against 0.9.0. The release candidate can be downloaded from here: http://people.apache.org/~nickpan47/samza-0.9.1-rc1/ The release candidate is signed with pgp key 911402D8, which is included in the

Re: Samza and sliding window

2015-06-29 Thread Yi Pan
Hi, Shekar, First, I would like to clarify what you meant by sliding window: is it defined as windows with size N and advance step size of 1 (which means that windows overlap and each input message would contribute to multiple counts in different windows)? Or windows with size N and advance step s

Re: Hopping and tumbling windows in streaming SQL

2015-06-29 Thread Yi Pan
Hey, Julian, That's awesome! I read through all the examples and it is really easy to express most of our use cases now! Thanks a lot! I have just a few additional points here: Q5. Aligned tumbling window TUMBLE does not have an align argument, so you need to use HOP. SELECT STREAM START(rowti

Re: SamzaAppMaster failling on yarn 2.5.1

2015-07-01 Thread Yi Pan
Hi, Nelson, We build and test Samza against YARN-2.5. There should not be an incompatibility issue here. From your logs, it seems that it is a security exception. Could you let us know your YARN site configuration? Is there any security mechanism configured in your YARN cluster that requires the A

Re: Samza and sliding window

2015-07-01 Thread Yi Pan
$map$1.apply(TraversableLike.scala:244) > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > > > > >

Re: [VOTE] Apache Samza 0.9.1 RC1

2015-07-01 Thread Yi Pan
t; >> > >> On Tue, Jun 30, 2015 at 2:10 AM, Yan Fang wrote: > >> > >> > +1 > >> > > >> > Verified MD5, Signature. > >> > > >> > Tested locally. > >> > > >> > Thanks, > >> > > >> > Fa

Re: Samza and sliding window

2015-07-01 Thread Yi Pan
Hi, Shekar, Could you attach the complete config file here? It would be hard just to debug through snippets from your configure file. Thanks! -Yi On Wed, Jul 1, 2015 at 5:59 PM, Shekar Tippur wrote: > Sorry, after re-reading the docs, > > https://samza.apache.org/learn/documentation/0.8/jobs/

Re: [VOTE] Apache Samza 0.9.1 RC1

2015-07-02 Thread Yi Pan
n weekend- or holiday-strattling votes for > five days. > > > > -Jakob > > > > > > On 1 July 2015 at 20:49, Milinda Pathirage > wrote: > >> +1 for extending the voting period. > >> > >> Thanks > >> Milinda > >> > &

Re: Thoughts and obesrvations on Samza

2015-07-02 Thread Yi Pan
Hi, all, Thanks Chris for sending out this proposal and Jay for sharing the extremely illustrative prototype code. I have been thinking it over many times and want to list out my personal opinions below: 1. Generally, I agree with most of the people here on the mailing list on two points:

Re: Thoughts and obesrvations on Samza

2015-07-02 Thread Yi Pan
he actual > resource assignment, process restart, etc, right? Is the additional value > add of the JobCoordinator just partition management? > > -Jay > > On Thu, Jul 2, 2015 at 11:32 AM, Yi Pan wrote: > > > Hi, all, > > > > > > Thanks Chris for sending o

Re: Thoughts and obesrvations on Samza

2015-07-02 Thread Yi Pan
that we need a pluggable partition management component, decoupled from the framework to do resource assignment, process restart, etc. On Thu, Jul 2, 2015 at 2:35 PM, Yi Pan wrote: > @Jay, yes, the current function in the JobCoordinator is just partition > management. Maybe we should just c

Re: Thoughts and obesrvations on Samza

2015-07-02 Thread Yi Pan
;containers") to run processes, while Samza > still needs to handle task assignment / scheduling like which tasks should > be allocated to which containers that consume from which partitions, etc. I > think this is want Yi meant for "partition management"? > > On Thu,

Re: Thoughts and obesrvations on Samza

2015-07-02 Thread Yi Pan
s to allow the job to control partition > assignment without having to deploy a custom partition assignment strategy > to the Kafka broker, is that right? > > The regex support and dynamic topic discovery you get for free as the > consumer needs to do that anyway. > > -Jay > &g

Re: Thoughts and obesrvations on Samza

2015-07-02 Thread Yi Pan
ake the a hard assertion on the server. > > So it may make sense to revist this, I don't think it is necessarily a > massive change and would give more flexibility for the variety of cases. > > -Jay > > On Thu, Jul 2, 2015 at 3:38 PM, Yi Pan wrote: > > > @Guo

Re: Samza and sliding window

2015-07-02 Thread Yi Pan
Hi, Shekar, Sorry I was not able to follow up w/ you in time. It is great that you have found the configure problem and made it work! As for the exception on the iterator, could you send us the log w/ the exception? Thanks! -Yi On Thu, Jul 2, 2015 at 4:36 PM, Shekar Tippur wrote: > Yi, > > L

Re: Thoughts and obesrvations on Samza

2015-07-06 Thread Yi Pan
Hi, Gianmarco, {quote} However, I think the fundamental operation that Samza, Copycat, and Kafka consumers should agree upon is "how can I specify in a simple and transparent way which partitions I want to consume, and how?". {quote} I agree that some basic partition distribution mechanism can be

Re: Thoughts and obesrvations on Samza

2015-07-06 Thread Yi Pan
Hi, Guozhang, {quote} but I think if we decide to go this route we'd better do it now than later as the protocol is not officially "released" yet. This may delay the first release of the new consumer. {quote} I totally agree. Given that potential heavy migration cost later, I think that a slight d

Re: Thoughts and obesrvations on Samza

2015-07-06 Thread Yi Pan
Hi, Martin, Great to hear your voice! I will just try to focus on your questions regarding to "w/o YARN" part. {quote} For example, would host affinity (SAMZA-617) still be possible? {quote} It would be possible if we separate the job execution/process launching from the partition assignment amon

Re: Thoughts and obesrvations on Samza

2015-07-06 Thread Yi Pan
ll your > other stuff is tied to (say) Mesos/Marathon, and the layer of indirection > obscures the interface the user is already familar with from other systems. > But I think I may actually be misunderstanding your proposal... > > -Jay > > On Mon, Jul 6, 2015 at 11:30 AM, Yi Pan w

Re: Thoughts and obesrvations on Samza

2015-07-06 Thread Yi Pan
tally unaware of the execution framework it uses. The job submission/configuration/launching tools can be completely isolated from the Samza container as a process, ideally. On Mon, Jul 6, 2015 at 3:13 PM, Yi Pan wrote: > @Jay, you got my point. > > {quote} > I think the question is wh

Re: Samza and sliding window

2015-07-06 Thread Yi Pan
2015 at 5:36 PM, Shekar Tippur > > wrote: > > >> > > >>> Yi, > > >>> > > >>> There is no exception. I want to do couple of things in the window. > > >>> > > >>> - Get all the keys and values and publish to an

Re: [VOTE] Apache Samza 0.9.1 RC1

2015-07-07 Thread Yi Pan
Hi, all, Is the vote done? We have got 4 binding and 2 un-binding votes for +1 so far. Thanks! -Yi On Mon, Jul 6, 2015 at 12:45 PM, Martin Kleppmann wrote: > +1 (binding) on RC1. Verified sig, built, tested with hello-samza. > > On 2 Jul 2015, at 19:22, Yi Pan wrote: > &

Re: [VOTE] Apache Samza 0.9.1 RC1

2015-07-08 Thread Yi Pan
Hi, all, If there is no objection, I plan to close this vote as passed today. So far, counting the vote +1 from myself, we have got: RC1: +1 (binding) x 5 and +1 (non-binding) x2 Thanks! -Yi On Tue, Jul 7, 2015 at 11:10 AM, Yi Pan wrote: > Hi, all, > > Is the vote done? We have got

Re: Powered by page update

2015-07-08 Thread Yi Pan
Hey, all, Reviving this thread. It would be really nice if we can update the Powered-by page when releasing 0.9.1. Thanks a lot! -Yi On Tue, Jun 16, 2015 at 5:31 PM, Chris Riccomini wrote: > Hey all, > > I'm seeing a lot of new faces on the mailing list, which is really awesome. > I want to i

Re: Thoughts and obesrvations on Samza

2015-07-09 Thread Yi Pan
Hi, Julian and Martin, Good point on community-merging vs project-merging and good summary! For Julian's point #2, I think that he was referring to the support to integrate w/ a cluster job execution framework, like YARN/Mesos/AWS. And who (i.e. the community) and which project (i.e. code) would

[RESULT][VOTE] Apache Samza 0.9.1 RC1

2015-07-09 Thread Yi Pan
Hey all, It looks like RC1 passed the vote: +1 (binding) x 5 (Martin, Jakob, Yan, Yi, Chris) I will publish a blog for 0.9.1 release this afternoon. Thanks everyone! -Yi On Wed, Jul 8, 2015 at 10:58 AM, Yi Pan wrote: > Hi, all, > > If there is no objection, I plan to close thi

Re: Thoughts and obesrvations on Samza

2015-07-12 Thread Yi Pan
Hi, Chris, Thanks for sending out this concrete set of points here. I agree w/ all but have a slight different point view on 8). My view on this is: instead of sunset Samza as TLP, can we re-charter the scope of Samza to be the home for "running streaming process as a service"? My main motivatio

Re: Thoughts and obesrvations on Samza

2015-07-12 Thread Yi Pan
, 2015 at 7:29 PM, Yi Pan wrote: > Hi, Chris, > > Thanks for sending out this concrete set of points here. I agree w/ all > but have a slight different point view on 8). > > My view on this is: instead of sunset Samza as TLP, can we re-charter the > scope of Samza to be the home f

Re: Thoughts and obesrvations on Samza

2015-07-13 Thread Yi Pan
Hi, Jay, Given all the user concerns, the board disagreement on sub-projects, I am supporting your 5th option as well. As you said, even the end goal is the same, it might help to pave a smooth path forward. One thing I learned over the years is that what we planned for may not be the final produc

  1   2   3   4   5   6   7   8   9   10   >