----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review148787 -----------------------------------------------------------
I started prototyping the following usecase: ` * A task that joins PageViewEvent {userId, pageUrl, timeStamp} with the UserProfile stream {userId, region}, and then counts the number of * page views by region. It outputs a kafka message {region, pageUrl, count}` and have my initial set of comments. I will follow up with you offline. samza-operator/src/main/java/org/apache/samza/operators/api/MessageStreams.java (line 37) <https://reviews.apache.org/r/47835/#comment216309> What do you think about adding a getSystemStreamPartition() function in the `SystemMessageStream` class? The following maybe an issue with the current design: In the `initOperator` callback that the user receives, how will a user know what SSP that a `SystemMessageStream` is for? For example, depending on what stream I'm consuming I would want to initialize them / perform different transforms on them. samza-operator/src/test/java/org/apache/samza/task/JoinOperatorTask.java (line 40) <https://reviews.apache.org/r/47835/#comment216294> I wonder if there's a better way to express the 'init' of the sources (instead of providing a callback that contains `a` single `MessageStream`. samza-operator/src/test/java/org/apache/samza/task/JoinOperatorTask.java (line 47) <https://reviews.apache.org/r/47835/#comment216295> The current implementation seems to rely on state maintained across multiple invocations of `initOperators` to correctly invoke the join function. Suggestion 1: I wonder if we could pass in list of all `SystemMessageStreams` in this method? (Or perhaps - declaratively create a `MessageStream` in code doing a `new MessageStream(systemStreamPartition);`? That way users can do a one-shot initialization? Suggestion 2: What if we just pass in a `List<SystemStreamPartition>` objects and users can choose to instantiate `messageStreams` and register them. Please let me know if I don't fully understand the constraints of the design! samza-operator/src/test/java/org/apache/samza/task/JoinOperatorTask.java (line 48) <https://reviews.apache.org/r/47835/#comment216308> Curious as to why should users rely on `Join.join` to do the join? I wonder if we could directly invoke a `joinSource.join(joinSource2)`? That way, I can do a more fluent groupBy on a join result? joinSource.join(joinSource2) .join(joinSource3) .window(SessionWindows.into ()) .sink(SinkFunction) - Jagadish Venkatraman On Sept. 12, 2016, 5:53 p.m., Yi Pan (Data Infrastructure) wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/47835/ > ----------------------------------------------------------- > > (Updated Sept. 12, 2016, 5:53 p.m.) > > > Review request for samza, Boris Shkolnik, Chris Pettitt, Chinmay Soman, Jake > Maes, Navina Ramesh, Jagadish Venkatraman, and Xinyu Liu. > > > Bugs: SAMZA-914 > https://issues.apache.org/jira/browse/SAMZA-914 > > > Repository: samza > > > Description > ------- > > SAMZA-914: initial draft of operator programming API. Design doc attached to > SAMZA-914: > https://issues.apache.org/jira/secure/attachment/12821524/SAMZA-914_%20operator%20Java%20programming%20API%20-%20Google%20Docs.pdf > > > Diffs > ----- > > build.gradle 16facbbf4dff378c561461786ff186bd9e0000ed > gradle/dependency-versions.gradle 52e25aa53a1edc85d478b48898621b26508ad4bb > samza-api/src/test/java/org/apache/samza/config/TestConfig.java > 5d066c5867e9df9e94e60bde825dedf10703b399 > samza-operator/src/main/java/org/apache/samza/operators/api/Join.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/MessageStream.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/MessageStreams.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/operators/api/Triggers.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/WindowState.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/operators/api/Windows.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/data/Message.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/data/WindowOutput.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/internal/Operators.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/internal/Trigger.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/internal/Window.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/ChainedOperators.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/OperatorBaseImpl.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/ProcessorContext.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/SimpleOperatorImpl.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/StateStoreImpl.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/window/SessionWindowImpl.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/task/StreamOperatorAdaptorTask.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/task/StreamOperatorTask.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/operators/api/data/TestMessageStream.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/task/AssembleCallGraphTask.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/task/BroadcastOperatorTask.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/task/InputAvroSystemMessage.java > PRE-CREATION > samza-operator/src/test/java/org/apache/samza/task/JoinOperatorTask.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/task/TestStreamOperatorTasks.java > PRE-CREATION > samza-operator/src/test/java/org/apache/samza/task/WindowOperatorTask.java > PRE-CREATION > > samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/schema/TestAvroSchemaConverter.java > PRE-CREATION > samza-sql-core/README.md PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Data.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/EntityName.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Relation.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Schema.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Stream.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Table.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Tuple.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/Operator.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorCallback.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorRouter.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/SimpleOperator.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/SqlOperatorFactory.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/IncomingMessageTuple.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/data/avro/AvroData.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/data/avro/AvroSchema.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlAvroSerde.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeFactory.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlStringSerde.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlStringSerdeFactory.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/string/StringData.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/string/StringSchema.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/NoopOperatorCallback.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorFactoryImpl.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorImpl.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleRouter.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/join/StreamStreamJoin.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/join/StreamStreamJoinSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/partition/PartitionOp.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/partition/PartitionSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/window/BoundedTimeWindow.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/window/WindowSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/window/WindowState.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/window/storage/OrderedStoreKey.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/system/sql/LongOffset.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/system/sql/Offset.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/task/sql/RouterMessageCollector.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/task/sql/SimpleMessageCollector.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeTest.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/task/sql/RandomWindowOperatorTask.java > PRE-CREATION > samza-sql-core/src/test/java/org/apache/samza/task/sql/StreamSqlTask.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java > PRE-CREATION > settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 > > Diff: https://reviews.apache.org/r/47835/diff/ > > > Testing > ------- > > ./gradlew clean build > > > Thanks, > > Yi Pan (Data Infrastructure) > >