----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review142421 -----------------------------------------------------------
The description says this is the initial draft implementation, but the title says initial draft for the APIs. I take it the latter is more accurate? Some initial high-level thoughts: 1. We should use Java 8 constructs where possible (if we've moved to Java 8). For example, Function. If not, we should probably have some Function equivalent and various type specifications for map, flatMap, etc. 2. The stream operators don't seem to **do** anything - they primarily appear to hold metadata about something to be done. Are they intended only to be declarative? Are they also not intended to know directly how to build themselves, but rather you would run some graph processor (with knowledge about each operator type) to build out a real processor? This is pretty fundamental to the design. If there is a doc that covers this let me know; otherwise it would be super helpful to be on the same page about the end goal. 3. If bullet 2 is correct, how would you navigate through the graph? It would seem that you would need some way to navigate `(source stream, operator)` tuples? 4. Related to bullet 3, why do the operators know about their output streams? Abstractly aren't they totally independent in the sense that I could apply the same operator to multiple input streams to produce a corresponding number of output streams? 5. It looks like you can compose graphs in two ways: directly using the operators or using the interfaces on DataStream. I would choose one or the other and use appropriate hiding mechanisms to expose just the API the user should be concerned with. If you go with the latter I would extract interfaces for the operators (if that is even necessary) into a public package and hide everything else in a package private namespace. 6. I suspect you want to limit the ability to create custom operators (at least if the assumption about how graph walking would work in bullet 2 holds), so StreamOperator's constructor probably needs to be package private. I suspect some over arching docs or a bluejeans session would be very helpful in allowing me to dig deeper into this. samza-operator/src/main/java/org/apache/samza/operators/api/FlatMapper.java (line 34) <https://reviews.apache.org/r/47835/#comment207976> Have we moved to Java 8 yet? If so, it would be preferable to use the Java 8 constructs if possible. If not, I think this can be generalized a little more nicely as a function from T to a Collection of R instead of having a separate flat map function type. This comment can be applied generally wherever we're providing "functions" - Chris Pettitt On July 13, 2016, 8:54 a.m., Yi Pan (Data Infrastructure) wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/47835/ > ----------------------------------------------------------- > > (Updated July 13, 2016, 8:54 a.m.) > > > Review request for samza, Boris Shkolnik, Chris Pettitt, Chinmay Soman, Jake > Maes, Navina Ramesh, Jagadish Venkatraman, and Xinyu Liu. > > > Bugs: SAMZA-914 > https://issues.apache.org/jira/browse/SAMZA-914 > > > Repository: samza > > > Description > ------- > > SAMZA-914: initial draft implementation of operator programming API. > > > Diffs > ----- > > build.gradle 16facbbf4dff378c561461786ff186bd9e0000ed > gradle/dependency-versions.gradle 52e25aa53a1edc85d478b48898621b26508ad4bb > samza-operator/src/main/java/org/apache/samza/operators/api/FlatMapper.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/operators/api/Mapper.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/operators/api/Scanner.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/operators/api/Sink.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/data/DataStream.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/join/Joiner.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/window/SessionWindow.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/api/window/Window.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/operators/impl/Pipeline.java > PRE-CREATION > > samza-operator/src/main/java/org/apache/samza/operators/impl/SimpleOperatorImpl.java > PRE-CREATION > samza-operator/src/main/java/org/apache/samza/task/DataStreamTask.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/operators/api/TestScanner.java > PRE-CREATION > > samza-operator/src/test/java/org/apache/samza/operators/api/data/TestDataStream.java > PRE-CREATION > samza-operator/src/test/java/org/apache/samza/task/DataStreamJoinTask.java > PRE-CREATION > samza-operator/src/test/java/org/apache/samza/task/DataStreamSplitTask.java > PRE-CREATION > samza-operator/src/test/java/org/apache/samza/task/DataStreamUserTask.java > PRE-CREATION > samza-operator/src/test/java/org/apache/samza/task/TestDataStreamTask.java > PRE-CREATION > > samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/schema/TestAvroSchemaConverter.java > PRE-CREATION > samza-sql-core/README.md PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Data.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/EntityName.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Relation.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Schema.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Stream.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Table.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Tuple.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/Operator.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorCallback.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorRouter.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/SimpleOperator.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/SqlOperatorFactory.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/IncomingMessageTuple.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/data/avro/AvroData.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/sql/data/avro/AvroSchema.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlAvroSerde.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeFactory.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlStringSerde.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlStringSerdeFactory.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/string/StringData.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/data/string/StringSchema.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/NoopOperatorCallback.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorFactoryImpl.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorImpl.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleRouter.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/join/StreamStreamJoin.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/join/StreamStreamJoinSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/partition/PartitionOp.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/partition/PartitionSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/window/BoundedTimeWindow.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/window/WindowSpec.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/operators/window/WindowState.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/sql/window/storage/OrderedStoreKey.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/system/sql/LongOffset.java > PRE-CREATION > samza-sql-core/src/main/java/org/apache/samza/system/sql/Offset.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/task/sql/RouterMessageCollector.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/task/sql/SimpleMessageCollector.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeTest.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/task/sql/RandomWindowOperatorTask.java > PRE-CREATION > samza-sql-core/src/test/java/org/apache/samza/task/sql/StreamSqlTask.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java > PRE-CREATION > settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 > > Diff: https://reviews.apache.org/r/47835/diff/ > > > Testing > ------- > > ./gradlew clean build > > > Thanks, > > Yi Pan (Data Infrastructure) > >