Hi, since I'm currently reworking the Stream operators I thought it's a good time to talk about the naming of some classes. We have some legacy problems with lots of Operators, OperatorBases, TwoInput, OneInput, Unary, Binary, etc. And maybe we can break things in streaming to have more consistent and future-proof naming.
In streaming, there are: - Tasks, these are an AbstractInvokabe and contain the main loop of a streaming vertex. They read from the inputs and forward data to the operator implementation. - Operators, these are invoked by a Task and are responsible for the actual logic of the operator. Think Map, Join, Reduce and so on. These are responsible for calling the user-defined function. - Operators (again, I know), these are user facing classes (some derived from DataStream, some not). There is for example SingleOutputStreamOperator, for the result of a DataStream transformation that has a single output. There are also TemporalOperator and its derived classes StreamCrossOperator and StreamJoinOperator. The actual operator inside a task (the ones I mentioned before that are responsible for the user logic) that executes a temporal join is called CoStreamWindow (with a JoinWindowFunction). As I currently have it in my PR, there are two Task classes, one for single input, and one for two-input operators. There are also the corresponding operator interfaces for unary and binary operators (see what I did there ... :D). What should we call all these classes (concepts). Also I'm heavily in favour of dropping all the Stream (or Streaming) prefixes and suffixes from the class names. I know I'm in streaming because the package is named streaming. And we should not restrain ourselves because the batch API also has things called operator. Also, the concept of one-input, two-input tasks and operators is not very scalable, Maybe we should have a single interface for operators that has a receiveElement(int, element) method that tells the operator from which input an element came. Then we can scale this to n-ary operators. This would of course have the overhead of always sending along the number of the input instead of encoding the input number in the method name, such as receiveElement1() and receiveElement2(). Any thoughts? :D (I know I'm writing the long annoying emails today but I think it is important we discuss these things before being stuck with them.) Cheers, Aljoscha