Which name changes are you referring to? The proposed names in my recent PR? Or the dropping of Stream from all the classes. For the rest I was just rambling about how I don't like the names in the batch API. :D
On Fri, May 8, 2015 at 12:31 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > Generally I am in favor of making these name changes. My only concern is > regarding to the one-input and multiple inputs operators. > > There is a general problem with the n-ary operators regarding type safety, > thats why we now have SingleInput and Co (two-input) operators. I think we > should keep these. > > On Fri, May 8, 2015 at 11:38 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > >> Hi, >> since I'm currently reworking the Stream operators I thought it's a >> good time to talk about the naming of some classes. We have some >> legacy problems with lots of Operators, OperatorBases, TwoInput, >> OneInput, Unary, Binary, etc. And maybe we can break things in >> streaming to have more consistent and future-proof naming. >> >> In streaming, there are: >> - Tasks, these are an AbstractInvokabe and contain the main loop of a >> streaming vertex. They read from the inputs and forward data to the >> operator implementation. >> >> - Operators, these are invoked by a Task and are responsible for the >> actual logic of the operator. Think Map, Join, Reduce and so on. These >> are responsible for calling the user-defined function. >> >> - Operators (again, I know), these are user facing classes (some >> derived from DataStream, some not). There is for example >> SingleOutputStreamOperator, for the result of a DataStream >> transformation that has a single output. There are also >> TemporalOperator and its derived classes StreamCrossOperator and >> StreamJoinOperator. The actual operator inside a task (the ones I >> mentioned before that are responsible for the user logic) that >> executes a temporal join is called CoStreamWindow (with a >> JoinWindowFunction). >> >> As I currently have it in my PR, there are two Task classes, one for >> single input, and one for two-input operators. There are also the >> corresponding operator interfaces for unary and binary operators (see >> what I did there ... :D). >> >> What should we call all these classes (concepts). Also I'm heavily in >> favour of dropping all the Stream (or Streaming) prefixes and suffixes >> from the class names. I know I'm in streaming because the package is >> named streaming. And we should not restrain ourselves because the >> batch API also has things called operator. >> >> Also, the concept of one-input, two-input tasks and operators is not >> very scalable, Maybe we should have a single interface for operators >> that has a receiveElement(int, element) method that tells the operator >> from which input an element came. Then we can scale this to n-ary >> operators. This would of course have the overhead of always sending >> along the number of the input instead of encoding the input number in >> the method name, such as receiveElement1() and receiveElement2(). >> >> Any thoughts? :D (I know I'm writing the long annoying emails today >> but I think it is important we discuss these things before being stuck >> with them.) >> >> Cheers, >> Aljoscha >>