Chesnay Schepler created FLINK-3751: ---------------------------------------
Summary: default Operator names are inconsistent Key: FLINK-3751 URL: https://issues.apache.org/jira/browse/FLINK-3751 Project: Flink Issue Type: Bug Components: DataSet API, DataStream API Affects Versions: 1.0.1 Reporter: Chesnay Schepler Priority: Minor h3. The Problem If a user doesn't name an operator explicitly (generally using the name() method) then Flink auto generates a name. These generated names are really (like, _really_) inconsistent within and across API's. In the batch API non-source/-sink operator names are _generally_ formed like this: {code}FlatMap (FlatMap at main(WordCount.java:81)){code} We have * FlatMap, describing the runtime operator type * another FlatMap, describing which user-call created this operator * main(WordCount.java:81), describing the call location This already falls apart when you have a DataSource, which looks like this: {code}DataSource (at getDefaultTextLineDataSet(WordCountData.java:70) (org.apache.flink.CollectionInputFormat){code} It is missing the call that created the sink (fromElements()) and suddenly includes the inputFormat name. Sink are a different story yet again, since collect() is displayed as {code} DataSink (collect()) {code} which is missing the call location. Then we have the Streaming API where things are named completely different as well: The fromElements source is displayed as {code} Source: Collection Source {code} non-source/-sink operators are displayed simply as their runtime operator type {code} FlatMap {code} and sinks, at times, do not have a name at all. {code} Sink: Unnamed {code} To put the cherry on top, chains are displayed in the Batch API as {code} CHAIN <operator> -> <operator> {code} while in the Streaming API we lost the CHAIN keyword {code} <operator> -> <operator> {code} Considering that these names are right in the users face via the Dashboard we should try to homogenize them a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)