Stephan Humm... I see. Back off one step, how do Flink deal with corrupted input data right now, like a dead letter queue?
Thanks, Chen On Thu, Aug 11, 2016 at 5:40 AM, Stephan Ewen <[email protected]> wrote: > Hi! > > This is a very big change, both on the semantics, the runtime classes. > These changes are tricky to get in, and usually work best if you document > the changes and all implications well. > > Something like a deep design doc, or a FLIP would be great for this. > https://cwiki.apache.org/confluence/display/FLINK/ > Flink+Improvement+Proposals > > Greetings, > Stephan > > > On Thu, Aug 11, 2016 at 12:41 AM, Chen Qin <[email protected]> wrote: > > > Hi there, > > > > I am thinking of implement sideOutput into Flink which seems missing > > support. > > https://cloud.google.com/dataflow/model/par-do#side-outputs > > > > It is useful because it will help pipeline author redirect corrputed > input/ > > code bug to a side stream or write to a table and reconsile afterwards. > > > > After some hack prototyping, I were able to get it works for simple > tests. > > Basically, It allows env to register a side output typeInfo which will be > > passed to configurations during graph building; Adding a new transform > > which similar to selection transform but holding different input type; > > StreamEdge will has a boolean to see if that is side output edge, if so, > > create output writer loads side output type serializer and emit record > only > > when sideOutput is called. > > > > I have some problem passing side output type as template to each data > > stream. It means it will have to expose any output stream with two type > > parameters. As you can imagine, the API interface change will be sizable. > > > > Any suggestion? > > > > Chen > > > -- -Chen Qin
