Re: Split a dataset

2017-10-17 Thread Fabian Hueske
Unfortunately, it's not possible to bridge the gap between the DataSet and DataStream APIs. However, you can also use a CsvInputFormat in the DataStream API. Since there's no built-in API to configure the CSV input, you would have to create (and configure) the CsvInputFormat yourself. Once you hav

Re: Split a dataset

2017-10-17 Thread Magnus Vojbacke
Thank you, Fabian! If batch semantics are not important to my use case, is there any way to "downgrade" or convert a DataSet to a DataStream? BR /Magnus > On 17 Oct 2017, at 10:54, Fabian Hueske wrote: > > Hi Magnus, > > there is no Split operator on the DataSet API. > > As you said, this ca

Re: Split a dataset

2017-10-17 Thread Fabian Hueske
Hi Magnus, there is no Split operator on the DataSet API. As you said, this can be done using a FilterFunction. This also allows for non-binary splits: DataSet setToSplit = ... DataSet firstSplit = setToSplit.filter(new SplitCondition1()); DataSet secondSplit = setToSplit.filter(new SplitConditi

Split a dataset

2017-10-17 Thread Magnus Vojbacke
I'm looking for something like DataStream.split(), but for DataSets. I'd like to split my streaming data so messages go to different parts of an execution graph, based on arbitrary logic. DataStream.split() seems to be perfect, except that my source is a CSV file, and I have only found built in