I would like to add that if your predicate does some heavy-weight
computation that you want to avoid duplicating for the filters, then
you can insert a map before the filters, where you evaluate the
predicate and put the result into a field.
Best,
Gabor
2016-05-13 11:51 GMT+02:00 Fabian Hueske
Hi,
it is true that Gabor's approach of using two filters has a certain
overhead.
However, the overhead should be reasonable. The data stays on the same node
and the filter can be very lightweight.
I agree that this is not a very nice solution.
However, modifying the DataSet API such that an oper
Hi,
if it just require implementing a custom operator(i mean does not require
changes to network stack or other engine level changes) i can try to
implement it since i am working on optimizer and plan generation for a
month. Also we are going to implement our etl framework on flink and this
kind
Hi,
I agree that this would be very nice. Unfortunately Flink does only allow
one output from an operation right now. Maybe we can extends this somehow
in the future.
Cheers,
Aljoscha
On Thu, 12 May 2016 at 17:27 CPC wrote:
> Hi Gabor,
>
> Yes functionally this helps. But in this case i am proc
Hi Gabor,
Yes functionally this helps. But in this case i am processing an element
twice and sending whole data to two different operator . What i am trying
to achieve is like datastream split like functionality or a little bit
more:
In filter like scenario i want to do below pseudo operation:
Hello,
You can split a DataSet into two DataSets with two filters:
val xs: DataSet[A] = ...
val split1: DataSet[A] = xs.filter(f1)
val split2: DataSet[A] = xs.filter(f2)
where f1 and f2 are true for those elements that should go into the
first and second DataSets respectively. So far, the splits
Hi folks,
Is there any way in dataset api to split Dataset[A] to Dataset[A] and
Dataset[B] ? Use case belongs to a custom filter component that we want to
implement. We will want to direct input elements whose result is false
after we apply the predicate. Actually we want to direct input elements