Gábor
​,
Thank you for the reply, I gave that a go and the flow still showed
parallel 90 for each step.  Is the ui not 100% accurate perhaps?

To get around it for now I implemented a partitioner that threw all the
data to the same partition, hack but works!​

On Tue, Oct 3, 2017 at 4:12 AM, Gábor Gévay <gga...@gmail.com> wrote:

> Hi Garrett,
>
> You can call .setParallelism(1) on just this operator:
>
> ds.reduceGroup(new GroupReduceFunction...).setParallelism(1)
>
> Best,
> Gabor
>
>
>
> On Mon, Oct 2, 2017 at 3:46 PM, Garrett Barton <garrett.bar...@gmail.com>
> wrote:
> > I have a complex alg implemented using the DataSet api and by default it
> > runs with parallel 90 for good performance. At the end I want to perform
> a
> > clustering of the resulting data and to do that correctly I need to pass
> all
> > the data through a single thread/process.
> >
> > I read in the docs that as long as I did a global reduce using
> > DataSet.reduceGroup(new GroupReduceFunction....) that it would force it
> to a
> > single thread.  Yet when I run the flow and bring it up in the ui, I see
> > parallel 90 all the way through the dag including this one.
> >
> > Is there a config or feature to force the flow back to a single thread?
> Or
> > should I just split this into two completely separate jobs?  I'd rather
> not
> > split as I would like to use flinks ability to iterate on this alg and
> > cluster combo.
> >
> > Thank you
>

Reply via email to