Thanks for bringing up this point!

+1 for the renaming.
@Marton: Is this a "complete" list, i.e., did you go through both APIs or
might there be more methods that are semantically identical but named
differently?

2015-06-01 17:31 GMT+02:00 Gyula Fóra <gyf...@apache.org>:

> +1 for the changes proposed by Marton (before the release)
>
> Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2015. jún. 1.,
> H,
> 16:32):
>
> > Yes, these renamings make sense. The partitionBy() is not yet in the
> > master for streaming, though.
> >
> > On Mon, Jun 1, 2015 at 4:10 PM, Márton Balassi <balassi.mar...@gmail.com
> >
> > wrote:
> > > Looking at the DataSet and DataStream APIs we have come to the
> conclusion
> > > with Aljoscha that there are a few methods that although providing the
> > same
> > > functionality are named differently. These are the following:
> > >
> > >    1.  rebalance (batch) / distribute (streaming): Rebalances the data
> > sent
> > >    to the downstream operators thus equally distributing it.
> > >    2. partitionByHash, partitionCustom (batch) / partitionBy
> (streaming):
> > >    Partitioning has just recently been exposed in the streaming API and
> > is not
> > >    as refined as the batch one. The streaming partitionBy is actually
> > >    partitionByHash.
> > >    3. Union (batch) / merge, connect (streaming): The streaming merge
> > does
> > >    a union of two streams with the same type. Connect is conceptually
> > >    different, it provides a way of sharing state between two streams
> with
> > >    potentially different types without mapping them to a common type
> and
> > then
> > >    merging them. This saves latency and an ugly mapping. The former
> > advantage
> > >    can be offset by proper operator chaining, the second one would
> > remain if
> > >    we did not have connect.
> > >
> > > To consolidate the naming I would suggest the following:
> > >
> > >    1. Rename streaming distribute to rebalance.
> > >    2. Rename streaming partitionBy to partitionByHash and file JIRA for
> > >    custom partitioning support for streaming.
> > >    3. Rename streaming merge to union, leave streaming connect as it
> is.
> >
>

Reply via email to