Re: Broadcast data sent increases with # slots per TM

2016-07-22 Thread Till Rohrmann
way is that the senders would have to know which sub-tasks are > >>> > > deployed to which TMs. > >>> > > > >>> > > As the broadcast variables are realized as additionally attached > >>> > "broadcast > >>> >

Re: Broadcast data sent increases with # slots per TM

2016-07-21 Thread Felix Neutatz
t; deployed to which TMs. >>> > > >>> > > As the broadcast variables are realized as additionally attached >>> > "broadcast >>> > > channels", I am assuming that the same behavior will apply for >>> broadcast >>> > > joins as well. >>> > >

Re: Broadcast data sent increases with # slots per TM

2016-07-07 Thread Felix Neutatz
dcast >> > > joins as well. >> > > >> > > Is this the case? >> > > >> > > Regards, >> > > Alexander >> > > >> > > >> > > 2016-06-08 17:13 GMT+02:00 Kunft, Andreas > >: >> > > >> >

[jira] [Created] (FLINK-4175) Broadcast data sent increases with # slots per TM

2016-07-07 Thread Felix Neutatz (JIRA)
Felix Neutatz created FLINK-4175: Summary: Broadcast data sent increases with # slots per TM Key: FLINK-4175 URL: https://issues.apache.org/jira/browse/FLINK-4175 Project: Flink Issue Type

Re: Broadcast data sent increases with # slots per TM

2016-06-09 Thread Felix Neutatz
s the case? > > > > > > Regards, > > > Alexander > > > > > > > > > 2016-06-08 17:13 GMT+02:00 Kunft, Andreas >: > > > > > > > Hi Till, > > > > > > > > thanks for the fast answer. > > > > I'll think about a concrete way

Re: Broadcast data sent increases with # slots per TM

2016-06-09 Thread Stephan Ewen
think about a concrete way of implementing and open an JIRA. > > > > > > Best > > > Andreas > > > > > > Von: Till Rohrmann > > > Gesendet: Mittwoch, 8. Juni 2016 15:53 > > > An: dev@fl

Re: Broadcast data sent increases with # slots per TM

2016-06-09 Thread Till Rohrmann
and open an JIRA. > > > > Best > > Andreas > > ____ > > Von: Till Rohrmann > > Gesendet: Mittwoch, 8. Juni 2016 15:53 > > An: dev@flink.apache.org > > Betreff: Re: Broadcast data sent increases with # slots per TM > &

Re: Broadcast data sent increases with # slots per TM

2016-06-08 Thread Alexander Alexandrov
> Von: Till Rohrmann > Gesendet: Mittwoch, 8. Juni 2016 15:53 > An: dev@flink.apache.org > Betreff: Re: Broadcast data sent increases with # slots per TM > > Hi Andreas, > > your observation is correct. The data is sent to each slot and the

AW: Broadcast data sent increases with # slots per TM

2016-06-08 Thread Kunft, Andreas
Hi Till, thanks for the fast answer. I'll think about a concrete way of implementing and open an JIRA. Best Andreas Von: Till Rohrmann Gesendet: Mittwoch, 8. Juni 2016 15:53 An: dev@flink.apache.org Betreff: Re: Broadcast data sent increases

Re: Broadcast data sent increases with # slots per TM

2016-06-08 Thread Till Rohrmann
Hi Andreas, your observation is correct. The data is sent to each slot and the receiving TM only materializes one copy of the data. The rest of the data is discarded. As far as I know, the reason why the broadcast variables are implemented that way is that the senders would have to know which sub

Broadcast data sent increases with # slots per TM

2016-06-08 Thread Kunft, Andreas
Hi, we experience some unexpected increase of data sent over the network for broadcasts with increasing number of slots per Taskmanager. We provided a benchmark [1]. It not only increases the size of data sent over the network but also hurts performance as seen in the preliminary results bel