Re: Udf Performance and Object Creation

2015-08-14 Thread Stephan Ewen
Yes, map() is like a convenience function around mapPartition(). On Fri, Aug 14, 2015 at 6:09 PM, Flavio Pompermaier wrote: > Hi Stephan thanks for the reply! > Now it's more clear..if I understood correctly map and mapPartition are > the same iff I have only one slot per task manager, right? >

Re: Udf Performance and Object Creation

2015-08-14 Thread Flavio Pompermaier
Hi Stephan thanks for the reply! Now it's more clear..if I understood correctly map and mapPartition are the same iff I have only one slot per task manager, right? I was convinced to have post those questions in this thread as 3rd or 4th message..isn't it? On 14 Aug 2015 17:57, "Stephan Ewen" wro

Re: Udf Performance and Object Creation

2015-08-14 Thread Fabian Hueske
O sorry, Flavio! I didn't see Hawins questions :-( Thanks Stephan for picking up! 2015-08-14 17:43 GMT+02:00 Flavio Pompermaier : > Any insight about these 2 questions..? > On 12 Aug 2015 17:38, "Flavio Pompermaier" wrote: > >> This is something I've never understood in depth: isn't a mapper cr

Re: Udf Performance and Object Creation

2015-08-14 Thread Stephan Ewen
Hi! (1) A mapper is created once per parallel task. So if you create a program that runs a map() transformation with a parallelism of n, you will have n mapper instances in the cluster. Some may be on the same TaskManager, if the TaskManager has multiple slots. (2) I would really like that. But i

Re: Udf Performance and Object Creation

2015-08-14 Thread Fabian Hueske
I think Timo answered both questions (quoting Michael: "Hey Timo, yes that is what I needed to know. Thanks"). Maybe one more comment. The motivation of the examples is not the best performance but to showcase Flink's APIs and concepts. Best, Fabian 2015-08-14 17:43 GMT+02:00 Flavio Pompermaier

Re: Udf Performance and Object Creation

2015-08-14 Thread Flavio Pompermaier
Any insight about these 2 questions..? On 12 Aug 2015 17:38, "Flavio Pompermaier" wrote: > This is something I've never understood in depth: isn't a mapper created > for each record?if it's created only once per task manager then it's not so > different from mapPartition..what I'm missing here? >

Re: Udf Performance and Object Creation

2015-08-13 Thread Hawin Jiang
Thanks Timo That is a good interview question Best regards Hawin On Thu, Aug 13, 2015 at 1:11 AM, Michael Huelfenhaus < m.huelfenh...@davengo.com> wrote: > Hey Timo, > > yes that is what I needed to know. > > Thanks > - Michael > > Am 12.08.2015 um 12:44 schrieb Timo Walther : > > > Hello Mic

Re: Udf Performance and Object Creation

2015-08-13 Thread Michael Huelfenhaus
Hey Timo, yes that is what I needed to know. Thanks - Michael Am 12.08.2015 um 12:44 schrieb Timo Walther : > Hello Michael, > > every time you code a Java program you should avoid object creation if you > want an efficient program, because every created object needs to be garbage > collecte

Re: Udf Performance and Object Creation

2015-08-12 Thread Timo Walther
Hello Michael, every time you code a Java program you should avoid object creation if you want an efficient program, because every created object needs to be garbage collected later (which slows down your program performance). You can have small Pojos, just try to avoid the call "new" in your