Re: OOM issues with Python Objects

2021-03-24 Thread Kevin Lam
Hi Dian, I have unit tests for which both sets of code (Row subclass vs. custom Python class) passes. The OOM occurs when reading a large amount of data from a kafka topic. At the moment I don't have a simple example to reproduce the issue, I'll let you know. On Tue, Mar 23, 2021 at 2:17 AM Dian

Re: OOM issues with Python Objects

2021-03-22 Thread Dian Fu
Hi Kevin, Is it possible to provide a simple example to reproduce this issue? PS: It will use pickle to perform the serialization/deserialization if you don't specify the type info. Regards, Dian On Mon, Mar 22, 2021 at 10:55 PM Arvid Heise wrote: > Hi Kevin, > > yes I understood that, but t

Re: OOM issues with Python Objects

2021-03-22 Thread Arvid Heise
Hi Kevin, yes I understood that, but then your Python class contains a Row field, where no mapping exists. I'm assuming PyFlink tries to do a deep conversion and fails to do so by ending in some infinite loop. On Mon, Mar 22, 2021 at 3:48 PM Kevin Lam wrote: > Thanks for the response Arvid! Poi

Re: OOM issues with Python Objects

2021-03-22 Thread Kevin Lam
Thanks for the response Arvid! Point of clarification, *things do NOT OOM when I use the Row subclass*. Instead, the code that doesn't use the Row subclass is the code that OOMs (ie. the simple python class). On Mon, Mar 22, 2021 at 10:24 AM Arvid Heise wrote: > Hi Kevin, > > I suspect that th

Re: OOM issues with Python Objects

2021-03-22 Thread Arvid Heise
Hi Kevin, I suspect that this is because Row is not supported as a Python field [1]; it's supposed to be a dict that is mapped to a Row by Flink. Maybe it runs in some infinite loop while trying serialize and hence the OOM. Subclassing Row might be an undocumented feature. I'm also pulling in Di

OOM issues with Python Objects

2021-03-19 Thread Kevin Lam
Hi all, I've encountered an interesting issue where I observe an OOM issue in my Flink Application when I use a DataStream of Python Objects, but when I make that Python Object a Subclass of pyflink.common.types.Row and provide TypeInformation, there is no issue. For the OOM scenario, no elements