Re: OOM issues with Python Objects

2021-03-24 Thread Kevin Lam
Hi Dian, I have unit tests for which both sets of code (Row subclass vs. custom Python class) passes. The OOM occurs when reading a large amount of data from a kafka topic. At the moment I don't have a simple example to reproduce the issue, I'll let you know. On Tue, Mar 23, 2021 at 2:17 AM Dian

Re: OOM issues with Python Objects

2021-03-22 Thread Dian Fu
Hi Kevin, Is it possible to provide a simple example to reproduce this issue? PS: It will use pickle to perform the serialization/deserialization if you don't specify the type info. Regards, Dian On Mon, Mar 22, 2021 at 10:55 PM Arvid Heise wrote: > Hi Kevin, > > yes I understood that, but t

Re: OOM issues with Python Objects

2021-03-22 Thread Arvid Heise
Hi Kevin, yes I understood that, but then your Python class contains a Row field, where no mapping exists. I'm assuming PyFlink tries to do a deep conversion and fails to do so by ending in some infinite loop. On Mon, Mar 22, 2021 at 3:48 PM Kevin Lam wrote: > Thanks for the response Arvid! Poi

Re: OOM issues with Python Objects

2021-03-22 Thread Kevin Lam
Thanks for the response Arvid! Point of clarification, *things do NOT OOM when I use the Row subclass*. Instead, the code that doesn't use the Row subclass is the code that OOMs (ie. the simple python class). On Mon, Mar 22, 2021 at 10:24 AM Arvid Heise wrote: > Hi Kevin, > > I suspect that th

Re: OOM issues with Python Objects

2021-03-22 Thread Arvid Heise
Hi Kevin, I suspect that this is because Row is not supported as a Python field [1]; it's supposed to be a dict that is mapped to a Row by Flink. Maybe it runs in some infinite loop while trying serialize and hence the OOM. Subclassing Row might be an undocumented feature. I'm also pulling in Di