Hi Dian,
I have unit tests for which both sets of code (Row subclass vs. custom
Python class) passes. The OOM occurs when reading a large amount of data
from a kafka topic.
At the moment I don't have a simple example to reproduce the issue, I'll
let you know.
On Tue, Mar 23, 2021 at 2:17 AM Dian
Hi Kevin,
Is it possible to provide a simple example to reproduce this issue?
PS: It will use pickle to perform the serialization/deserialization if you
don't specify the type info.
Regards,
Dian
On Mon, Mar 22, 2021 at 10:55 PM Arvid Heise wrote:
> Hi Kevin,
>
> yes I understood that, but t
Hi Kevin,
yes I understood that, but then your Python class contains a Row field,
where no mapping exists. I'm assuming PyFlink tries to do a deep conversion
and fails to do so by ending in some infinite loop.
On Mon, Mar 22, 2021 at 3:48 PM Kevin Lam wrote:
> Thanks for the response Arvid! Poi
Thanks for the response Arvid! Point of clarification, *things do NOT OOM
when I use the Row subclass*. Instead, the code that doesn't use the Row
subclass is the code that OOMs (ie. the simple python class).
On Mon, Mar 22, 2021 at 10:24 AM Arvid Heise wrote:
> Hi Kevin,
>
> I suspect that th
Hi Kevin,
I suspect that this is because Row is not supported as a Python field [1];
it's supposed to be a dict that is mapped to a Row by Flink.
Maybe it runs in some infinite loop while trying serialize and hence the
OOM.
Subclassing Row might be an undocumented feature.
I'm also pulling in Di