Hi Dian,
I have unit tests for which both sets of code (Row subclass vs. custom
Python class) passes. The OOM occurs when reading a large amount of data
from a kafka topic.
At the moment I don't have a simple example to reproduce the issue, I'll
let you know.
On Tue, Mar 23, 2021 at 2:17 AM Dian
Hi Kevin,
Is it possible to provide a simple example to reproduce this issue?
PS: It will use pickle to perform the serialization/deserialization if you
don't specify the type info.
Regards,
Dian
On Mon, Mar 22, 2021 at 10:55 PM Arvid Heise wrote:
> Hi Kevin,
>
> yes I understood that, but t
Hi Kevin,
yes I understood that, but then your Python class contains a Row field,
where no mapping exists. I'm assuming PyFlink tries to do a deep conversion
and fails to do so by ending in some infinite loop.
On Mon, Mar 22, 2021 at 3:48 PM Kevin Lam wrote:
> Thanks for the response Arvid! Poi
Thanks for the response Arvid! Point of clarification, *things do NOT OOM
when I use the Row subclass*. Instead, the code that doesn't use the Row
subclass is the code that OOMs (ie. the simple python class).
On Mon, Mar 22, 2021 at 10:24 AM Arvid Heise wrote:
> Hi Kevin,
>
> I suspect that th
Hi Kevin,
I suspect that this is because Row is not supported as a Python field [1];
it's supposed to be a dict that is mapped to a Row by Flink.
Maybe it runs in some infinite loop while trying serialize and hence the
OOM.
Subclassing Row might be an undocumented feature.
I'm also pulling in Di
Hi all,
I've encountered an interesting issue where I observe an OOM issue in my
Flink Application when I use a DataStream of Python Objects, but when I
make that Python Object a Subclass of pyflink.common.types.Row and provide
TypeInformation, there is no issue.
For the OOM scenario, no elements