SQL massively more resource-intensive? Memory leak?

deklanw Sat, 01 Jun 2019 09:30:07 -0700

Is using Beam SQL just massively more resource-intensive or is there a bug 
somewhere here (in my code or elsewhere)? (I'm using Flink runner)


Here is my code:
https://pastebin.com/nNtc9ZaG

Here is the error I get (truncated at the end because it's so long and 
seemingly repetitive) when I run the SQL transform and my memory/CPU usage 
skyrockets:
https://pastebin.com/mywmkCQi

For example,

After several early firing triggers, 13-15% CPU, 1.5GB-2GB RAM, everything 
working fine:
`rowStream.apply(Combine.globally(Count.<Row>combineFn()).withoutDefaults()).apply(myPrint());`

After a single early firing trigger, CPU usage shoots to 90%+, 4.7GB+ RAM, soon 
crashes:
`rowStream.apply(SqlTransform.query("SELECT COUNT(*) FROM 
PCOLLECTION")).apply(myPrint());`

I can't imagine this is expected behavior but maybe I'm just ignorant of how 
SQL is implemented.

Most of this code is just getting Schema Registry Avro Kafka messages into a 
Row stream. There have been discussions on the mailing list recently about how 
to do that. This is the best I could do. If that part is incorrect I'd be glad 
to know.

Any help appreciated. Thank you.

SQL massively more resource-intensive? Memory leak?

Reply via email to