Sorry, I had a second look and your stacktrace does not even point to
the spilling channel - it reads from the memory segment directly.
-> setting the temp dirs will thus not make a difference
I'm wondering why your deserializer eventually reads from a file on
gs:// directly, instead of, for examp
Hi Nico,
Unfortunately I can't share any of data, but it is not even data being
processed at the point of failure - it is still in the
matching-files-from-GCS phase.
I am using Apache Beam's FileIO to match files and during one of those
match-files steps I get the failure above.
Currently I run
Hi Encho,
the SpillingAdaptiveSpanningRecordDeserializer that you see in your
stack trace is executed while reading input records from another task.
If the (serialized) records are too large (> 5MiB), it will write and
assemble them in a spilling channel, i.e. on disk, instead of using
memory. This
Hello,
I am using Flink 1.5.3 and executing jobs through Apache Beam 2.6.0. One of
my jobs involves reading from Google Cloud Storage which uses the file
scheme "gs://". Everything was fine but once in a while I would get an
exception that the scheme is not recognised. Now I've started seeing them