Hello folks,
We see threads from
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/internal/TransferManagerUtils.java#L49
outlive a batch job that writes Parquet Files to S3, causing a ClassLoader
Leak. Is this a known issue ? Logically a close on the TransferManager
should close the ExecutorService ( and thus the threads ),
The code is fairly straightforward,
val job = new Job() val hadoopOutFormat = new
HadoopOutputFormat[Void, GenericRecord]( new
AvroParquetOutputFormat(), job )
AvroParquetOutputFormat.setSchema(job, schema)
FileOutputFormat.setOutputPath(job, new
org.apache.hadoop.fs.Path(path))
ParquetOutputFormat.setCompression(job, CompressionCodecName.SNAPPY)
ParquetOutputFormat.setEnableDictionary(job, true) // do we need
this?
and then an output
This is using
scalaVersion := "2.12.12"flinkVersion = "1.11.2"hadoopVersion = "2.8.3"
Regards
Vishal