Hi Billy,
I suspect that it's not possible in Flink as is. The tar file acts as a
directory containing an arbitrary number of files. Afaik, Flink assumes
that all compressed files or just single files, like gz without tar. It's
like this in your case, but then the tar part doesn't make much sense.
We have an input file that is tarred and compressed to 12gb. It is about
50gb uncompressed.
With readTextFile(), I see it uncompress the file but then flink doesn't
seem to handle the untar portion. It's just a single file. (We don't
control the input format)
foo.tar.gz 12gb
foo.tar 50gb
then un