Hi,
I read a section:
https://pig.apache.org/docs/r0.11.1/func.html#handling-compression

according to which any concatenated bzip/gzip files will produce strange
results.
I did a test - concatenated some files and processed them. However, all the
results were identical to ones that were produces on non-concatenated
files. Why? They should be different...

Then I saw: https://issues.apache.org/jira/i#browse/HADOOP-6835

My questions:
1. is https://pig.apache.org/docs/r0.11.1/func.html#handling-compression
still correct and concatenation will produce wrong results? Is this true
for any concatenated files or it might happanes once a time
2. is there any way how to find out whether tar.gz or tar.bz2 is
concatenated?

Reply via email to