Hi, I read a section: https://pig.apache.org/docs/r0.11.1/func.html#handling-compression
according to which any concatenated bzip/gzip files will produce strange results. I did a test - concatenated some files and processed them. However, all the results were identical to ones that were produces on non-concatenated files. Why? They should be different... Then I saw: https://issues.apache.org/jira/i#browse/HADOOP-6835 My questions: 1. is https://pig.apache.org/docs/r0.11.1/func.html#handling-compression still correct and concatenation will produce wrong results? Is this true for any concatenated files or it might happanes once a time 2. is there any way how to find out whether tar.gz or tar.bz2 is concatenated?
