Hello, I have some questions about the compaction process. I need to manually trigger compaction operations on a standard partitioned orc table (not ACID), and be able to get back the list of compacted files. I could achieve this via HDFS, getting the directory listing and then triggering the compaction, but will imply stopping the underlying processing to avoid new files to be added in between. Here are some questions I could not answer myself from the material I found online:
- Is the compaction executed as a MapReduce job? - Is there a way to get back the list of compacted files? - How can you customize the compaction criteria? Also, any link to documentation/material is really appreciated. Thank you all for your time. Riccardo