rdblue opened a new pull request #21: Add manifest listing files
URL: https://github.com/apache/incubator-iceberg/pull/21
 
 
   This adds a separate file, a manifest list, to track the manifests for a 
snapshot. The manifest list is an Avro file with a row for each manifest. The 
file columns are used to avoid reading manifests to look for data files.
   
   Columns include:
   * `manifest_path`: path of the manifest file
   * `partition_spec_id`: ID of the partition spec used to write the manifest 
(depends on #3)
   * `added_snapshot_id`: snapshot ID when the manifest was added to the table
   * `added_data_files_count`, `existing_data_files_count`, 
`deleted_data_files_count` to track operations
   * `partitions`: a summary (min, max, and containsNull for each field) of the 
partitions in the manifest file

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to