Hi Xiang, On Thu, Nov 5, 2020 at 11:07 AM 李响 <wate...@gmail.com> wrote:
> Dear community: > > I am using SparkTableUtil to import an existing Hive table to an Iceberg > table. > The ORC files of Hive table is an old version of ORC, so I set a name > mapping (like: id 1 mapped to _col0 and id 2 mapped to _col1...) to the > Iceberg table by using "schema.name-mapping.default" so that the matrics of > ORC files could be built correctly during the import process. > > After that, I plan to write new data into the Iceberg table (using the ORC > version 1.6.5 in the iceberg package), how could I deal with that name > mapping used for importing ? Should I remove that? Does that name mapping > do any harm when reading/writing from/to the new ORC file? > If I understand correctly the name-mapping would only apply if there were no Iceberg IDs found in the ORC file as type attributes, which is the case for the imported data. All new data you write with Iceberg/ORC will have the Iceberg field-id stored as a type attribute, so when reading those new files the name-mapping should have no effect since the read path will detect the Iceberg field-ids. Cheers, -- Edgar R