Peter, The "data" tasks produce records using Iceberg's Record class and the internal representations. I believe that's what the existing Iceberg object inspectors use. Couldn't you just wrap this with an IcebergWritable and use the regular object inspectors?
On Thu, Jul 15, 2021 at 8:53 AM Peter Vary <pv...@cloudera.com.invalid> wrote: > I have put together a somewhat working solution: > > case METADATA: > return (CloseableIterable) CloseableIterable.transform(((DataTask) > currentTask).rows(), row -> { > Record record = GenericRecord.create(readSchema); > List<Types.NestedField> tableFields = tableSchema.asStruct().fields(); > for (int i = 0; i < row.size(); i++) { > Types.NestedField tableField = tableFields.get(i); > if (readSchema.findField(tableField.name()) != null) { > record.setField(tableField.name(), row.get(i, > tableField.type().typeId().javaClass())); > } > } > return record; > }); > > Which is working only for int/long/string etc types and it has problems > with Long->OffsetDateTime conversion and friends. > I am almost sure that this should have an existing and better solution > already somewhere :) > > On Jul 15, 2021, at 15:57, Peter Vary <pv...@cloudera.com> wrote: > > Hi Team, > > I am working to enable running queries above metadata tables through Hive. > I was able to load the correct metadata table though the Catalogs, and I > created the TableScan, but I am stuck there ATM. > > What is the recommended way to get the Record-s for the Schema defined by > the MetadataTable using the Java API? > For data files we create our own readers, but I guess we already has some > better way to do that for metadata. > > Any pointers would be welcome. > > Thanks, > Peter > > > -- Ryan Blue Tabular