Thanks Ryan for checking this out!

IcebergWritable wraps a Record to a Container, and a Writable, so that is
why I try to create a Record here.

The problem is that the metadata table scan returns a StructLike and I have
to match that with the metadata schema and then with the read schema.

I have seen in the doc that metadata table query is already working for
Spark and I was hoping to avoid the duplicated work, but could not find the
relevant part of the code yet.

Thanks Peter


On Tue, 20 Jul 2021, 02:05 Ryan Blue, <b...@tabular.io> wrote:

> Peter,
>
> The "data" tasks produce records using Iceberg's Record class and the
> internal representations. I believe that's what the existing Iceberg object
> inspectors use. Couldn't you just wrap this with an IcebergWritable and use
> the regular object inspectors?
>
> On Thu, Jul 15, 2021 at 8:53 AM Peter Vary <pv...@cloudera.com.invalid>
> wrote:
>
>> I have put together a somewhat working solution:
>>
>> case METADATA:
>>   return (CloseableIterable) CloseableIterable.transform(((DataTask) 
>> currentTask).rows(), row -> {
>>     Record record = GenericRecord.create(readSchema);
>>     List<Types.NestedField> tableFields = tableSchema.asStruct().fields();
>>     for (int i = 0; i < row.size(); i++) {
>>       Types.NestedField tableField = tableFields.get(i);
>>       if (readSchema.findField(tableField.name()) != null) {
>>         record.setField(tableField.name(), row.get(i, 
>> tableField.type().typeId().javaClass()));
>>       }
>>     }
>>     return record;
>>   });
>>
>> Which is working only for int/long/string etc types and it has problems
>> with Long->OffsetDateTime conversion and friends.
>> I am almost sure that this should have an existing and better solution
>> already somewhere :)
>>
>> On Jul 15, 2021, at 15:57, Peter Vary <pv...@cloudera.com> wrote:
>>
>> Hi Team,
>>
>> I am working to enable running queries above metadata tables through Hive.
>> I was able to load the correct metadata table though the Catalogs, and I
>> created the TableScan, but I am stuck there ATM.
>>
>> What is the recommended way to get the Record-s for the Schema defined by
>> the MetadataTable using the Java API?
>> For data files we create our own readers, but I guess we already has some
>> better way to do that for metadata.
>>
>> Any pointers would be welcome.
>>
>> Thanks,
>> Peter
>>
>>
>>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to