Re: Supporting TINYINT and SMALLINT in Iceberg

2021-12-10 Thread Ryan Blue
For TINYINT and SMALLINT, I don't think there is any advantage at the storage layer. Avro uses variable-length ints and the columnar formats, Parquet and ORC, will do efficient encodings for multiple values in a column. I don't see much value in these types, besides compatibility with existing SQL.

Re: Supporting TINYINT and SMALLINT in Iceberg

2021-12-10 Thread Walaa Eldin Moustafa
Just to update this thread, we have agreed internally to use INT in the struct schema corresponding to union types. The reasons are two-fold: (1) Uncertainty around whether TINYINT will make it to Iceberg while we wanted to stick to the spec. (2) Since Avro does not support TINYINT either, this iss

Re: Supporting TINYINT and SMALLINT in Iceberg

2021-12-03 Thread Walaa Eldin Moustafa
Also, wanted to add another observation that is on the flip side of the initial argument. Some data formats like Avro do not support TINYINT either. So even if Trino uses TINYINT in the struct schema, when the table is written back to Avro, TINYINT will not be used. This supports the side of the ar