I think you'd have to go with something like one of the first two options
(something in the schema) rather than some flag in a library. The problem
with an flag in a library is if someone has an avro file they want to
deserialize, they might not know if it was encoded with uuids as bytes or
strings and they'd be left with guessing one and trying again with the
second if the first failed which would not be a pleasant experience.

-Scott

On Fri, Dec 22, 2023 at 5:00 AM Martin Grigorov <mgrigo...@apache.org>
wrote:

> Hi,
>
> How would the application tell Avro what storage type to use - String or
> bytes ?
> - new logical type ? e.g. "logicalType": "uuid-bytes"
> - extra attribute ? e.g. { ..., "logicalType": "uuid", "storage-type":
> "bytes" }
> - global switch that tells the library to always use "string" or "bytes"
> for all UUIDs ?
> - ...
>
> Martin
>
> On Fri, Dec 22, 2023 at 10:49 AM Fokko Driesprong <fo...@apache.org>
> wrote:
>
> > Hey everyone,
> >
> > For Iceberg we're using UUIDs in Avro and we're storing them as binary,
> > rather than a string. This has several advantages such as more compact
> > storage, more efficient reading, and more efficient skipping. For more
> > details, please check out the doc that I've created
> > <
> >
> https://docs.google.com/document/d/16_oSWrEM7AFUCTe0uuraAEHxywezLfoEz5ahzwvhGUk/edit#heading=h.43xuauwfk7ow
> > >
> > (and feel free to comment). Also created AVRO-3918
> > <https://issues.apache.org/jira/browse/AVRO-3918> on Jira to track this.
> >
> > Looking forward to hearing from y'all!
> >
> > Kind regards and happy holidays,
> >
> > Fokko Driesprong
> >
>

Reply via email to