Thank you Manu and Russell for your answers, Would there be any document / ticket / commit where I could find some information or example on how the partition transforms are implemented, and the various code places they involve touching ? Many thanks again :)
Joseph On Wed, Jul 5, 2023 at 1:43 PM <russell.spit...@gmail.com> wrote: > We have been discussing something like this as well, either an arbitrary > partitioning scheme or just a more extensive and customizable transform. > > An example I’m interested in is a geo hash index where we store offsets on > a large grid to denote partitions. The total offset file for the whole > planet still only ends up being in the low megabytes while accounting for > high density in cities and low density over oceans > > Sent from my iPhone > > On Jul 4, 2023, at 8:08 AM, Joseph Allemandou <jalleman...@wikimedia.org> > wrote: > > > Hi Iceberg team, > > I'm working at the WikimediaFoundation, and we started using Iceberg for > some of our big-data tables - we love it :) > > One of the needs we'll have in the future would be to partition data using > a specific bucketing function. > How complex would that be to add a new function to the ones already > present in the Iceberg partitioning mechanism? Is there any docs on doing > that? > Bonus points: Are there any plans to make it possible for users to > reference their own bucketing functions at table definition? > > Many thanks for the awesome project<3 > > -- > Joseph Allemandou (joal) (he / him) > Staff Data Engineer > Wikimedia Foundation > > -- Joseph Allemandou (joal) (he / him) Staff Data Engineer Wikimedia Foundation