Bumping this thread again. I received this SQL from a user today.
CREATE TABLE partition_by_epoch_timestamp ( > `id` STRING, > `data` DECIMAL(10,0), > `start_ts` DECIMAL(10,0), > `end_ts` DECIMAL(10,0) > ) > USING iceberg > PARTITIONED BY ( > days(from_unixtime(start_ts)) > ) Of course it fails now. Do you think it's possible to support? Or something similar? Thanks, Manu On Fri, Sep 13, 2024 at 3:51 AM rdb...@gmail.com <rdb...@gmail.com> wrote: > I was thinking about adding to the specification how the current date/time > transforms work with source columns that are longs. I don't think it helps > much to introduce a new transform, although that would make the > interpretation of the source data more obvious to users applying the > transform. > > On Wed, Sep 11, 2024 at 10:26 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> Maybe we could update the time-based partition functions to be applied >>> to a long column directly. It would treat that column like a timestamp in >>> milliseconds. Would that work? I need to think more about the implications >>> of doing that, but I don't think that we currently have an issue with >>> extending the supported source column types like that. >> >> >> By update do you mean adding new functions like >> HOUR_FROM_UNIX_EPOCH_MILLIS or overloading the definition to accept Long >> values and treat them as milliseconds. The former seems more reasonable to >> me. The latter I think has many of the same draw-backs raised on the other >> thread. IMO, both aren't super pleasing from a long term maintainability >> perspective. >> >> Cheers, >> Micah >> >> On Tue, Sep 10, 2024 at 8:50 AM rdb...@gmail.com <rdb...@gmail.com> >> wrote: >> >>> Maybe we could update the time-based partition functions to be applied >>> to a long column directly. It would treat that column like a timestamp in >>> milliseconds. Would that work? I need to think more about the implications >>> of doing that, but I don't think that we currently have an issue with >>> extending the supported source column types like that. >>> >>> On Mon, Sep 9, 2024 at 9:05 PM Manu Zhang <owenzhang1...@gmail.com> >>> wrote: >>> >>>> Hi all, >>>> >>>> I'd like to bump this thread since we don't want to allow long to >>>> timestamp promotion in V3 >>>> <https://lists.apache.org/thread/79y866zdhs2fmyv0nsfq3xvdsmqh7h8c>. >>>> What other options do we have? >>>> >>>> Thanks, >>>> Manu >>>> >>>> On Fri, Apr 5, 2024 at 12:09 AM Jean-Baptiste Onofré <j...@nanthrax.net> >>>> wrote: >>>> >>>>> Ah yes, milestone is fine. Thanks ! >>>>> >>>>> All good. >>>>> >>>>> Regards >>>>> JB >>>>> >>>>> On Thu, Apr 4, 2024 at 5:12 PM Eduard Tudenhoefner <edu...@tabular.io> >>>>> wrote: >>>>> > >>>>> > There is the V3 Spec milestone where it's tracked (amongst other >>>>> things). >>>>> > >>>>> > On Thu, Apr 4, 2024 at 9:44 AM Jean-Baptiste Onofré <j...@nanthrax.net> >>>>> wrote: >>>>> >> >>>>> >> Hi Eduard, >>>>> >> >>>>> >> Thanks for the update ! It makes sense to me. >>>>> >> >>>>> >> Maybe a GH label with spec or v3_spec would help to see what is >>>>> planned for v3 ? >>>>> >> >>>>> >> Regards >>>>> >> JB >>>>> >> >>>>> >> On Thu, Apr 4, 2024 at 9:36 AM Eduard Tudenhoefner < >>>>> edu...@tabular.io> wrote: >>>>> >> > >>>>> >> > Type promotion from Long to Timestamp is on the roadmap for the >>>>> V3 Spec, so that would be the preferred way. >>>>> >> > >>>>> >> > On Wed, Apr 3, 2024 at 10:38 AM Jean-Baptiste Onofré < >>>>> j...@nanthrax.net> wrote: >>>>> >> >> >>>>> >> >> Hi Manu >>>>> >> >> >>>>> >> >> TIMESTAMP_LONG type promotion could be the easiest way, it would >>>>> work >>>>> >> >> with the existing transform. >>>>> >> >> >>>>> >> >> Would it work for you? >>>>> >> >> >>>>> >> >> Regards >>>>> >> >> JB >>>>> >> >> >>>>> >> >> On Wed, Apr 3, 2024 at 5:56 AM Manu Zhang < >>>>> owenzhang1...@gmail.com> wrote: >>>>> >> >> > >>>>> >> >> > Hi all, >>>>> >> >> > >>>>> >> >> > We have source data with a timestamp field in LONG type to >>>>> land in an Iceberg table. We want to partition the table with the >>>>> timestamp >>>>> field such that we can query recent data more efficiently. However, LONG >>>>> is >>>>> not supported as the source type of time-based transform (hour, day, etc) >>>>> >> >> > >>>>> >> >> > I find the previous discussion >>>>> https://github.com/apache/iceberg/issues/417 and Ryan suggested two >>>>> solutions >>>>> >> >> > >>>>> >> >> > 1. type promotion from LONG to TIMESTAMP >>>>> >> >> > 2. custom transform >>>>> >> >> > >>>>> >> >> > As I understand it, neither solution has already been >>>>> implemented yet. Is there any progress in either direction? Which solution >>>>> does the community prefer? Any other suggestions are also appreciated. >>>>> >> >> > >>>>> >> >> > Thanks, >>>>> >> >> > Manu >>>>> >>>>