Re: [DISCUSS] Simplify multi-arg table metadata

2025-02-07 Thread Xianjin Ye
+1. I think it's good timing to allow multi-arg transform for V3 and onwards only. On 2025/02/03 18:26:00 "Driesprong, Fokko" wrote: > Hi everyone, > > While I was looking to add the V3 partition-spec (de/en)coder to PyIceberg, > I noticed that it allows for backporting the multi-arg transforms

Re: Table schema and partition spec update

2024-08-20 Thread Xianjin YE
table for the target spec. > This would make a fully dynamic sink. I don't have a concrete use case ATM, > so if it is not trivial, we could just leave it for later. > What surprised me is that there is no easy way to convert a Transform to a > PartitionSpec update. > > T

Re: Type promotion in v3

2024-08-20 Thread Xianjin YE
nce the >>>> type information would exist in Parquet. >>>> >>>> > And I think there’s also another aspect to consider: whether the new >>>> > type promotion is compatible with partition transforms. Currently all >>>> > the pa

Re: Type promotion in v3

2024-08-19 Thread Xianjin YE
ed on > the partition-spec-id. Evolving the partition spec would fix it. When we > decide to include the schema-id, we would be able to create the evaluator > based on the (partition-spec-id, schema-id) tuple when evaluating the > partitions. > > Kind regards, > Fokko > &g

Re: Type promotion in v3

2024-08-19 Thread Xianjin YE
Thanks Ryan for bringing this up. > int and long to string Could you elaborate a bit on how we can support type promotion for `int` and `long` to `string` if the upper and lower bounds are already encoded in 4/8 bytes binary? It seems that we cannot add promotions to string as Piotr pointed o

Re: Table schema and partition spec update

2024-08-19 Thread Xianjin YE
Hey Péter, For evolving the schema, Spark has the ability to mergeSchema based into the new incoming Schema, you may want t

Re: [DISCUSS] Implementing a table-level statistics file to store column statistics

2024-08-06 Thread Xianjin YE
Thanks for raising the discussion Huaxin. I also think partition-level statistics file(s) are more useful and has advantage over table-level stats. For instance: 1. It would be straight forward to support incremental stats computing for large tables: by recalculating new or updated partitions on

Re: Flink Table Maintenance - Tag based locking

2024-08-06 Thread Xianjin YE
> DataFile rewrite will create a new manifest file. This means if a DataFile > rewrite task is finished and committed, and there is a concurrent > ManifestFile rewrite then the ManifestFile rewrite will fail. I have played > around with serializing the Maintenance Tasks (resulted in a very ugly/

Re: [DISCUSS] DROP PARTITION in Spark

2024-08-06 Thread Xianjin YE
at should happen, not > how to do it. > > On Fri, Aug 2, 2024 at 10:20 AM Xianjin YE <mailto:xian...@apache.org>> wrote: >> > we would instead add support for pushing down `CAST` expressions from Spark >> >> Supporting pushing down more expressions is de

Re: [DISCUSS] Clarify in REST spec expected implementation behavior for unknown updates or requirements

2024-08-06 Thread Xianjin YE
Thanks Amogh for driving this discussion. I’m also +1 for 400 status code as others pointed out that the server is unable to determine the request is well formed or not. > On Aug 6, 2024, at 05:28, Amogh Jahagirdar <2am...@gmail.com> wrote: > > I also went back and forth on 400 vs 422 but ult

Re: [DISCUSS] DROP PARTITION in Spark

2024-08-02 Thread Xianjin YE
> data for a date easily as well. > > On Fri, Aug 2, 2024 at 6:32 AM Xianjin YE <mailto:xian...@apache.org>> wrote: >> > b) they have a concern that with getting the WHERE filter of the DELETE >> > not aligned with partition boundaries they might end up havin

Re: [DISCUSS] DROP PARTITION in Spark

2024-08-02 Thread Xianjin YE
> b) they have a concern that with getting the WHERE filter of the DELETE not > aligned with partition boundaries they might end up having pos-deletes that > could have an impact on their read perf I think this is a legit concern and currently `DELETE FROM` cannot guarantee that. It would be va