subject:"Spark cannot read iceberg tables which were originally written by Impala"

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-03 Thread OpenInx

Hi Zotan Thanks for the issue, I think it's fair to wait for a new major release for this breaking change. Best Regards. On Wed, Jan 3, 2024 at 11:16 PM Zoltán Borók-Nagy wrote: > Hi, > > I created a IMPALA-12675 > about annotating > STRING

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-03 Thread Zoltán Borók-Nagy

Hi, I created a IMPALA-12675 about annotating STRINGs with UTF8 by default. The code change should be trivial, but I'm afraid we will need to wait for a new major release with this (because users might store binary data in STRING columns, so it

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-03 Thread OpenInx

Thanks Zoltan and Ryan for your feedback. I think we all agreed that adding an option to promote BINARY to String (Approach A) in flink/spark/hive reader sides to read those historic dataset correctly written by impala on hive already. Besides that, applying approach B to future Apache Impala rel

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-01 Thread Ryan Blue

Thanks for bringing this up and for finding the cause. I think we should add an option to promote binary to string (Approach A). That sounds pretty reasonable overall. I think it would be great if Impala also produced correct Parquet files, but that's beyond our control and there's, no doubt, a to

Re: Spark cannot read iceberg tables which were originally written by Impala

2023-12-26 Thread Zoltán Borók-Nagy

Hey Everyone, Thank you for raising this issue and reaching out to the Impala community. Let me clarify that the problem only happens when there is a legacy Hive table written by Impala, which is then converted to Iceberg. When Impala writes into an Iceberg table there is no problem with interope

Spark cannot read iceberg tables which were originally written by Impala

2023-12-25 Thread OpenInx

Hi dev Sensordata [1] had encountered an interesting Apache Impala & Iceberg bug in their real customer production environment. Their customers use Apache Impala to create a large mount of Apache Hive tables in HMS, and ingested PB-level dataset in their hive table (which were originally written b

Re: Spark cannot read iceberg tables which were originally written by Impala

Re: Spark cannot read iceberg tables which were originally written by Impala

Re: Spark cannot read iceberg tables which were originally written by Impala

Re: Spark cannot read iceberg tables which were originally written by Impala

Re: Spark cannot read iceberg tables which were originally written by Impala

Spark cannot read iceberg tables which were originally written by Impala

6 matches

Site Navigation

Mail list logo

Footer information