Re:Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3

2025-04-22 Thread lisoda
Hello Sungwoo BTW, would you consider adding HIVE4-LLAP as a control group for the trial? Tks. Lisoda 在 2025-04-22 16:37:29,"Sungwoo Park" 写道: From average response time analysis: For Spark, it performs better than its total execution time suggests, with an average res

Re:Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3

2025-04-22 Thread lisoda
x27;s optimization strategies or investigate the reasons for the poor execution plans of these typical SQL queries? Tks. Lisoda. 在 2025-04-22 16:37:29,"Sungwoo Park" 写道: From average response time analysis: For Spark, it performs better than its total execution time suggests, w

Re:Re: Performance evaluation of Trino 468, Spark 4.0.0-RC2, and Hive 4 on Tez/MR3

2025-04-22 Thread lisoda
Maybe HIVE4 ON TEZ should enable the continer reuse feature so that the start/stop overhead of the application can be further reduced. 在 2025-04-22 16:37:29,"Sungwoo Park" 写道: From average response time analysis: For Spark, it performs better than its total execution time suggests,

Re: which one is better to use in hive when storage date like text type ,string or varchar?

2025-01-15 Thread lisoda
For the most part, I think just using the string type is sufficient. Replied Message | From | liubin_w...@yeah.net | | Date | 01/16/2025 15:01 | | To | user@hive.apache.org | | Cc | | | Subject | which one is better to use in hive when storage date like text type ,string or varchar? |

Re:Re: Blog article 'Performance Tuning for Single-table Queries'

2024-12-29 Thread lisoda
should seek their opinions. Seonggon Namgung & Sungwoo Park, what do you think? -lisoda 在 2024-12-30 04:44:04,"Ayush Saxena" 写道: If it is related to Hive we can get it posted on the Hive’s Twitter page as well, If you say so & share us what you want to write alon

Re: Blog article 'Performance Tuning for Single-table Queries'

2024-12-28 Thread lisoda
see again Replied Message | From | lisoda | | Date | 12/24/2023 00:28 | | To | user | | Cc | | | Subject | Re: Blog article 'Performance Tuning for Single-table Queries' | 🎉🚀 Replied Message | From | Sungwoo Park | | Date | 12/24/2023 00:06 | | To | user@hive.

Re: Blog article 'Performance Tuning for Single-table Queries'

2024-12-28 Thread lisoda
1 Replied Message | From | Sungwoo Park | | Date | 12/24/2023 00:06 | | To | user@hive.apache.org | | Cc | | | Subject | Blog article 'Performance Tuning for Single-table Queries' | Hello Hive users, I have published a new blog article 'Performance Tuning for Single-table Queries'.

Re:Re: Question related to reuse of BytesColumnVector.vector[][].

2024-11-15 Thread lisoda
oduce the >issue. > >Best, >Stamatis > >On Sun, Sep 29, 2024 at 7:51 AM lisoda wrote: >> >> Currently, when we run HIVE version 4.0.0, the Sql often breaks abnormally, >> and the log message is as follows: >> >> >> 2024-09-29 00:18:06,5

Re:Re: Question related to reuse of BytesColumnVector.vector[][].

2024-10-30 Thread lisoda
Hello Stamatis. I submitted an ISSUE in which I described the steps to reproduce the problem and provided the relevant dataset. ISSUE: https://issues.apache.org/jira/browse/HIVE-28598 It would be great if you could check it out. Tks. Lisoda. 在 2024-10-02 18:05:01,"Stamatis Zampe

Re:HIVE-28488/28489/28490 and the performance of Hive 4.0.1 on MR3 1.12 (vs Trino 453)

2024-10-09 Thread lisoda
leading to skewed tasks. This makes the execution of the Query very slow. At present, we do not have a very good way to handle this scenario. Does anyone have experience dealing with this kind of problem? If someone can guide me or join the discussion, I would be very grateful. Thanks

Question related to reuse of BytesColumnVector.vector[][].

2024-09-28 Thread lisoda
Currently, when we run HIVE version 4.0.0, the Sql often breaks abnormally, and the log message is as follows: 2024-09-29 00:18:06,569 [INFO] [Dispatcher thread {Central}] |HistoryEventHandler.criticalEvents|: [HISTORY][DAG:dag_1721298780048_105514_11][Event:TASK_ATTEMPT_FINISHED]: vertexName

HIVE-22392 appears to have failed to be handled correctly

2024-09-24 Thread lisoda
Hello Team. We found that HIVE-22392 doesn't seem to be handled correctly, and in jira we see that the code for this PR is merged into the master branch, but, in fact, this part of the code is not merged. We use JdbcStorageHandle a lot, and we would really like it to support writes, and it woul

Re:Re: Re: Merge Operation Failing Results in this SQL Error [40000] [42000]

2024-09-04 Thread lisoda
hello Okumin. After porting the patch HIVE-28428, I observed that the slow query problem disappeared.Currently the query efficiency for orc+zstd table is basically the same as orc+snappy. Therefore I strongly recommend addingHIVE-28428to version 4.0.1. Regards, lisoda At 2024-09-02 23

Re:Re: Merge Operation Failing Results in this SQL Error [40000] [42000]

2024-09-01 Thread lisoda
Hello Clinton: We have actually encountered the same issue where, in many cases, querying Iceberg does not meet expected efficiency, falling short of regular ORC/Parquet tables in speed. Since the current HiveIcebergInputSplit does not support splits based on file size, reading can be slow when

Re:Re: Iceberg HadoopCatalog and location_based_table

2024-07-15 Thread lisoda
Tks. At 2024-07-15 16:15:36, "Denys Kuzmenko" wrote: >see `HadoopInputFile` as an example

Iceberg HadoopCatalog and location_based_table

2024-07-15 Thread lisoda
rd to your reply. Regards. lisoda. iceberg-fix_pr: https://github.com/apache/iceberg/pull/10623

Re:Re: Re: Support java/11/17/21

2024-07-10 Thread lisoda
>https://issues.apache.org/jira/browse/HADOOP-18197?focusedCommentId=17820711&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17820711 >[3] https://github.com/apache/hadoop/pull/6593 > >On Wed, 10 Jul 2024 at 13:15, lisoda wrote: >> >> Hello Sir. &g

Re:Re: Support java/11/17/21

2024-07-10 Thread lisoda
道: We are working towards supporting JDK-17, should take couple of months, we don’t have a planned deadline for that as of now Btw. hadoop didn’t drop support for JDK-8…. -Ayush On 10 Jul 2024, at 12:52 PM, lisoda wrote:  Hi. Currently, Iceberg/hadoop/spark and the rest of the third-party

Support java/11/17/21

2024-07-10 Thread lisoda
Hi. Currently, Iceberg/hadoop/spark and the rest of the third-party frameworks have dropped support for JAVA8 (or are planning to do so). When will HIVE be able to support a higher version of the JDK and what progress has been made in this regard?

Some problems encountered when reading ICEBERG with vectorisation turned on

2024-03-10 Thread lisoda
Hi. I am using HIVE 4.0.0 to read ICEBERG tables. I am having some problems with it, so if someone could guide me, that would be great. Env: hadoop3.3.6 hive4.0.0 tez0.10.2 iceberg1.4.3 iceberg-table: hadoop-catalog-table/location_based_table Question 1: How tez.mrreader.config.update.pr

Re: Blog article 'Performance Tuning for Single-table Queries'

2023-12-23 Thread lisoda
🎉🚀 Replied Message | From | Sungwoo Park | | Date | 12/24/2023 00:06 | | To | user@hive.apache.org | | Cc | | | Subject | Blog article 'Performance Tuning for Single-table Queries' | Hello Hive users, I have published a new blog article 'Performance Tuning for Single-table Queries'.

when enable reducededuplication, count(distinct)+group by very slow

2023-12-19 Thread lisoda
Hi team. I found that when I enable reduceduplication, count(distinct)+GroupBy becomes very slow. Is there a problem with reduceduplication? test query info: | CONFIG | SQL | TIME | | hive.optimize.reducededuplication=true | select count(1) from(select uni_shop_id,partner,count(distinct uni_id)

A deadLock problem

2023-12-07 Thread lisoda
Hi Team. [HIVE-27944] When HIVE-LLAP reads the ICEBERG table, a deadlock may occur. - ASF JIRA (apache.org) I submitted this ISSUE, if anyone can help me I would appreciate it. Tks.

Re:Re: hive can not read iceberg-parquet table

2023-11-22 Thread lisoda
Hi. Following your suggestion, I created three ISSUE: [HIVE-27901] Hive's performance for querying the Iceberg table is very poor. - ASF JIRA (apache.org) [HIVE-27900] hive can not read iceberg-parquet table - ASF JIRA (apache.org) [HIVE-27898] HIVE4 can't use ICEBERG table in subqueries - ASF JI

Re:Re: hive can not read iceberg-parquet table

2023-11-21 Thread lisoda
Sorry, I don't have an account with jira at the moment. I was rejected by the administrator when I applied for an account earlier. He thought that such issues could be discussed in an email. I'll try to apply for an account again.

Re:Re: hive can not read iceberg-parquet table

2023-11-21 Thread lisoda
1. TEZ_VERSION 0.10.3 SNAPSHOT 2. iceberg table is cow table. insert small data will get same error. 3.using orc-iceberg is ok. 4. disable vectorized and using parquet is ok.

hive can not read iceberg-parquet table

2023-11-21 Thread lisoda
Hi team. I am currently testing HIVE-4.0.0-BETA. For better read performance, we use the Iceberg-Parquet table. However, we have found that HIVE is currently unable to handle iceberg-parquet tables correctly. Example: CREATE EXTERNAL TABLE iceberg_dwd.b_qqd_shop_rfm_parquet_snappy STORED BY '

Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting'

2023-11-16 Thread lisoda
May I ask when hive4 can be released? Replied Message | From | Butao Zhang | | Date | 11/17/2023 12:24 | | To | user@hive.apache.org | | Cc | | | Subject | Re: [EXTERNAL] Re: Slow Hive query with a lot of 'get_materialized_views_for_rewriting' | Thanks for the info. I checked Hive3.1.

Re:Re: Re: Hive's performance for querying the Iceberg table is very poor.

2023-11-08 Thread lisoda
3-11-09 15:36:50,"Butao Zhang" 写道: Could you please provide detailed steps to reproduce this issue? e.g. how do you create the table? Thanks, Butao Zhang Replied Message | From | lisoda | | Date | 11/9/2023 14:25 | | To | | | Subject | Re:Re: Re: Hive's performance for

Re:Re: Re: Hive's performance for querying the Iceberg table is very poor.

2023-11-08 Thread lisoda
Incidentally, I'm using a COW table, so there is no DELETE_FILE. 在 2023-11-09 10:57:35,"Butao Zhang" 写道: Hi lisoda. You can check this ticket https://issues.apache.org/jira/browse/HIVE-27347 which can use iceberg basic stats to optimize count(*) query. Note: it didn&#

Re:Re: Re: Hive's performance for querying the Iceberg table is very poor.

2023-11-08 Thread lisoda
es would be supported even today on Hive master, but 4.0.0 would have them running for sure. -Ayush On Tue, 24 Oct 2023 at 14:51, lisoda wrote: Thanks. I would like to know if hive currently supports push to ICEBERG table partition under JOIN condition. Because I see HIVE-27734 is not yet

Re: Announce: Hive-MR3 with Celeborn,

2023-10-24 Thread lisoda
Thanks. I will try. Replied Message | From | Sungwoo Park | | Date | 10/24/2023 20:08 | | To | user@hive.apache.org | | Cc | | | Subject | Announce: Hive-MR3 with Celeborn, | Hi Hive users, Before the impending release of MR3 1.8, we would like to announce the release of Hive-MR3 wi

Re:Re: Hive's performance for querying the Iceberg table is very poor.

2023-10-24 Thread lisoda
tables? 在 2023-10-24 11:03:07,"Ayush Saxena" 写道: Hi Lisoda, The iceberg jar for hive 3.1.3 doesn't have a lot of changes, We did a bunch of improvements on the 4.x line for Hive-Iceberg. You can give iceberg a try on the 4.0.0-beta-1 release mentioned here [1], we

Re: Hive's performance for querying the Iceberg table is very poor.

2023-10-23 Thread lisoda
cross Apache Iceberg, Apache Hudi and Apache Hive. Here is a video of connecting the 2 products through a webinar StarRocks did with Tabular (authors of Apache Iceberg). https://www.youtube.com/watch?v=bAmcTrX7hCI&t=10s On Mon, Oct 23, 2023 at 7:18 AM lisoda wrote: Hi Team. I rec

Hive's performance for querying the Iceberg table is very poor.

2023-10-23 Thread lisoda
Hi Team. I recently was testing Hive query Iceberg table , I found that Hive query Iceberg table performance is very very poor . Almost impossible to use in the production environment . And Join conditions can not be pushed down to the Iceberg partition. I'm using the 1.3.1 Hive Runt