Issue with Flink Job when Reading Data from Kafka and Executing SQL Query (q77 TPC-DS)

2023-12-19 Thread Вова Фролов
Hello Flink Community, I am texting to you with an issue I have encountered while using Apache Flink version 1.17.1. In my Flink Job, I am using Kafka version 3.6.0 to ingest data from TPC-DS(current tpcds100 target size tpcds1), and then I am executing SQL queries, specifically, the q77 query

Flink caching mechanism

2024-01-11 Thread Вова Фролов
Hi Everyone, I'm currently looking to understand the caching mechanism in Apache Flink in general. As part of this exploration, I have a few questions related to how Flink handles data caching, both in the context of SQL queries and more broadly. When I send a SQL query for example to PostgreS

Long execution of SQL query to Kafka + Hive (q77 TPC-DS)

2024-01-23 Thread Вова Фролов
Hello, I am executing a heterogeneous SQL query (part of the data is in Hive and part in Kafka. The query utilizes TPC-DS benchmark 100GB data.) in BatchMode. However, the execution time is excessively long, taking approximately 11 minutes to complete , although the request to Hive only (without

Re: Long execution of SQL query to Kafka + Hive (q77 TPC-DS)

2024-01-26 Thread Вова Фролов
eturns , (profit - coalesce(profit_loss,0)) as profit from ws left join wr on ws.wp_web_page_sk = wr.wp_web_page_sk ) x group by rollup (channel, id) order by channel ,id LIMIT 100; Kind regards, Vladimir. чт, 25 янв. 2024 г. в 14:43, Ron liu : > Hi, > >

The fault tolerance and recovery mechanism in batch mode within Apache Flink.

2024-02-16 Thread Вова Фролов
Hi everyone, I am currently exploring the fault tolerance and recovery mechanism in batch mode within Apache Flink. If I terminate the task manager process while the job is running, the job restarts from the point of failure. However, at some point, the job restarts from the very beginning. The