Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction

2021-11-03 Thread Guowei Ma
Hi, Very thanks Fabian for drafting this FLIP! It looks very good to me. I see currently most of us agree with option 2, but I personally feel that option 3 may be better :-) I have some small concerns for option 2 1. Developers understand that the cost is relatively high. The merging of small fi

[jira] [Created] (FLINK-24759) Metrics on web UI is not correct in a filesystem source to filesystem sink streaming job

2021-11-03 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-24759: --- Summary: Metrics on web UI is not correct in a filesystem source to filesystem sink streaming job Key: FLINK-24759 URL: https://issues.apache.org/jira/browse/FLINK-24759

[jira] [Created] (FLINK-24758) partition.time-extractor.kind support "yyyyMMdd"

2021-11-03 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-24758: Summary: partition.time-extractor.kind support "MMdd" Key: FLINK-24758 URL: https://issues.apache.org/jira/browse/FLINK-24758 Project: Flink Issue Type:

[jira] [Created] (FLINK-24757) Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client

2021-11-03 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-24757: --- Summary: Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client Key: FLINK-24757 URL: https://issues.apache.org/jira/bro

Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction

2021-11-03 Thread Jingsong Li
Hi Fabian, Thanks for drafting the FLIP! ## Few thoughts of user requirements 1.compact files from multiple checkpoints This is what users need very much. 2.The compaction block the checkpointing - Some scenarios are required. For example, the user expects the output data to be consistent wit

[VOTE] FLIP-189: SQL Client Usability Improvements

2021-11-03 Thread Sergey Nuyanzin
Hi everyone, I would like to start a vote on FLIP-189: SQL Client Usability Improvements [1]. The FLIP was discussed in this thread [2]. FLIP-189 targets usability improvements of SQL Client such as parsing improvement, syntax highlighting, completion, prompts The vote will be open for at least 7

[jira] [Created] (FLINK-24756) Flink metric identifiers contain group variables.

2021-11-03 Thread Frederic Hemery (Jira)
Frederic Hemery created FLINK-24756: --- Summary: Flink metric identifiers contain group variables. Key: FLINK-24756 URL: https://issues.apache.org/jira/browse/FLINK-24756 Project: Flink Issue

[jira] [Created] (FLINK-24755) sun.misc doesn't exist

2021-11-03 Thread Aitozi (Jira)
Aitozi created FLINK-24755: -- Summary: sun.misc doesn't exist Key: FLINK-24755 URL: https://issues.apache.org/jira/browse/FLINK-24755 Project: Flink Issue Type: Bug Components: Build System

Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction

2021-11-03 Thread Fabian Paul
Hi David and Till, Thanks for your great feedback. One definitely confusing point in the FLIP is who is doing the actual compaction. The compaction will not be done by the CommittableAggregator operator but the committers so it should also not affect the checkpointing duration or have a signif

[jira] [Created] (FLINK-24754) PushDownSepcs are better to be print by a clear order.

2021-11-03 Thread xuyang (Jira)
xuyang created FLINK-24754: -- Summary: PushDownSepcs are better to be print by a clear order. Key: FLINK-24754 URL: https://issues.apache.org/jira/browse/FLINK-24754 Project: Flink Issue Type: Improv

[jira] [Created] (FLINK-24753) Enforce CHAR/VARCHAR precision when outputing to a Sink

2021-11-03 Thread Marios Trivyzas (Jira)
Marios Trivyzas created FLINK-24753: --- Summary: Enforce CHAR/VARCHAR precision when outputing to a Sink Key: FLINK-24753 URL: https://issues.apache.org/jira/browse/FLINK-24753 Project: Flink

Re: [DISCUSS] FLIP-189: SQL Client Usability Improvements

2021-11-03 Thread Sergey Nuyanzin
Hi Timo, I completely agree it would be great if we can propagate Calcite parser config in the way you have described. As you mentioned we could discuss this when it comes to the implementation. Meanwhile it looks like I can start voting (please correct me if I'm wrong). I will start it a bit lat

[jira] [Created] (FLINK-24752) Cleanup ScalarOperatorGens#generateCast

2021-11-03 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-24752: --- Summary: Cleanup ScalarOperatorGens#generateCast Key: FLINK-24752 URL: https://issues.apache.org/jira/browse/FLINK-24752 Project: Flink Issue T

Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction

2021-11-03 Thread Till Rohrmann
Ideally, the compaction won't affect the checkpointing time. If the compaction takes longer to complete, then the result could be published with the next checkpoint. Of course, this would increase the end-to-end latency. I hope that with option 2, we can support both use cases: single task compact

[jira] [Created] (FLINK-24751) flink SQL comile failed cause by: java.lang.StackOverflowError

2021-11-03 Thread tao.yang03 (Jira)
tao.yang03 created FLINK-24751: -- Summary: flink SQL comile failed cause by: java.lang.StackOverflowError Key: FLINK-24751 URL: https://issues.apache.org/jira/browse/FLINK-24751 Project: Flink

[jira] [Created] (FLINK-24750) Single quotes converted to left/right single quotes during docs generation

2021-11-03 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-24750: --- Summary: Single quotes converted to left/right single quotes during docs generation Key: FLINK-24750 URL: https://issues.apache.org/jira/browse/FLINK-24750 Proj

Re: [DISCUSS] FLIP-189: SQL Client Usability Improvements

2021-11-03 Thread Timo Walther
Hi Sergey, thanks for your explanation. Regarding keywords and other info: We should receive the information from the Flink SQL parser directly. We have added a couple of new keywords such as WATERMARK or MATCH_RECOGNIZE clauses. SQL92 would not help a user understand why a column name needs

Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction

2021-11-03 Thread David Morávek
Hi Fabian, thanks for drafting the FLIP! This is a really nice and useful topic to target ;) Few thoughts on the option 2) The file compaction is by definition quite costly IO bound operation. If I understand the proposal correctly, the aggregation itself would run during operator (aggregator) ch

[jira] [Created] (FLINK-24749) Reuse CheckpointStatsTracker across rescaling

2021-11-03 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-24749: Summary: Reuse CheckpointStatsTracker across rescaling Key: FLINK-24749 URL: https://issues.apache.org/jira/browse/FLINK-24749 Project: Flink Issue T

[jira] [Created] (FLINK-24748) Remove CheckpointStatsTracker#getJobCheckpointingConfiguration

2021-11-03 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-24748: Summary: Remove CheckpointStatsTracker#getJobCheckpointingConfiguration Key: FLINK-24748 URL: https://issues.apache.org/jira/browse/FLINK-24748 Project: Flink

Re: [DISCUSS] FLIP-191: Extend unified Sink interface to support small file compaction

2021-11-03 Thread Till Rohrmann
Thanks for creating this FLIP Fabian. >From your description I would be in favour of option 2 for the following reasons: Assuming that option 2 solves all our current problems, it seems like the least invasive change and smallest in scope. Your main concern is that it might not cover future use ca

Re: [DISCUSS] FLIP-189: SQL Client Usability Improvements

2021-11-03 Thread Sergey Nuyanzin
Hi 李宇彬, I think you are right. Thank you very much for the idea. I came across MySQL[1] and PostgreSQL[2] prompts and also found several interesting features like control symbols to change style, showing current property value and different datetime formats. I have added your proposals and my fin

[jira] [Created] (FLINK-24747) Add producedDataType to SupportsProjectionPushDown.applyProjection

2021-11-03 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-24747: --- Summary: Add producedDataType to SupportsProjectionPushDown.applyProjection Key: FLINK-24747 URL: https://issues.apache.org/jira/browse/FLINK-24747 Proj

Re: [DISCUSS] FLIP-187: Adaptive Batch Job Scheduler

2021-11-03 Thread Till Rohrmann
I have to admit that I cannot think of a better name for the adaptive batch scheduler atm. Maybe it is good enough to call the two schedulers AdaptiveBatchScheduler and AdaptiveStreamingScheduler to tell which scheduler is used for which execution mode. It is true, though, that the former is adapti

[jira] [Created] (FLINK-24746) Alibaba maven mirror is unstable

2021-11-03 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-24746: Summary: Alibaba maven mirror is unstable Key: FLINK-24746 URL: https://issues.apache.org/jira/browse/FLINK-24746 Project: Flink Issue Type: Technica

[jira] [Created] (FLINK-24745) Add support for Oracle OGG json parser in flink-json module

2021-11-03 Thread Steven (Jira)
Steven created FLINK-24745: -- Summary: Add support for Oracle OGG json parser in flink-json module Key: FLINK-24745 URL: https://issues.apache.org/jira/browse/FLINK-24745 Project: Flink Issue Type:

[jira] [Created] (FLINK-24744) FlinkKafkaProducerITCase.testMigrateFromAtLeastOnceToExactlyOnce fails on Azure because org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does

2021-11-03 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-24744: - Summary: FlinkKafkaProducerITCase.testMigrateFromAtLeastOnceToExactlyOnce fails on Azure because org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition

[jira] [Created] (FLINK-24743) New File Sink end-to-end test fails on Azure

2021-11-03 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-24743: - Summary: New File Sink end-to-end test fails on Azure Key: FLINK-24743 URL: https://issues.apache.org/jira/browse/FLINK-24743 Project: Flink Issue Type: Bu

Re: [DISCUSS] FLIP-187: Adaptive Batch Job Scheduler

2021-11-03 Thread Lijie Wang
Hi David, Thanks for your comments. I personally think that "Adaptive" means: Flink automatically determines the appropriate scheduling and execution plan based on some information. The information can include both resource information and workload information, rather than being limited to a ce