Hey Aljoscha
A couple of thoughts for the two remaining TODOs in the doc:
# Processing Time Support in BATCH/BOUNDED execution mode
I think there are two somewhat orthogonal problems around this topic:
1. Firing processing timers at the end of the job
2. Having processing timers in the B
Hi everyone,
I updated the FLIP again and hope that I could address the mentioned
concerns.
@Leonard: Thanks for the explanation. I wasn't aware that ts_ms and
source.ts_ms have different semantics. I updated the FLIP and expose the
most commonly used properties separately. So frequently use
Huang Xingbo created FLINK-19163:
Summary: Add building py38 wheel package of PyFlink in Azure CI
Key: FLINK-19163
URL: https://issues.apache.org/jira/browse/FLINK-19163
Project: Flink
Issue
Hi Timo,
The updated CAST SYSTEM_METADATA behavior sounds good to me. I just noticed
that a BIGINT can't be converted to "TIMESTAMP(3) WITH LOCAL TIME ZONE".
So maybe we need to support this, or use "TIMESTAMP(3) WITH LOCAL TIME
ZONE" as the defined type of Kafka timestamp? I think this makes sens
+1
We just need to make sure to find a good name before the release but
shouldn't block any work on this.
Aljoscha
On 08.09.20 07:59, Xintong Song wrote:
Thanks for the vote, @Jincheng.
Concerning the namings, the original idea was, as you suggested, to have
separate configuration names fo
Hey Kurt,
Thank you for comments!
Ad. 1 I might have missed something here, but as far as I see it is that
using the current execution stack with regular state backends (RocksDB
in particular if we want to have spilling capabilities) is equivalent to
hash-based execution. I can see a different sp
Hi Shengkai,
first of I would not consider the section Problems of
SupportsWatermarkPushDown" as a "problem". It was planned to update the
WatermarkProvider once the interfaces are ready. See the comment in
WatermarkProvider:
// marker interface that will be filled after FLIP-126:
// Waterma
Hi Jark,
according to Flink's and Calcite's casting definition in [1][2]
TIMESTAMP WITH LOCAL TIME ZONE should be castable from BIGINT. If not,
we will make it possible ;-)
I'm aware of DeserializationSchema.getProducedType but I think that this
method is actually misplaced. The type should
Hi Becket,
Thank you for picking up this FLIP. I have a few questions:
* two thoughts on naming:
* idleTime: In the meantime, a similar metric "idleTimeMsPerSecond" has
been introduced in https://issues.apache.org/jira/browse/FLINK-16864. They
have a similar name, but different definitions of
Hello dev,
A user hits the following issue when using Flink 1.11.1 to connect to Hive
1.1.0:
java.lang.NoSuchMethodError:
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(Lorg/apache/hadoop/conf/Configuration;)V
> at
> org.apache.flink.table.catalog.hive.client.HiveShimV100.getHiveMetastor
Hi Rui,
The maven artifacts are built using the script:
releasing/deploy_staging_jars.sh [1].
Regards,
Dian
[1] https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
> 在 2020年9月8日,下午7:19,Rui Li 写道:
>
> maven artifacts
Hi Timo,
Regarding "pushing other computed columns into source, e.g. encrypted
records/columns, performing checksum checks, reading metadata etc.",
I'm not sure about this.
1. the planner don't know which computed column should be pushed into source
2. it seems that we can't improve performances i
Thanks Dian. The script looks all right to me. I'll double check with the
user whether the issue is related to his building environment.
On Tue, Sep 8, 2020 at 7:36 PM Dian Fu wrote:
> Hi Rui,
>
> The maven artifacts are built using the
> script: releasing/deploy_staging_jars.sh [1].
>
> Regards
Hi, Timo ~
"It is not about changelog mode compatibility, it is about the type
compatibility.”
For fromDataStream(dataStream, Schema), there should not be compatibility
problem or data type inconsistency. We know the logical type from Schema and
physical type from the dataStream itself.
For to
Hi Timo and Jark.Thanks for your replies.
Maybe I don't explain clearly in doc. I think the main reason behind is we
have no means to distinguish the calc in LogicalProject. Let me give you an
example to illustrate the reason. Assume we have 2 cases:
case 1:
create table MyTable (
int a,
int
+1
Best Regards,
Yu
On Tue, 8 Sep 2020 at 17:03, Aljoscha Krettek wrote:
> +1
>
> We just need to make sure to find a good name before the release but
> shouldn't block any work on this.
>
> Aljoscha
>
> On 08.09.20 07:59, Xintong Song wrote:
> > Thanks for the vote, @Jincheng.
> >
> >
> > Con
Hi Danny,
Your proposed signatures sound good to me.
fromDataStream(dataStream, Schema)
toDataStream(table, AbstractDataType)
They address all my concerns. The API would not be symmetric anymore,
but this is fine with me. Others raised concerns about deprecating
`fromDataStream(dataStream, Ex
Regarding #1, yes the state backend is definitely hash-based execution.
However there are some differences between
batch hash-based execution. The key difference is *random access &
read/write mixed workload". For example, by using
state backend in streaming execution, one have to mix the read and
Hi Jark, Hi Shengkai,
"shall we push the expressions in the following Projection too?"
This is something that we should at least consider.
I also don't find a strong use case. But what I see is that we are
merging concepts that actually can be separated. And we are executing
the same code twi
On Tue, Sep 8, 2020 at 6:55 PM Konstantin Knauf wrote:
> Hi Becket,
>
> Thank you for picking up this FLIP. I have a few questions:
>
> * two thoughts on naming:
>* idleTime: In the meantime, a similar metric "idleTimeMsPerSecond" has
> been introduced in https://issues.apache.org/jira/browse
Hi, Timo.
I agree with you that the concepts Watermark and ComputedColumn should be
separated. However, we are merging the interface SupportsCalcPushDown and
SupportsWatermarkPushDown actually. The concept computed column has
disappeared in optimization.
As for the drawback you mentiond, I have alr
Hey Konstantin,
Thanks for the feedback and suggestions. Please see the reply below.
* idleTime: In the meantime, a similar metric "idleTimeMsPerSecond" has
> been introduced in https://issues.apache.org/jira/browse/FLINK-16864. They
> have a similar name, but different definitions of idleness,
>
Serhat Soydan created FLINK-19164:
-
Summary: Release scripts break other dependency versions
unintentionally
Key: FLINK-19164
URL: https://issues.apache.org/jira/browse/FLINK-19164
Project: Flink
HI, Timo
Thanks for driving this FLIP.
Sorry but I have a concern about Writing metadata via DynamicTableSink section:
CREATE TABLE kafka_table (
id BIGINT,
name STRING,
timestamp AS CAST(SYSTEM_METADATA("timestamp") AS BIGINT) PERSISTED,
headers AS CAST(SYSTEM_METADATA("headers") AS MAP
Verified the issue was related to the building environment. The published
jar is good. Thanks Dian for the help.
On Tue, Sep 8, 2020 at 7:49 PM Rui Li wrote:
> Thanks Dian. The script looks all right to me. I'll double check with the
> user whether the issue is related to his building environmen
Ad. 1
Yes, you are right in principle.
Let me though clarify my proposal a bit. The proposed sort-style
execution aims at a generic KeyedProcessFunction were all the
"aggregations" are actually performed in the user code. It tries to
improve the performance by actually removing the need to use Ro
Hi, Thanks for posting this interesting question, and welcome to StateFun!
:-)
The first thing that I would like to mention is that, if your original
motivation for that scenario is a concern of a transient failures such as:
- did function Y ever received a message sent by function X ?
- did sendi
Hi Leonard,
the only alternative I see is that we introduce a concept that is
completely different to computed columns. This is also mentioned in the
rejected alternative section of the FLIP. Something like:
CREATE TABLE kafka_table (
id BIGINT,
name STRING,
timestamp INT SYSTEM_M
Hi Devs,
I'd like to propose an update to how state backends and checkpoint storage
are configured to help users better understand Flink.
Apache Flink's durability story is a mystery to many users. One of the most
common recurring questions from users comes from not understanding the
relationship
Dawid Wysakowicz created FLINK-19165:
Summary: Clean up the UnilateralSortMerger
Key: FLINK-19165
URL: https://issues.apache.org/jira/browse/FLINK-19165
Project: Flink
Issue Type: Bug
Hi Stephan,
Thanks for the input. Just a few more clarifications / questions.
*Num Bytes / Records Metrics*
1. At this point, the *numRecordsIn(Rate)* metrics exist in both
OperatorIOMetricGroup and TaskIOMetricGroup. I did not find
*numRecordsIn(Rate)* in the TaskIOMetricGroup updated anywhere
Jingsong Lee created FLINK-19166:
Summary: StreamingFileWriter should register Listener before the
initialization of buckets
Key: FLINK-19166
URL: https://issues.apache.org/jira/browse/FLINK-19166
Pro
Hi Zhu Zhu,
Add a new blocker: https://issues.apache.org/jira/browse/FLINK-19166
Will fix it soon.
Best,
Jingsong
On Tue, Sep 8, 2020 at 12:26 AM Zhu Zhu wrote:
> Hi All,
>
> Since there are still two 1.11.2 blockers items in progress, RC1 creation
> will be postponed to tomorrow.
>
> Thanks,
Hi all,
Thanks a lot for the discussion and votes. I will summarize the result in a
separate email.
Best,
Xingbo
Zhu Zhu 于2020年9月8日周二 上午11:16写道:
> +1
>
> Thanks,
> Zhu
>
> Hequn Cheng 于2020年9月8日周二 上午8:57写道:
>
> > +1 (binding)
> >
> >
> > On Tue, Sep 8, 2020 at 7:43 AM jincheng sun
> > wrote:
Hi all,
The voting time for FLIP-137[1] has passed. I'm closing the vote now.
There 6 + 1 votes, 4 of which are binding:
- Dian (binding)
- Jincheng (binding)
- Hequn (binding)
- Zhu Zhu (binding)
- Wei (non-binding)
- Shuiqiang (non-binding)
There are no disapproving votes.
Thus, FLIP-137 has
Hi Zhu Zhu,
Replenish its[1] influence:
For HiveStreamingSink & FileSystemSink in Table/SQL, partition commit make
partition visible for downstream Hive/Spark engines.
But due to FLINK-19166, will lose some partitions to commit after Job
failover in some cases, especially for short partitions.
In
Thanks for reporting this issue and offering to fix it @Jingsong Li
Agreed it is a reasonable blocker. I will postpone 1.11.2 RC1 creation
until it is fixed.
Thanks,
Zhu
Jingsong Li 于2020年9月9日周三 上午11:27写道:
> Hi Zhu Zhu,
>
> Replenish its[1] influence:
> For HiveStreamingSink & FileSystemSink i
I doubt that any sorting algorithm would work with only knowing the keys
are different but without
information of which is greater.
Best,
Kurt
On Tue, Sep 8, 2020 at 10:59 PM Dawid Wysakowicz
wrote:
> Ad. 1
>
> Yes, you are right in principle.
>
> Let me though clarify my proposal a bit. The
Thanks for the summary Timo ~
I want to clarify a little bit, what is the conclusion about the
fromChangelogStream and toChangelogStream, should we use this name or we use
fromDataStream with an optional ChangelogMode flag ?
Best,
Danny Chan
在 2020年9月8日 +0800 PM8:22,Timo Walther ,写道:
> Hi Danny
Hi,
I'm +1 to use the naming of from/toDataStream, rather than
from/toInsertStream. So we don't need to deprecate the existing
`fromDataStream`.
I'm +1 to Danny's proposal: fromDataStream(dataStream, Schema) and
toDataStream(table, AbstractDataType)
I think we can also keep the method `createTem
darion yaphet created FLINK-19168:
-
Summary: Upgrade Kafka client version
Key: FLINK-19168
URL: https://issues.apache.org/jira/browse/FLINK-19168
Project: Flink
Issue Type: Improvement
tinny cat created FLINK-19167:
-
Summary: Proccess Function Example could not work
Key: FLINK-19167
URL: https://issues.apache.org/jira/browse/FLINK-19167
Project: Flink
Issue Type: Bug
I also share the concern that reusing the computed column syntax but have
different semantics
would confuse users a lot.
Besides, I think metadata fields are conceptually not the same with
computed columns. The metadata
field is a connector specific thing and it only contains the information
that
The conclusion is that we will drop `fromChangelogStream(ChangelogMode,
DataStream)` but will keep `fromChangelogStream(DataStream)`.
The latter is necessary to have a per-record changeflag.
We could think about merging `fromChangelogStream`/`fromDataStream` and
`toChangelogStream`/`toDataStre
44 matches
Mail list logo