Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread Yun Tang
Hi Shuwen Since you just have 10 “chk-“ folders as expected and when subsuming checkpoints, the “chk-” folder would be removed after we successfully removed shared state [1]. That is to say, I think you might not have too many orphan states files left. To ensure this, you could use state proces

Re: [DISCUSS] FLIP-84: Improve & Refactor API of Table Module

2019-11-05 Thread Jark Wu
Hi Terry, I would suggest to change the title a bit. For example, "Improve & Refactor TableEnvironment APIs". Or more specifically, "Improve & Refactor TableEnvironment execute/sqlQuery/sqlUpdate.. APIs" Currently, the title is a little wide (there are so many APIs in table module) . Make the tit

Re: Flunk savepoin(checkpoint) load api or debug

2019-11-05 Thread Jark Wu
Btw, user questions should be asked in user@f.a.o or user-zh@f.a.o. The dev ML is mainly used to discuss development. Best, Jark On Wed, 6 Nov 2019 at 15:36, Jark Wu wrote: > Hi, > > Savepoint.load(env, path) is in state processor API library, you should > add the following dependency in your p

Re: Flunk savepoin(checkpoint) load api or debug

2019-11-05 Thread Jark Wu
Hi, Savepoint.load(env, path) is in state processor API library, you should add the following dependency in your project. org.apache.flink flink-state-processor-api_2.11 1.9.1 You can see the docuementation for more detailed instructions [1]. Best, Jark [1]: https://ci.apache.org/proj

[jira] [Created] (FLINK-14626) User jar packaged with hadoop dependencies may cause class conflit with hadoop jars on yarn

2019-11-05 Thread Victor Wong (Jira)
Victor Wong created FLINK-14626: --- Summary: User jar packaged with hadoop dependencies may cause class conflit with hadoop jars on yarn Key: FLINK-14626 URL: https://issues.apache.org/jira/browse/FLINK-14626

[jira] [Created] (FLINK-14625) Eliminate cross join in multi join to reduce cost

2019-11-05 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-14625: -- Summary: Eliminate cross join in multi join to reduce cost Key: FLINK-14625 URL: https://issues.apache.org/jira/browse/FLINK-14625 Project: Flink Issue Type: Im

[jira] [Created] (FLINK-14624) Support computed column as rowtime attribute

2019-11-05 Thread Jark Wu (Jira)
Jark Wu created FLINK-14624: --- Summary: Support computed column as rowtime attribute Key: FLINK-14624 URL: https://issues.apache.org/jira/browse/FLINK-14624 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-14623) Add computed column information into TableSchema

2019-11-05 Thread Danny Chen (Jira)
Danny Chen created FLINK-14623: -- Summary: Add computed column information into TableSchema Key: FLINK-14623 URL: https://issues.apache.org/jira/browse/FLINK-14623 Project: Flink Issue Type: Sub-

[jira] [Created] (FLINK-14622) Cooperate WatermarkSpec in TableSourceScan with minibatch configuration

2019-11-05 Thread Jark Wu (Jira)
Jark Wu created FLINK-14622: --- Summary: Cooperate WatermarkSpec in TableSourceScan with minibatch configuration Key: FLINK-14622 URL: https://issues.apache.org/jira/browse/FLINK-14622 Project: Flink

[jira] [Created] (FLINK-14621) Do not generate watermark assigner operator if no time attribute operations on rowtime

2019-11-05 Thread Jark Wu (Jira)
Jark Wu created FLINK-14621: --- Summary: Do not generate watermark assigner operator if no time attribute operations on rowtime Key: FLINK-14621 URL: https://issues.apache.org/jira/browse/FLINK-14621 Project:

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread shuwen zhou
Hi Yun and Till, Thank you for your response. For @Yun 1. No, I just renamed the checkpoint directory name since the directory name contains company data. Sorry for the confusion. 2. Yes, I set state.checkpoints.num-retained: 10 state.backend.rocksdb.predefined-options: FLASH_SSD_OPTIMIZED In fli

Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-05 Thread Yang Wang
Thanks Yu for starting this discussion. I'm in favor of adding a e2e performance testing framework. Currently the e2e tests are mainly focused on functionality and written in shell. We need a better e2e framework for performance and functionality tests. Best, Yang Biao Liu 于2019年11月5日周二 上午10:1

[jira] [Created] (FLINK-14620) Rewrite the elasticsearch related end-to-end tests by using the newly introduce e2e java framework

2019-11-05 Thread Zheng Hu (Jira)
Zheng Hu created FLINK-14620: Summary: Rewrite the elasticsearch related end-to-end tests by using the newly introduce e2e java framework Key: FLINK-14620 URL: https://issues.apache.org/jira/browse/FLINK-14620

Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-05 Thread Terry Wang
Hi Bowen: Thanks for your feedback. Your opinion convinced me and I just remove the section about catalog create statement and also remove `DBPROPERTIES` `PROPERTIES` from alter DDLs. Open to more comments or votes :) ! Best, Terry Wang > 2019年11月6日 07:22,Bowen Li 写道: > > Hi Terry, > > I w

[jira] [Created] (FLINK-14619) Failed to fetch BLOB

2019-11-05 Thread liang yu (Jira)
liang yu created FLINK-14619: Summary: Failed to fetch BLOB Key: FLINK-14619 URL: https://issues.apache.org/jira/browse/FLINK-14619 Project: Flink Issue Type: Bug Affects Versions: 1.9.1

Flunk savepoin(checkpoint) load api or debug

2019-11-05 Thread qq
Hi all, I want to load checkpoint or savepoint metadata on dev . in this case , I want to debug saved checkpoint metadata. And I knew flink provided a api which is Savepoint.load(env, path), but I can’t find it and can’t use it. Anyone who know about this ? Could you help me ? Thanks very muc

Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-05 Thread Bowen Li
Hi Terry, I went over the FLIP in detail again. The FLIP mostly LGTM. A couple issues: - since we on't plan to support catalog ddl, can you remove them from the FLIP? - I found there are some discrepancies in proposed database and table DDLs. For db ddl, the create db syntax proposes specifying

[jira] [Created] (FLINK-14618) Give more detailed debug information on akka framesize exception

2019-11-05 Thread Jacob Sevart (Jira)
Jacob Sevart created FLINK-14618: Summary: Give more detailed debug information on akka framesize exception Key: FLINK-14618 URL: https://issues.apache.org/jira/browse/FLINK-14618 Project: Flink

Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-05 Thread Peter Huang
+1 for the enhancement. On Tue, Nov 5, 2019 at 11:04 AM Xuefu Z wrote: > +1 to the long missing feature in Flink SQL. > > On Tue, Nov 5, 2019 at 6:32 AM Terry Wang wrote: > > > Hi all, > > > > I would like to start the vote for FLIP-69[1] which is discussed and > > reached consensus in the disc

[jira] [Created] (FLINK-14617) Dataset Parquet ClassCastException for SpecificRecord

2019-11-05 Thread Jira
Dominik Wosiński created FLINK-14617: Summary: Dataset Parquet ClassCastException for SpecificRecord Key: FLINK-14617 URL: https://issues.apache.org/jira/browse/FLINK-14617 Project: Flink

[jira] [Created] (FLINK-14616) Clarify the ordering guarantees in the "The Broadcast State Pattern"

2019-11-05 Thread Filip Niksic (Jira)
Filip Niksic created FLINK-14616: Summary: Clarify the ordering guarantees in the "The Broadcast State Pattern" Key: FLINK-14616 URL: https://issues.apache.org/jira/browse/FLINK-14616 Project: Flink

Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-05 Thread Xuefu Z
+1 to the long missing feature in Flink SQL. On Tue, Nov 5, 2019 at 6:32 AM Terry Wang wrote: > Hi all, > > I would like to start the vote for FLIP-69[1] which is discussed and > reached consensus in the discussion thread[2]. > > The vote will be open for at least 72 hours. I'll try to close it

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread Yun Tang
@Till Rohrmann , I think just set `cleanupInBackground()` should be enough for RocksDB to clean up in compaction filter after Flink-1.9.0 [1] @Shuwen , I have several questions for your behavior: 1. Is the ` flink-chk743e4568a70b626837b` real folder for checkpoints? I don't think a job-id would

Re: [VOTE] Accept Stateful Functions into Apache Flink

2019-11-05 Thread Stephan Ewen
Thanks, all, I love it when a quorum comes together! Also, really cool to see all the votes from outside the PMC, thank you for voicing your interest. Result: - 19/25 PMC members voted, that is 76% of the PMC. - 20 non-PMC members voted. - 19 x +1 (binding) - 20 x +1 (non-binding) - 0 x

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread Till Rohrmann
Hi Shuwen, I think the problem is that you configured state ttl to clean up on full snapshots which aren't executed when using RocksDB with incremental snapshots. Instead you need to activate `cleanupInRocksdbCompactFilter`: val ttlConfig = StateTtlConfig .newBuilder(Time.minutes(30) .updateT

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread shuwen zhou
Hi Jiayi, I understand that being shared folder means to store state of multiple checkpoints. I think that shared folder should only retain data across number “state.checkpoint.num-retained” checkpoints and remove outdated checkpoint, isn't it? In my case I doubt that outdated checkpoint's states w

[VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-05 Thread Terry Wang
Hi all, I would like to start the vote for FLIP-69[1] which is discussed and reached consensus in the discussion thread[2]. The vote will be open for at least 72 hours. I'll try to close it by 2019-11-08 14:30 UTC, unless there is an objection or not enough votes. [1] https://cwiki.apache.org

Re: [DISCUSS] FLIP 69 - Flink SQL DDL Enhancement

2019-11-05 Thread Terry Wang
Hi Bowen~ We don’t intend to support create/drop catalog syntax in this flip, we may support it if there indeed has a strong desire. And I’m going to kick off a vote for this flip, feel free to review again. Best, Terry Wang > 2019年9月26日 00:44,Xuefu Z 写道: > > Actually catalogs are more of

Re: [DISCUSS] Flink Avro Cloudera Registry (FLINK-14577)

2019-11-05 Thread Gyula Fóra
Thanks Matyas for starting the discussion! I think this would be a very valuable addition to Flink as many companies are already using the Hortonworks/Cloudera registry and it would enable them to connect to Flink easily. @Dawid: Regarding the implementation this a much more lightweight connector

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread bupt_ljy
Hi Shuwen, The “shared” means that the state files are shared among multiple checkpoints, which happens when you enable incremental checkpointing[1]. Therefore, it’s reasonable that the size keeps growing if you set “state.checkpoint.num-retained” to be a big value. [1] https://flink.apache.

Re: [DISCUSS] Flink Avro Cloudera Registry (FLINK-14577)

2019-11-05 Thread Dawid Wysakowicz
Hi Matyas, I think this would be a valuable addition. You may reuse some of the already available abstractions for writing avro deserialization schema based on a schema registry (have a look at RegistryDeserializationSchema and SchemaCoderProvider). There is also an opened PR for adding a similar

[jira] [Created] (FLINK-14615) Add Flink Web UI capabilities for savepoint

2019-11-05 Thread Mario Georgiev (Jira)
Mario Georgiev created FLINK-14615: --- Summary: Add Flink Web UI capabilities for savepoint Key: FLINK-14615 URL: https://issues.apache.org/jira/browse/FLINK-14615 Project: Flink Issue Type:

[jira] [Created] (FLINK-14614) add annotation location javastyle rule

2019-11-05 Thread lamber-ken (Jira)
lamber-ken created FLINK-14614: -- Summary: add annotation location javastyle rule Key: FLINK-14614 URL: https://issues.apache.org/jira/browse/FLINK-14614 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-14613) Add validation check when applying UDF to tempral table key in Temporal Table Join condition

2019-11-05 Thread hailong wang (Jira)
hailong wang created FLINK-14613: Summary: Add validation check when applying UDF to tempral table key in Temporal Table Join condition Key: FLINK-14613 URL: https://issues.apache.org/jira/browse/FLINK-14613

[DISCUSS] Flink Avro Cloudera Registry (FLINK-14577)

2019-11-05 Thread Őrhidi Mátyás
Dear Flink Community! We have noticed a recent request for Hortonworks schema registry support ( FLINK-14577 ). We have an implementation for it already, and we would be happy to contribute it to Apache Flink. You can find the documentation below

[jira] [Created] (FLINK-14612) Degenerate the current ConcurrentHashMap type of intermediateResults to a normal HashMap type.

2019-11-05 Thread vinoyang (Jira)
vinoyang created FLINK-14612: Summary: Degenerate the current ConcurrentHashMap type of intermediateResults to a normal HashMap type. Key: FLINK-14612 URL: https://issues.apache.org/jira/browse/FLINK-14612

Anybody can help to review the PR about re-work the e2e framework in Java ?

2019-11-05 Thread OpenInx
Hi : I’m working on the e2e framework re-work, say rewrite the the e2e framework in java so that we can do more things, such as running it on both standalone & distributed Flink cluster, test on standalone or distributed kafka env , run under maven env etc. Talked with Till and Chesnay in Slack

[jira] [Created] (FLINK-14611) Move allVerticesInSameSlotSharingGroupByDefault from ExecutionConfig to StreamGraph

2019-11-05 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14611: --- Summary: Move allVerticesInSameSlotSharingGroupByDefault from ExecutionConfig to StreamGraph Key: FLINK-14611 URL: https://issues.apache.org/jira/browse/FLINK-14611 Project: Fl

RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread shuwen zhou
Hi Community, I have a job running on Flink1.9.0 on YARN with rocksDB on HDFS with incremental checkpoint enabled. I have some MapState in code with following config: val ttlConfig = StateTtlConfig .newBuilder(Time.minutes(30) .updateTtlOnCreateAndWrite() .cleanupInBackground() .cleanupFul