Re: How to use FsBackBackend without getting deprecation warning

2020-08-10 Thread Yun Tang
Hi Nikola You could use codes below to get rid of the warnings. StateBackend fsStateBackend = new FsStateBackend("hdfs://namenode:40010/flink/checkpoints"); env.setStateBackend(fsStateBackend); In fact, this warning is actually no harmful. Best Yun Tang ___

Re: Flink checkpoint recovery time

2020-08-18 Thread Yun Tang
savepoint. Unfortunately, this part of time is also not recorded in metrics now. If you find the task is in RUNNING state but not consume any record, that might be stuck in restoring checkpoint/savepoint. Best Yun Tang From: Zhinan Cheng Sent: Tuesday, August 1

Re: Flink checkpointing with Azure block storage

2020-08-20 Thread Yun Tang
nfigured, using default (Memory / JobManager) MemoryStateBackend". You can view the log to see whether your changes printed to search for "Loading configuration property". [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/filesystems/azure.html#credential

Re: How to sink invalid data from flatmap

2020-08-24 Thread Yun Tang
Hi Chao I think side output [1] might meet your requirements. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html Best Yun Tang From: 范超 Sent: Tuesday, August 25, 2020 10:54 To: user Subject: How to sink invalid data from

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-25 Thread Yun Tang
Thanks for Zhu's work to manage this release and everyone who contributed to this! Best, Yun Tang From: Yangze Guo Sent: Tuesday, August 25, 2020 14:47 To: Dian Fu Cc: Zhu Zhu ; dev ; user ; user-zh Subject: Re: [ANNOUNCE] Apache Flink 1.10.2 released T

Re: Monitor the usage of keyed state

2020-08-26 Thread Yun Tang
ects/flink/flink-docs-release-1.11/ops/config.html#state-backend-rocksdb-memory-managed [2] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#state-backend-rocksdb-metrics-block-cache-usage Best, Yun Tang From: Andrey Zagrebin Sent: Tue

Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-27 Thread Yun Tang
Congratulations , Dian! Best Yun Tang From: Yang Wang Sent: Friday, August 28, 2020 10:28 To: Arvid Heise Cc: Benchao Li ; dev ; user-zh ; Dian Fu ; user Subject: Re: [ANNOUNCE] New PMC member: Dian Fu Congratulations Dian ! Best, Yang Arvid Heise

Re: Performance issue associated with managed RocksDB memory

2020-08-27 Thread Yun Tang
b.com/dataArtisans/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/include/rocksdb/write_buffer_manager.h#L47 [2] https://github.com/dataArtisans/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/db/column_family.cc#L196 Best Yun Tang From: Juha Mynttinen Sent: Monday, August 24, 2020

Re: Flink Migration

2020-08-28 Thread Yun Tang
host could be resolved. You can also check the service of 'jobmanager' whether work as expected via 'kubectl get svc' . [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#compatibility-table Best Yun Tang From: N

Re: flink on k8s 如果jobmanager 所在pod重启后job失败如何处理

2020-09-03 Thread Yun Tang
jobmanager high availability[1] and you could refer to [2] for plans of HighAvailabilityService based on native k8s APIs. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html [2] https://issues.apache.org/jira/browse/FLINK-12884 Best Yun Tang

Re: Flink 1.11 TaskManagerLogFileHandler -Exception

2020-09-04 Thread Yun Tang
x27; locates [1] via property {log.file}. [1] https://github.com/apache/flink/blob/6b9cdd41743edd24a929074d62a57b84e7b2dd97/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/BootstrapTools.java#L419 Best, Yun Tang From: aj Sent: Friday, Septem

Re: Performance issue associated with managed RocksDB memory

2020-09-08 Thread Yun Tang
I think it's worth to give some hints in Flink documentations. When talking about your idea to sanity check the arena size, I think a warning should be enough as Flink seems never throw exception directly when the performance could be poor. Best Yun Tang

Re: flink checkpoint timeout

2020-09-14 Thread Yun Tang
/browse/FLINK-14816 Best Yun Tang From: Deshpande, Omkar Sent: Tuesday, September 15, 2020 10:25 To: user@flink.apache.org Subject: Re: flink checkpoint timeout I have followed this https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory

Re: info about flinkml

2020-09-14 Thread Yun Tang
] https://issues.apache.org/jira/browse/FLINK-12470 [4] https://github.com/alibaba/Alink Best Yun Tang From: Cristian Lorenzetto Sent: Monday, September 14, 2020 18:59 To: user@flink.apache.org Subject: info about flinkml Hi i m evaluating to adopt flink instead

Re: Performance issue associated with managed RocksDB memory

2020-09-14 Thread Yun Tang
Hi Juha Would you please consider to contribute this back to community? If agreed, please open a JIRA ticket and we could help review your PR then. Best Yun Tang From: Juha Mynttinen Sent: Thursday, September 10, 2020 19:05 To: Stephan Ewen Cc: Yun Tang ; user

Re: Poor performance with large keys using RocksDB and MapState

2020-09-24 Thread Yun Tang
://issues.apache.org/jira/browse/FLINK-17800 Best Yun Tang From: ירון שני Sent: Wednesday, September 23, 2020 23:56 To: user@flink.apache.org Subject: Poor performance with large keys using RocksDB and MapState Hello, I have a poor throughput issue, and I think I

Re: Poor performance with large keys using RocksDB and MapState

2020-10-01 Thread Yun Tang
lock-Cache#caching-index-filter-and-compression-dictionary-blocks [2] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/state/state_backends.html#memory-management Best Yun Tang From: ירון שני Sent: Tuesday, September 29, 2020 17:49 To: Yun Tang

Re: 回复: need help about "incremental checkpoint",Thanks

2020-10-02 Thread Yun Tang
ink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/AbstractFsCheckpointStorage.java#L60 Best Yun Tang From: 大森林 Sent: Saturday, October 3, 2020 9:30 To: David Anderson Cc: user Subject: 回复: need help about "incremental checkpoint",T

Re: checkpoint fail

2020-10-10 Thread Yun Tang
Hi Song Flink-1.4.2 is a bit too old, and I think this error is caused by FLINK-8876 [1][2] which should be fixed after Flink-1.5, please consider to upgrade Flink version. [1] https://issues.apache.org/jira/browse/FLINK-8876 [2] https://issues.apache.org/jira/browse/FLINK-8836 Best Yun Tang

Re: Is MapState tied to Operator Input & Output type?

2020-10-13 Thread Yun Tang
hich using the 'TestProcess'? The state might not be restored if you change your code without id assigned. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html#assigning-operator-ids Best Yun Tang From: Arpith P Sent: Tuesd

Re: Is MapState tied to Operator Input & Output type?

2020-10-13 Thread Yun Tang
cess keyed state and timers you have to apply the ProcessFunction on a keyed stream"? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/stream/operators/process_function.html#the-processfunction Best Yun Tang From: Arpith P Sent: Tuesda

Re: Rocksdb - Incremental vs full checkpoints

2020-10-13 Thread Yun Tang
200 KB), I think your RocksDB has not ever triggered compaction to reduce sst files, that's why the size constantly increase. Best Yun Tang From: sudranga Sent: Wednesday, October 14, 2020 10:40 To: user@flink.apache.org Subject: Rocksdb - Incremental vs

Re: Large state RocksDb backend increases app start time

2020-10-14 Thread Yun Tang
[2] https://issues.apache.org/jira/browse/FLINK-17288 Best Yun Tang From: Arpith P Sent: Thursday, October 15, 2020 0:50 To: user Subject: Large state RocksDb backend increases app start time Hi, I'm currently storing around 70GB of data in map sate back

Re: akka.framesize configuration does not runtime execution

2020-10-18 Thread Yun Tang
. BTW, why you could have so large RPC message over than 1GB? [1] https://github.com/apache/flink/blob/f705f0af6ba50f6e68c22484d1daeda842518d27/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java#L313 Best Yun Tang From: Yuval

Re: Flink checkpointing state

2020-10-26 Thread Yun Tang
Hi Boris Please refer to FLINK-12884[1] for current progress of native HA support of k8s which targets for release-1.12. [1] https://issues.apache.org/jira/browse/FLINK-12884 Best Yun Tang From: Boris Lublinsky Sent: Tuesday, October 27, 2020 2:56 To: user

Re: Rocksdb - Incremental vs full checkpoints

2020-10-26 Thread Yun Tang
n the machine and find where locates state dir to see how sst files stored for each checkpoint when local recovery is enabled [1]. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery Best Yun Tang

Re: Flink checkpointing state

2020-10-29 Thread Yun Tang
Hi Added Yang Wang who mainly develops this feature, I think he could provide more information. Best Yun Tang From: Boris Lublinsky Sent: Tuesday, October 27, 2020 22:57 To: Yun Tang Cc: user Subject: Re: Flink checkpointing state Thanks Yun, This refers to

Re: Job Manager is taking very long time to finalize the Checkpointing.

2020-11-18 Thread Yun Tang
rease. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#state-backend-fs-memory-threshold Best Yun Tang From: Slim Bouguerra Sent: Wednesday, November 18, 2020 6:16 To: user@flink.apache.org Subject: Job Manager is taking very long time to fi

Re: What happens when a job is rescaled

2020-11-18 Thread Yun Tang
/RocksDBIncrementalRestoreOperation.java#L274 [2] https://issues.apache.org/jira/browse/FLINK-17288 Best Yun Tang From: Richard Deurwaarder Sent: Saturday, November 14, 2020 0:14 To: user Subject: What happens when a job is rescaled Hello, I have a question about what

Re: Job Manager is taking very long time to finalize the Checkpointing.

2020-11-18 Thread Yun Tang
/flink-docs-master/monitoring/checkpoint_monitoring.html#monitoring Best Yun Tang From: Slim Bouguerra Sent: Thursday, November 19, 2020 7:56 To: Yun Tang Cc: user@flink.apache.org Subject: Re: Job Manager is taking very long time to finalize the Checkpointing. Hi Y

Re: fromCollection() and savepoints

2020-11-25 Thread Yun Tang
/c354f7bd679b9fa8c1e0d75feb3827ccca7f317b/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/FromElementsFunction.java#L123 Best Yun Tang From: Tomasz Dudziak Sent: Wednesday, November 25, 2020 18:22 To: user@flink.apache.org Subject: fromCollection

Re: State Processor API SQL State

2020-12-01 Thread Yun Tang
0faa8454db36f0/flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/operators/join/stream/StreamingJoinOperator.java#L83 Best Yun Tang From: Dominik Wosiński Sent: Tuesday, December 1, 2020 21:05 To: dev Subject: State Processor AP

Re: Questions regarding DDL and savepoints

2020-12-02 Thread Yun Tang
/flink/flink-docs-release-1.11/ops/cli.html#restore-a-savepoint Best Yun Tang From: Kevin Kwon Sent: Thursday, December 3, 2020 8:31 To: user@flink.apache.org Subject: Questions regarding DDL and savepoints I have a question regarding DDLs if they are considered

Re: Running Flink job as a rest

2020-12-02 Thread Yun Tang
k-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java Best Yun Tang From: dhurandar S Sent: Thursday, December 3, 2020 5:31 To: Flink Dev ; user Subject: Running Flink job as a rest Can Flink job

Re: Flink 1.9Version State TTL parameter configuration it does not work

2020-12-06 Thread Yun Tang
what's RocksDB doing. [1] https://github.com/jvm-profiling-tools/async-profiler Best Yun Tang From: Andrey Zagrebin Sent: Friday, December 4, 2020 17:49 To: user Subject: Re: Flink 1.9Version State TTL parameter configuration it does not work Hi

Re: Problem when restoring from savepoint with missing state & POJO modification

2020-12-07 Thread Yun Tang
Hi Bastien, Flink supports to register state via state descriptor when calling runtimeContext.getState(). However, once the state is registered, it cannot be removed anymore. And when you restore from savepoint, the previous state is registered again [1]. Flink does not to drop state directly a

Re: Recommendation about RocksDB Metrics ?

2020-12-08 Thread Yun Tang
trics.total-sst-files-size BTW, state.backend.rocksdb.metrics.column-family-as-variable is not rocksDB internal metrics but to expose column family as variable so that we could classify different state status. Best Yun Tang From: Steven Wu Sent: Wednesday,

Re: Flink cli Stop command exception

2020-12-09 Thread Yun Tang
Hi Suchithra, Have you ever checked job manager log to see whether the savepoint is triggered and why the savepoint failed to complete. Best Yun Tang From: V N, Suchithra (Nokia - IN/Bangalore) Sent: Wednesday, December 9, 2020 23:45 To: user@flink.apache.org

Re: Problem when restoring from savepoint with missing state & POJO modification

2020-12-09 Thread Yun Tang
-processing-api/src/main/java/org/apache/flink/state/api/WritableSavepoint.java#L103 [2] https://github.com/apache/flink/blob/168124f99c75e873adc81437c700f85f703e2248/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/output/StatePathExtractor.java#L54 Best Yun Tang

Re: [ANNOUNCE] Apache Flink 1.12.0 released

2020-12-10 Thread Yun Tang
Thanks Dian and Robert for driving this release and thanks everyone who makes this great work possible ! Best Yun Tang From: Wei Zhong Sent: Thursday, December 10, 2020 20:32 To: d...@flink.apache.org Cc: user ; annou...@apache.org Subject: Re: [ANNOUNCE

Re: Queryable state on task managers that are not running the job

2020-12-23 Thread Yun Tang
r, no queryable state could be queried. [1] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/yarn.html#per-job-cluster-mode Best Yun Tang From: Martin Boyanov Sent: Monday, December 21, 2020 19:04 To: user@flink.apache.org Su

Re: Restoring from checkpoint with different parallism

2021-01-11 Thread Yun Tang
Hi Rex, I think doc [1] should have given some descriptions. Rescaling from previous checkpoint is still supported in current Flink version. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint Best Yun Tang

Re: Setting different timeouts for savepoints and checkpoints

2021-01-18 Thread Yun Tang
Hi Timo and Rex, Actually, there existed several existing issues: FLINK-9465 [1] targets for CLI option while FLINK-10360 [2] targets for REST API. [1] https://issues.apache.org/jira/browse/FLINK-9465 [2] https://issues.apache.org/jira/browse/FLINK-10360 Best Yun Tang

Re: Problem with overridden hashCode/equals in keys in Flink 1.11.3 when checkpointing with RocksDB

2021-01-21 Thread Yun Tang
/custom_serialization.html Best Yun Tang From: Robert Metzger Sent: Thursday, January 21, 2021 22:49 To: David Haglund Cc: user@flink.apache.org Subject: Re: Problem with overridden hashCode/equals in keys in Flink 1.11.3 when checkpointing with RocksDB Hey David, this is a good

Re: rocksdb block cache usage

2021-01-27 Thread Yun Tang
Hi, If you have enabled managed memory, and since all rocksDB instances share the same block cache within one slot, all flink_taskmanager_job_task_operator_window_contents_rocksdb_block_cache_pinned_usage in the same slot would report the same value. Best Yun Tang

Re: Understanding n LIST calls as part of checkpointing

2020-03-08 Thread Yun Tang
should not come from Flink if you're using Flink-1.5+ [1] https://issues.apache.org/jira/browse/FLINK-8540 Best Yun Tang From: Piyush Narang Sent: Saturday, March 7, 2020 6:15 To: user Subject: Understanding n LIST calls as part of checkpointing Hi fo

Re: Question about RocksDBStateBackend Compaction Filter state cleanup

2020-03-17 Thread Yun Tang
compaction_filter.h#L140 Best Yun Tang From: LakeShen Sent: Tuesday, March 17, 2020 15:30 To: dev ; user-zh ; user Subject: Question about RocksDBStateBackend Compaction Filter state cleanup Hi community , I see the flink RocksDBStateBackend state

Re: savepoint - checkpoint - directory

2020-03-26 Thread Yun Tang
Hi Fanbin To resume from checkpoint, you should provide at least the directory named as /path/chk-x or /path/chk-x/_metadata. The sub-dir named as “shared” is used to store incremental checkpoint content. You could refer to [1] for more information. BTW, stop with savepoint could help reduce

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Yun Tang
Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ? Best Yun Tang From: Yangze Guo Sent: Friday, March 27, 2020 18:00 To: u

Re: Log file environment variable 'log.file' is not set.

2020-03-29 Thread Yun Tang
path, I am afraid this cannot understand hdfs paths. [1] https://github.com/apache/flink/blob/ae3b0ff80b93a83a358ab474060473863d2c30d6/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/BootstrapTools.java#L420 Best Yun Tang From: Vitaliy Sem

Re: Latency tracking together with broadcast state can cause job failure

2020-04-01 Thread Yun Tang
Hi Lasse Never meet this problem before, but can you share some exception stack trace so that we could take a look. The simple project to reproduce is also a good choice. Best Yun Tang From: Lasse Nedergaard Sent: Tuesday, March 31, 2020 19:10 To: user

Re: Flink incremental checkpointing - how long does data is kept in the share folder

2020-04-07 Thread Yun Tang
/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L96 Best Yun Tang From: Shachar Carmeli Sent: Tuesday, April 7, 2020 16:19 To: user@flink.apache.org Subject: Flink incremental checkpointing - how long does data is kept in the share folder We

Re: New kafka producer on each checkpoint

2020-04-07 Thread Yun Tang
/flink/streaming/connectors/kafka/FlinkKafkaProducer.java#L871 Best Yun Tang From: Maxim Parkachov Sent: Monday, April 6, 2020 23:16 To: user@flink.apache.org Subject: New kafka producer on each checkpoint Hi everyone, I'm trying to test exactly once function

Re: Using MapState clear, put methods in snapshotState within KeyedCoProcessFunction, valid or not?

2020-04-07 Thread Yun Tang
quot;models" is just a HashMap[(String, String), Model], and I don't know why we need to couple all models to just one specific key. Best Yun Tang From: Salva Alcántara Sent: Sunday, April 5, 2020 20:22 To: user@flink.apache.org Subject: Re: Using MapSt

Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

2020-04-08 Thread Yun Tang
Excited to see the stateful functions release! Thanks for the great work of manager Gordon and everyone who ever contributed to this. Best Yun Tang From: Till Rohrmann Sent: Wednesday, April 8, 2020 14:30 To: dev Cc: Oytun Tez ; user Subject: Re: [ANNOUNCE

Re: Possible memory leak in JobManager (Flink 1.10.0)?

2020-04-08 Thread Yun Tang
/ClusterEntrypoint.java#L260 [2] https://github.com/apache/flink/blob/d7e247209358779b6485062b69965b83043fb59d/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/ZooKeeperCompletedCheckpointStore.java#L234 Best Yun Tang From: Marc LEGER Sent: Wednesday

Re: Possible memory leak in JobManager (Flink 1.10.0)?

2020-04-09 Thread Yun Tang
heckpoints would occupy about 585MB memory, which is close to your observed scenario. >From my point of view, the checkpoint interval of one second is really too >often and would not make much sense in production environment. Best Yun Tang From: Till Rohrmann

Re: FlinkRuntimeException: Unexpected list element deserialization failure

2020-04-09 Thread Yun Tang
org/apache/flink/contrib/streaming/state/RocksDBListState.java#L151 Best Yun Tang From: anaray Sent: Friday, April 10, 2020 1:25 To: user@flink.apache.org Subject: FlinkRuntimeException: Unexpected list element deserialization failure Hi flink team, I see below

Re: Flink incremental checkpointing - how long does data is kept in the share folder

2020-04-13 Thread Yun Tang
ointMetadata to load '_metadata' to know which files belonging to that checkpoint. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#directory-structure Best Yun Tang From: Shachar Carmeli Sent: Sunday,

Re: Quick survey on checkpointing performance

2020-04-15 Thread Yun Tang
async time would not be too large, the most common reason is operator receiving the barrier late which lead to the end-to-end duration large. I hope you could offer the UI of your checkpoint details for further investigation. [1] https://issues.apache.org/jira/browse/FLINK-13390 Best Yun Tang

Re: [PROPOSAL] Contribute training materials to Apache Flink

2020-04-15 Thread Yun Tang
could be run well as expected. Best Yun Tang From: Till Rohrmann Sent: Wednesday, April 15, 2020 16:08 To: dev Cc: Eduardo Winpenny Tejedor ; Seth Wiesman ; Niels Basjes ; user Subject: Re: [PROPOSAL] Contribute training materials to Apache Flink Hi David

Re: Streaming Job eventually begins failing during checkpointing

2020-04-15 Thread Yun Tang
().getBroadcastState ? Did you pass a different operator state descriptor each time? Best Yun Tang From: Stephen Patel Sent: Thursday, April 16, 2020 2:09 To: user@flink.apache.org Subject: Streaming Job eventually begins failing during checkpointing I've got a flink (

Re: Streaming Job eventually begins failing during checkpointing

2020-04-16 Thread Yun Tang
/4fc924a8193bb9495c6b7ba755ced576bb8a35d5/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L95 Best Yun Tang From: Stephen Patel Sent: Thursday, April 16, 2020 22:30 To: Yun Tang Cc: user@flink.apache.org Subject

Re: Checkpoint Space Usage Debugging

2020-04-17 Thread Yun Tang
Hi Kent You can view checkpoint details via web UI to know how much checkpointed data uploaded for each operator, and you can compare the state size as time goes on to see whether they upload checkpointed data in stable range. Best Yun Tang From: Kent Murra

Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-18 Thread Yun Tang
" source? Best Yun Tang From: Jacob Sevart Sent: Saturday, April 18, 2020 9:22 To: Oleg Vysotsky Cc: Timo Walther ; user@flink.apache.org ; Long Nguyen Subject: Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fa

Re: Modelling time for complex events generated out of simple ones

2020-04-19 Thread Yun Tang
when generating them. I think there is no absolute rules and all depends on your actual scenarios. Best Yun Tang From: Salva Alcántara Sent: Monday, April 20, 2020 2:03 To: user@flink.apache.org Subject: Modelling time for complex events generated out of simple

Re: Flink incremental checkpointing - how long does data is kept in the share folder

2020-04-20 Thread Yun Tang
would say some files are not mentioned in the metadata file but are related to the checkpoint? How to judge that they are related to specific checkpoint? BTW, my name is "Yun" which means cloud in Chinese, not the delicious "Yum" 🙂 Best Yun Tang _

Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-20 Thread Yun Tang
ts.html#allowing-non-restored-state Best Yun Tang From: Oleg Vysotsky Sent: Tuesday, April 21, 2020 6:45 To: Jacob Sevart ; Timo Walther ; user@flink.apache.org Cc: Long Nguyen ; Gurpreet Singh Subject: Re: Checkpoints for kafka source sometimes get 55 GB size (i

Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-21 Thread Yun Tang
rs/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java#L115 [3] https://github.com/apache/flink/blob/99cbaa929ff9f2f5c387cbf4f76a0166f83a3a8c/flink-runtime/src/main/java/org/apache/flink/runtime/state/PartitionableL

Re: Latency tracking together with broadcast state can cause job failure

2020-04-21 Thread Yun Tang
Hi Lasse Really sorry for missing your reply. I'll run your project and find the root cause in my day time. And thanks for @Robert Metzger<mailto:rmetz...@apache.org> 's kind remind. Best Yun Tang From: Robert Metzger Sent: Tuesday, April 2

Re: Latency tracking together with broadcast state can cause job failure

2020-04-22 Thread Yun Tang
FLINK-17322 [1] to track this problem, and related owner would take a look at this problem. Really thank you for reporting this bug. [1] https://issues.apache.org/jira/browse/FLINK-17322 Best Yun Tang From: Yun Tang Sent: Wednesday, April 22, 2020 1:43 To: Lasse

Re: Flink 1.10.0 stop command

2020-04-22 Thread Yun Tang
Hi I think you could still use ./bin/flink cancel to cancel the job. What is the exception thrown? Best Yun Tang From: seeksst Sent: Wednesday, April 22, 2020 18:17 To: user Subject: Flink 1.10.0 stop command Hi, When i test 1.10.0, i found i must to

Re: Flink incremental checkpointing - how long does data is kept in the share folder

2020-04-22 Thread Yun Tang
job, the files which are newly created than last checkpoint completed time should also been filter out (if you are not retain multi checkpoints). The rest files are safe to remove. A simple way is stopping the job, and remove all files not recorded in the checkpoint metadata. Best Yun Tang

Re: K8s native - checkpointing to S3 with RockDBStateBackend

2020-04-23 Thread Yun Tang
Hi Averell Please build your own flink docker with S3 plugin as official doc said [1] [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html#using-plugins Best Yun Tang From: Averell Sent: Thursday, April 23, 2020 20:58 To: user

Re: RocksDB default logging configuration

2020-04-27 Thread Yun Tang
db.localdir" [1] should work for RocksDB in Flink-1.7.1. [1] https://github.com/apache/flink/blob/808cc1a23abb25bd03d24d75537a1e7c6987eef7/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.java#L285

Re: Savepoint memory overhead

2020-04-29 Thread Yun Tang
Hi Lasse Which version of Flink did you use? Before Flink-1.10, there might exist memory problem when RocksDB executes savepoint with write batch[1]. [1] https://issues.apache.org/jira/browse/FLINK-12785 Best Yun Tang From: Lasse Nedergaard Sent: Wednesday

Re: RocksDB default logging configuration

2020-04-29 Thread Yun Tang
ate.backend: rocksdb state.checkpoints.dir: hdfs:///checkpoint-path Best Yun Tang From: Bajaj, Abhinav Sent: Wednesday, April 29, 2020 3:16 To: Yun Tang ; user@flink.apache.org Cc: Chesnay Schepler Subject: Re: RocksDB default logging configuration Thanks Yun for your r

Re: Savepoint memory overhead

2020-04-29 Thread Yun Tang
-backend Best Yun Tang From: Lasse Nedergaard Sent: Thursday, April 30, 2020 12:39 To: Yun Tang Cc: user Subject: Re: Savepoint memory overhead We using Flink 1.10 running on Mesos. Med venlig hilsen / Best regards Lasse Nedergaard Den 30. apr. 2020 kl. 04.53

Re: What is the RocksDB local directory in flink checkpointing?

2020-05-08 Thread Yun Tang
/BootstrapTools.java#L478-L489 Best Yun Tang From: Till Rohrmann Sent: Wednesday, May 6, 2020 17:35 To: LakeShen Cc: dev ; user ; user-zh Subject: Re: What is the RocksDB local directory in flink checkpointing? Hi LakeShen, `state.backend.rocksdb.localdir` defin

Re: Flink on Kubernetes unable to Recover from failure

2020-05-08 Thread Yun Tang
ontainers running do not mean they're all registered to the job manager, I think you could refer to the JM and TM log to see whether the register connection is lost. Best Yun Tang From: Robert Metzger Sent: Friday, May 8, 2020 22:33 To: Morgan Geldenhuys Cc:

Re: Prometheus Pushgateway Reporter Can not DELETE metrics on pushgateway

2020-05-13 Thread Yun Tang
"operator_id;task_id;task_attempt_id", which are rarely used, in >metrics.reporter..scope.variables.excludes. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#reporter Best Yun Tang From: Thomas Huang Sent: Wed

Re: Incremental state with purging

2020-05-13 Thread Yun Tang
seconds. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#state-time-to-live-ttl Best Yun Tang From: Annemarie Burger Sent: Wednesday, May 13, 2020 2:46 To: user@flink.apache.org Subject: Incremental state with purging Hi, I

Re: Protection against huge values in RocksDB List State

2020-05-14 Thread Yun Tang
/50d63a2af01a46dd938dc1b717067339c92da040/java/src/main/java/org/rocksdb/RocksDB.java#L1382 Best Yun Tang From: Robin Cassan Sent: Friday, May 15, 2020 0:37 To: user Subject: Protection against huge values in RocksDB List State Hi all! I cannot seem to find any

Re: the savepoint problem of upgrading job from blink-1.5 to flink-1.10

2020-05-15 Thread Yun Tang
blink. As you can see, we use "org.apache.flink.runtime.state.KeyGroupsStateSnapshot" instead of "org.apache.flink.runtime.state.KeyGroupsStateHandle", and thus the savepoint generated by Blink cannot be easily consumed by Flink. Best Yun Tang ___

Re: Protection against huge values in RocksDB List State

2020-05-15 Thread Yun Tang
Yun Tang From: Robin Cassan Sent: Friday, May 15, 2020 20:59 To: Yun Tang Cc: user Subject: Re: Protection against huge values in RocksDB List State Hi Yun, thanks for your answer! And sorry I didn't see this limitation from the documentation, makes sens

Re: Rocksdb implementation

2020-05-18 Thread Yun Tang
ests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateClient.java#L108-L122 Best Yun Tang From: Arvid Heise Sent: Monday, May 18, 2020 23:40 To: Jaswin Shah Cc: user@flink.apache.org Subject: Re: Rocksdb implementation Hi Ja

Re: Using Queryable State within 1 job + docs suggestion

2020-05-18 Thread Yun Tang
nable this feature on server side as Flink job needs to access the queryable-state classes. If you're just running your Flink job locally, add dependency could let your local job access the queryable-state classes which is actually the doc wanted to tell users.

Re: RocksDB savepoint recovery performance improvements

2020-05-18 Thread Yun Tang
[1] https://github.com/apache/flink/blob/f35679966eac9e3bb53a02bcdbd36dbd1341d405/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksFullSnapshotStrategy.java#L308 Best Yun Tang From: Joey Pereira Sent: Tues

Re: Using Queryable State within 1 job + docs suggestion

2020-05-21 Thread Yun Tang
they could query each other, which provide better performance. Best Yun Tang From: Annemarie Burger Sent: Thursday, May 21, 2020 19:45 To: user@flink.apache.org Subject: Re: Using Queryable State within 1 job + docs suggestion Hi, Thanks for your response! What i

Re: In consistent Check point API response

2020-05-25 Thread Yun Tang
f Flink, it will create the checkpoint statics >without restored checkpoint and assign it once the latest savepoint/checkpoint >is restored. [1] [1] https://github.com/apache/flink/blob/50253c6b89e3c92cac23edda6556770a63643c90/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Che

Re: In consistent Check point API response

2020-05-25 Thread Yun Tang
, to tell your story. [1] https://github.com/apache/flink/blob/8f992e8e868b846cf7fe8de23923358fc6b50721/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L1250 Best Yun Tang From: Vijay Bhaskar Sent: Monday, May 25, 2020 17

Re: In consistent Check point API response

2020-05-26 Thread Yun Tang
estored" is for last restored checkpoint and "completed" is for last completed checkpoint, they are actually not the same thing. The only scenario that they're the same in numbers is when Flink just restore successfully before a new checkpoint completes. Best Yun Tang __

Re: RocksDB savepoint recovery performance improvements

2020-05-26 Thread Yun Tang
that is for quick fix at his scenario. Best Yun Tang From: Steven Wu Sent: Wednesday, May 27, 2020 0:36 To: Joey Pereira Cc: user@flink.apache.org ; Yun Tang ; Mike Mintz ; Shahid Chohan ; Aaron Levin Subject: Re: RocksDB savepoint recovery performanc

Re: In consistent Check point API response

2020-05-26 Thread Yun Tang
flink/blob/master/docs/monitoring/checkpoint_monitoring.zh.md Best Yun Tang From: Vijay Bhaskar Sent: Tuesday, May 26, 2020 15:18 To: Yun Tang Cc: user Subject: Re: In consistent Check point API response Thanks Yun. How can i contribute better documentation of t

Re: Running and Maintaining Multiple Jobs

2020-05-28 Thread Yun Tang
Hi Prasanna As far as I know, Flink does not allow to submit new jobgraph without restarting it, and I actually not understand what's your 3rd question meaning. From: Prasanna kumar Sent: Friday, May 29, 2020 11:18 To: Yun Tang Cc: user Subject: Re: Ru

Re: State expiration in Flink

2020-05-31 Thread Yun Tang
Hi Vasily After Flink-1.10, state will be cleaned up periodically as CleanupInBackground is enabled by default. Thus, even you never access some specific entry of state and that entry could still be cleaned up. Best Yun Tang From: Vasily Melnik Sent: Saturday

Re: State expiration in Flink

2020-06-01 Thread Yun Tang
Hi Vasily As far as I know, current TTL of state lack of such kind of trigger, and perhaps onTimer or process specific event to trigger could help your scenario. Best Yun Tang. From: Vasily Melnik Sent: Monday, June 1, 2020 14:13 To: Yun Tang Cc: user Subject

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Yun Tang
not be discarded. If you want resume from previous job, you should use -s command to resume from retained checkpoint. [1] [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint Best Yun Tang Fr

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Yun Tang
Hi Bhaskar We strongly not encourage to use such hack configuration to make job always having with the same special job id. If you stick to use this, all runs of this jobgraph would have the same job id. Best Yun Tang From: Vijay Bhaskar Sent: Monday, June 8

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-08 Thread Yun Tang
Hi Bhaskar By default, you will get a new job id. There existed some hack and hidden method to set the job id but is not meant to be used by the user Best Yun Tang From: Vijay Bhaskar Sent: Monday, June 8, 2020 15:03 To: Yun Tang Cc: Kathula, Sandeep ; user

  1   2   3   4   >