Re: What happen to state in Flink Task Manager when crash?

2019-01-11 Thread Congxian Qiu
Hi, Siew Wai Yow When the job is running, the states are stored in the local RocksDB, Flink will copy all the needed states to checkpointPath when doing a Checkpoint. If there have any failures, the job will be restored from the last previously *Successfully* checkpoint, and assign the restored st

Re: What happen to state in Flink Task Manager when crash?

2019-01-11 Thread Siew Wai Yow
Thanks. But this is something I know. I would like to know will the other TM take over the crashed TM's state to ensure data completion(say the state BYKEY, different key state will be stored in different TM) OR the crashed TM need to be recovered to continue? For example, 5 records, rec1:KEYA

Is the eval method invoked in a thread safe manner in Fink UDF

2019-01-11 Thread Anil
Is the eval method invoked in a thread safe manner in Fink UDF. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: What happen to state in Flink Task Manager when crash?

2019-01-11 Thread Jamie Grier
Flink is designed such that local state is backed up to a highly available system such as HDFS or S3. When a TaskManager fails state is recovered from there. I suggest reading this: https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html On Fri, Jan 11, 2019 at

breaking the pipe line

2019-01-11 Thread Alieh
Hello all, I have a very very long pipeline (implementation of an incremental algorithm). It takes a very long time for Flink execution planner to create the plan. So I splitted the pipeline into several independent pipelines by writing down the intermediate results and again read them. Is th

What happen to state in Flink Task Manager when crash?

2019-01-11 Thread Siew Wai Yow
Hello, May i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing? Regards, Yow

Re: NoMatchingTableFactoryException when test flink sql with kafka in flink 1.7

2019-01-11 Thread Joshua Fan
Hi Timo Thank you for your advice. It is truely a typo. After I fix it, the same exception remains. But when I add the inAppendMode() to the StreamTableDescriptor, the exception disappears, and it can find the proper kafka08factory. And another exception turns out. Caused by: org.apache.flink.sh

Re: NoMatchingTableFactoryException when test flink sql with kafka in flink 1.7

2019-01-11 Thread Timo Walther
Hi Jashua, according to the property list, you passed "connector.version=0.10" so a Kafka 0.8 factory will not match. Are you sure you are compiling the right thing? There seems to be a mismatch between your screenshot and the exception. Regards, Timo Am 11.01.19 um 15:43 schrieb Joshua Fa

NoMatchingTableFactoryException when test flink sql with kafka in flink 1.7

2019-01-11 Thread Joshua Fan
Hi, I want to test flink sql locally by consuming kafka data in flink 1.7, but it turns out an exception like below. Exception in thread "main" >> org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find >> a suitable table factory for >> 'org.apache.flink.table.factories.Stream

Flink 1.7.1 - Parallel tasks in different Task Managers

2019-01-11 Thread Francisco Blaya
Hi, We are trying to upgrade to Flink 1.7.1. We have a job with parallelism > 1 over 2 tasks. Due to implementation details outside our control one of these tasks cannot share the same JVM with another parallel instance i.e. if the parallelism is 4 we need 4 slots across different Task Managers.

Re: Multiple MapState vs single nested MapState in stateful Operator

2019-01-11 Thread Gagan Agrawal
This makes perfect sense to me. Thanks Congxian and Kostas for your inputs. Gagan On Thu, Jan 10, 2019 at 6:03 PM Kostas Kloudas wrote: > Hi Gagan, > > I agree with Congxian! > In MapState, when accessing the state/value associated with a key in the > map, then the whole value is de-serialized

StreamingFileSink cannot get AWS S3 credentials

2019-01-11 Thread Taher Koitawala
Hi All, We have implemented S3 sink in the following way: StreamingFileSink sink= StreamingFileSink.forBulkFormat(new Path("s3a://mybucket/myfolder/output/"), ParquetAvroWriters.forGenericRecord(schema)) .withBucketCheckInterval(50l).withBucketAssigner(new CustomBucketAssigner()).build();

Flink 1.7.0 HA based on zookeepers

2019-01-11 Thread min.tan
Hi, I have a simple HA setting with Flink 1.7.0: Node1 (active master, active slave) Node2 (standby master, active slave) Step 1, start-cluster.sh from Node1, no problem Step 2, manually kill the active master on Node1, no problem and the standby master become active Step 3, bin/jobmanager.sh st

Re: Building Flink from source according to vendor-specific versionbut causes protobuf conflict

2019-01-11 Thread Gary Yao
Hi Wei, Did you build Flink with maven 3.2.5 as recommended in the documentation [1]? Also, did you use the -Pvendor-repos flag to add the cloudera repository when building? Best, Gary [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html#build-flink [2] https://