Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Yun Tang
Hi Bhaskar We strongly not encourage to use such hack configuration to make job always having with the same special job id. If you stick to use this, all runs of this jobgraph would have the same job id. Best Yun Tang From: Vijay Bhaskar Sent: Monday, June 8, 2

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Vijay Bhaskar
Hi Yun If we start using the special Job ID and redeploy the job, then after deployment, will it going to get assigned with special Job ID? or new Job ID? Regards Bhaskar On Mon, Jun 8, 2020 at 9:33 AM Yun Tang wrote: > Hi Sandeep > > In general, Flink assign unique job-id to each job and use

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Yun Tang
Hi Sandeep In general, Flink assign unique job-id to each job and use that id as the zk path. Thus when the checkpoint store shuts down with globally terminal state (e.g. FAILED, CANCELLED), it needs to clean paths in ZK to ensure no resource leak as the next job would have different job-id. I

Re: UpsertStreamTableSink for Aggregate Only Query

2020-06-07 Thread Jark Wu
Hi Satyam, Currently, `UpsertStreamTableSink` requires the query to contain a primary key, and the key will be set to `UpsertStreamTableSink#setKeyFields`. If there is no primary key in the query, an error will be thrown as you can see. It should work for all the group by queries (if no projectio

Re: Native K8S not creating TMs

2020-06-07 Thread Yang Wang
Hi Kevin, It may because the characters length limitation of K8s(no more than 63)[1]. So the pod name could not be too long. I notice that you are using the client automatic generated cluster-id. It may cause problem and could you set a meaningful cluster-id for your Flink session? For example, k

Re: Beam/Python/Flink unable to deserialize UnboundedSource

2020-06-07 Thread Pradip Thachile
I've attached below some minimal sample code that reproduces this issue below. This works perfectly with the DirectRunner. -Pradip #!/usr/bin/env python3 import apache_beam as beam import logging import os class DummyPipeline(beam.PTransform): def expand(self, p): ( p

Beam/Python/Flink unable to deserialize UnboundedSource

2020-06-07 Thread Pradip Thachile
Hey folks, I've got a Beam/Python pipeline that works on the DirectRunner and now am trying to run this on a local dev Flink cluster. Running this yields an error out the gate around not being able to deserialize UnboundedSource (my PubSub source). I'm not sure how to debug this and would love to

Re: Flink savepoints history

2020-06-07 Thread Chesnay Schepler
I can't quite find the answer right now, but the Web UI relies entirely on the REST API. So, whenever you see something in the UI, and wonder where that data comes from, open up the developer tools in your browser, go to the network tab, reload the page and see what requests are being made. On

Re: Simple stateful polling source

2020-06-07 Thread Ken Krugler
Hi Chesnay, > On Jun 19, 2019, at 6:05 AM, Chesnay Schepler wrote: > > A (Rich)SourceFunction that does not implement RichParallelSourceFunction is > always run with a parallelism of 1. RichSourceFunction

Flink savepoints history

2020-06-07 Thread M Singh
Hi: I wanted to find out if we can access the savepoints created for a job or all jobs using Flink CLI or REST API.   I took a look at the CLI (Apache Flink 1.10 Documentation: Command-Line Interface) and REST API (https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/rest_api.

Re: Run command after Batch is finished

2020-06-07 Thread Mark Davis
Hi Jeff, Unfortunately this is not good enough for me. My clients are very volatile, they start a batch and can go away any moment without waiting for it to finish. Think of an elastic web application or an AWS Lambda. I hopped to find something what could be deployed to the cluster together wi