Re: Should I use static database connection pool?

2019-10-16 Thread Xin Ma
Thanks, John. If I don't static my pool, I think it will create one instance for each task. If the pool is static, each jvm can hold one instance. Depending on the deployment approach, it can create one to multiple instances. Is this correct? Konstantin's talk mentions static variables can lead to

Re: standalone flink savepoint restoration

2019-10-16 Thread Yun Tang
Hi Matt Have you ever configured `high-availability.cluster-id` ? If not, Flink standalone job would first try to recover from high-availability checkpoint store named '/default'. If there existed a checkpoint, Flink would always restore from checkpoint disabling 'allowNonRestoredState'[1] (alw

standalone flink savepoint restoration

2019-10-16 Thread Matt Anger
Hello everyone, I am running a flink job in k8s as a standalone HA job. Now I updated my job w/ some additional sinks, which I guess have made the checkpoints incompatible with the newer version, meaning flink now crashes on bootup with the following: Caused by: java.lang.IllegalStateException: Th

Re: Consumer group and flink

2019-10-16 Thread Vishwas Siravara
Please ignore this email. On Wed, Oct 16, 2019 at 1:40 PM Vishwas Siravara wrote: > Hi guys, > Is it necessary to specify a consumer group name for a kafka streaming job > when checkpointing is enabled? Since the offsets are not stored in kafka > how does specifying a consumer group help ? > > B

Consumer group and flink

2019-10-16 Thread Vishwas Siravara
Hi guys, Is it necessary to specify a consumer group name for a kafka streaming job when checkpointing is enabled? Since the offsets are not stored in kafka how does specifying a consumer group help ? Best, Vishwas

Re: Should I use static database connection pool?

2019-10-16 Thread John Smith
Xin. The open() close() cycle of a Sink function is only called once so I don't think you event need to have it static your pool. Someone can confirm this? Miki the JDBC Connector lacks some functionality for instance it only flushes batches when the batch interval is reached. So if you set batch

Re: Kinesis Connector and Savepoint/Checkpoint restore.

2019-10-16 Thread Steven Nelson
In my situation I believe it's because we have idle shards (it's a testing environment). I dug into the connector code and it looks like it only updates the shard state when a record is processed or when the shard hits shard_end. So, for an idle shard it would never get a checkpointed state. I gues

Re: Kinesis Connector and Savepoint/Checkpoint restore.

2019-10-16 Thread Ravi Bhushan Ratnakar
Do you know step by step process to reproduce this problem? -Ravi On Wed 16 Oct, 2019, 17:40 Steven Nelson, wrote: > I have verified this behavior in 1.9.0, 1.8.1 and 1.7.2. > > About half my shards start over at trim horizon. Why would some shard > statuses appear to not exist in a savepoints

Re: Flink State Migration Version 1.8.2

2019-10-16 Thread ApoorvK
Yes, I have tried giving it as option, also the case class has default constructor (this) still unable to migrate -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

RE: EXT :Re: Jar Uploads in High Availability (Flink 1.7.2)

2019-10-16 Thread Martin, Nick J [US] (IS)
Yeah, I’ll do that if I have to. I’m hoping there’s a ‘right’ way to do it that’s easier. If I have to implement the zookeeper lookups in my load balancer myself, that feels like a definite step backwards from the pre-1.5 days when the cluster would give 307 redirects to the current leader From

Re: Kinesis Connector and Savepoint/Checkpoint restore.

2019-10-16 Thread Steven Nelson
I have verified this behavior in 1.9.0, 1.8.1 and 1.7.2. About half my shards start over at trim horizon. Why would some shard statuses appear to not exist in a savepoints? This seems like a big problem. -Steve On Wed, Oct 16, 2019 at 12:08 AM Ravi Bhushan Ratnakar < ravibhushanratna...@gmail.co

Re: Flink State Migration Version 1.8.2

2019-10-16 Thread miki haiat
Can you try to add the new variables as option ? On Wed, Oct 16, 2019, 17:17 ApoorvK wrote: > I have been trying to alter the current state case class (scala) which has > 250 variables, now when I add 10 more variables to the class, and when I > run > my flink application from the save point ta

Re: Should I use static database connection pool?

2019-10-16 Thread miki haiat
If it's a sink that use jdbc, why not using the flink Jdbcsink connector? On Wed, Oct 16, 2019, 17:03 Xin Ma wrote: > I have watched one of the recent Flink forward videos, Apache Flink Worst > Practices by Konstantin Knauf. The talk helps me a lot and mentions that we > should avoid using stat

Re: JDBC Table Sink doesn't seem to sink to database.

2019-10-16 Thread John Smith
Ok I think I found it. it's the batch interval setting. From what I see, if we want "realtime" stream to the database we have to set it to 1 other wise the sink will wait until, the batch interval count is reached. The batch interval mechanism doesn't see correct? If the default size is 5000 and y

Flink State Migration Version 1.8.2

2019-10-16 Thread ApoorvK
I have been trying to alter the current state case class (scala) which has 250 variables, now when I add 10 more variables to the class, and when I run my flink application from the save point taken before(Some of the variables are object which are also maintained as state). It fails to migrate the

Re: Mirror Maker 2.0 cluster and starting from latest offset and other queries

2019-10-16 Thread Vishal Santoshi
1. still no clue, apart from the fact that ConsumerConfig gets it from somewhere ( need to override it and have tried both auto.offset.reset =latest and consumer.auto.offset.reset = latest [2019-10-16 13:50:34,260] INFO ConsumerConfig values: allow.auto.create.topics = true auto.commit.interva

Should I use static database connection pool?

2019-10-16 Thread Xin Ma
I have watched one of the recent Flink forward videos, Apache Flink Worst Practices by Konstantin Knauf. The talk helps me a lot and mentions that we should avoid using static variables to share state between tasks. So should I also avoid static database connection? Because I am facing a weird iss

Re: FLINK-13497 / "Could not create file for checking if truncate works" / HDFS

2019-10-16 Thread Congxian Qiu
Glad to hear it! Best, Congxian Adrian Vasiliu 于2019年10月15日周二 下午9:10写道: > Hi, > FYI we've switched to a different Hadoop server, and the issue vanished... > It does look as the cause was on hadoop side. > Thanks again Congxian. > Adrian > > > - Original message - > From: "Adrian Vasili

Re: Fwd: Is it possible to get Flink job name in an operator?

2019-10-16 Thread Chesnay Schepler
If you have control over the job you can modify it to use ExEnv#execute(String jobName), and üass this explicitly to your functions in some form (like the global job parameters). Beyond that there is no way to access the job name from within a function/operator. On 15/10/2019 08:53, 马阳阳 wrot

Elasticsearch6UpsertTableSink how to trigger es delete index。

2019-10-16 Thread ouywl
Hi,    When I use Elasticsearch6UpsertTableSink, and It seems implements delete index. Like code:             @Overridepublic void process(Tuple2, Row> element, RuntimeContext ctx, RequestIndexer indexer) { if (element.f0) { process

Re: Verifying correctness of StreamingFileSink (Kafka -> S3)

2019-10-16 Thread Kostas Kloudas
Hi Amran, If you want to know from which partition your input data come from, you can always have a separate bucket for each partition. As described in [1], you can extract the offset/partition/topic information for an incoming record and based on this, decide the appropriate bucket to put the rec