RE: Flink sql CLI parsing avro format data error 解析avro数据失败

2021-08-26 Thread Schwalbe Matthias
Hi Wayne, The only obvious difference between you ACRO schema and the table schema is, that you AVRO ccc is nullable and your SQL ‘ccc’ is not nullable. Please adjust one of the two. Also (not entirely sure) in order to correctly map an AVRO nullable to SQL it needs to have a default value like

Re: Any one can help me? How to connect offline hadoop cluster and realtime hadoop cluster by different hive catalog?

2021-08-26 Thread Caizhi Weng
Hi! It seems that your Flink cluster cannot connect to realtime-cluster-master001/xx.xx.xx.xx:8050. Please check your network and port status. Jim Chen 于2021年8月27日周五 下午2:20写道: > Hi, All > My flink version is 1.13.1 and my company have two hadoop cluster, > offline hadoop cluster and realtime

Any one can help me? How to connect offline hadoop cluster and realtime hadoop cluster by different hive catalog?

2021-08-26 Thread Jim Chen
Hi, All My flink version is 1.13.1 and my company have two hadoop cluster, offline hadoop cluster and realtime hadoop cluster. Now, on realtime hadoop cluster, we want to submit flink job to connect offline hadoop cluster by different hive catalog. I use different hive configuration diretory in h

Re: Not able to avoid Dynamic Class Loading

2021-08-26 Thread Arvid Heise
Without a real stacktrace, everything is a guess work. So please provide it together with your Flink version. It might be that some transition of DebeziumAvroRegistryDeserializationSchema (let's say open) will cause an illegal state where it's not serializable. On Thu, Aug 26, 2021 at 9:35 PM Kev

Re: checkpoints/.../shared cleanup

2021-08-26 Thread Alexey Trenikhun
"the shared subfolder still grows" - while upgrading job, we cancel job with savepoint, my expectations that Flink will clean checkpoint including shared directory, since checkpoints are not reatained, then we start upgraded job from savepoint, however when I look into shared folder I see older

Re: Flink KafkaConsumer metrics in DataDog

2021-08-26 Thread Debraj Manna
Yes we are also facing the same problem and not able to find any solution. On Thu, Aug 26, 2021 at 5:59 PM Chesnay Schepler wrote: > AFAIK this metric is directly forwarded from Kafka as-is; so Flink isn't > calculating anything. > > I suggest to reach out to the Kafka folks. > > On 25/08/2021 1

Re: 1.9 to 1.11 Managed Memory Migration Questions

2021-08-26 Thread Caizhi Weng
Hi! I've read the first mail again and discover that the direct memory OOM occurs when the job is writing to the sink, not when the data is transferring between tasks through the network. I'm not familiar with HDFS, but I guess writing to HDFS will require some direct memory. Maybe a detailed sta

Re: checkpoints/.../shared cleanup

2021-08-26 Thread Alexey Trenikhun
Hi Matthias, I don't use externalized checkpoints (from Flink UI Persist Checkpoints Externally: Disabled), why do you think checkpoint(s) should be retained? It kind of contradicts with documentation [1] - Checkpoints are by default not retained and are only used to resume a job from failures.

Flink sql CLI parsing avro format data error 解析avro数据失败

2021-08-26 Thread Wayne
i have Apache Avro schema 我的avro schema 如下 { "type" : "record", "name" : "KafkaAvroMessage", "namespace" : "xxx", "fields" : [ { "name" : "aaa", "type" : "string" }, { "name" : "bbb", "type" : [ "null", "string" ], "default" : null },{ "name" : "ccc", "

RE: 1.9 to 1.11 Managed Memory Migration Questions

2021-08-26 Thread Hailu, Andreas [Engineering]
Hi Caizhi, thanks for responding. The networking keys you suggested didn’t help, but I found that adding the ‘taskmanager.memory.task.off-heap.size’ with a value of ‘1g’ lead to a successful job. I can see on this property’s documentation [1] that the default value is 0 bytes. Task Off-Heap Me

Re: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden)

2021-08-26 Thread jonas eyob
@Svend - that seems to have done the trick, adding the bucket itself as a resource got flink to write to the configured s3 bucket. @Gil - we manage our kubernetes cluster on aws with kops. But we do assign the iam roles through the deployment annotations. Seems presto is able to use the s3:// sche

Re: Not able to avoid Dynamic Class Loading

2021-08-26 Thread Kevin Lam
I also tested serializing an instance of `OurSource` with `org.apache.commons.lang3.SerializationUtils.clone` and it worked fine. On Thu, Aug 26, 2021 at 3:27 PM Kevin Lam wrote: > Hi Arvid, > > Got it, we don't use Avro.schema inside of > DebeziumAvroRegistryDeserializationSchema, but I tried

Re: Not able to avoid Dynamic Class Loading

2021-08-26 Thread Kevin Lam
Hi Arvid, Got it, we don't use Avro.schema inside of DebeziumAvroRegistryDeserializationSchema, but I tried to test it with a unit test and `org.apache.commons.lang3.SerializationUtils.clone` runs successfully. I'm curious as to why things work (are serializable) when we use dynamic classloading,

Re: Identify metrics belonging to the "same" task manager in kubernetes

2021-08-26 Thread gaurav kulkarni
Hi,  I have another question: What mechanisms are usually used to correlate prometheus flink metrics for kubernetes?  Thanks,Gaurav  On Thursday, August 26, 2021, 10:06:30 AM PDT, gaurav kulkarni wrote: Thanks for the response! For #2, custom labels should work too for our case.  Than

Re: Bulk Scheduler timeout when creating several jobs in flink kubernetes HA deployment

2021-08-26 Thread Gil De Grove
Hello Matthias, I'll extract the logs from the cluster au update that here. For the tm's, i'll try to find relevant logs, we had many of them deployed at that time. And all of the logs may not be that interesting to upload. Regards, Gil On Thu, Aug 26, 2021, 12:31 Matthias Pohl wrote: > Hi Gil

Re: Identify metrics belonging to the "same" task manager in kubernetes

2021-08-26 Thread gaurav kulkarni
Thanks for the response! For #2, custom labels should work too for our case.  Thanks,Gaurav On Thursday, August 26, 2021, 08:28:27 AM PDT, Chesnay Schepler wrote: 1) As is there is no way to accomplish this. 2) Yes (datadog, graphite) but if you are happy with Prometheus I instea

Re: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden)

2021-08-26 Thread Gil De Grove
Hi Jonas, Just wondering, are you trying to deploy via iam service account annotations in a AWS eks cluster? We noticed that when using presto, the iam service account was using en ec2 metadata API inside AWS. However, when using eks service account, the API used is the webtoken auth. Not sure

Re: Flink Avro Timestamp Precision Issue

2021-08-26 Thread Akshay Agarwal
Thanks Matthias for pointing us to the links, we will definitely follow those. *Regards* *Akshay Agarwal* On Thu, Aug 26, 2021 at 6:43 PM Matthias Pohl wrote: > Hi Akshay, > thanks for reaching out to the community. There was a similar question on > the mailing list earlier this month [1]. Unf

Re: Dynamic Cluster/Index Routing for Elasticsearch Sink

2021-08-26 Thread David Morávek
Hi Rion, personally I'd start with unit test in the base module using a test sink implementation. There is already *DummyElasticsearchSink* that you may be able to reuse (just note that we're trying to get rid of Mockito based tests such as this one). I'm bit unsure that integration test would ac

Re: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden)

2021-08-26 Thread Svend
Hi Jonas, Just a thought, could you try this policy? If I recall correctly, I think you need ListBucket on the bucket itself, whereas the other can have a path prefix like the "/*" you added " { "Version": "2012-10-17", "Statement": [ { "Action": [ "s

Re: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden)

2021-08-26 Thread jonas eyob
Hey Matthias, Yes, I have followed the documentation on the link you provided - and decided to go for the recommended approach of using IAM roles. The hive.s3.use-instance-credentials configuration parameter I got from [1] (first bullet) since I am using the flink-s3-fs-presto plugin - which says:

Re: hdfs lease issues on flink retry

2021-08-26 Thread Matthias Pohl
I see - I should have checked my mailbox before answering. I received the email and was able to login. On Thu, Aug 26, 2021 at 6:12 PM Matthias Pohl wrote: > The link doesn't work, i.e. I'm redirected to a login page. It would be > also good to include the Flink logs and make them accessible for

Re: hdfs lease issues on flink retry

2021-08-26 Thread Matthias Pohl
The link doesn't work, i.e. I'm redirected to a login page. It would be also good to include the Flink logs and make them accessible for everyone. This way others could share their perspective as well... On Thu, Aug 26, 2021 at 5:40 PM Shah, Siddharth [Engineering] < siddharth.x.s...@gs.com> wrote

Re: Dynamic Cluster/Index Routing for Elasticsearch Sink

2021-08-26 Thread Rion Williams
Just chiming in on this again. I think I have the pieces in place regarding the implementation (both a DynamicElasticsearchSink class and ElasticsearchSinkRouter interface) added to the elasticsearch-base module. I noticed that HttpHost wasn't available within that module/in the tests, so I'd susp

RE: hdfs lease issues on flink retry

2021-08-26 Thread Shah, Siddharth [Engineering]
Hi Matthias, Thank you for responding and taking time to look at the issue. Uploaded the yarn lags here: https://lockbox.gs.com/lockbox/folders/963b0f29-85ad-4580-b420-8c66d9c07a84/ and have also requested read permissions for you. Please let us know if you’re not able to see the files. From

Re: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden)

2021-08-26 Thread Matthias Pohl
Hi Jonas, have you included the s3 credentials in the Flink config file like it's described in [1]? I'm not sure about this hive.s3.use-instance-credentials being a valid configuration parameter. Best, Matthias [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems

Re: Identify metrics belonging to the "same" task manager in kubernetes

2021-08-26 Thread Chesnay Schepler
1) As is there is no way to accomplish this. 2) Yes (datadog, graphite) but if you are happy with Prometheus I instead recommend to fork the reporter and adjust it accordingly. Would only custom prefixes/suffixes work for your use-case, or would a custom label also work? 3) Sure, there's ple

Re: Not able to avoid Dynamic Class Loading

2021-08-26 Thread Arvid Heise
Hi Kevin, the consumer needs to be serializable. Apparently you are also serializing the Avro schema (probably as part of your DebeziumAvroRegistryDeserializationSchema) and that fails. You may want to copy our SerializableAvroSchema [1] Make sure that everything is serializable. You can check th

Re: Identify metrics belonging to the "same" task manager in kubernetes

2021-08-26 Thread gaurav kulkarni
Thanks for the response!  1) That's correct. I was wondering if there is any logical ID for task manager that stays the same even if its moved to a different pod and is a part of metrics emitted. The scenario is if a TM#1 was running on pod#1 and then is moved to pod#2 and if we can correlate t

KafkaFetcher [] - Committing offsets to Kafka failed.

2021-08-26 Thread bat man
Hi, I am using flink 12.1 to consume data from kafka in a streaming job. Using the flink-connector-kafka_2.12:1.12.1. Kafka broker version is 2.2.1 In logs I see warnings like this - 2021-08-26 13:36:49,903 WARN org.apache.flink.streaming.connectors.kafka.internals.KafkaFetcher [] - Committing o

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Alexis Sarda-Espinosa
I think it would be nice if the task manager pods get their values from the configuration file only if the pod templates don’t specify any resources. That was the goal of supporting pod templates, right? Allowing more custom scenarios without letting the configuration options get bloated. Regar

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Denis Cosmin NUTIU
Hi Matthias, Thanks for getting back to me and for your time! We have some Flink jobs deployed on Kubernetes and running kubectl top pod gives the following result: NAMECPU(cores) MEMORY(bytes) aa-78c8cb77d4-zlmpg 8

Re: hdfs lease issues on flink retry

2021-08-26 Thread Matthias Pohl
Hi Siddharth, thanks for reaching out to the community. This might be a bug. Could you share your Flink and YARN logs? This way we could get a better understanding of what's going on. Best, Matthias On Tue, Aug 24, 2021 at 10:19 PM Shah, Siddharth [Engineering] < siddharth.x.s...@gs.com> wrote:

Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden)

2021-08-26 Thread jonas eyob
Hey, I am setting up HA on a standalone Kubernetes Flink application job cluster. Flink (1.12.5) is used and I am using S3 as the storage backend * The JobManager shortly fails after starts with the following errors (apologies in advance for the length), and I can't understand what's going on. *

Re: Disabling autogenerated uid/hash doesn't work when using file source

2021-08-26 Thread Matthias Pohl
Hi Vishal, you're right: the FileSource itself doesn't provide these methods. But you could get them through the DataStreamSource (which implements SingleOutputStreamOperator and provides these two methods [1,2]). It is returned by StreamExecutionEnvironment.fromSource [3]. fromSource would need th

Re: 【Announce】Zeppelin 0.10.0 is released, Flink on Zeppelin Improved

2021-08-26 Thread Till Rohrmann
Cool, thanks for letting us know Jeff. Hopefully, many users use Zeppelin together with Flink. Cheers, Till On Thu, Aug 26, 2021 at 4:47 AM Leonard Xu wrote: > Thanks Jeff for the great work ! > > Best, > Leonard > > 在 2021年8月25日,22:48,Jeff Zhang 写道: > > Hi Flink users, > > We (Zeppelin commun

Re: Flink Avro Timestamp Precision Issue

2021-08-26 Thread Matthias Pohl
Hi Akshay, thanks for reaching out to the community. There was a similar question on the mailing list earlier this month [1]. Unfortunately, it just doesn't seem to be supported, yet. The feature request was already created with FLINK-23589 [2]. Best, Matthias [1] https://lists.apache.org/thread.

Re: Not able to avoid Dynamic Class Loading

2021-08-26 Thread Kevin Lam
Hi! We're using 1.13.1. We have a class in our user code that extends FlinkKafkaConsumer, that's built for reading avro records from Kafka. However it doesn't hold any Schema objects as fields so I'm a little confused. Something like this: ``` class OurSource[T <: ClassTag: TypeInformation: Deco

Re: Kinesis Producer not working with Flink 1.11.2

2021-08-26 Thread Matthias Pohl
Hi Sanket, have you considered reaching out to the Kinesis community? I might be wrong but it looks like a Kinesis issue. Best, Matthias On Tue, Aug 24, 2021 at 7:13 PM Sanket Agrawal wrote: > Hi, > > > > We are trying to use Kinesis along with Flink(1.11.2) and JDK 11 on EMR > cluster(6.2). Wh

Re: checkpoints/.../shared cleanup

2021-08-26 Thread Matthias Pohl
Hi Alexey, thanks for reaching out to the community. I have a question: What do you mean by "the shared subfolder still grows"? As far as I understand, the shared folder contains the state of incremental checkpoints. If you cancel the corresponding job and start a new job from one of the retained i

Re: Flink KafkaConsumer metrics in DataDog

2021-08-26 Thread Chesnay Schepler
AFAIK this metric is directly forwarded from Kafka as-is; so Flink isn't calculating anything. I suggest to reach out to the Kafka folks. On 25/08/2021 17:23, Shilpa Shankar wrote: Hello , We have enabled DataDogHTTPReporter to fetch metrics on flink v1.13.1 running on kubernetes. The metric

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Matthias Pohl
Hi Denis, I did a bit of digging: It looks like there is no way to specify them independently. You can find documentation about pod templates for TaskManager and JobManager [1]. But even there it states that for cpu and memory, the resource specs are overwritten by the Flink configuration. The code

Re: Queries regarding Flink upgrade strategies

2021-08-26 Thread Matthias Pohl
Hi Amit, upgrading Flink versions means that you should stop your jobs with a savepoint first. A new cluster with the new Flink version can be deployed next. Then, this cluster can be used to start the jobs from the previously created savepoints. Each job should pick up the work from where it stopp

Re: Query on Metrics | Cumulative Metrics via Prometheus Reporter

2021-08-26 Thread Chesnay Schepler
So you want the scrape intervals to be synced, i.e., so that metrics are gathered in each process at roughly the same time? Could you clarify whether you use the PrometheusReporter or the PrometheusPushGatewayReporter? If it is the former, then this is something you'd need to fix on the Promet

Re: Query on Metrics | Cumulative Metrics via Prometheus Reporter

2021-08-26 Thread bhawana gupta
Hi Chesnay, Elaborating the use case, for example I have 3 operators : Operator A(parallelism 1, running on taskmanager1) --> Operator B(parallelism 3, running on taskmanager1, taskmanager2) --> Operator C(Parallelism 1, running on taskmanager1). Suppose there are 10 records in process since Op

Re: Identify metrics belonging to the "same" task manager in kubernetes

2021-08-26 Thread Chesnay Schepler
1) Can you clarify what you mean with "same"? Are you referring to something like an index, i.e., "This is TaskManager #2", without being tied to a specific process? 2) It is not possible for the PrometheusReporter. On 26/08/2021 04:22, gaurav kulkarni wrote: Hi, We have multiple flink clust

Re: Query on Metrics | Cumulative Metrics via Prometheus Reporter

2021-08-26 Thread Chesnay Schepler
I'm not sure I understand what the desired behavior would be. Is it that you want metrics on each TM to be scraped in a well-defined order? I.e., first all metrics of Operator A, then all metrics of Operator B and so on? Maybe you could provide an example to make things clear. On 26/08/2021

Re: Bulk Scheduler timeout when creating several jobs in flink kubernetes HA deployment

2021-08-26 Thread Matthias Pohl
Hi Gil, could you provide the complete logs (TaskManager & JobManager) for us to investigate it? The error itself and the behavior you're describing sounds like expected behavior if there are not enough slots available for all the submitted jobs to be handled in time. Have you tried increasing the

Re: NullPointerException when using KubernetesHaServicesFactory

2021-08-26 Thread jonas eyob
So when spinning it up on minikube, and then ssh into one of the JobManager pods shows following for the commands you mentioned: flink@local-thoros-jobmanager-6l9lz:~$ id uid=(flink) gid=(flink) groups=(flink) flink@local-thoros-jobmanager-6l9lz:~$ ls -la $FLINK_HOME/plugins/s3-fs-pr

Re: Error while deserializing the element

2021-08-26 Thread JING ZHANG
My previous response in mail list was too brief, sorry about that. For the use of config 'state.backend.rocksdb.timer-service.factory: HEAP', I would like to add some information. Please correct me if anything is wrong. 1. The configure could help to avoid the EOF exception when recover rock

Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Denis Cosmin NUTIU
Hello, I've developed a Flink job and I'm trying to deploy it on a Kubernetes cluster using Flink Native. Setting kubernetes.taskmanager.cpu=0.5 and kubernetes.jobmanager.cpu=0.5 sets the requests and limits to 500m, which is correct, but I'd like to set the requests and limits to different value

Flink Avro Timestamp Precision Issue

2021-08-26 Thread Akshay Agarwal
Hi everyone, We are trying out flink 1.13.1 with kafka topics as avro backend but we are facing an issue while creating Table SQL that avro doesn't support precision greater than 3. I am not getting the reason why flink isn't supporting timestamps greater than 3 (exception

Queries regarding Flink upgrade strategies

2021-08-26 Thread Amit Bhatia
Hi, We are using Flink 1.13.2 with Kubernetes HA solution provided by flink. We have created a deployment for JobManager and TaskManager with option to deploy multiple replicas and the same is bundled in a single helm chart. So we have below queries regarding Flink upgrade strategies, kindly help