Re: NullPointerException in StateTable.put()

2021-08-16 Thread László Ciople
Ok, thank you for the tips. I will modify it and get back to you :) On Tue, Aug 17, 2021 at 9:42 AM David Morávek wrote: > Hi Laszlo, > > Please use reply-all for mailing list replies. This may help others > finding their answer in the future ;) > > >> sb.append(DeviceDetail.class.getName()).app

Re: RabbitMQ 3.9+ Native Streams

2021-08-16 Thread David Morávek
This would be awesome! We have the contribution guide [1] that should give you a rough idea on how to approach the contribution. Let me know if you need any further guidance, I'd be happy to help ;) [1] https://flink.apache.org/contrib

Re: NullPointerException in StateTable.put()

2021-08-16 Thread David Morávek
Hi Laszlo, Please use reply-all for mailing list replies. This may help others finding their answer in the future ;) > sb.append(DeviceDetail.class.getName()).append('@').append(Integer.toHexString(System.identityHashCode(this))).append('['); This part will again make your key non-deterministi

Re: Can i contribute for flink doc ?

2021-08-16 Thread Caizhi Weng
Hi! Thanks for your interest in contributing to Flink. Currently most of the committers are busy with the upcoming Flink 1.14 so there might be few people having their eyes on the new PRs, especially if they do not exist in a JIRA issue. Please follow Jing Zhang's advice by first creating the cor

Re: Handling HTTP Requests for Keyed Streams

2021-08-16 Thread Caizhi Weng
Hi! As you mentioned that the configuration fetching is very infrequent, why don't you use a blocking approach to send HTTP requests and receive responses? This seems like a more reasonable solution to me. Rion Williams 于2021年8月17日周二 上午4:00写道: > Hi all, > > I've been exploring a few different o

Re: Can i contribute for flink doc ?

2021-08-16 Thread JING ZHANG
Hi Camile, First of all, thanks for the great contribution, the document improvement is very helpful. but I don't know why nobody merges it and no comment. > Maybe we could try the following way to start the first contribution, please go document [1] for more detailed information. 1. Please make s

Re: Windowed Aggregation With Event Time over a Temporary View

2021-08-16 Thread JING ZHANG
Hi Joe, > > caused by: org.apache.calcite.runtime.CalciteContextException: From line > 1, column 139 to line 1, column 181: Cannot apply '$TUMBLE' to arguments of > type '$TUMBLE(, )'. Supported form(s): > '$TUMBLE(, )' The first parameter of Group Window Function [1] must be a field with time att

Can i contribute for flink doc ?

2021-08-16 Thread Camile Sing
Hi, all I'm a Flink user. recently I find some problems when I use Flink, it takes some time to understand the internal mechanisms. This really makes me know more about Flink, but I think the doc can be clearer, so I open some merge requests for the doc: - https://github.com/apache/flink/pull/1

Re: redis sink from flink

2021-08-16 Thread Yik San Chan
By the way, this post in Chinese showed how we do it exactly with code. https://yiksanchan.com/posts/flink-bulk-insert-redis And yes it had buffered writes support by leveraging Flink operator state, and Redis Pipelining. Feel free to let you know if you have any questions. On Tue, Aug 17, 2021

Re: redis sink from flink

2021-08-16 Thread Yik San Chan
Hi Jin, I was in the same shoes. I tried bahir redis connector at first, then I felt it was very limited, so I rolled out my own. It was actually quite straightforward. All you need to do is to extend RichSinkFunction, then put your logic inside. Regarding Redis clients, Jedis (https://github.com

Re: redis sink from flink

2021-08-16 Thread Yangze Guo
Hi, Jin IIUC, the DataStream connector `RedisSink` can still be used. However, the Table API connector `RedisTableSink` might not work (at least in the future) because it is implemented based on the deprecated Table connector abstraction. You can still give it a try, though. Best, Yangze Guo On

Re: Flink taskmanager in crash loop

2021-08-16 Thread Yangze Guo
Hi, Abhishek, Do you see something like "Fatal error occurred while executing the TaskManager" in your log or would you like to provide the whole task manager log? Best, Yangze Guo On Tue, Aug 17, 2021 at 5:17 AM Abhishek Rai wrote: > > Hello, > > In our production environment, running Flink 1.

redis sink from flink

2021-08-16 Thread Jin Yi
is apache bahir still a thing? it hasn't been touched for months (since redis 2.8.5). as such, looking at the current flink connector docs, it's no longer pointing to anything from the bahir project. looking around in either the flink or bahir newsgroups doesn't turn up anything regarding bahir'

Re: RabbitMQ 3.9+ Native Streams

2021-08-16 Thread Rob Englander
I will definitely consider the contribution idea :) On Mon, Aug 16, 2021 at 3:16 PM David Morávek wrote: > Hi Rob, > > there is currently no on-going effort for this topic, I think this would > be a really great contribution though. This seems to be pushing RabbitMQ > towards new usages ;) > >

Flink taskmanager in crash loop

2021-08-16 Thread Abhishek Rai
Hello, In our production environment, running Flink 1.13 (Scala 2.11), where Flink has been working without issues with a dozen or so jobs running for a while, Flink taskmanager started crash looping with a period of ~4 minutes per crash. The stack trace is not very informative, therefore reachin

Handling HTTP Requests for Keyed Streams

2021-08-16 Thread Rion Williams
Hi all, I've been exploring a few different options for storing tenant-specific configurations within Flink state based on the messages I have flowing through my job. Initially I had considered creating a source that would periodically poll an HTTP API and connect that stream to my original event

Re: RabbitMQ 3.9+ Native Streams

2021-08-16 Thread David Morávek
Hi Rob, there is currently no on-going effort for this topic, I think this would be a really great contribution though. This seems to be pushing RabbitMQ towards new usages ;) Best, D. On Mon, Aug 16, 2021 at 8:16 PM Rob Englander wrote: > I'm wondering if there's any work underway to develop

RE: Upgrading from Flink on YARN 1.9 to 1.11

2021-08-16 Thread Hailu, Andreas [Engineering]
Hi David, You’re correct about classpathing problems – thanks for your help in spotting them. I was able to get past that exception by removing some conflicting packages in my shaded JAR, but I’m seeing something else that’s interesting. With the 2 threads trying to submit jobs, one of the thre

RabbitMQ 3.9+ Native Streams

2021-08-16 Thread Rob Englander
I'm wondering if there's any work underway to develop DataSource/DataSink for RabbitMQ's streams recently released in RMQ 3.9? Rob Englander

Keystore format limitations for TLS

2021-08-16 Thread Alexis Sarda-Espinosa
Hello, I am trying to configure TLS communication for a Flink cluster running on Kubernetes. I am currently using the BCFKS format and setting that as default via javax.net.ssl.keystoretype and javax.net.ssl.truststoretype (which are injected in the environment variable FLINK_ENV_JAVA_OPTS). Th

Windowed Aggregation With Event Time over a Temporary View

2021-08-16 Thread Joseph Lorenzini
Hi all,   I am on Flink 1.12.3.   So here’s the scenario: I have a Kafka topic as a source, where each record repsents a change to an append only audit log. The kafka record has the following fields:   id (unique identifier for that audit log entry) operation id (is shared across

Re: Dynamic Cluster/Index Routing for Elasticsearch Sink

2021-08-16 Thread Rion Williams
Thanks for this suggestion David, it's extremely helpful. Since this will vary depending on the elements retrieved from a separate stream, I'm guessing something like the following would be roughly the avenue to continue down: fun main(args: Array) { val parameters = mergeParametersFromProper

Re: NullPointerException in StateTable.put()

2021-08-16 Thread David Morávek
Great, let me know if that helped ;) Best, D. On Mon, Aug 16, 2021 at 4:36 PM László Ciople wrote: > The events are json dictionaries and the key is a field which represents a > device id, or if it doesn't exist, then actually a *hashCode *of the > device object in the dictionary is used. So th

Re: NullPointerException in StateTable.put()

2021-08-16 Thread David Morávek
My intuition is that you have a non-deterministic shuffle key. If you perform any "per-key" operation, you need to make sure that the same key always end up in the same partition. To simplify this, it means that the key needs to have a consistent *hashCode* and *equals* across different JVMs. Usua

Re: Exploring Flink for a HTTP delivery service.

2021-08-16 Thread David Morávek
Hi Prasanna, here are some quick thoughts 1) Batching is an aggregation operation.But what I have seen in the > examples of windowing is that they get the count/max/min operation in the > particular window. So could the batching be implemented via a windowing > mechanism ? > I guess a custom fu

NullPointerException in StateTable.put()

2021-08-16 Thread László Ciople
Hello, I am trying to write a Flink application which receives data from Kafka, does processing on keyed windowed streams and sends results on a different topic. Shortly after the job is started it fails with a NullPointerException in StateTable.put(). I am really confused by this error, because I

Re: s3 access denied with flink-s3-fs-presto

2021-08-16 Thread David Morávek
Hi Vamshi, >From your configuration I'm guessing that you're using Amazon S3 (not any implementation such as Minio). Two comments: - *s3.endpoint* should not contain bucket (this is included in your s3 path, eg. *s3:///*) - "*s3.path.style.access*: true" is only correct for 3rd party implementati

Re: Upgrading from Flink on YARN 1.9 to 1.11

2021-08-16 Thread David Morávek
Hi Andreas, Per-job and session deployment modes should not be affected by this FLIP. Application mode is just a new deployment mode (where job driver runs embedded within JM), that co-exists with these two. >From information you've provided, I'd say your actual problem is this exception: ``` C

Re: Problems with reading ORC files with S3 filesystem

2021-08-16 Thread David Morávek
Hi Piotr, unfortunately this is a known long-standing issue [1]. The problem is that ORC format is not using Flink's filesystem abstraction for actual reading of the underlying file, so you have to adjust your Hadoop config accordingly. There is also a corresponding SO question [2] covering this.

Re: Scaling Flink for batch jobs

2021-08-16 Thread Gorjan Todorovski
Thanks, I'll check more about job tuning. On Mon, 16 Aug 2021 at 06:28, Caizhi Weng wrote: > Hi! > > if I use parallelism of 2 or 4 - it takes the same time. >> > It might be that there is no data in some parallelisms. You can click on > the nodes in Flink web UI and see if it is the case for ea

Re: JobManager Resident memory Always Increasing

2021-08-16 Thread David Morávek
Hi Pranjul, which deployment mode are you using? - For session cluster, I'd say it's possible that memory grows with # of jobs. - For application mode, there is actual user-code executed, so if you're using some native libraries in your job driver, that may be another reason for the growing memor