without DISTINCT unique lines show up many times in FLINK SQL

2022-08-16 Thread Marco Villalobos
Hello everybody, When I perform this simple set of queries, a unique line from the source file shows up many times. I have verified many times that a unique line in the source shows up as much as 100 times in the select statement. Is this the correct behavior for Flink 1.15.1? FYI, it does s

Re: Pyflink :: Conversion from DataStream to TableAPI

2022-08-16 Thread Ramana
Hi Yuan - Thanks for your response. Wondering if the window api supports non-keyed streams? On Wed, Aug 17, 2022, 06:43 yu'an huang wrote: > Hi, > > > Pyflink should support window api. You can read this document. > > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/python/dat

Re: Pyflink :: Conversion from DataStream to TableAPI

2022-08-16 Thread yu'an huang
Hi, Pyflink should support window api. You can read this document. https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/python/datastream/operators/windows/ Hope this helps. Best, Yuan On Tue, 16 Aug 2022 at 3:11 PM, Ramana wrote: > Hi All - > > Trying to achieve the following

Passing Dynamic Table Options to Catalog's getTable()

2022-08-16 Thread Krzysztof Chmielewski
Hi, I'm working on my own Catalog implementation and I'm wondering if there is any way to pas Dynamic Table Options [1] to Catalog's getTable(...) method. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#dynamic-table-options Regards, Krzysztof Chmielewsk

Re: Flink-Statefun : Design Solution advice ( ASYNC IO)

2022-08-16 Thread Tymur Yarosh
No, I can’t. I have no previous experience with Python. According to the  examples, Stateful Functions in Python should already be async. So, an async API/DB client should do the job while being used properly. Regarding your scenarios: API DB service down From my experience with Java SDK, there

Re: Flink-Statefun : Design Solution advice ( ASYNC IO)

2022-08-16 Thread Himanshu Sareen
Thanks, Tymor. I think the Datastream API job will also work for the use-case. I'll try a quick POC. Further, Can you suggest how should we fetch data from External API/DB from a Remote Stateful Function ( Pythion sdk ). We have a Flink-Statefun application which listens to Kafka Streams and po

Re: Flink-Statefun : Design Solution advice ( ASYNC IO)

2022-08-16 Thread Tymur Yarosh
Ok, I don't see it as a good case for Stateful Functions. Statefun is a fantastic tool when you treat each function as an object with its data and behavior, and these objects communicate with each other in arbitrary directions. Also, Stateful Functions remote API is asynchronous, so you can use

Re: Flink-Statefun : Design Solution advice ( ASYNC IO)

2022-08-16 Thread Himanshu Sareen
Hi Tymur, 1. Why do you want a remote function to call an embedded function in this case? To Implement an Async IO call. https://nightlies.apache.org/flink/flink-statefun-docs-master/docs/sdk/flink-datastream/#completing-async-requests 2. Do you have separate outgoing Kafka messages or join

Re: Flink-Statefun : Design Solution advice ( ASYNC IO)

2022-08-16 Thread Tymur Yarosh
Hi, Just a few questions: 1. Why do you want a remote function to call an embedded function in this case? 2. Do you have separate outgoing Kafka messages or join them per record? 3. What is explicit acknowledgment at the end for? Best, Tymur Yarosh 16 серп. 2022 р., 11:48 +0300, Himanshu Sareen ,

Flink-Statefun : Design Solution advice ( ASYNC IO)

2022-08-16 Thread Himanshu Sareen
Team, I'm solving a use-case and needs advice/suggestions if Flink is right choice for solution. 1. Ingress - Kafka events/messages consist of multiple IDs. 2. Task - For each ID in a Kafka message query Cassandra DB ( asynchronously) to fetch records. Prepare multiple small messages out of the

Pyflink :: Conversion from DataStream to TableAPI

2022-08-16 Thread Ramana
Hi All - Trying to achieve the following - 1. Ingest the data from RMQ 2. Decompress the data read from RMQ 3. Window it for 5 mins and process the data 4. Sink the processed data. Was able to achieve step1 and step 2, however realized that Pyflink *DataStream *doesn't have window support. Given