Apache Flink - Question about application restart

2020-05-22 Thread M Singh
Hi Flink Folks: If I have a Flink Application with 10 restarts, if it fails and restarts, then: 1. Does the job have the same id ?2. Does the automatically restarting application, pickup from the last checkpoint ? I am assuming it does but just want to confirm. Also, if it is running on AWS EMR I

Apache Flink - Error on creating savepoints using REST interface

2020-05-22 Thread M Singh
Hi: I am using Flink 1.6.2 and trying to create a savepoint using the following curl command using the following references (https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/rest_api.html) and  https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/

Re: Apache Flink - Question about application restart

2020-05-25 Thread M Singh
side and persist in DFS for reuse2. yes if high availability is enabled Thanks,Zhu Zhu M Singh 于2020年5月23日周六 上午4:06写道: Hi Flink Folks: If I have a Flink Application with 10 restarts, if it fails and restarts, then: 1. Does the job have the same id ?2. Does the automatically restarting

Re: Apache Flink - Error on creating savepoints using REST interface

2020-05-25 Thread M Singh
On Saturday, May 23, 2020, 10:17:27 AM EDT, Chesnay Schepler wrote: You also have to set the boolean cancel-job parameter. On 22/05/2020 22:47, M Singh wrote: Hi: I am using Flink 1.6.2 and trying to create a savepoint using the following curl command using the following

Re: Apache Flink - Question about application restart

2020-05-26 Thread M Singh
et me know if I've missed anything. Thanks On Monday, May 25, 2020, 05:32:39 PM EDT, M Singh wrote: Hi Zhu Zhu: Just to clarify - from what I understand, EMR also has by default restart times (I think it is 3). So if the EMR restarts the job - the job id is the same since the job gr

ClusterClientFactory selection

2020-05-26 Thread M Singh
Hi: I wanted to find out which parameter/configuration allows flink cli pick up the appropriate cluster client factory (especially in the yarn mode). Thanks

Re: Apache Flink - Question about application restart

2020-05-28 Thread M Singh
checkpoints associated with this job. Cheers,Till On Tue, May 26, 2020 at 12:42 PM M Singh wrote: Hi Zhu Zhu: I have another clafication - it looks like if I run the same app multiple times - it's job id changes.  So it looks like even though the graph is the same the job id is not depende

Re: ClusterClientFactory selection

2020-05-28 Thread M Singh
e) is also used for the same purpose from the execution environments. Cheers, Kostas On Wed, May 27, 2020 at 4:49 AM Yang Wang wrote: > > Hi M Singh, > > The Flink CLI picks up the correct ClusterClientFactory via java SPI. You > could check YarnClusterClientFactory#isCompatibleWit

Re: Apache Flink - Question about application restart

2020-05-28 Thread M Singh
will recover the submitted jobs from a persistent storage system. Cheers,Till On Thu, May 28, 2020 at 4:05 PM M Singh wrote: Hi Till/Zhu/Yang:  Thanks for your replies. So just to clarify - the job id remains same if the job restarts have not been exhausted.  Does Yarn also resubmit the job in

Stopping a job

2020-06-04 Thread M Singh
Hi: I am running a job which consumes data from Kinesis and send data to another Kinesis queue.  I am using an older version of Flink (1.6), and when I try to stop the job I get an exception  Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.rest.util.RestClientExc

Re: Stopping a job

2020-06-06 Thread M Singh
this SO thread [1] helps you already? [1] https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable On Thu, Jun 4, 2020 at 7:43 PM M Singh wrote: Hi: I am running a job which consumes data from Kinesis and send data to another Kinesis queue.  I am using an

Re: Stopping a job

2020-06-06 Thread M Singh
flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html[2] https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java On Sat, Jun 6, 2020 at 8:03 PM M Singh wrote: Hi Arvid: I chec

Flink savepoints history

2020-06-07 Thread M Singh
Hi: I wanted to find out if we can access the savepoints created for a job or all jobs using Flink CLI or REST API.   I took a look at the CLI (Apache Flink 1.10 Documentation: Command-Line Interface) and REST API (https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/rest_api.

Re: Stopping a job

2020-06-08 Thread M Singh
Cc: M Singh , User-Flink Subject: Re: Stopping a job   What Arvid said is correct.  The only thing I have to add is that "stop" allows also exactly-once sinks to push out their buffered data to their final destination (e.g. Filesystem). In other words, it takes into account si

Kinesis ProvisionedThroughputExceededException

2020-06-15 Thread M Singh
Hi: I am using multiple (almost 30 and growing) Flink streaming applications that read from the same kinesis stream and get  ProvisionedThroughputExceededException exception which fails the job. I have seen a reference  http://mail-archives.apache.org/mod_mbox/flink-user/201811.mbox/%3CCAJnSTVxpu

Re: Kinesis ProvisionedThroughputExceededException

2020-06-16 Thread M Singh
ot;1.5") // multiplying pause to 1.5 on each next step     conf.put(ConsumerConfigConstants.SHARD_GETRECORDS_RETRIES, "100") // and make up to 100 retries with best regards, Roman Grebennikov | g...@dfdx.me On Mon, Jun 15, 2020, at 13:45, M Singh wrote: Hi: I am using multiple (almost

Re: Kinesis ProvisionedThroughputExceededException

2020-06-18 Thread M Singh
data to consumer, instead of consumer periodically pulling the data). Roman Grebennikov | g...@dfdx.me On Wed, Jun 17, 2020, at 04:39, M Singh wrote: Thanks Roman for your response and advice. >From my understanding increasing shards will increase throughput but still if >more

Flink AskTimeoutException killing the jobs

2020-07-02 Thread M Singh
Hi: I am using Flink 1.10 on AWS EMR cluster. We are getting AskTimeoutExceptions which is causing the flink jobs to die.    Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/resourcemanager#-1602864959]] after [1 ms]. Message of type [org.apache.flink.run

Re: Flink AskTimeoutException killing the jobs

2020-07-03 Thread M Singh
10 sec. I would suggest to look into the jobmanager logs and gc logs, see if there's any problem that prevent the process from handling the rpc messages timely. Thank you~ Xintong Song On Fri, Jul 3, 2020 at 3:51 AM M Singh wrote: Hi: I am using Flink 1.10 on AWS EMR cluster.

Re: Flink AskTimeoutException killing the jobs

2020-07-06 Thread M Singh
imely. The Akka ask timeout does not seem to be the root problem to me. Thank you~ Xintong Song On Sat, Jul 4, 2020 at 12:12 AM M Singh wrote: Hi Xintong/LakeShen: We have the following setting in flink-conf.yaml akka.ask.timeout: 180 s akka.tcp.timeout: 180 s But still see this exception.

Apache Flink - A question about Tables API and SQL interfaces

2021-05-12 Thread M Singh
Hey Folks: I have the following questions regarding Table API/SQL in streaming mode: 1. Is there is a notion triggers/evictors/timers when using Table API or SQL interfaces ?2. Is there anything like side outputs and ability to define allowed lateness when dealing with the Table API or SQL interf

Re: Apache Flink - A question about Tables API and SQL interfaces

2021-05-12 Thread M Singh
e-apisql[2]: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/tableapi/#groupby-window-aggregation[3]: https://github.com/ververica/flink-sql-cookbook#aggregations-and-analytics[4]: https://github.com/ververica/sql-training/wiki/Queries-and-Time On Wed, May 12, 2021 at 1:30 PM

Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-06 Thread M Singh
Hey Folks: I am trying to understand how LookupTableSource works and have a few questions:  1. Are there other examples/documentation on how create a query that uses it vs ScanTableSource ?2. Are there any best practices for using this interface ?3. How does the planner decide to use LookupTableS

Re: Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-07 Thread M Singh
tion would be thrown out in the > compile phase. [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/joins/ Best regards,JING ZHANG M Singh 于2021年7月7日周三 上午8:23写道: Hey Folks: I am trying to understand how LookupTableSource works and have a few que

Apache Flink - How to get heap dump when a job is failing in EMR

2019-08-21 Thread M Singh
Hi: Is there any configuration to get heap dump when job fails in an EMR ?  Thanks

Apache Flink - Operator name and uuid best practices

2019-11-16 Thread M Singh
Hi: I am working on a project and wanted to find out what are the best practices for setting name and uuid for operators: 1. Are there any restrictions on the length of the name and uuid attributes ?2. Are there any restrictions on the characters used for name and uuid (blank spaces, etc) ?3. Ca

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-16 Thread M Singh
ery operator, which gives me better experience in monitoring and scaling. Hope this helps. [1]  https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#matching-operator-state Best, Jiayi Liao At 2019-11-16 18:35:38, "M Singh" wrote: Hi: I am wo

Apache Airflow - Question about checkpointing and re-run a job

2019-11-16 Thread M Singh
Hi: I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id directory at the checkpoint location. If I run the same job again, it will get a new job id and the checkpoint saved from the previous run job (which

Re: Apache Airflow - Question about checkpointing and re-run a job

2019-11-17 Thread M Singh
Folks - Please let me know if you have any advice on this question.  Thanks On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh wrote: Hi: I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id

Re: Apache Airflow - Question about checkpointing and re-run a job

2019-11-18 Thread M Singh
-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint Best,Congxian M Singh 于2019年11月18日周一 上午2:54写道: Folks - Please let me know if you have any advice on this question.  Thanks On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh wrote: Hi: I have a Flink job and

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-20 Thread M Singh
in any case. Best, Arvid On Sat, Nov 16, 2019 at 6:40 PM M Singh wrote: Thanks Jiayi for your response. I am thinking on the same lines.   Regarding using the same name and uuid, I believe the checkpoint state for an operator will be easy to identify if the uuid is the same as name.  But I am

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-21 Thread M Singh
, and the assigned in the application belongs to one operator, they are different. Best,Congxian M Singh 于2019年11月21日周四 上午6:18写道: Hi Arvid: Thanks for your clarification. I am giving supplying uid for the stateful operators and find the following directory structure on in the chkpoint directory

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-21 Thread M Singh
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointMetadataLoadingTest.java Best,Congxian M Singh 于2019年11月21日周四 下午7:44写道: Hi Congxian: For my application i see many uuids under the chk-6 directory ( I posted one in the sample above).   I am trying to understand that

Apache Flink - Uid and name for Flink sources and sinks

2019-11-21 Thread M Singh
Hi Folks: I am assigning uid and name for all stateful processors in our application and wanted to find out the following: 1. Should we assign uid and name to the sources and sinks too ?  2. What are the pros and cons of adding uid to sources and sinks ?3. The sinks have uid and hashUid - which

Re: Apache Flink - Uid and name for Flink sources and sinks

2019-11-22 Thread M Singh
Hi Folks - Please let me know if you have any advice on the best practices for setting uid for sources and sinks.  Thanks.  MansOn Thursday, November 21, 2019, 10:10:49 PM EST, M Singh wrote: Hi Folks: I am assigning uid and name for all stateful processors in our application and

Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-22 Thread M Singh
Hi: I have a flink application in which some of the operators have uid and name and some stateless ones don't. I've taken a save point and tried to start another instance of the application from a savepoint - I get the following exception which indicates that the operator is not available to the

Re: Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-23 Thread M Singh
Hey Folks:    Please let me know how to resolve this issue since using  --allowNonRestoredState without knowing if any state will be lost seems risky. ThanksOn Friday, November 22, 2019, 02:55:09 PM EST, M Singh wrote: Hi: I have a flink application in which some of the operators have

Re: Apache Flink - Uid and name for Flink sources and sinks

2019-11-24 Thread M Singh
ed for creating a uid for the sink ?   >> Not sure what do you mean by "attribute id". Could you give some more >> detailed information about it? Regards, Dian On Fri, Nov 22, 2019 at 6:27 PM M Singh wrote: Hi Folks - Please let me know if you have any advice on the best

Apache Flink - Throttling stream flow

2019-11-24 Thread M Singh
Hi: I have an Flink streaming application that invokes  some other web services.  However the webservices have limited throughput.  So I wanted to find out if there any recommendations on how to throttle the Flink datastream so that they don't overload the downstrream services.  I am using Kines

Re: Apache Flink - Throttling stream flow

2019-11-25 Thread M Singh
able/dev/table/udfs.html Here is a throttling iterator example:  https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/utils/ThrottledIterator.java M Singh 于2019年11月25日周一 上午5:50写道: Hi: I have an Flink streaming ap

Re: Apache Flink - Uid and name for Flink sources and sinks

2019-11-25 Thread M Singh
Thanks DIan for your pointers.  MansOn Sunday, November 24, 2019, 08:57:53 PM EST, Dian Fu wrote: Hi Mans, Please see my reply inline below. 在 2019年11月25日,上午5:42,M Singh 写道: Thanks Dian for your answers. A few more questions: 1. If I do not assign uids to operators/sources and

Re: Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-25 Thread M Singh
ld you please provide some code from your job, just to see if > there is anything problematic with the application code? > Normally there should be no problem with not providing UIDs for some > stateless operators. > > Cheers, > Kostas > > On Sat, Nov 23, 2019 at 11:16 AM M S

Re: Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-29 Thread M Singh
n-ids-to-all-operators-in-my-job[2]   https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/savepoints.html#what-happens-if-i-delete-an-operator-that-has-state-from-my-job Best,Congxian M Singh 于2019年11月26日周二 上午10:49写道: Hi Kostas/Congxian: Thanks fo your response.   Based on

Re: Apache Flink - Throttling stream flow

2019-11-29 Thread M Singh
Wednesday, November 27, 2019, 11:32:06 AM EST, Rong Rong wrote: Hi Mans, is this what you are looking for [1][2]? --Rong [1] https://issues.apache.org/jira/browse/FLINK-11501[2]  https://github.com/apache/flink/pull/7679 On Mon, Nov 25, 2019 at 3:29 AM M Singh wrote: Thanks Ciazhi & Th

Side output question

2019-12-02 Thread M Singh
Hi: I am replacing SplitOperator in my flink application with a simple processor with side outputs. My questions is that does the main stream from which we get the side outputs need to have any events (ie, produced using by the using collector.collect) ?  Or can we have all the output as side o

Apache Flink - Retries for async processing

2019-12-09 Thread M Singh
Hi Folks: I am working on a project where I will be using Flink's async processing capabilities.  The job has to make http request using a token.  The token expires periodically and needs to be refreshed. So, I was looking for patterns for handling async call failures and retries when the token

Re: Side output question

2019-12-10 Thread M Singh
should be no issue to only have side-outputs in your operator. There should also be no big drawbacks. I guess mostly some metrics will not be properly populated, but you can always populate them manually or add new ones. Best, Arvid On Mon, Dec 2, 2019 at 8:40 PM M Singh wrote: Hi: I am replacing

Re: Apache Flink - Retries for async processing

2019-12-10 Thread M Singh
:08:39 PM EST, Jingsong Li wrote: Hi M Singh, Our internal has this scenario too, as far as I know, Flink does not have this internal mechanism in 1.9 too.I can share my solution:- In async function, start a thread factory.- Send the call to thread factory when this call has failed. Do

Apache Flink - Clarifications about late side output

2019-12-10 Thread M Singh
Hi: I have a few questions about the side output late data.   Here is the API stream .keyBy(...) <- keyed versus non-keyed windows .window(...) <- required: "assigner" [.trigger(...)]<- optional: "trigger" (else default trigger) [.

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread M Singh
time for testing the > semantics. > > I hope this helps. Otherwise of course we can help you in finding the > answers to the remaining questions. > > Regards, > Timo > > > > On 10.12.19 20:32, M Singh wrote: >> Hi: >> >> I have a few question

Re: Apache Flink - Retries for async processing

2019-12-11 Thread M Singh
Thanks Zhu for your advice.  Mans On Tuesday, December 10, 2019, 09:32:01 PM EST, Zhu Zhu wrote: Hi M Singh, I think you would be able to know the request failure cause and whether it is recoverable or not.You can handle the error as you like. For example, if you think the error is

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread M Singh
that helps,David On Wed, Dec 11, 2019 at 1:40 PM M Singh wrote: Thanks Timo for your answer.  I will try the prototype but was wondering if I can find some theoretical documentation to give me a sound understanding. Mans On Wednesday, December 11, 2019, 05:44:15 AM EST, Timo Walther wrote:

Apache Flink - Flink Metrics - How to distinguish b/w metrics for two job manager on the same host

2019-12-18 Thread M Singh
Hi: I am using AWS EMR with Flink application and two of the job managers are running on the same host.  I am looking at the metrics documentation (Apache Flink 1.9 Documentation: Metrics) and and see the following:  | | | | Apache Flink 1.9 Documentation: Metrics | | | - metr

Re: Apache Flink - Flink Metrics - How to distinguish b/w metrics for two job manager on the same host

2019-12-19 Thread M Singh
e, you can specify it to distinguish  different Flink cluster.[1] [1]:  https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#datadog-orgapacheflinkmetricsdatadogdatadoghttpreporter Best,Vino M Singh 于2019年12月19日周四 上午2:54写道: Hi: I am using AWS EMR with Flink application and two

Apache Flink - Flink Metrics collection using Prometheus on EMR from streaming mode

2019-12-24 Thread M Singh
Hi: I wanted to find out what's the best way of collecting Flink metrics using Prometheus in a streaming application on EMR/Hadoop. Since the Flink streaming jobs could be running on any node - is there any Prometheus configuration or service discovery option available that will dynamically pick

Re: Apache Flink - Flink Metrics collection using Prometheus on EMR from streaming mode

2019-12-25 Thread M Singh
ecommended for a streaming job? Best,Vino M Singh 于2019年12月24日周二 下午4:02写道: Hi: I wanted to find out what's the best way of collecting Flink metrics using Prometheus in a streaming application on EMR/Hadoop. Since the Flink streaming jobs could be running on any node - is there any

Apache Flink - Sharing state in processors

2020-01-10 Thread M Singh
Hi: I have a few question about how state is shared in processors in Flink. 1. If I have a processor instantiated in the Flink app, and apply use in multiple times in the Flink -     (a) if the tasks are in the same slot - do they share the same processor on the taskmanager ?     (b) if the tasks

Re: Apache Flink - Sharing state in processors

2020-01-23 Thread M Singh
/streaming/runtime/partitioner/KeyGroupStreamPartitioner.java#L58 BestYun Tang From: M Singh Sent: Friday, January 10, 2020 23:29 To: User Subject: Apache Flink - Sharing state in processors Hi: I have a few question about how state is shared in processors in Flink. 1. If I have a processor instantiated in t

Apache Flink Job fails repeatedly due to RemoteTransportException

2020-01-28 Thread M Singh
Hi Folks: We have streaming Flink application (using v 1.6.2) and it dies within 12 hours.  We have configured number of restarts which is 10 at the moment. Sometimes the job runs for some time and then within a very short time has a number of restarts and finally fails.  In other instances, the

Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-22 Thread M Singh
Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in Flink Session vs Flink Cluster mode (https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html#flink-session-cluster-on-kubernete

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-24 Thread M Singh
should we use emptyDir ? [1].  https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html M Singh 于2020年2月23日周日 上午2:28写道: Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in F

Re: Apache Flink Job fails repeatedly due to RemoteTransportException

2020-02-24 Thread M Singh
Thanks will try your recommendations and apologize for the delayed response. On Wednesday, January 29, 2020, 09:58:26 AM EST, Till Rohrmann wrote: Hi M Singh, have you checked the TaskManager logs of  ip-xx-xxx-xxx-xxx.ec2.internal/xx.xxx.xxx.xxx:39623 for any suspicious logging

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread M Singh
at 3:36 AM Yang Wang wrote: Hi M Singh, > Mans - If we use the session based deployment option for K8 - I thought K8 >will automatically restarts any failed TM or JM.  In the case of failed TM - the job will probably recover, but in the case of failed JM - perhaps we need to resubmit al

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread M Singh
BTW - Is there any limit to the amount of data that can be stored on emptyDir in K8 ?   On Wednesday, February 26, 2020, 07:33:54 PM EST, M Singh wrote: Thanks Yang and Arvid for your advice and pointers.  Mans On Wednesday, February 26, 2020, 09:52:26 AM EST, Arvid Heise wrote

Re: Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-08 Thread M Singh
rojects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/joins/#event-time-temporal-join M Singh 于2021年7月7日周三 下午5:22写道: Hi Jing: Thanks for your explanation and references.  I looked at your reference (https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/jo

Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-08 Thread M Singh
Hi: I am trying to read avro encoded messages from Kafka with schema registered in schema registry. I am using the class (https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/formats/avro/registry/confluent/ConfluentRegistryAvroDeserializationSchema.html) using the me

Re: Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-15 Thread M Singh
r is that it is ever used in a single parallel instance and it is not sent over a network again. So basically there you use the writer schema retrieved from schema registry as the reader schema. I hope this answers your questions. Best, Dawid [1] https://avro.apache.org/docs/1.10.2/spec.h

Re: Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-15 Thread M Singh
over time) thus you need schema registry. Best, Dawid [1]https://avro.apache.org/docs/1.10.2/spec.html#Data+Serialization+and+Deserialization [2] https://avro.apache.org/docs/1.10.2/spec.html#Schema+Resolution On 15/07/2021 15:48, M Singh wrote: Hello Dawid: Thanks for your an

Apache Flink - Using upsert JDBC sink for DataStream

2021-10-16 Thread M Singh
Hi Folks: I am working on Flink DataStream pipeline and would like to use JDBC upsert functionality.  I found a class TableJdbcUpsertOutputFormat but am not sure who to use it with the JdbcSink as shown in the document  (https://nightlies.apache.org/flink/flink-docs-master/api/java//org/apache/fl

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-18 Thread M Singh
quot; + " GROUP BY len, cTag\n" + ")\n" + "GROUP BY cnt, cTag") .await();   [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/jdbc/#key-handling[2]   https:

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-20 Thread M Singh
ormat`, `JdbcOutputFormat.Builder` would choose to create a `TableJdbcUpsertOutputFormat` or `JdbcOutputFormat` instance depends on whether key fields is defined in DML.  [1]  https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/jdbc/#idempotent-writes Best,JING ZHANG M Singh 于2021年10

Apache Flink - Building flink docs locally

2022-01-15 Thread M Singh
Hi:   I wanted to find out what's the command for building the flink docs locally and the location of the final docs.  Is it apart of the build commands (Building Flink from Source) and can they be built separately ?   Thanks 

Re: Apache Flink - Building flink docs locally

2022-01-16 Thread M Singh
Thanks Chesnay for your pointers.  Mans On Sunday, January 16, 2022, 06:19:09 AM EST, Chesnay Schepler wrote: see https://github.com/apache/flink/tree/master/docs On 15/01/2022 15:04, M Singh wrote: Hi:   I wanted to find out what's the command for building the

Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-22 Thread M Singh
Hi Folks: I am working on an exploratory project in which I would like to add/remove KeyedStreams (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#keyby) without restarting the Flink streaming application. Is it possible natively in Apache Flink ?  If

Apache Flink - Can AllWindowedStream be parallel ?

2022-01-22 Thread M Singh
Hi Folks: The documentation for AllWindowedStream (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#datastream-rarr-allwindowedstream) has a note: This is in many cases a non-parallel transformation. All records will be gathered in one task for the wi

Re: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-24 Thread M Singh
s not possible to do so without restarting the job and as far as I know there is no existing framework/pattern to achieve this. By the way, why do you need this functionality? Could you elaborate more on your use case? M Singh 于2022年1月22日周六 21:32写道: Hi Folks: I am working on an exploratory proje

Re: Apache Flink - Can AllWindowedStream be parallel ?

2022-01-24 Thread M Singh
would be  SingleOutputStreamOperator, which cannot be parallel. [1]  https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/AllWindowedStream.scala BestYun TangFrom: M Singh Sent: Sunday, January 23, 2022 4:24 To: User-Flink Subject

Re: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-25 Thread M Singh
gregationKey to the stream based on the value of groupByFields   The flatMap means one message may be emitted several times with different values of aggregationKey so it may belong to multiple aggregations.       From: M Singh Sent: Monday, January 24, 2022 9:52 PM To: Caizhi We

Re: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-25 Thread M Singh
],groupByType:bySession,aggregationKey:'3'} {session:3,account:1,value:3200,groupByFields:[account],groupByType:byAccount,aggregationKey:'2'}       From: Colletta, Edward Sent: Tuesday, January 25, 2022 1:29 PM To: M Singh ; Caizhi Weng ; User-Flink Subject: RE: Apache F

Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-11 Thread M Singh
Hi: The flink docs (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/) indicates that the CURRENT_WATERMARK(rowtime) can return null: Note that this function can return NULL, and you may have to consider this case. For example, if you want to filter

Re: Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-11 Thread M Singh
2022 at 16:45, M Singh wrote: Hi: The flink docs (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/) indicates that the CURRENT_WATERMARK(rowtime) can return null: Note that this function can return NULL, and you may have to consider this case. F

Re: Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-11 Thread M Singh
. Mans On Friday, February 11, 2022, 05:02:49 PM EST, M Singh wrote: Hi Martijn: Thanks for the reference.    My understanding was that if we use watermark then any event with event time (in the above example) < event_time - 30 seconds will be dropped automatically.    My question [1]

Re: Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-12 Thread M Singh
current watermark is NULL, and no events will be considered late. Hope this helps clarify things. Regards,David On Sat, Feb 12, 2022 at 12:01 AM M Singh wrote: I thought a little more about your references Martijn and wanted to confirm one thing - the table is specifying the watermark and the d

Apache Flink - User Defined Functions - Exception when passing all arguments

2022-02-16 Thread M Singh
Hi: I have a simple concatenate UDF (for testing purpose) defined as:     public static class ConcatenateFunction extends ScalarFunction {        public String eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object ... inputs) {            return Arrays.stream(inputs).map(i -> i.toString()).coll

Apache Flink - Continuously monitoring directory using filesystem connector - 1.14.3

2022-02-17 Thread M Singh
Hi:   I have a simple application and am using file system connector to monitor a directory and then print to the console (using datastream).  However, the application stops after reading the file in the directory (at the moment I have a single file in the directory).   I am using Apache Flink v

Re: Apache Flink - Continuously monitoring directory using filesystem connector - 1.14.3

2022-02-18 Thread M Singh
18, 2022 at 1:28 AM M Singh wrote: Hi:   I have a simple application and am using file system connector to monitor a directory and then print to the console (using datastream).  However, the application stops after reading the file in the directory (at the moment I have a single file in the

Apache Flink - Continuously streaming data using jdbc connector

2022-02-20 Thread M Singh
Hi Folks: I am trying to monitor a jdbc source and continuously streaming data in an application using the jdbc connector.  However, the application stops after reading the data in the table. I've checked the docs (https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/jdbc/)

Re: Apache Flink - Continuously streaming data using jdbc connector

2022-02-21 Thread M Singh
6:23 AM M Singh wrote: Hi Folks: I am trying to monitor a jdbc source and continuously streaming data in an application using the jdbc connector.  However, the application stops after reading the data in the table. I've checked the docs (https://nightlies.apache.org/flink/flink-docs-m

Re: Apache Flink - User Defined Functions - Exception when passing all arguments

2022-02-22 Thread M Singh
phase rather than when parsing, but it shouldn't work anyway. I suggest you to just use Table API for that, as it's richer. You can even use withColumns(range(..)) which gives you more control. Hope it helps,FG On Thu, Feb 17, 2022 at 1:34 AM M Singh wrote: Hi: I have a simple concatenat

Apache Flink - Exception on left outer join with 'kafka' connector

2022-02-22 Thread M Singh
Hi Folks: I am using 'kafka' connector and joining with data from jdbc source (using connector).   I am using Flink v 1.14.3.  If I do a left outer join between kafka source and jdbc source, and try to save it to another kafka sink using connectors api, I get the following exception: Exception i

Apache Flink - How to destroy global window and release it's resources

2019-04-10 Thread M Singh
Hi: I have a use case where I need to create a global window where I need to wait for unknown time for certain events for a particular key.  I understand that I can create a global window and use a custom trigger to initiate the function computation.  But I am not sure how to destroy the window

Re: Apache Flink - Question about broadcast state pattern usage

2019-04-10 Thread M Singh
rgs. Since that you could pass multiple MapStateDescriptors to it. [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/state.html#using-managed-operator-state[2]   https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/broadcast_state.html Best,Guowei

Re: Apache Flink - How to destroy global window and release it's resources

2019-04-12 Thread M Singh
proper TriggerResult, which defines how to deal with the window elements after computing a window in your trigger implementation. You could find the detail information from the doc[1]. 1.  https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/windows.html#fire-and-purge

Re: Apache Flink - Question about broadcast state pattern usage

2019-04-12 Thread M Singh
= ... MapStateDescriptor bcState2 = ... DataStream stream = ... BroadcastStream bcStream = stream.broadcast(bcState1, bcState2); Best,Fabian Am Mi., 10. Apr. 2019 um 19:44 Uhr schrieb M Singh : Hi Guowei; Thanks for your answer. Do you have any example which illustrates using broadcast is used wi

Apache Flink - CEP vs SQL detecting patterns

2019-04-12 Thread M Singh
Hi: I am looking at the documentation of the CEP and there is way to access patterns which have timeout.  But I could  not find similar capability in the Table and SQL interface detecting patterns.  I am assuming that the CEP interface is more comprehensive and complete than the SQL/Table interf

Re: Apache Flink - CEP vs SQL detecting patterns

2019-04-20 Thread M Singh
in SQL, as there is no such feature in SQL standard. The only addition to SQL standard we introduce so far is the WITHIN clause. We might introduce the timed out patterns some time in the future, but personally I am not aware of such plans. Best, Dawid On 12/04/2019 22:40, M Singh wrote

Apache Flink - Question about dynamically changing window end time at run time

2019-04-23 Thread M Singh
Hi: I am working on a project and need to change the end time of the window dynamically.  I want to find out if the end time of the window is used internally (for sorting windows/etc) except for handling watermarks that would cause problems if the end time was changed during run time after the w

Re: Apache Flink - Question about dynamically changing window end time at run time

2019-04-24 Thread M Singh
hanks,Rong [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/windows.html#session-windows[2]   https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/libs/cep.html#groups-of-patterns On Tue, Apr 23, 2019 at 8:19 AM M Singh wrote: Hi: I am working on a project

Re: Apache Flink - Question about dynamically changing window end time at run time

2019-04-28 Thread M Singh
memory for the key+window combination. Plus there is just window per key with Global Windows.  On Wed, Apr 24, 2019 at 7:47 AM M Singh wrote: Hi Rong: Thanks for your answer. >From what I understand the dynamic gap session windows are also created when >the event is encountered.  I need

Re: Apache Flink - Question about dynamically changing window end time at run time

2019-04-29 Thread M Singh
ly cleaned up. Best,Fabian Am So., 28. Apr. 2019 um 16:31 Uhr schrieb M Singh : Thanks Sameer/Rong: As Fabian and you have mentioned, the window still sticks around forever for global window, so I am trying avoid that scenario. Fabian & Flink team - do you have any insights into what wou

  1   2   >