Re: Exception: This method must be called from inside the mailbox thread

2020-11-24 Thread Arvid Heise
Hi KristoffSC, I'd strongly suggest not blocking the task thread if it involves external services. RPC notification cannot be processed and checkpoints are delayed when the task thread is blocked. That's what AsyncIO is for. If your third party library just takes a few ms to finish computation wi

Re: slot problem

2020-11-24 Thread Zhu Zhu
Each task will be assigned a dedicated thread for its data processing. A slot can be shared by multiple tasks if they are in the same slot sharing group[1]. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/concepts/runtime.html#task-slots-and-resources Thanks, Zhu ゞ野蠻遊戲χ 于2020年1

slot problem

2020-11-24 Thread ?g???U?[????
Hi all        Can only one thread run at a time for a slot? Or one slot can run multiple threads in parallel at the same time? Thanks, Jiazhi

Re: Is there a way we can specify operator ID for DDLs?

2020-11-24 Thread Danny Chan
Hi Kevin Kwon ~ Do you want to customize only the source operator name or all the operator name in order for the state compatibility ? State compatibility is an orthogonal topic and keep the operator name is one way to solve it. Kevin Kwon 于2020年11月25日周三 上午1:11写道: > For SQLs, I know that the o

Re: Concise example of how to deploy flink on Kubernetes

2020-11-24 Thread George Costea
Thank you. This is very helpful. On Mon, Nov 23, 2020 at 9:46 AM Till Rohrmann wrote: > Hi George, > > Here is some documentation about how to deploy a stateful function job > [1]. In a nutshell, you need to deploy a Flink cluster on which you can run > the stateful function job. This can either

Re: Exception: This method must be called from inside the mailbox thread

2020-11-24 Thread KristoffSC
Hi Arvid, Thank you for your answer. And what if a) would block task's thread? Let's say I'm ok with making entire task thread to wait on this third party lib. In that case I would be safe from having this exception even though I would not use AsyncIO? -- Sent from: http://apache-flink-user

Statefun delayed message

2020-11-24 Thread Timothy Bess
Hi everyone, I have a question about how delayed messages work, I tried to dig through some docs on it, but not sure it addresses exactly my question. Basically, if I send a delayed message with exactly-once mode on, does Flink need to wait until the delayed message sends to commit Kafka offsets?

Re: statefun creates unexpected new physical function

2020-11-24 Thread Lian Jiang
Probolved solved. It is because another function sends messages to myFunc by using non hard coded ids. Thanks. On Tue, Nov 24, 2020 at 11:24 AM Lian Jiang wrote: > Hi, > > I am using statefun 2.2.0 and have below routing: > > downstream.forward(myFunc.TYPE, myFunc.TYPE.name(), message); > > I

Re: Exception: This method must be called from inside the mailbox thread

2020-11-24 Thread Arvid Heise
Hi KristoffSC, sorry for the confusing error message. In short, mailbox thread = task thread. your operator a) calls collector.collect from a different thread (in which the CompleteableFuture is completed). However, all APIs must always be used from the task thread. The only way to cross thread

statefun creates unexpected new physical function

2020-11-24 Thread Lian Jiang
Hi, I am using statefun 2.2.0 and have below routing: downstream.forward(myFunc.TYPE, myFunc.TYPE.name(), message); I expect this statement will create only one physical myFunc because the id is hard coded with myFunc.TYPE.name(). This design can use the PersistedValue field in myFunc for all i

Re: Dynamic ad hoc query deployment strategy

2020-11-24 Thread lalala
Hi Timo and Dawid, Thank you for a detailed answer; it looks like we need to reconsider all job submission flow. What is the best way to compare the new job graph? Can we use Flink visualizer to ensure that the new job graph shares the table as you mention It is not guaranteed? Best regards,

Is there a way we can specify operator ID for DDLs?

2020-11-24 Thread Kevin Kwon
For SQLs, I know that the operator ID assignment is not possible now since the query optimizer may not be backward compatible in each release But are DDLs also affected by this? for example, CREATE TABLE mytable ( id BIGINT, data STRING ) with ( connector = 'kafka' ... id = 'mytable'

Re: Questions on Table to Datastream conversion and onTimer method in KeyedProcessFunction

2020-11-24 Thread Timo Walther
Hi Fuyao, great that you could make progress. 2. Btw nice summary of the idleness concept. We should have that in the docs actually. 4. By looking at tests like `IntervalJoinITCase` [1] it seems that we also support FULL OUTER JOINs as interval joins. Maybe you can make use of them. 5. "b

Exception: This method must be called from inside the mailbox thread

2020-11-24 Thread KristoffSC
Hi, I faced an issue on Flink 1.11. It was for now one time thing and I cannot reproduce it. However I think something is lurking there... I cannot post full stack trace and user code however I will try to describe the problem. Setup without any resource groups with only one Operator chain restri

Re: How to setup Regions for Fault Tolerance in Flink when using Side Outputs

2020-11-24 Thread Till Rohrmann
Hi Patrick, Flink supports regional failover [1] which only restarts all tasks connected via pipelined data exchanges. Hence, either when having an embarrassingly parallel topology or running a batch job, Flink should not restart the whole job in case of a task failure. However, in the case of si

Re: Learn flink source code book recommendation

2020-11-24 Thread Timo Walther
Hi, one advice I can give you is to checkout the code and execute some of the examples in debugging mode. Esp. within Flink's functions e.g. MapFunction or ProcessFunction you can set a breakpoint and look at the stack trace. This gives you a good overview about the Flink stack in general.

Re: Dynamic ad hoc query deployment strategy

2020-11-24 Thread Timo Walther
I agree with Dawid. Maybe one thing to add is that reusing parts of the pipeline is possible via StatementSets in TableEnvironment. They allow you to add multiple queries that consume from a common part of the pipeline (for example a common source). But all of that is compiled into one big job

Re: JobListener weird behaviour

2020-11-24 Thread Till Rohrmann
Hi Flavio, looking only at the code, then the job should first transition into a globally terminal state before notifying the client about it. The only possible reason I could see for this behaviour is that the RestServerEndpoint uses an ExecutionGraphCache (DefaultExecutionGraphCache is the imple

How to setup Regions for Fault Tolerance in Flink when using Side Outputs

2020-11-24 Thread Eifler, Patrick
Hi all, We are trying to setup regions to enable Flink to only stop failing tasks based on region instead of failing the entire stream. We are using one main stream that is reading from a kafka topic and a bunch of side outputs for processing each event from that topic differently. For the proce

Re: Jdbc input format and system properties

2020-11-24 Thread Flavio Pompermaier
Just to close this thread I found the cause of the problem: looking into the code of the mysql connector the value of PropertyDefinitions.SYSP_disableAbandonedConnectionCleanup is "com.mysql.cj.disableAbandonedConnectionCleanup" and not "com.mysql.disableAbandonedConnectionCleanup" as stated in [1]

Re: Question: How to avoid local execution being terminated before session window closes

2020-11-24 Thread Timo Walther
For debugging you can also implement a simple non-parallel source using `org.apache.flink.streaming.api.functions.source.SourceFunction`. You would need to implement the run() method with an endless loop after emitting all your records. Regards, Timo On 24.11.20 16:07, Klemens Muthmann wrote:

Re: Dynamic ad hoc query deployment strategy

2020-11-24 Thread Dawid Wysakowicz
Hi, Really sorry for a late reply. To the best of my knowledge there is no such possibility to "attach" to a source/reader of a different job. Every job would read the source separately. `The GenericInMemoryCatalog is an in-memory implementation of a catalog. All objects will be available only f

Re: Learn flink source code book recommendation

2020-11-24 Thread Till Rohrmann
Hi, the best way learning Flink's source code would be to dive into it [1]. [1] https://github.com/apache/flink Cheers, Till On Tue, Nov 24, 2020 at 3:43 AM 心剑 <2752980...@qq.com> wrote: > Excuse me, I want to learn the flink source code. Do you have any good > information and the latest books

Re: Print on screen DataStream content

2020-11-24 Thread Simone Cavallarin
ok, thanks you all for the help! s From: David Anderson Sent: 24 November 2020 15:16 To: Simone Cavallarin Cc: user@flink.apache.org Subject: Re: Print on screen DataStream content Simone, What you want to do is to override the toString() method on Event so th

Re: Print on screen DataStream content

2020-11-24 Thread David Anderson
Simone, What you want to do is to override the toString() method on Event so that it produces a more helpful String as its result, and then use stream.print() in your IDE (where stream is a DataStream). By the way, printOrTest(stream) isn't part of Flink -- that's just something used by the tra

Re: Print on screen DataStream content

2020-11-24 Thread Simone Cavallarin
Hi, yes, I would like to debug locally on my IDE. This is what I tried so far, but no luck. a)String ff = result.toString(); System.out.print(ff); b) printOrTest(stream); c)stream.print(); d) System.out.println(stream.print()); This is the output and to me it looks l

Re: Question: How to avoid local execution being terminated before session window closes

2020-11-24 Thread Klemens Muthmann
Hi, Thanks for your reply. I am using processing time instead of event time, since we do get the events in batches and some might arrive days later. But for my current dev setup I just use a CSV dump of finite size as input. I will hand over the pipeline to some other guys, who will need to

Re: Job Manager logs

2020-11-24 Thread Timo Walther
Hi Saksham, could you tell us a bit more about your deployement where you run Flink. This seems to be the root exception: 2020-11-24 11:11:16,296 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerStdoutFileHandler [] - Failed to transfer file from TaskExecutor f0dc0ae680e65a

Re: Question: How to avoid local execution being terminated before session window closes

2020-11-24 Thread Timo Walther
Hi Klemens, what you are observing are reasons why event-time should be preferred over processing-time. Event-time uses the timestamp of your data while processing-time is to basic for many use cases. Esp. when you want to reprocess historic data, you want to do that at full speed instead of

Job Manager logs

2020-11-24 Thread saksham sapra
Hi All, I am running a job in flink and somehow the job is failing and the task manager is getting out of the pool unknowingly. Also some heartbeat timeout exceptions are coming. Thanks, Saksham 2020-11-24 11:07:44,594 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - ---

Re: Print on screen DataStream content

2020-11-24 Thread Timo Walther
Hi Simone, if you are just executing DataStream pipelines locally in your IDE while prototyping. You should be able to use `DataStream#print()` which just prints to standard out [1] (It might be hidden between the log messages). For debugging locally, you can also just set breakpoints in your

Re: Logs of JobExecutionListener

2020-11-24 Thread Flavio Pompermaier
ok that's fine to me, just add an @internal annotation on the RestClusterClient if it is intended only for internal use.. but wouldn't be easier to provide some sort of client generation facility (e.g. swagger or similar)? Il mar 24 nov 2020, 11:38 Till Rohrmann ha scritto: > I see the point in

Re: Hi I'm having problems with self-signed certificiate trust with Native K8S

2020-11-24 Thread Till Rohrmann
Hi Kevin, I expect the 1.12.0 release to happen within the next 3 weeks. Cheers, Till On Tue, Nov 24, 2020 at 4:23 AM Yang Wang wrote: > Hi Kevin, > > Let me try to understand your problem. You have added the trusted keystore > to the Flink app image(my-flink-app:0.0.1) > and it could not be l

Re: Logs of JobExecutionListener

2020-11-24 Thread Till Rohrmann
I see the point in having a richer RestClusterClient. However, I think we first have to make a decision whether the RestClusterClient is something internal or not. If it is something internal, then only extending the RestClusterClient and not adding these convenience methods to ClusterClient could

Re: Print on screen DataStream content

2020-11-24 Thread Simone Cavallarin
I tried to `DataStream#print()` but I don't quite understand how to implement it. Could you please give me an example? I'm using Intellij so what I would need is just to see the data on my screen. Thanks From: David Anderson Sent: 24 November 2020 10:01 To: Pan

Re: Print on screen DataStream content

2020-11-24 Thread David Anderson
When Flink is running on a cluster, `DataStream#print()` prints to files in the log directory. Regards, David On Tue, Nov 24, 2020 at 6:03 AM Pankaj Chand wrote: > Please correct me if I am wrong. `DataStream#print()` only prints to the > screen when running from the IDE, but does not work (pri

Question: How to avoid local execution being terminated before session window closes

2020-11-24 Thread Klemens Muthmann
Hi, I have written an Apache Flink Pipeline containing the following piece of code (Java): stream.window(ProcessingTimeSessionWindows.withGap(Time.seconds(50))).aggregate(new CustomAggregator()).print(); If I run the pipeline using local execution I see the following behavior. The "CustomA

JobListener weird behaviour

2020-11-24 Thread Flavio Pompermaier
Hello everybody, these days I have been trying to use the JobListener to implement a simple logic in our platform that consists in calling an external service to signal that the job has ended and, in case of failure, save the error cause. After some problems to make it work when starting a job usi