Re: UnsupportedOperationException from org.apache.flink.shaded.asm6.org.objectweb.asm.ClassVisitor.visitNestHostExperimental using Java 11

2020-01-13 Thread Piotr Nowojski
Hi, Yes, this is work in progress [1]. It looks like the Java 11 support is targeted for Flink 1.10 which should be released this or the following month. Piotrek [1] https://issues.apache.org/jira/browse/FLINK-10725 > On 9 Jan 2020, at 15:56

Re: Data overflow in SpillingResettableMutableObjectIterator

2020-01-13 Thread Piotr Nowojski
Hi Jian, Thank your for reporting the issue. I see that you have already created a ticket for this [1]. Piotrek [1] https://issues.apache.org/jira/browse/FLINK-15549 > On 9 Jan 2020, at 09:10, Jian Cao wrote: > > Hi all: > We are using fl

Re: Long end-to-end checkpointing time on a single subtask

2020-01-13 Thread Arvid Heise
Hi Robin, I noticed that I answered privately, so let me forward that to the user list. Please come back to the ML if you have more questions. Best, Arvid On Thu, Jan 9, 2020 at 5:47 PM Robin Cassan wrote: > Hi Arvid, thanks a lot for this quick response! > We have wrongly assumed that our d

Re: [Question] Failed to submit flink job to secure yarn cluster

2020-01-13 Thread Ethan Li
Sorry forgot to update on this. I figured it out. KMS is not set up correctly in my environment. ResourceManager is also missing key provider config. PE is fixing it. Thanks for looking into this Ethan Li > On Jan 13, 2020, at 21:38, Yang Wang wrote: > >  > I am not familiar with kerber

Re: Job Cluster vs Session Cluster deploying and configuration

2020-01-13 Thread Yang Wang
Hi KristoffSC, Glad to hear that you are looking to run Flink on container land. Firstly, You are right. Flink could both support session and per-job cluster in container environment. The differences are job submission process and isolation. For session cluster, you do not need to build you own

Re: [Question] Failed to submit flink job to secure yarn cluster

2020-01-13 Thread Yang Wang
I am not familiar with kerberos. However i find "keyProvider null cannot renew token" in the Yarn ResourceManager logs. Could you please check the key provider has been configured correctly? Best, Yang Ethan Li 于2020年1月10日周五 下午10:54写道: > Hi Yangze, > > Thanks for your reply. Those are the docs

Re: Please suggest helpful tools

2020-01-13 Thread Kurt Young
First could you check whether the added filter conditions are executed before join operators? If they are already pushed down and executed before join, it's should be some real join keys generating data skew. Best, Kurt On Tue, Jan 14, 2020 at 5:09 AM Eva Eva wrote: > Hi Kurt, > > Assuming I'm

Re: Please suggest helpful tools

2020-01-13 Thread Eva Eva
Hi Kurt, Assuming I'm joining two tables, "latestListings" and "latestAgents" like below: "SELECT * FROM latestListings l " + "LEFT JOIN latestAgents aa ON l.listAgentKeyL = aa.ucPKA " + "LEFT JOIN latestAgents ab ON l.buyerAgentKeyL = ab.ucPKA " + "LEFT JOIN latestAgents

Understanding watermark

2020-01-13 Thread Cam Mach
Hello Flink expert, We have a pipeline that read both bounded and unbounded sources and our understanding is that when the bounded sources complete they should get a watermark of +inf and then we should be able to take a savepoint and safely restart the pipeline. However, we have source that never

Re: Yarn Kerberos issue

2020-01-13 Thread Rong Rong
Hi Juan, Sorry I think I hit send button too soon as well :-) There's 2nd part of the analysis which was already captured in FLINK-15561 but not sent: > *It seems like the delegation token checker is imposed in > YarnClusterDescriptor [1], but not in HadoopModule [2].* in regards to your commen

Re: Custom File Sink using EventTime and defined custom file name for parquet file

2020-01-13 Thread Leonard Xu
Hi, David For you first description, I’m a little confused about duplicated records when backfilling, could you describe your usage scenario/code more? I remembered a backfill user solution from Pinterest which is very similar to yours and using Flink too[1], hope that can help you. Best, Leo

PubSub source throwing grpc errors

2020-01-13 Thread Itamar Syn-Hershko
Hi all, We are trying to use the PubSub source with a very minimal and basic Flink application as a POC, and getting the following error consistently every couple of seconds. What am I missing? ``` io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue SEVERE: *~*~*~ Chan

Incorrect Physical Plan when unioning two different windows, giving incorrect SQL query results

2020-01-13 Thread Benoit Hanotte
Hello, We seem to be facing an issue with Flink where the physical plan after planner optimization is not correct. I have been able to reproduce the issue in the following "simplified" use case (it doesn't seem to happen in trivial cases): 1. We open 2 event streams ("clicks" and "displays")

Re: How can I find out which key group belongs to which subtask

2020-01-13 Thread Till Rohrmann
This feature won't be more public than it is today. Cheers, Till On Fri, Jan 10, 2020 at 9:51 PM 杨东晓 wrote: > Thanks Till , I will do some test about this , will this be some public > feature in next release version or later? > > Till Rohrmann 于2020年1月10日周五 上午6:15写道: > >> Hi, >> >> you would n

[ANNOUNCE] Flink Forward San Francisco 2020 Call for Presentation extended!

2020-01-13 Thread Fabian Hueske
Hi everyone, We know some of you only came back from holidays last week. To give you more time to submit a talk, we decided to extend the Call for Presentations for Flink Forward San Francisco 2020 until Sunday January 19th. The conference takes place on March 23-25 with two days of talks and one

Re: Yarn Kerberos issue

2020-01-13 Thread Juan Gentile
Thank you Rong We believe that the job (or scheduler) launching Flink should be the one responsible for renewing the DT. Here is some documentation that could be useful regarding Spark https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/security/README.md#dt-