Re: submit job failed on Yarn HA

2019-02-27 Thread Gary Yao
Hi, How did you determine "jmhost" and "port"? Actually you do not need to specify these manually. If the client is using the same configuration as your cluster, the client will look up the leading JM from ZooKeeper. If you have already tried omitting the "-m" parameter, you can check in the clie

Re: flink list and flink run commands timeout

2019-02-27 Thread Gary Yao
Hi Sen Sun, The question is already resolved. You can find the entire email thread here: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/flink-list-and-flink-run-commands-timeout-td22826.html Best, Gary On Wed, Feb 27, 2019 at 7:55 AM sen wrote: > Hi Aneesha: > > I

Re: okio and okhttp not shaded in the Flink Uber Jar on EMR

2019-02-27 Thread Gary Yao
Hi, Actually Flink's inverted class loading feature was designed to mitigate problems with different versions of libraries that are not compatible with each other [1]. You may want to debug why it does not work for you. You can also try to use the Hadoop free Flink distribution, and export the HA

Re: submit job failed on Yarn HA

2019-02-28 Thread Gary Yao
st the rest api : http://activeRm/proxy/appId/jars > > > The all client log is in the mail attachment. > > > > > 在 2019年2月27日,下午9:30,Gary Yao 写道: > > Hi, > > How did you determine "jmhost" and "port"? Actually you do not need to > specify

Re: okio and okhttp not shaded in the Flink Uber Jar on EMR

2019-02-28 Thread Gary Yao
wrote: > Thanks Gary, > > I will try to look into why the child-first strategy seems to have failed > for this dependency. > > Best, > Austin > > On Wed, Feb 27, 2019 at 12:25 PM Gary Yao wrote: > >> Hi, >> >> Actually Flink's inverted class loading

Re: submit job failed on Yarn HA

2019-03-04 Thread Gary Yao
/state_backends.html#available-state-backends On Mon, Mar 4, 2019 at 10:01 AM 孙森 wrote: > Hi Gary: > > > Yes, I enable the checkpoints in my program . > > 在 2019年3月4日,上午3:03,Gary Yao 写道: > > Hi Sen, > > Did you set a restart strategy [1]? If you enabled checkpo

Re: submit job failed on Yarn HA

2019-03-04 Thread Gary Yao
link-docs-release-1.7/ops/deployment/yarn_setup.html#recovery-behavior-of-flink-on-yarn On Tue, Mar 5, 2019 at 7:41 AM 孙森 wrote: > Hi Gary: > I used FsStateBackend . > > > The jm log is here: > > > After restart , the log is : > > > > > Best! > Sen &g

Re: submit job failed on Yarn HA

2019-03-05 Thread Gary Yao
fixed as soon as possible. > Best! > Sen > > 在 2019年3月5日,下午3:15,Gary Yao 写道: > > Hi Sen, > > In that email I meant that you should disable the ZooKeeper configuration > in > the CLI because the CLI had troubles resolving the leader from ZooKeeper. > What > you sh

Re: K8s job cluster and cancel and resume from a save point ?

2019-03-12 Thread Gary Yao
Hi Vishal, This issue was fixed recently [1], and the patch will be released with 1.8. If the Flink job gets cancelled, the JVM should exit with code 0. There is a release candidate [2], which you can test. Best, Gary [1] https://issues.apache.org/jira/browse/FLINK-10743 [2] http://apache-flink-

Re: K8s job cluster and cancel and resume from a save point ?

2019-03-12 Thread Gary Yao
of this > release ? > > On Tue, Mar 12, 2019 at 10:27 AM Gary Yao wrote: > >> Hi Vishal, >> >> This issue was fixed recently [1], and the patch will be released with >> 1.8. If >> the Flink job gets cancelled, the JVM should exit with code 0. There

Re: K8s job cluster and cancel and resume from a save point ?

2019-03-12 Thread Gary Yao
Mar 12, 2019 at 10:32 AM Vishal Santoshi < > vishal.santo...@gmail.com> wrote: > >> :) That makes so much more sense. Is k8s native flink a part of this >> release ? >> >> On Tue, Mar 12, 2019 at 10:27 AM Gary Yao wrote: >> >>> Hi Vishal, >&g

Re: local disk cleanup after crash

2019-03-12 Thread Gary Yao
Hi, If no other TaskManager (TM) is running, you can delete everything. If multiple TMs share the same host, as far as I know, you will have to parse TM logs to know what directories you can delete [1]. As for local recovery, tasks that were running on a crashed TM are lost. From the documentation

Re: K8s job cluster and cancel and resume from a save point ?

2019-03-12 Thread Gary Yao
could test it for you. > > Without 1.8and this exit code we are essentially held up. > > On Tue, Mar 12, 2019 at 10:56 AM Gary Yao wrote: > >> Nobody can tell with 100% certainty. We want to give the RC some exposure >> first, and there is also a release process th

Re: Flink 1.7.2: Task Manager not able to connect to Job Manager

2019-03-14 Thread Gary Yao
Hi Harshith, Can you share JM and TM logs? Best, Gary On Thu, Mar 14, 2019 at 3:42 PM Kumar Bolar, Harshith wrote: > Hi all, > > > > I'm trying to upgrade our Flink cluster from 1.4.2 to 1.7.2 > > > > When I bring up the cluster, the task managers refuse to connect to the > job managers with t

Re: Re: Flink 1.7.2: Task Manager not able to connect to Job Manager

2019-03-14 Thread Gary Yao
otingTerminator - Shutting > down remote daemon. > 2019-03-14 11:47:35,952 INFO > akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote > daemon shut down; proceeding with flushing remote transports. > 2019-03-14 11:47:35,959 INFO > akka.remote.RemoteAc

Re: Re: Re: Flink 1.7.2: Task Manager not able to connect to Job Manager

2019-03-15 Thread Gary Yao
1.us-east-1.abc.com:28945/user/resourcemanager > <http://fl...@flink0-1.flink1.us-east-1.abc.com:28945/user/resourcemanager> > under registration id 170ee6a00f80ee02ead0e88710093d77.* > > > > > > Thanks, > > Harshith > > > > *From: *Harshith Kumar Bolar

Re: Re: Re: Flink 1.7.2: Task Manager not able to connect to Job Manager

2019-03-15 Thread Gary Yao
I forgot to add line numbers to the first link in my previous email: https://github.com/apache/flink/blob/c6878aca6c5aeee46581b4d6744b31049db9de95/flink-dist/src/main/flink-bin/bin/jobmanager.sh#L21-L25 On Fri, Mar 15, 2019 at 8:08 AM Gary Yao wrote: > Hi Harshith, > > In the jobm

Re: Where does the logs in Flink GUI's Exception tab come from?

2019-03-15 Thread Gary Yao
Hi Averell, I think I have answered your question previously [1]. The bottom line is that the error is logged on INFO level in the ExecutionGraph [2]. However, your effective log level (of the root logger) is WARN. The log levels are ordered as follows [3]: TRACE < DEBUG < INFO < WARN < ERRO

Re: inconsistent module descriptor for hadoop uber jar (Flink 1.8)

2019-04-16 Thread Gary Yao
Hi, Can you describe how to reproduce this? Best, Gary On Mon, Apr 15, 2019 at 9:26 PM Hao Sun wrote: > Hi, I can not find the root cause of this, I think hadoop version is mixed > up between libs somehow. > > --- ERROR --- > java.text.ParseException: inconsistent module descriptor file found

[DISCUSS] Temporarily remove support for job rescaling via CLI action "modify"

2019-04-23 Thread Gary Yao
Hi all, As the subject states, I am proposing to temporarily remove support for changing the parallelism of a job via the following syntax [1]: ./bin/flink modify [job-id] -p [new-parallelism] This is an experimental feature that we introduced with the first rollout of FLIP-6 (Flink 1.5). Ho

Re: [DISCUSS] Temporarily remove support for job rescaling via CLI action "modify"

2019-04-24 Thread Gary Yao
or now. Actually some users are not aware of that > it’s > > > still experimental, and ask quite a lot about the problem it causes. > > > > > > Best, > > > Paul Lam > > > > > > 在 2019年4月24日,14:49,Stephan Ewen 写道: > > > > > >

Re: Flink CLI

2019-04-26 Thread Gary Yao
Hi Steve, (1) The CLI action you are looking for is called "modify" [1]. However, we want to temporarily disable this feature beginning from Flink 1.9 due to some caveats with it [2]. If you have objections, it would be appreciated if you could comment on the respective thread on

Re: [DISCUSS] Temporarily remove support for job rescaling via CLI action "modify"

2019-04-29 Thread Gary Yao
Since there were no objections so far, I will proceed with removing the code [1]. [1] https://issues.apache.org/jira/browse/FLINK-12312 On Wed, Apr 24, 2019 at 1:38 PM Gary Yao wrote: > The idea is to also remove the rescaling code in the JobMaster. This will > make > it easier to r

Re: Queryable State race condition or serialization errors?

2019-05-21 Thread Gary Yao
Hi Burgess Chen, If you are using MemoryStateBackend or FsStateBackend, you can observe race conditions on the state objects. However, the RocksDBStateBackend should be safe from these issues [1]. Best, Gary [1] https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/queryab

Flink Dashboard stopped showing list of uploaded jars

2016-07-20 Thread Gary Yao
Hi all, I accidentally packaged a Flink Job for which the main method could not be looked up. This breaks the Flink Dashboard's job submission page (no jobs are displayed). I opened a ticket: https://issues.apache.org/jira/browse/FLINK-4236 Is there a way to recover from this without restartin

Sporadic exceptions when checkpointing to S3

2016-07-27 Thread Gary Yao
Hi all, I am using the filesystem state backend with checkpointing to S3. From the JobManager logs, I can see that it works most of the time, e.g., 2016-07-26 17:49:07,311 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 3 @ 1469555347310 2016-07-26 17

Re: Unsubscribe

2017-11-09 Thread Gary Yao
Hi Paolo, If you haven't done so already, you need to write to user-unsubscr...@flink.apache.org to unsubscribe. Best, Gary On Thu, Nov 9, 2017 at 5:28 PM, Paolo Cristofanelli < cristofanelli.pa...@gmail.com> wrote: > Hi, > I would like to unsubscribe from the mailing list. > > Best, > P

Re: all task managers reading from all kafka partitions

2017-11-17 Thread Gary Yao
Forgot to hit "reply all" in my last email. On Fri, Nov 17, 2017 at 8:26 PM, Gary Yao wrote: > Hi Robert, > > To get your desired behavior, you should start a single job with > parallelism set to 4. > > Flink does not rely on Kafka's consumer groups to d

Re: all task managers reading from all kafka partitions

2017-11-18 Thread Gary Yao
oncurency in one JVM. > > Thanks! > > > > > > > > > Оригинално писмо > > >От: Gary Yao g...@data-artisans.com > > >Относно: Re: all task managers reading from all kafka partitions > > >До: "r. r." > > >Из

Re: Performance of docker-flink

2017-12-07 Thread Gary Yao
Hi Jayant, Running Flink in a Docker container should not have an impact on the performance in itself. Docker does not employ virtualization. To put it simply, Docker containers are processes on the host operating system that are isolated against each other using kernel features. See [1] for a mor

Re: Trigger not firing when using BoundedOutOfOrdernessTimestampExtractor

2018-01-11 Thread Gary Yao
Hi Jayant, The difference is that the Watermarks from BoundedOutOfOrdernessTimestampExtractor are based on the greatest timestamp of all previous events. That is, if you do not receive new events, the Watermark will not advance. In contrast, your custom implementation of AssignerWithPeriodicWaterm

Re: Two issues when deploying Flink on DC/OS

2018-01-11 Thread Gary Yao
Hi Dongwon, I am not familiar with the deployment on DC/OS. However, Eron Wright and Jörg Schad (cc'd), who have worked on the Mesos integration, might be able to help you. Best, Gary On Tue, Jan 9, 2018 at 10:29 AM, Dongwon Kim wrote: > Hi, > > I've launched JobManager and TaskManager on DC/O

Re: Aggregation using event timestamp than clock window

2018-01-12 Thread Gary Yao
Hi Rohan, Your ReportTimestampExtractor assigns timestamps to the stream records correctly but uses the wall clock to emit Watermarks (System.currentTimeMillis). In Flink Watermarks are the mechanism to advance the event time. Hence, you should emit Watermarks according to the time that you extrac

Re: Aggregation using event timestamp than clock window

2018-01-14 Thread Gary Yao
the window and dispatch the report. Intention is i > don't want to loss the last hour of the data since the stream end in > between the hour. > > Rohan > > On Fri, Jan 12, 2018 at 12:00 AM, Gary Yao wrote: > >> Hi Rohan, >> >> Your ReportTimestampExt

Re: Aggregation using event timestamp than clock window

2018-01-15 Thread Gary Yao
i don't get > message for the given id then i would like to close the window and send > this to destination system(in my case kafka topic.) > > > > > Rohan > > On Sun, Jan 14, 2018 at 1:00 PM, Gary Yao wrote: > >> Hi Rohan, >> >> I am not sure if

Re: Far too few watermarks getting generated with Kafka source

2018-01-18 Thread Gary Yao
Hi William, How often does the Watermark get updated? Can you share your code that generates the watermarks? Watermarks should be strictly ascending. If your code produces watermarks that are not ascending, smaller ones will be discarded. Could it be that the events in Kafka are more "out of order

Re: Flip-6 + Dynamic scaling

2018-01-26 Thread Gary Yao
Hi Jayant, I am working on FLIP-6, and I can tell you that we are aiming at shipping it with Flink 1.5. For the scope and release timeline see this blog post: https://flink.apache.org/news/2017/11/22/release-1.4-and-1.5-timeline.html Best, Gary On Wed, Jan 24, 2018 at 7:02 AM, Jayant Ameta w

Re: Extending Flink Slots when running on Yarn

2018-02-02 Thread Gary Yao
Hi Julio, When you start the Flink YARN session, you have to specify the number of TaskManagers and number of slots per TaskManager. There is currently no officially supported way to add more TaskManagers to a long running YARN session. We are aware of this limitation, and there are ongoing develo

Re: Flink REST API

2018-02-02 Thread Gary Yao
Hi Raja, The registered tracking URL of the YARN application can be used to issue HTTP requests against the REST API. You can retrieve the URL by using the YARN client: yarn application -list In the output, the rightmost column shows the URL, e.g., Application-Id ... Trac

Re: [EXTERNAL] Re: Flink REST API

2018-02-09 Thread Gary Yao
erving in YARN Resource Manager. > Not sure, If the Hadoop environment I am on, is using some proxies or etc. > > Do you have any thoughts on why this is happening different in YARN > command line Vs YARN Resource Manager ? > > > > > > Thanks a lot again. > > > &

Re: [EXTERNAL] Re: Flink REST API

2018-02-11 Thread Gary Yao
ication -list”* > > > > Thanks a lot. > > > > Regards, > > Raja. > > > > *From: *Gary Yao > *Date: *Friday, February 9, 2018 at 9:25 AM > *To: *Raja Aravapalli > *Cc: *"user@flink.apache.org" > *Subject: *Re: [EXTERNAL] Re: Flink REST AP

Re: Unable to submit job from the web UI in flink 4.0

2018-02-13 Thread Gary Yao
.com/questions/48158617/flink-cannot-submit-new-job>* > > > On Mon, Feb 12, 2018 at 12:23 PM, Gary Yao wrote: > >> Hi Puneet, >> >> Are you not able to upload the jars? If the jar upload already fails, can >> you >> try to upload from the command li

Re: Job is be cancelled, but the stdout log still prints

2018-03-08 Thread Gary Yao
Hi, You are not shutting down the ScheduledExecutorService [1], which means that after job cancelation the thread will continue running getLiveInfo(). The user code class loader, and your classes won't be garbage collected. You should use the RichFunction#close callback to shutdown your thread poo

Re: Flink UI not responding on Yarn + Flink

2018-03-08 Thread Gary Yao
Hi Samar, Can you share the JobManager and TaskManager logs returned by: yarn logs -applicationId ? Is your browser rendering a blank page, or does the HTTP request not finish? Can you show the output of one of the following commands: curl -v http://host:port curl -v http://host:port/jobs

Re: Change conf.yaml properties flink docker

2018-03-15 Thread Gary Yao
Hi Miki, A custom image is not needed to do that. You can mount a directory containing a custom flink-conf.yaml [1], and set the environment variable FLINK_CONF_DIR to point to that directory [2][3]. Best, Gary [1] https://docs.docker.com/storage/volumes/ [2] https://docs.docker.com/engine/refer

Re: Apache Zookeeper vs Flink Zookeeper

2018-03-21 Thread Gary Yao
Hi Alex, You can use vanilla Apache ZooKeeper. The class FlinkZooKeeperQuorumPeer is only used if you start ZooKeeper via the provided script bin/zookeeper.sh. FlinkZooKeeperQuorumPeer does not add any functionality except creating ZooKeeper's myid file. Best, Gary On Wed, Mar 21, 2018 at 12:02

Re: NoClassDefFoundError for jersey-core on YARN

2018-03-28 Thread Gary Yao
Hi Juho, Can you try submitting with HADOOP_CLASSPATH=`hadoop classpath` set? [1] For example: HADOOP_CLASSPATH=`hadoop classpath` link-${FLINK_VERSION}/bin/flink run [...] Best, Gary [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/hadoop.html#configuring-flink-with-h

Re: cancel-with-savepoint: 404 Not Found

2018-03-29 Thread Gary Yao
Hi Juho, Thank you for testing the Apache Flink 1.5 release. For FLIP-6 [1], the "cancel with savepoint" API was reworked. Unfortunately the FLIP-6 REST API documentation still needs to be re-generated [2][3]. Under the new API, you first issue a POST request against /jobs/:jobid/savepoints, and

Re: cancel-with-savepoint: 404 Not Found

2018-03-29 Thread Gary Yao
e-1.5/flink-runti > me/src/main/java/org/apache/flink/runtime/rest/handler/ > job/savepoints/SavepointHandlers.java#L59 > > > On Thu, Mar 29, 2018 at 11:25 AM, Gary Yao wrote: > >> Hi Juho, >> >> Thank you for testing the Apache Flink 1.5 release. >> >

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Gary Yao
Hi Juho, The log message Could not allocate all requires slots within timeout of 30 ms. Slots required: 20, slots allocated: 8 indicates that you do not have enough resources in your cluster left. Can you verify that after you started the job submission the YARN cluster does not reach its

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Gary Yao
8 >> 15:27 and I terminated it at ~28-03-2018 15:36. >> >> Do you happen to know about what that BlobServerConnection error means in >> the code? If it may lead into some unrecoverable state (where neither >> restart is attempted, nor job is failed for good).. >>

Re: State management and heap usage

2018-04-12 Thread Gary Yao
Hi Michael, You can configure the default state backend by setting state.backend in flink-conf.yaml, or you can configure it per job [1]. The default state backend is "jobmanager" (MemoryStateBackend), which stores state and checkpoints on the Java heap. RocksDB must be explicitly enabled, e.g., b

Re: Access logs for a running Flink app in YARN cluster

2018-04-13 Thread Gary Yao
Hi, I see two options: 1. You can login to the slave machines, which run the NodeManagers, and access the container logs. The path of the container logs can be configured in yarn-site.xml with the key yarn.nodemanager.log-dirs. In my tests with EMR, the logs are stored at /var/log/hadoop-yarn/con

Re: Old Flink jobs restarting on Job Manager failover

2018-04-15 Thread Gary Yao
Hi Steve, What is the Flink version you are using? Jobs are recovered from metadata stored in ZooKeeper. The behavior you describe indicates that the submitted job graph is not deleted from ZooKeeper. By default, the jobs that should be running/recovered are stored in znode: /flink/default/job

Re: rest.port is reset to 0 by YarnEntrypointUtils

2018-04-18 Thread Gary Yao
Hi Dongwon, I think the rationale was to avoid conflicts between multiple Flink instances running on the same YARN cluster. There is a ticket that proposes to allow configuring a port range instead [1]. Best, Gary [1] https://issues.apache.org/jira/browse/FLINK-5758 On Tue, Apr 17, 2018 at 9:56

Re: Testing on Flink 1.5

2018-04-19 Thread Gary Yao
Hi Amit, Thank you for the follow up. What you describe sounds like a bug but I am not able to reproduce it. Can you open an issue in Jira with an outline of your code and how you submit the job? > Could you also recommend us the best practice in FLIP6, should we use YARN session or submit jobs i

Re: masters file only needed when using start-cluster.sh script?

2018-04-20 Thread Gary Yao
Hi David, You are right. If you don't use start-cluster.sh, the conf/masters file is not needed. Best, Gary On Wed, Apr 18, 2018 at 8:25 AM, David Corley wrote: > The HA documentation is a little confusing in that it suggests JM > registration and discovery is done via Zookeeper, but it also r

Re: How to run flip-6 on mesos

2018-04-23 Thread Gary Yao
Hi Miki, Did you try to submit a job? With the introduction of FLIP-6, resources are allocated dynamically. Best, Gary On Tue, Apr 24, 2018 at 8:31 AM, miki haiat wrote: > > HI, > Im trying to tun flip-6 on mesos but its not clear to me what is the > correct way to do it . > > I run the sessi

Re: akka.remote.OversizedPayloadException

2018-04-23 Thread Gary Yao
Hi Alex, Can you try add the following two lines to your flink-conf.yaml? akka.framesize: 314572800b akka.client.timeout: 10min AFAIK it is not needed to use Java system properties here. Best, Gary On Mon, Apr 23, 2018 at 8:48 PM, Alex Soto wrote: > Hello, > > I am using Flink version 1.

Re: How to run flip-6 on mesos

2018-04-24 Thread Gary Yao
ng the web UI . > Can you refer me to some example how to submit a job ? > Using REST ? to which port ? > > thanks, > > miki > > On Tue, Apr 24, 2018 at 9:42 AM, Gary Yao wrote: > >> Hi Miki, >> >> Did you try to submit a job? With the introduction o

Re: Testing on Flink 1.5

2018-04-24 Thread Gary Yao
Hi Amit, web.timeout should only affect RPC calls originating from the REST API. In FLIP-6, the submission of the job graph happens via HTTP. The value under akka.ask.timeout is still used as the default timeout for RPC calls [1][2]. Since you also had custom heartbeats settings, you should consid

Re: How to run flip-6 on mesos

2018-04-24 Thread Gary Yao
; java:260) >> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask( >> ForkJoinPool.java:1339) >> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo >> l.java:1979) >> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW &

Re: How to run flip-6 on mesos

2018-04-25 Thread Gary Yao
e date for 1.5 release ? > > Thanks allot for your help . > > > > On Tue, Apr 24, 2018 at 10:39 AM, Gary Yao wrote: > >> Hi Miki, >> >> The stacktrace you posted looks familiar [1]. We have fixed the issue in >> Flink >> 1.5. What is the Flink

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-03 Thread Gary Yao
Hi Julio, Are you using the -m flag of "bin/flink run" by any chance? In HA mode, you cannot manually specify the JobManager address. The client determines the leader through ZooKeeper. If you did not configure the ZooKeeper quorum in the flink-conf.yaml on the machine from which you are submittin

Re: How to set fix JobId for my application.

2018-05-03 Thread Gary Yao
Hi, AFAIK it is not possible to set the job id manually. You could get the job id via the REST API. For example, http://host:port/joboverview/running gives you a list of running jobs [1] which you can filter by name. Would that work for you? Best, Gary [1] https://ci.apache.org/projects/flink/f

Re: IOException: Size of the state is larger than the maximum permitted memory-backed state

2018-05-03 Thread Gary Yao
Hi, It looks like you are still using the MemoryStateBackend. Are you overriding the state backend settings from within your job? To debug this, it would be helpful to see the JobManager logs and the contents of your flink-conf.yaml Best, Gary On Wed, May 2, 2018 at 3:25 AM, syed wrote: > Hi;

Re: How to set fix JobId for my application.

2018-05-05 Thread Gary Yao
Hi, If I understand correctly, you are using Queryable State to access state of one job from another one. To avoid redeployment, some operator in application B would need to regularly poll the REST API to discover changes of the job id of application A. It is doable but not advised to use Queryabl

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-05 Thread Gary Yao
ace can redirect from the >> secondary back to primary. >> >> Currently I'm still running 1.4.0 (and I plan to upgrade to 1.4.2 as soon >> as I can fix this). >> >> I'll try again with the HA/ZooKeeper properly set up on my machine and, >> if it s

Re: Flink 1.5 release timing

2018-05-10 Thread Gary Yao
Hi Ken, If you haven't seen it already, the 1.5 release candidate #2 is out since yesterday [1][2]. [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-5-0-release-candidate-2-td22197.html [2] http://people.apache.org/~trohrmann/flink-1.5.0-rc2/ Best, Gary On Wed,

Re: FLIP-6 Docker / Kubernetes

2018-05-10 Thread Gary Yao
Hi, Unfortunately, there won't be native support for Kubernetes in Flink 1.5 yet [1]. That is, you will still have to deploy a standalone session cluster and submit a job via the REST API. Can you open an issue in JIRA regarding the Blob Cache corruption? FLINK-5908 is closed already. Best, Gary

Re: Message guarantees with S3 Sink

2018-05-17 Thread Gary Yao
Hi Amit, The BucketingSink doesn't have well defined semantics when used with S3. Data loss is possible but I am not sure whether it is the only problem. There are plans to rewrite the BucketingSink in Flink 1.6 to enable eventually consistent file systems [1][2]. Best, Gary [1] http://apache-f

Re: HA stand alone cluster error

2018-06-06 Thread Gary Yao
Hi Miki, Sorry for the late reply. If you are able to reproduce the first problem, it would be good to see the complete JobManager logs. The second exception indicates that you have not removed all data from ZooKeeper. On recovery, Flink looks up the locations of the submitted JobGraphs in ZooKee

Re: Maven release

2017-10-18 Thread Gary Yao
Hi Biswajit, The distribution management configuration can be found in the parent pom of flink-parent: org.apache apache 18 apache.releases.https Apache Release Distribution Repository https://repository.apache.org/service/local/staging/deploy/maven2 apache.sna

<    1   2