Hi,
How did you determine "jmhost" and "port"? Actually you do not need to
specify
these manually. If the client is using the same configuration as your
cluster,
the client will look up the leading JM from ZooKeeper.
If you have already tried omitting the "-m" parameter, you can check in the
clie
Hi Sen Sun,
The question is already resolved. You can find the entire email thread here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/flink-list-and-flink-run-commands-timeout-td22826.html
Best,
Gary
On Wed, Feb 27, 2019 at 7:55 AM sen wrote:
> Hi Aneesha:
>
> I
Hi,
Actually Flink's inverted class loading feature was designed to mitigate
problems with different versions of libraries that are not compatible with
each other [1]. You may want to debug why it does not work for you.
You can also try to use the Hadoop free Flink distribution, and export the
HA
st the rest api : http://activeRm/proxy/appId/jars
>
>
> The all client log is in the mail attachment.
>
>
>
>
> 在 2019年2月27日,下午9:30,Gary Yao 写道:
>
> Hi,
>
> How did you determine "jmhost" and "port"? Actually you do not need to
> specify
wrote:
> Thanks Gary,
>
> I will try to look into why the child-first strategy seems to have failed
> for this dependency.
>
> Best,
> Austin
>
> On Wed, Feb 27, 2019 at 12:25 PM Gary Yao wrote:
>
>> Hi,
>>
>> Actually Flink's inverted class loading
/state_backends.html#available-state-backends
On Mon, Mar 4, 2019 at 10:01 AM 孙森 wrote:
> Hi Gary:
>
>
> Yes, I enable the checkpoints in my program .
>
> 在 2019年3月4日,上午3:03,Gary Yao 写道:
>
> Hi Sen,
>
> Did you set a restart strategy [1]? If you enabled checkpo
link-docs-release-1.7/ops/deployment/yarn_setup.html#recovery-behavior-of-flink-on-yarn
On Tue, Mar 5, 2019 at 7:41 AM 孙森 wrote:
> Hi Gary:
> I used FsStateBackend .
>
>
> The jm log is here:
>
>
> After restart , the log is :
>
>
>
>
> Best!
> Sen
&g
fixed as soon as possible.
> Best!
> Sen
>
> 在 2019年3月5日,下午3:15,Gary Yao 写道:
>
> Hi Sen,
>
> In that email I meant that you should disable the ZooKeeper configuration
> in
> the CLI because the CLI had troubles resolving the leader from ZooKeeper.
> What
> you sh
Hi Vishal,
This issue was fixed recently [1], and the patch will be released with 1.8.
If
the Flink job gets cancelled, the JVM should exit with code 0. There is a
release candidate [2], which you can test.
Best,
Gary
[1] https://issues.apache.org/jira/browse/FLINK-10743
[2]
http://apache-flink-
of this
> release ?
>
> On Tue, Mar 12, 2019 at 10:27 AM Gary Yao wrote:
>
>> Hi Vishal,
>>
>> This issue was fixed recently [1], and the patch will be released with
>> 1.8. If
>> the Flink job gets cancelled, the JVM should exit with code 0. There
Mar 12, 2019 at 10:32 AM Vishal Santoshi <
> vishal.santo...@gmail.com> wrote:
>
>> :) That makes so much more sense. Is k8s native flink a part of this
>> release ?
>>
>> On Tue, Mar 12, 2019 at 10:27 AM Gary Yao wrote:
>>
>>> Hi Vishal,
>&g
Hi,
If no other TaskManager (TM) is running, you can delete everything. If
multiple TMs share the same host, as far as I know, you will have to parse
TM
logs to know what directories you can delete [1]. As for local recovery,
tasks
that were running on a crashed TM are lost. From the documentation
could test it for you.
>
> Without 1.8and this exit code we are essentially held up.
>
> On Tue, Mar 12, 2019 at 10:56 AM Gary Yao wrote:
>
>> Nobody can tell with 100% certainty. We want to give the RC some exposure
>> first, and there is also a release process th
Hi Harshith,
Can you share JM and TM logs?
Best,
Gary
On Thu, Mar 14, 2019 at 3:42 PM Kumar Bolar, Harshith
wrote:
> Hi all,
>
>
>
> I'm trying to upgrade our Flink cluster from 1.4.2 to 1.7.2
>
>
>
> When I bring up the cluster, the task managers refuse to connect to the
> job managers with t
otingTerminator - Shutting
> down remote daemon.
> 2019-03-14 11:47:35,952 INFO
> akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote
> daemon shut down; proceeding with flushing remote transports.
> 2019-03-14 11:47:35,959 INFO
> akka.remote.RemoteAc
1.us-east-1.abc.com:28945/user/resourcemanager
> <http://fl...@flink0-1.flink1.us-east-1.abc.com:28945/user/resourcemanager>
> under registration id 170ee6a00f80ee02ead0e88710093d77.*
>
>
>
>
>
> Thanks,
>
> Harshith
>
>
>
> *From: *Harshith Kumar Bolar
I forgot to add line numbers to the first link in my previous email:
https://github.com/apache/flink/blob/c6878aca6c5aeee46581b4d6744b31049db9de95/flink-dist/src/main/flink-bin/bin/jobmanager.sh#L21-L25
On Fri, Mar 15, 2019 at 8:08 AM Gary Yao wrote:
> Hi Harshith,
>
> In the jobm
Hi Averell,
I think I have answered your question previously [1]. The bottom line is
that
the error is logged on INFO level in the ExecutionGraph [2]. However, your
effective log level (of the root logger) is WARN. The log levels are ordered
as follows [3]:
TRACE < DEBUG < INFO < WARN < ERRO
Hi,
Can you describe how to reproduce this?
Best,
Gary
On Mon, Apr 15, 2019 at 9:26 PM Hao Sun wrote:
> Hi, I can not find the root cause of this, I think hadoop version is mixed
> up between libs somehow.
>
> --- ERROR ---
> java.text.ParseException: inconsistent module descriptor file found
Hi all,
As the subject states, I am proposing to temporarily remove support for
changing the parallelism of a job via the following syntax [1]:
./bin/flink modify [job-id] -p [new-parallelism]
This is an experimental feature that we introduced with the first rollout of
FLIP-6 (Flink 1.5). Ho
or now. Actually some users are not aware of that
> it’s
> > > still experimental, and ask quite a lot about the problem it causes.
> > >
> > > Best,
> > > Paul Lam
> > >
> > > 在 2019年4月24日,14:49,Stephan Ewen 写道:
> > >
> > >
Hi Steve,
(1)
The CLI action you are looking for is called "modify" [1]. However, we
want
to temporarily disable this feature beginning from Flink 1.9 due to some
caveats with it [2]. If you have objections, it would be appreciated if
you
could comment on the respective thread on
Since there were no objections so far, I will proceed with removing the
code [1].
[1] https://issues.apache.org/jira/browse/FLINK-12312
On Wed, Apr 24, 2019 at 1:38 PM Gary Yao wrote:
> The idea is to also remove the rescaling code in the JobMaster. This will
> make
> it easier to r
Hi Burgess Chen,
If you are using MemoryStateBackend or FsStateBackend, you can observe race
conditions on the state objects. However, the RocksDBStateBackend should be
safe from these issues [1].
Best,
Gary
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/queryab
Hi all,
I accidentally packaged a Flink Job for which the main method could not
be looked up. This breaks the Flink Dashboard's job submission page
(no jobs are displayed). I opened a ticket:
https://issues.apache.org/jira/browse/FLINK-4236
Is there a way to recover from this without restartin
Hi all,
I am using the filesystem state backend with checkpointing to S3.
From the JobManager logs, I can see that it works most of the time, e.g.,
2016-07-26 17:49:07,311 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering
checkpoint 3 @ 1469555347310
2016-07-26 17
Hi Paolo,
If you haven't done so already, you need to write to
user-unsubscr...@flink.apache.org
to unsubscribe.
Best,
Gary
On Thu, Nov 9, 2017 at 5:28 PM, Paolo Cristofanelli <
cristofanelli.pa...@gmail.com> wrote:
> Hi,
> I would like to unsubscribe from the mailing list.
>
> Best,
> P
Forgot to hit "reply all" in my last email.
On Fri, Nov 17, 2017 at 8:26 PM, Gary Yao wrote:
> Hi Robert,
>
> To get your desired behavior, you should start a single job with
> parallelism set to 4.
>
> Flink does not rely on Kafka's consumer groups to d
oncurency in one JVM.
>
> Thanks!
>
>
>
>
>
>
>
> > Оригинално писмо
>
> >От: Gary Yao g...@data-artisans.com
>
> >Относно: Re: all task managers reading from all kafka partitions
>
> >До: "r. r."
>
> >Из
Hi Jayant,
Running Flink in a Docker container should not have an impact on the
performance
in itself. Docker does not employ virtualization. To put it simply, Docker
containers are processes on the host operating system that are isolated
against
each other using kernel features. See [1] for a mor
Hi Jayant,
The difference is that the Watermarks from
BoundedOutOfOrdernessTimestampExtractor are based on the greatest timestamp
of
all previous events. That is, if you do not receive new events, the
Watermark
will not advance. In contrast, your custom implementation of
AssignerWithPeriodicWaterm
Hi Dongwon,
I am not familiar with the deployment on DC/OS. However, Eron Wright and
Jörg
Schad (cc'd), who have worked on the Mesos integration, might be able to
help
you.
Best,
Gary
On Tue, Jan 9, 2018 at 10:29 AM, Dongwon Kim wrote:
> Hi,
>
> I've launched JobManager and TaskManager on DC/O
Hi Rohan,
Your ReportTimestampExtractor assigns timestamps to the stream records
correctly
but uses the wall clock to emit Watermarks (System.currentTimeMillis). In
Flink
Watermarks are the mechanism to advance the event time. Hence, you should
emit
Watermarks according to the time that you extrac
the window and dispatch the report. Intention is i
> don't want to loss the last hour of the data since the stream end in
> between the hour.
>
> Rohan
>
> On Fri, Jan 12, 2018 at 12:00 AM, Gary Yao wrote:
>
>> Hi Rohan,
>>
>> Your ReportTimestampExt
i don't get
> message for the given id then i would like to close the window and send
> this to destination system(in my case kafka topic.)
>
>
>
>
> Rohan
>
> On Sun, Jan 14, 2018 at 1:00 PM, Gary Yao wrote:
>
>> Hi Rohan,
>>
>> I am not sure if
Hi William,
How often does the Watermark get updated? Can you share your code that
generates
the watermarks? Watermarks should be strictly ascending. If your code
produces
watermarks that are not ascending, smaller ones will be discarded. Could it
be
that the events in Kafka are more "out of order
Hi Jayant,
I am working on FLIP-6, and I can tell you that we are aiming at shipping it
with Flink 1.5. For the scope and release timeline see this blog post:
https://flink.apache.org/news/2017/11/22/release-1.4-and-1.5-timeline.html
Best,
Gary
On Wed, Jan 24, 2018 at 7:02 AM, Jayant Ameta w
Hi Julio,
When you start the Flink YARN session, you have to specify the number of
TaskManagers and number of slots per TaskManager. There is currently no
officially supported way to add more TaskManagers to a long running YARN
session. We are aware of this limitation, and there are ongoing develo
Hi Raja,
The registered tracking URL of the YARN application can be used to issue
HTTP
requests against the REST API. You can retrieve the URL by using the YARN
client:
yarn application -list
In the output, the rightmost column shows the URL, e.g.,
Application-Id ... Trac
erving in YARN Resource Manager.
> Not sure, If the Hadoop environment I am on, is using some proxies or etc.
>
> Do you have any thoughts on why this is happening different in YARN
> command line Vs YARN Resource Manager ?
>
>
>
>
>
> Thanks a lot again.
>
>
>
&
ication -list”*
>
>
>
> Thanks a lot.
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *Gary Yao
> *Date: *Friday, February 9, 2018 at 9:25 AM
> *To: *Raja Aravapalli
> *Cc: *"user@flink.apache.org"
> *Subject: *Re: [EXTERNAL] Re: Flink REST AP
.com/questions/48158617/flink-cannot-submit-new-job>*
>
>
> On Mon, Feb 12, 2018 at 12:23 PM, Gary Yao wrote:
>
>> Hi Puneet,
>>
>> Are you not able to upload the jars? If the jar upload already fails, can
>> you
>> try to upload from the command li
Hi,
You are not shutting down the ScheduledExecutorService [1], which means that
after job cancelation the thread will continue running getLiveInfo(). The
user
code class loader, and your classes won't be garbage collected. You should
use
the RichFunction#close callback to shutdown your thread poo
Hi Samar,
Can you share the JobManager and TaskManager logs returned by:
yarn logs -applicationId ?
Is your browser rendering a blank page, or does the HTTP request not finish?
Can you show the output of one of the following commands:
curl -v http://host:port
curl -v http://host:port/jobs
Hi Miki,
A custom image is not needed to do that. You can mount a directory
containing a
custom flink-conf.yaml [1], and set the environment variable FLINK_CONF_DIR
to
point to that directory [2][3].
Best,
Gary
[1] https://docs.docker.com/storage/volumes/
[2] https://docs.docker.com/engine/refer
Hi Alex,
You can use vanilla Apache ZooKeeper. The class FlinkZooKeeperQuorumPeer is
only
used if you start ZooKeeper via the provided script bin/zookeeper.sh.
FlinkZooKeeperQuorumPeer does not add any functionality except creating
ZooKeeper's myid file.
Best,
Gary
On Wed, Mar 21, 2018 at 12:02
Hi Juho,
Can you try submitting with HADOOP_CLASSPATH=`hadoop classpath` set? [1]
For example:
HADOOP_CLASSPATH=`hadoop classpath` link-${FLINK_VERSION}/bin/flink run
[...]
Best,
Gary
[1]
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/hadoop.html#configuring-flink-with-h
Hi Juho,
Thank you for testing the Apache Flink 1.5 release.
For FLIP-6 [1], the "cancel with savepoint" API was reworked. Unfortunately
the
FLIP-6 REST API documentation still needs to be re-generated [2][3]. Under
the
new API, you first issue a POST request against /jobs/:jobid/savepoints, and
e-1.5/flink-runti
> me/src/main/java/org/apache/flink/runtime/rest/handler/
> job/savepoints/SavepointHandlers.java#L59
>
>
> On Thu, Mar 29, 2018 at 11:25 AM, Gary Yao wrote:
>
>> Hi Juho,
>>
>> Thank you for testing the Apache Flink 1.5 release.
>>
>
Hi Juho,
The log message
Could not allocate all requires slots within timeout of 30 ms. Slots
required: 20, slots allocated: 8
indicates that you do not have enough resources in your cluster left. Can
you
verify that after you started the job submission the YARN cluster does not
reach
its
8
>> 15:27 and I terminated it at ~28-03-2018 15:36.
>>
>> Do you happen to know about what that BlobServerConnection error means in
>> the code? If it may lead into some unrecoverable state (where neither
>> restart is attempted, nor job is failed for good)..
>>
Hi Michael,
You can configure the default state backend by setting state.backend in
flink-conf.yaml, or you can configure it per job [1]. The default state
backend
is "jobmanager" (MemoryStateBackend), which stores state and checkpoints on
the
Java heap. RocksDB must be explicitly enabled, e.g., b
Hi,
I see two options:
1. You can login to the slave machines, which run the NodeManagers, and
access
the container logs. The path of the container logs can be configured in
yarn-site.xml with the key yarn.nodemanager.log-dirs. In my tests with EMR,
the
logs are stored at /var/log/hadoop-yarn/con
Hi Steve,
What is the Flink version you are using?
Jobs are recovered from metadata stored in ZooKeeper. The behavior you
describe
indicates that the submitted job graph is not deleted from ZooKeeper. By
default, the jobs that should be running/recovered are stored in znode:
/flink/default/job
Hi Dongwon,
I think the rationale was to avoid conflicts between multiple Flink
instances
running on the same YARN cluster. There is a ticket that proposes to allow
configuring a port range instead [1].
Best,
Gary
[1] https://issues.apache.org/jira/browse/FLINK-5758
On Tue, Apr 17, 2018 at 9:56
Hi Amit,
Thank you for the follow up. What you describe sounds like a bug but I am
not
able to reproduce it. Can you open an issue in Jira with an outline of your
code
and how you submit the job?
> Could you also recommend us the best practice in FLIP6, should we use
YARN session or submit jobs i
Hi David,
You are right. If you don't use start-cluster.sh, the conf/masters file is
not
needed.
Best,
Gary
On Wed, Apr 18, 2018 at 8:25 AM, David Corley wrote:
> The HA documentation is a little confusing in that it suggests JM
> registration and discovery is done via Zookeeper, but it also r
Hi Miki,
Did you try to submit a job? With the introduction of FLIP-6, resources are
allocated dynamically.
Best,
Gary
On Tue, Apr 24, 2018 at 8:31 AM, miki haiat wrote:
>
> HI,
> Im trying to tun flip-6 on mesos but its not clear to me what is the
> correct way to do it .
>
> I run the sessi
Hi Alex,
Can you try add the following two lines to your flink-conf.yaml?
akka.framesize: 314572800b
akka.client.timeout: 10min
AFAIK it is not needed to use Java system properties here.
Best,
Gary
On Mon, Apr 23, 2018 at 8:48 PM, Alex Soto wrote:
> Hello,
>
> I am using Flink version 1.
ng the web UI .
> Can you refer me to some example how to submit a job ?
> Using REST ? to which port ?
>
> thanks,
>
> miki
>
> On Tue, Apr 24, 2018 at 9:42 AM, Gary Yao wrote:
>
>> Hi Miki,
>>
>> Did you try to submit a job? With the introduction o
Hi Amit,
web.timeout should only affect RPC calls originating from the REST API. In
FLIP-6, the submission of the job graph happens via HTTP. The value under
akka.ask.timeout is still used as the default timeout for RPC calls [1][2].
Since you also had custom heartbeats settings, you should consid
; java:260)
>> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(
>> ForkJoinPool.java:1339)
>> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo
>> l.java:1979)
>> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW
&
e date for 1.5 release ?
>
> Thanks allot for your help .
>
>
>
> On Tue, Apr 24, 2018 at 10:39 AM, Gary Yao wrote:
>
>> Hi Miki,
>>
>> The stacktrace you posted looks familiar [1]. We have fixed the issue in
>> Flink
>> 1.5. What is the Flink
Hi Julio,
Are you using the -m flag of "bin/flink run" by any chance? In HA mode, you
cannot manually specify the JobManager address. The client determines the
leader
through ZooKeeper. If you did not configure the ZooKeeper quorum in the
flink-conf.yaml on the machine from which you are submittin
Hi,
AFAIK it is not possible to set the job id manually.
You could get the job id via the REST API. For example,
http://host:port/joboverview/running gives you a list of running jobs [1]
which
you can filter by name. Would that work for you?
Best,
Gary
[1]
https://ci.apache.org/projects/flink/f
Hi,
It looks like you are still using the MemoryStateBackend. Are you
overriding the
state backend settings from within your job? To debug this, it would be
helpful
to see the JobManager logs and the contents of your flink-conf.yaml
Best,
Gary
On Wed, May 2, 2018 at 3:25 AM, syed wrote:
> Hi;
Hi,
If I understand correctly, you are using Queryable State to access state of
one
job from another one. To avoid redeployment, some operator in application B
would need to regularly poll the REST API to discover changes of the job id
of
application A. It is doable but not advised to use Queryabl
ace can redirect from the
>> secondary back to primary.
>>
>> Currently I'm still running 1.4.0 (and I plan to upgrade to 1.4.2 as soon
>> as I can fix this).
>>
>> I'll try again with the HA/ZooKeeper properly set up on my machine and,
>> if it s
Hi Ken,
If you haven't seen it already, the 1.5 release candidate #2 is out since
yesterday [1][2].
[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-5-0-release-candidate-2-td22197.html
[2] http://people.apache.org/~trohrmann/flink-1.5.0-rc2/
Best,
Gary
On Wed,
Hi,
Unfortunately, there won't be native support for Kubernetes in Flink 1.5 yet
[1]. That is, you will still have to deploy a standalone session cluster and
submit a job via the REST API.
Can you open an issue in JIRA regarding the Blob Cache corruption?
FLINK-5908 is
closed already.
Best,
Gary
Hi Amit,
The BucketingSink doesn't have well defined semantics when used with S3.
Data
loss is possible but I am not sure whether it is the only problem. There are
plans to rewrite the BucketingSink in Flink 1.6 to enable eventually
consistent
file systems [1][2].
Best,
Gary
[1]
http://apache-f
Hi Miki,
Sorry for the late reply. If you are able to reproduce the first problem, it
would be good to see the complete JobManager logs.
The second exception indicates that you have not removed all data from
ZooKeeper. On recovery, Flink looks up the locations of the submitted
JobGraphs
in ZooKee
Hi Biswajit,
The distribution management configuration can be found in the parent pom of
flink-parent:
org.apache
apache
18
apache.releases.https
Apache Release Distribution Repository
https://repository.apache.org/service/local/staging/deploy/maven2
apache.sna
101 - 173 of 173 matches
Mail list logo