Hi,
you should expect that the size can vary for some checkpoints, even if the
change rate is constant. Some checkpoints will upload compacted replacements
for previous checkpoints to prevent that the checkpoint history will grow
without bounds. Whenever that
happens, the checkpoint does some „
Hi,
afaik, the option is not exposed according to the current state of source code.
I can see it to be useful and technically possible using:
db.getLongProperty(stateColumnFamilyHandle, "rocksdb.estimate-num-keys”);
Though couple of things come into my mind to take into account for this feature:
Hi all,
I am trying to test failure recovery of a Flink job when a JM or TM goes
down.
Our target is having job auto restart and back to normal condition in any
case.
However, what's I am seeing is very strange and hope someone here help me
to understand it.
When JM or TM went down, I see the jo
Hi Kien
>From your description, your job has already started to execute checkpoint
>after job failover, which means your job was in RUNNING status. From my point
>of view, the actual recovery time should be the time during job's status:
>RESTARTING->CREATED->RUNNING[1].
Your trouble sounds more
Hi trung,
Can you provide more information to aid in positioning? For example, the
size of the state generated by a checkpoint and more log information, you
can try to switch the log level to DEBUG.
Thanks, vino.
Yun Tang 于2018年9月6日周四 下午7:42写道:
> Hi Kien
>
> From your description, your job has
Thanks Fabian,
I didn't notice select() wasn't SQL compliant.
sqlQuery works fine, it's all right :)
All the best
François
2018-09-05 12:30 GMT+02:00 Fabian Hueske :
> Hi
>
> You are using SQL syntax in a Table API query. You have to stick to Table
> API syntax or use SQL as
>
> tEnv.sqlQuery
Hello,
We have a Flink pipeline where we are windowing our data after a keyBy. i.e.
myStream.keyBy().window().process(MyIncrementalAggregation(),
MyProcessFunction()).
I have two questions about the above line of code:
1) Does the state in the process window function qualify as KeyedState or
Oper
Hi Yun,
Yes, the job’s status change to Running pretty fast after failure (~ 1 min).
As soon as the status change to running, first checkpoint is kick off and
it took 30 mins. I need to have exactly-one as i maintining some
aggregation metric, do you know whats the diffrent between first checkpoi
And here is the snapshot of my checkpoint metrics in normal condition.
On Thu, Sep 6, 2018 at 9:21 AM trung kien wrote:
> Hi Yun,
>
> Yes, the job’s status change to Running pretty fast after failure (~ 1
> min).
>
> As soon as the status change to running, first checkpoint is kick off and
> it
Hi Kien
You could try to kill one TM container by using 'yarn container -signal
FORCEFUL_SHUTDOWN' command, and then watch the first checkpoint
after job failover. You could view the checkpoint details[1] to see whether
exists outlier operator or sub-task which consumed extremely long time to
Thanks very much. Now it works fine.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Thanks very much. Now it works fine.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi,all
I’m using “bin/flink run -m yarn-cluster” to run my program on yarn. However,
it seems that I can’t add my own files into classpath before the the job is
submitted to yarn. For example, I have a conf file, which located in my own
conf directory, and I need to load file from the conf dire
Hi Subramanya,
if the container is still running and the TM can simply not connect to the
JobManager, then the ResourceManager does not see a problem. The RM things
in terms of containers and as long as n containers are running, it won't
start new ones. That's the reason why the TM should exit in
Hello Chesnay,
Thanks for the information.
Decided to move straight away to launching a standalone cluster.
I'm now having another problem when trying to submit a job through my Java
program after launching the standalone cluster.
I configured the cluster (flink-1.6.0/conf/flink-conf.yaml) to us
Did you by chance use the RemoteEnvironment and pass in 6123 as the
port? If so, try using 8081 instead, which is the REST port.
On 06.09.2018 18:24, Miguel Coimbra wrote:
Hello Chesnay,
Thanks for the information.
Decided to move straight away to launching a standalone cluster.
I'm now havin
Hello all,
We are running Flink 1.5.3 on Kubernetes with RocksDB as statebackend.
When performing some load testing we got an /OutOfMemoryError: native memory
exhausted/, causing the job to fail and be restarted.
After the Taskmanager is restarted, the job is recovered from a Checkpoint,
but it
Exactly, that was the problem.
Didn't realize the restructured cluster channels all communications to the
REST port.
Thanks again.
Best,
On Thu, 6 Sep 2018 at 17:57, Chesnay Schepler wrote:
> Did you by chance use the RemoteEnvironment and pass in 6123 as the port?
> If so, try using 8081 inst
Hi everyone,
I'm running a YARN session on a cluster with one master and one core and
would like to use the Monitoring API programmatically to submit jobs. I
have found that the configuration variables are read but ignored when
starting the session - it seems to choose a random port each run.
Her
Hi Austin,
`rest.port` is the latest config option to configure "The port that the
server listens on / the client connects to.", with deprecated key
`web.port` which is with deprecated key `jobmanager.web.port`, so it is
enough to config `rest.port` only (at least for 1.6). However, in your case
t
Hi, Julio:
Is checkpoint enabled in your job? Flink kafka connector only commits
offsets when checkpoint is enabled.
On Tue, Sep 4, 2018 at 11:43 PM Tzu-Li (Gordon) Tai
wrote:
> Hi Julio,
>
> As Renjie had already mentioned, to achieve exactly-once semantics with
> the Kafka consumer, Flink need
Hi all,
Here I prefer to forcing a task running in LAZY_FROM_SOURCE schedule mode
with all ResultPartitionType be BLOCKING.
But I cannot find options to config that in StreamExecutionEnvironment,
thus using below as a workaround, quite triky.
inal StreamExecutionEnvironment env =
StreamExecution
Hi bupt,
No sure about the answer.
Do you mean that you can't read the file from local FS? Have you ever tried
load the file through a full path? or you choose a wrong classloader.
Best, Hequn
On Thu, Sep 6, 2018 at 11:01 PM bupt_ljy wrote:
> Hi,all
>
>I’m using “bin/flink run -m yarn-clus
Hi Edward,
>From this log: Caused by: java.io.EOFException, it seems that the state
metadata file has been corrupted.
But I can't confirm it, maybe Stefan knows more details, Ping him for you.
Thanks, vino.
Edward Rojas 于2018年9月7日周五 上午1:22写道:
> Hello all,
>
> We are running Flink 1.5.3 on Kube
Hi Harshvardhan,
1) Yes, ProcessWindowFunction extends AbstractRichFunction, through
getRuntimeContext,you can access keyed state API.
2) ProcessWindowFunction has given you considerable flexibility, you can
based on processing time / event time / timer / it's clear method /
customized implementat
Hi Hequn,
I can read it by using the full path, but I want it to be in program's
classpath. For example, I use “bin/flink run -m yarn-cluster” to run the
program on Server1, and I have a conf file “config.conf" located in
“/server/conf” on Server1, I can’t read this file by using
ConfigFactor
Hi all,
I am encountering a weird problem when running flink 1.6 in yarn per-job
clusters.
The job fails in about half an hour after it starts. Related logs is
attached as an imange.
This piece of log comes from one of the taskmanagers. There are not any
other related log lines.
No ERROR-level log
Hi Austin,
The config options rest.port, jobmanager.web.port, etc. are intentionally
ignored on YARN. The port should be chosen randomly to avoid conflicts with
other containers [1]. I do not see a way how you can set a fixed port at the
moment but there is a related ticket for that [2]. The Flink
28 matches
Mail list logo