Hi Vino,
So I will use the default setting of DELETE_ON_CANCELLATION. When the
program cancels the checkpoint will be deleted, when the program fails,because
the checkpoint will not be deleted, I still can have a checkpoint that can be
used to resume.
Please help to correct me i
I see,
Thanks for the clarification.
Cheers,
Kostas
> On Sep 25, 2018, at 8:51 AM, Averell wrote:
>
> Hi Kostas,
>
> I use PROCESS_CONTINUOUSLY mode, and checkpoint interval of 20 minutes. When
> I said "Within that 15 minutes, checkpointing process is not triggered
> though" in my previous e
Hi Kostas,
I use PROCESS_CONTINUOUSLY mode, and checkpoint interval of 20 minutes. When
I said "Within that 15 minutes, checkpointing process is not triggered
though" in my previous email, I was not complaining that checkpoint is not
running, but to say that the slowness is not due to ongoing chec
anyone?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Averell,
Can you describe your settings in a bit more detail?
For example, are you reading in PROCESS_CONTINUOUSLY mode or PROCESS_ONCE?
What is your checkpoint interval?
The above are to understand why checkpoints are not processed within these 15
min.
Kostas
> On Sep 25, 2018, at 8:08 AM
Hi All,
I use using a FsStateBackend in local executing, I set the
DELETE_ON_CANCELLATION of checkpoint. When I click the “stop” button in
Intellij IDEA, the log shows that it has been switched CANCELED state, but I
check the local file system, the checkpoint directory and file still exi
Hi Henry,
I gave a blue comment in your original email.
Thanks, vino.
徐涛 于2018年9月25日周二 下午12:56写道:
> Hi Vino,
> *What is the definition and difference between job cancel and job fails?*
> Can I say that if the program is shutdown artificially, then it is a job
> cancel,
>
Hi everyone,
I have 2 file sources, which I want to start reading them in a specified
order (e.g: source2 should only start 5 minutes after source1 has started).
I could not find any Flink document mentioning this capability, and I also
tried to search the mailing list, without any success.
Howeve
Hi Kostas,
Yes, applying the filter on the 100K files takes time, and the delay of 15
minutes I observed definitely caused by that big number of files and the
cost of each individual file status check. However, the delay is much
smaller when checkpointing is off.
Within that 15 minutes, checkpoint
Hi Vino,
What is the definition and difference between job cancel and job fails?
Can I say that if the program is shutdown artificially, then it is a
job cancel,
if the program is shutdown due to some error, it
is a job fail?
This is importa
Hi Henry,
Answer your question:
What is the definition and difference between job cancel and job fails?
> The cancellation and failure of the job will cause the job to enter the
termination state. But cancellation is artificially triggered and normally
terminated, while failure is usually a pass
Hi
Just a quick thought on this:
You might be able to use delegation token to access HBase[1]. It might be a
more secure way instead of distributing your keytab over to all the YARN
nodes.
Hope this helps.
--
Rong
[1] https://wiki.apache.org/hadoop/Hbase/HBaseTokenAuthentication
On Mon, Sep 24
Hi All,
I mean if I can guarantee that a savepoint can always be made before
manually cancelation. If I use DELETE_ON_CANCELLATION option on checkpoints, is
there any probability that I do not have a checkpoint to recover from?
Thank a a lot.
Best
Henry
> 在 2018年9月25日,上午10:41,
Hi Bryant,
Maybe Stefan can answer your question, ping him for you.
Thanks, vino.
Bryant Baltes 于2018年9月25日周二 上午12:29写道:
> Hi All,
>
> After upgrading from 1.3.2 to 1.5.2, one of our apps that uses
> checkpointing no longer writes metadata files to the state.checkpoints.dir
> location provided
Hi Aljoscha,
Sorry for my late response . According to my experience , if the
flink-conf.yaml has set the "security.kerberos.login.keytab" and
"security.kerberos.login.contexts" with a kerberos file then yarn will
ship the keytab file to the TaskManager .
Also i can find the log like:
" INFO org.
Hi All,
In flink document, it says
DELETE_ON_CANCELLATION: “Delete the checkpoint when the job is
cancelled. The checkpoint state will only be available if the job fails.”
What is the definition and difference between job cancel and job fails?
If I run the program on yarn,
Hey guys, Stefan,
Yeah, sorry about the stacks. Completely forgot about them.
But I think we figured out why it's taking so long (and yeah, Stefan was
right from the start): This specific slot is receiving 5x more records than
any other slot (on a recent run, it had 10x more records than the seco
Thanks for the clarification, Dawid and Till.
@Till We have a few streaming jobs that need to be running all the time and
we plan on using the modify tool to update parallelism of jobs as we scale
the cluster in and out and knowing total slots value is crucial to this
workflow.
As Dawid pointed o
Hi All,
After upgrading from 1.3.2 to 1.5.2, one of our apps that uses
checkpointing no longer writes metadata files to the state.checkpoints.dir
location provided to the flink conf. I see this email chain addressed this
here:
https://lists.apache.org/thread.html/922f77880eca2a7b279e153090da2388b
Hi Suraj,
at the moment Flink's new mode does not support such a behaviour. There are
plans to set a min number of running TaskManagers which won't be released.
But no work has been done in this direction yet, afaik. If you want, then
you can help the community with this effort.
Cheers,
Till
On
Hi,
I have nothing more to add. You (Dawid) and Vino explained it correctly :)
Piotrek
> On 24 Sep 2018, at 15:16, Dawid Wysakowicz wrote:
>
> Hi Harshvardhan,
>
> Flink won't buffer all the events between checkpoints. Flink uses Kafka's
> transaction, which are committed only on checkpoints
Hi Harshvardhan,
Flink won't buffer all the events between checkpoints. Flink uses
Kafka's transaction, which are committed only on checkpoints, so the
data will be persisted on the Kafka's side, but only available to read
once committed.
I've cced Piotr, who implemented the Kafka 0.11 connector
Hi Suraj,
As far as I know this was changed with FLIP-6 to allow dynamic resource
allocation.
Till, cced might know if there is a switch to restore old behavior or
are there plans to support it.
Best,
Dawid
On 24/09/18 12:24, suraj7 wrote:
> Hi,
>
> I am using Amazon EMR to run Flink Cluster o
Hi Yuvraj,
It looks as some race condition for me. Would it be ok for you to switch
to either Event or Ingestion time[1]?
I also cced @Aljosha who might give you a bit more insights
Best,
Dawid
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/event_time.html#event-time--pro
yep, they're there. thank you!
On Mon, Sep 24, 2018 at 12:54 PM 杨力 wrote:
> They are provided in taskmanagers.
>
> Sayat Satybaldiyev 于 2018年9月24日周一 下午6:38写道:
>
>> Dear all,
>>
>> While configuring JMX with Flink, I don't see some bean metrics that
>> belongs to the job, in particular, the numb
this is my code
DataStream cityWithGeoHashesDataStream =
filteredGeohashDataStream.keyBy(FilteredGeoHashes::getCity).window(
ProcessingTimeSessionWindows.withGap(Time.seconds(4)))
.process(new ProcessWindowFunction() {
@Override
Hi all ,
I am stuck with this error
please help me .
I am using sessionwindow
2018-09-23 07:15:08,097 INFO org.apache.flink.runtime.taskmanager.Task
- city-geohashes-processor (24/48)
(26aed9a769743191c7cb0257087e490a) switched from RUNNING to FAILED.
java.lang.Unsupporte
They are provided in taskmanagers.
Sayat Satybaldiyev 于 2018年9月24日周一 下午6:38写道:
> Dear all,
>
> While configuring JMX with Flink, I don't see some bean metrics that
> belongs to the job, in particular, the number in/out records per operator.
> I've checked REST API and those numbers provided ther
Dear all,
While configuring JMX with Flink, I don't see some bean metrics that
belongs to the job, in particular, the number in/out records per operator.
I've checked REST API and those numbers provided there. Does flink provide
such bean or there's an additional configuration for it?
Here's a li
Hello,
We have a query regarding SSL algorithms available for Flink versions. From the
documents of Flink 1.6.0 we could see following SSL algorithms options are
supported.
security.ssl.algorithms:
TLS_DHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_RSA_WITH_AES_2
Hi,
I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of
m4.large instances for 1 master and 2 core nodes. I have started the Flink
Cluster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4.
Flink Job Manager and Application Manager starts but there are no Tas
Dear community,
this is the weekly community update thread #39. Please post any news and
updates you want to share with the community to this thread.
# Flink 1.6.1 and Flink 1.5.4 released
The community has released new bug fix releases: Flink 1.6.1 and Flink
1.5.4 [1, 2].
# Open source review
Hi Alexander,
the issue for the reactive mode, the mode which reacts to newly available
resources and scales the up accordingly, is here:
https://issues.apache.org/jira/browse/FLINK-10407. It does not contain a
lot of details but we are actively working on publishing the corresponding
design docum
I can't really help you here.
Digging into the backing java internals isn't supported, and neither is
registering a kryo serializer (which is why it isn't exposed in the
python environment).
The jython-related serialization logic doesn't care about Flink's usual
type serialization mechanism, s
Thanks, I'll check it out.
On Mon, Sep 24, 2018 at 9:49 AM vino yang wrote:
> Hi,
>
> According to the instructions in the script:
>
> # Long.MAX_VALUE in TB: This is an upper bound, much less direct memory will
> be used
> TM_MAX_OFFHEAP_SIZE="8388607T"
>
>
> I think you may need to confirm i
No, this isn't really possible. You need a java process to kick off the
processing.
The only thing i can come up with is to open the flink-streaming-python
module in the IDE and manually call the PythonStreamBinder class with
the same arguments that you pass in the CLI as a test.
On 17.09.20
Hello Stefan,
Thank you for the help.
I've actually lost those logs to due several cluster restarts that we did,
which cause log rotation up (limit = 5 versions).
Those log lines that i've posted were the only ones that showed signs of
some problem.
*The configuration of the job is as follows:
Hi Averell,
Happy to hear that the problem is no longer there and if you have more news
from your
debugging, let us know.
The thing that I wanted to mention is that from what you are describing, the
problem does
not seem to be related to checkpointing, but to the fact that applying your
filt
i have one more question ,
is it possible , if i do keyby on the stream it will get portioned
automatically ,
because i am getting all the data in the same partition in kafka.
Thanks
Yubraj Singh
On Mon, Sep 24, 2018 at 12:34 PM yuvraj singh <19yuvrajsing...@gmail.com>
wrote:
> I am processing
I am processing data and then sending it to kafka by kafka sink .
this is method where I am producing the data
nudgeDetailsDataStream.keyBy(NudgeDetails::getCarSugarID).addSink(NudgeCarLevelProducer.getProducer(config))
.name("nudge-details-producer")
.uid("nudge-details-producer"
40 matches
Mail list logo