Thanks Junrui for creating the FLIP and kicking off this discussion.
Exposing a mutable ExecutionConfig which is even shared by multiple
operators is truly a defect which can result in weird results.
+1
Thanks,
Zhu
Junrui Lee 于2023年11月15日周三 16:53写道:
> Hi all,
>
> I'd like to start a discussio
Thanks for kicking off the new release.
+1 for January 17th as the feature freeze date :)
+1 for Qingsheng, Leonard, Martijn and Matthias as release managers
Thanks,
Zhu
Dong Lin 于2022年10月23日周日 15:09写道:
>
> Thanks for kicking off the release plan.
>
> +1 for the proposed timeline.
>
> Best,
> D
Hi Shilpa,
JobType was introduced in 1.13. So I guess the cause is that the client
which creates and submit
the job is still 1.12.2. The client generates a outdated job graph which
does not have its JobType
set and resulted in this NPE problem.
Thanks,
Zhu
Austin Cawley-Edwards 于2021年7月1日周四 上午1
Thanks Dawid and Guowei for being the release managers! And thanks everyone
who has made this release possible!
Thanks,
Zhu
Yun Tang 于2021年5月6日周四 下午2:30写道:
> Thanks for Dawid and Guowei's great work, and thanks for everyone involved
> for this release.
>
> Best
> Yun Tang
>
Thanks Roman and Yuan for being the release managers! Thanks everyone who
has made this release possible!
Cheers,
Zhu
Piotr Nowojski 于2021年3月6日周六 上午12:38写道:
> Thanks Roman and Yuan for your work and driving the release process :)
>
> pt., 5 mar 2021 o 15:53 Till Rohrmann napisał(a):
>
>> Great
Thanks Xintong for being the release manager and everyone who helped with
the release!
Cheers,
Zhu
Dian Fu 于2021年1月29日周五 下午5:56写道:
> Thanks Xintong for driving this release!
>
> Regards,
> Dian
>
> 在 2021年1月29日,下午5:24,Till Rohrmann 写道:
>
> Thanks Xintong for being our release manager. Well don
Each task will be assigned a dedicated thread for its data processing.
A slot can be shared by multiple tasks if they are in the same slot sharing
group[1].
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/concepts/runtime.html#task-slots-and-resources
Thanks,
Zhu
ゞ野蠻遊戲χ 于2020年1
Hi Arpith,
All tasks in CREATED state indicates no task is scheduled yet. It is
strange it a job gets stuck in this state.
Is it possible that you share the job manager log so we can check what is
happening there?
Thanks,
Zhu
Arpith P 于2020年9月21日周一 下午3:52写道:
> Hi,
>
> We have Flink 1.8.0 clust
The Apache Flink community is very happy to announce the release of Apache
Flink 1.11.2, which is the second bugfix release for the Apache Flink 1.11
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
Hi Zheng,
To divide managed memory for operators[1], we need to consider which tasks
will
run in the same slot. In batch jobs, vertices in different regions may not
run at
the same time. If we put them in the same slot sharing group, running tasks
may run slower with less managed memory, while man
Congratulations Dian!
Thanks,
Zhu
Zhijiang 于2020年8月27日周四 下午6:04写道:
> Congrats, Dian!
>
> --
> From:Yun Gao
> Send Time:2020年8月27日(星期四) 17:44
> To:dev ; Dian Fu ; user <
> user@flink.apache.org>; user-zh
> Subject:Re: Re: [ANNOUNC
The Apache Flink community is very happy to announce the release of Apache
Flink 1.10.2, which is the first bugfix release for the Apache Flink 1.10
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
a
that each
slot can contain a source task. With config cluster.evenly-spread-out-slots
set to true, slots can be evenly distributed in all available taskmanagers
in most cases.
Thanks,
Zhu Zhu
Ken Krugler 于2020年8月7日周五 上午5:28写道:
> Hi all,
>
> Was there any change in how sub-tasks get all
What you obsessed is right. At the moment, one IntermediateDataSet can have
one only consumer job edge in production code path.
Thanks,
Zhu Zhu
yuehan1 于2020年8月4日周二 下午5:14写道:
> IntermediateDataSet.java has a JobEdge list named consumers.
> In which case, an IntermediateDataSet co
resources for a requested container can be found in Flink JM log.
Thanks,
Zhu Zhu
mars 于2020年7月29日周三 下午10:52写道:
> Hi All,
>
> I have an EMR Cluster with one Master Node and 3 worker Nodes ( it has
> auto
> scaling enabled and the max no.of worker nodes can go up to 8).
>
> I have
Hi Joseph,
The availability of pipelined result partition is notified to JM
via scheduleOrUpdateConsumers RPC.
Just want to mention that it's better to send such questions to the user
mail list.
Thanks,
Zhu Zhu
Fork Joseph 于2020年7月21日周二 上午3:30写道:
> Hi,
>
> According to
of your job
is c23a172cda6cc659296af6452ff57f45, but the REST request is get the info
of job be3d6b9751b6e9c509b9bedeb581a72e.
Thanks,
Zhu Zhu
Prasanna kumar 于2020年6月3日周三 上午2:16写道:
> Hi ,
>
> I am running flink locally in my machine with following configurations.
>
> # The
) but each submission will be treated as a different job
and will have a different job id.
Thanks,
Zhu Zhu
M Singh 于2020年5月29日周五 上午4:59写道:
> Thanks Till - in the case of restart of flink master - I believe the jobid
> will be different. Thanks
>
> On Thursday, May 28, 2020, 11:
mentioned.
One example is that you can submit an application to a cluster multiple
times at the same time, different JobIDs are needed to differentiate them.
Thanks,
Zhu Zhu
Till Rohrmann 于2020年5月27日周三 下午10:05写道:
> Hi,
>
> if you submit the same job multiple times, then it will get eve
Hi M,
Regarding your questions:
1. yes. The id is fixed once the job graph is generated.
2. yes
Regarding yarn mode:
1. the job id keeps the same because the job graph will be generated once
at client side and persist in DFS for reuse
2. yes if high availability is enabled
Thanks,
Zhu Zhu
M
Thanks Yu for being the release manager. Thanks everyone who made this
release possible!
Thanks,
Zhu Zhu
Benchao Li 于2020年5月15日周五 下午7:51写道:
> Thanks Yu for the great work, and everyone else who made this possible.
>
> Dian Fu 于2020年5月15日周五 下午6:55写道:
>
>> Thanks Yu for man
Ticket FLINK-17714 is created to track this requirement.
Thanks,
Zhu Zhu
Till Rohrmann 于2020年5月13日周三 下午8:30写道:
> Yes, you are right Zhu Zhu. Extending
> the RestartBackoffTimeStrategyFactoryLoader to also load custom
> RestartBackoffTimeStrategies sound like a good improvement for t
ng to support later.
@Till Rohrmann @Gary Yao what do
you think?
[1]
https://lists.apache.org/thread.html/6ed95eb6a91168dba09901e158bc1b6f4b08f1e176db4641f79de765%40%3Cdev.flink.apache.org%3E
Thanks,
Zhu Zhu
Ken Krugler 于2020年5月13日周三 上午7:34写道:
> Hi Til,
>
> Sorry, missed the key q
Seems something bad happened in the task managers and led to
heartbeat timeouts.
These TMs were not released by flink but lost connections with the master
node.
I think you need to check the TM log to see what happens there.
Thanks,
Zhu Zhu
seeksst 于2020年4月26日周日 下午2:13写道:
> Thank you for y
ugh
resource?
Thanks,
Zhu Zhu
seeksst 于2020年4月26日周日 上午11:21写道:
> Hi,
>
>
> Recently, I find a problem when job failed in 1.10.0, flink didn’t
> release resource first.
>
>
>
> You can see I used flink on yarn, and it doesn’t allocate task
> manager, beacause
to confirm that the line "FOG_PREDICTION_FUNCTION (15/20) (
3086efd0e57612710d0ea74138c01090) switched from RUNNING to FAILED" does not
appear in the JM log, right? This might be an issue that the message was
lost on network, which should be a rare case. Do you encounter it often?
Thanks,
this case.
@Bruce would you take a look at the TM log? If the guess is right, in task
manager logs there will be one line "Task {} is already in state FAILED."
Thanks,
Zhu Zhu
Till Rohrmann 于2020年4月10日周五 上午12:51写道:
> For future reference, here is the issue to track the reconciliat
e root cause. Would you check whether the hostname
*prod-bigd-dn11* is resolvable? And whether the port 43757 of that machine
is permitted to be accessed?
Thanks,
Zhu Zhu
Vitaliy Semochkin 于2020年3月27日周五 上午1:54写道:
> Hi,
>
> I'm facing an issue similar to
> https://issues.apach
break the network
connection between the Flink app and the source service).
Thanks,
Zhu Zhu
Eleanore Jin 于2020年3月5日周四 上午8:40写道:
> Hi,
>
> I have a flink application and checkpoint is enabled, I am running locally
> using miniCluster.
>
> I just wonder if there is a way to simulat
Congratulations Jingsong!
Thanks,
Zhu Zhu
Fabian Hueske 于2020年2月22日周六 上午1:30写道:
> Congrats Jingsong!
>
> Cheers, Fabian
>
> Am Fr., 21. Feb. 2020 um 17:49 Uhr schrieb Rong Rong >:
>
> > Congratulations Jingsong!!
> >
> > Cheers,
> > Rong
> >
cost is expected, one can increase
the cancellation timeout ("task.cancellation.timeout") to avoid this issue.
Thanks,
Zhu Zhu
Soheil Pourbafrani 于2020年2月15日周六 上午6:41写道:
> Hi,
>
> I developed a single Flink job that read a huge amount of files and after
> some simple pr
Cheers!
Thanks Gary and Yu for the great job as release managers.
And thanks to everyone whose contribution makes the release possible!
Thanks,
Zhu Zhu
Wyatt Chun 于2020年2月12日周三 下午9:36写道:
> Sounds great. Congrats & Thanks!
>
> On Wed, Feb 12, 2020 at 9:31 PM Yu Li wrote:
>
>
Congratulations Dian.
Thanks,
Zhu Zhu
hailongwang <18868816...@163.com> 于2020年1月17日周五 上午10:01写道:
>
> Congratulations Dian !
>
> Best,
> Hailong Wang
>
>
>
>
> 在 2020-01-16 21:15:34,"Congxian Qiu" 写道:
>
> Congratulations Dian Fu
>
&
Hi Ken,
This is actually a bug that a Partition should not require a UID. It is
fixed in 1.9.2 and 1.10. see FLINK-14910
<https://jira.apache.org/jira/browse/FLINK-14910>.
Thanks,
Zhu Zhu
Ken Krugler 于2020年1月10日周五 上午7:51写道:
> Hi all,
>
> [Of course, right after hitting send I r
case.
KristoffSC 于2020年1月9日周四 下午9:26写道:
> Thank you David and Zhu Zhu,
> this helps a lot.
>
> I have follow up questions though.
>
> Having this
> /"Instead the Job must be stopped via a savepoint and restarted with a new
> parallelism"/
>
> and slot s
e-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Temporarily-remove-support-for-job-rescaling-via-CLI-action-quot-modify-quot-td27447.html
Thanks,
Zhu Zhu
KristoffSC 于2020年1月9日周四 上午1:05写道:
> Hi all,
> I must say I'm very impressed by Flink and what it can do.
>
> I
Yes. State TTL is by default disabled.
Thanks,
Zhu Zhu
LakeShen 于2020年1月6日周一 上午10:09写道:
> I saw the flink source code, I find the flink state ttl default is
> never expire,is it right?
>
> LakeShen 于2020年1月6日周一 上午9:58写道:
>
>> Hi community,I have a question about flink
Hi Aaron,
It is thread safe since the state snapshot happens in the same thread with
the user function.
Thanks,
Zhu Zhu
Aaron Langford 于2019年12月19日周四 上午11:25写道:
> Hello Flink Community,
>
> I'm hoping to verify some understanding:
>
> If I have a function with managed sta
Hi KristoffSC,
Flink does not support specifying the TM for tasks.
So I think you need to launch a separate job to do the "AsyncCall + map" in
the secured zone.
Thanks,
Zhu Zhu
KristoffSC 于2019年12月18日周三 下午8:04写道:
> Hi,
> I have a question regarding job/operator deployment
original event was
processed.
Thanks,
Zhu Zhu
Rafi Aroch 于2019年12月18日周三 下午3:50写道:
> Hi Pooja,
>
> Here's an implementation from Jamie Grier for de-duplication using
> in-memory cache with some expiration time:
>
> https://github.com/jgrier/FilteringExample/blob/master/src/mai
on cluster or submit a job
in job cluster mode.
Thanks,
Zhu Zhu
Sidney Feiner 于2019年12月17日周二 下午11:08写道:
> I'm using Flink 1.9.1 with PrometheusPushGateway to report my metrics. The
> jobName the metrics are reported with is defined in the flink-conf.yaml
> file which makes the j
not be a problem with the unique
transaction_id assumption.
Thanks,
Zhu Zhu
Pooja Agrawal 于2019年12月17日周二 下午9:17写道:
>
> Hi,
>
> I have a use case where we are reading events from kinesis stream.The
> event can look like this
> Event {
> event_id,
> transaction_id
>
Hi Jesús,
If your job has checkpointing enabled, you can monitor
'numberOfCompletedCheckpoints' to see wether the job is still alive and
healthy.
Thanks,
Zhu Zhu
Jesús Vásquez 于2019年12月17日周二 上午2:43写道:
> The thing about numRunningJobs metric is that i have to configure in
Thanks Hequn for driving the release and everyone who makes this release
possible!
Thanks,
Zhu Zhu
Wei Zhong 于2019年12月12日周四 下午3:45写道:
> Thanks Hequn for being the release manager. Great work!
>
> Best,
> Wei
>
> 在 2019年12月12日,15:27,Jingsong Li 写道:
>
> Thanks Hequn
the error is recoverable, you can just retry (or refresh the token), and
only complete the ResultFuture until it succeeds (until timeout).
Thanks,
Zhu Zhu
M Singh 于2019年12月10日周二 下午8:51写道:
> Thanks Jingsong for sharing your solution.
>
> Since both refreshing the token and the actual AP
parse it later
in the main method.
Thanks,
Zhu Zhu
Протченко Алексей 于2019年11月19日周二 上午12:29写道:
>
> Hello all.
>
> I have a question about providing complex configuration to Flink job. We
> are working on some kind of platform for running used-defined packages
> which actually ca
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/docker.html#flink-job-cluster
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/yarn_setup.html#run-a-single-flink-job-on-yarn
Thanks,
Zhu Zhu
amran dean 于2019年11月19日周二 上午5:53写道:
> Is it possible
There is no plan for release 1.9.2 yet.
Flink 1.10.0 is planned to be released in early January.
Thanks,
Zhu Zhu
srikanth flink 于2019年11月11日周一 下午9:53写道:
> Zhu Zhu,
>
> That's awesome and is what I'm looking for.
> Any update on when would be the next release date?
>
&
Hi Srikanth,
Is this issue what you encounter? FLINK-12122: a job would tend to fill one
TM before using another.
If it is, you may need to wait for the release 1.9.2 or 1.10, since it is
just fixed.
Thanks,
Zhu Zhu
vino yang 于2019年11月11日周一 下午5:48写道:
> Hi srikanth,
>
> What
with the initial state, i.e. the job will consume data
from the very beginning and there can be a big data regression.
Thanks,
Zhu Zhu
钟旭阳 于2019年11月5日周二 下午3:01写道:
> hello:
>
>
> I am currently learning flink.I recently had a problem with Flink for
> disaster recovery testin
Hi Caio,
Did you check whether there are enough resources to launch the other nodes?
Could you attach the logs you mentioned? And elaborate how the tasks are
connected in the topology?
Thanks,
Zhu Zhu
Caio Aoque 于2019年10月30日周三 上午8:31写道:
> Hi, I've been running some flink scala appl
source interfaces provided in ExecutionEnvironment, like
#readTextFile and #readFile, use FileInputFormat, so the input locality is
supported by default.
Thanks,
Zhu Zhu
Pritam Sadhukhan 于2019年10月21日周一 上午10:17写道:
> Hi Zhu Zhu,
>
> Thanks for your detailed answer.
> Can you pleas
,
Zhu Zhu
Pritam Sadhukhan 于2019年10月18日周五 上午10:59写道:
> Hi,
>
> I am trying to process data stored on HDFS using flink batch jobs.
> Our data is splitted into 16 data nodes.
>
> I am curious to know how data will be pulled from the data nodes with the
> same number of parall
I think ExecutionConfig.GlobalJobParameters is the way to do this if you
want to retrieve it in runtime.
Or you can just pass the name to each operator you implement to have it
serialized together with the udf.
Thanks,
Zhu Zhu
马阳阳 于2019年10月15日周二 下午3:11写道:
> As the title. Is it possible now?
I mean the Kafka source provided in Flink can correctly ignores null
deserialized values.
isEndOfStream allows you to control when to end the input stream.
If it is used for running infinite stream jobs, you can simply return false.
Thanks,
Zhu Zhu
John Smith 于2019年10月12日周六 下午8:40写道:
>
a
null deserialized record.
Just pay attention to also not make *DeserializationSchema#isEndOfStream* throw
errors on a null record provided.
Thanks,
Zhu Zhu
John Smith 于2019年10月12日周六 上午5:36写道:
> Hi using Flink 1.8.0.
>
> I am ingesting data from Kafka, unfortunately for the time bei
be java.io.tmpdir in standalone mode.
Thanks,
Zhu Zhu
John Smith 于2019年10月11日周五 上午2:41写道:
> And can that folder be shared so that all nodes see it?
>
> On Thu, 10 Oct 2019 at 14:36, Yun Tang wrote:
>
>> Hi John
>>
>> The jar is not stored in HA path, I think
hange the parallelism via manually rescaling.
Thanks,
Zhu Zhu
Akshay Iyangar 于2019年9月27日周五 上午4:20写道:
> Hi
>
> So we are running a beam pipeline that uses flink as its execution engine.
> We are currently on flink1.8
>
> So per the flink documentation I see that there is an option
"e12", 7L) AND
allowedLateness(Time.milliseconds(2L)) change
to allowedLateness(Time.milliseconds(4L)) *
The same as case 1).
Thanks,
Zhu Zhu
Indraneel R 于2019年9月26日周四 上午2:24写道:
> Hi Everyone,
>
> I am trying to execute this simple sessionization pipeline, with the
> allo
We will then keep the decision that we do not support customized restart
strategy in Flink 1.10.
Thanks Steven for the inputs!
Thanks,
Zhu Zhu
Steven Wu 于2019年9月26日周四 上午12:13写道:
> Zhu Zhu, that is correct.
>
> On Tue, Sep 24, 2019 at 8:04 PM Zhu Zhu wrote:
>
>> Hi
Yes. 1.8.2 contains all commits in 1.8.1.
Subramanyam Ramanathan
于2019年9月25日周三 下午5:03写道:
> Hi Zhu,
>
>
>
> Thanks a lot !
>
> Since 1.8.2 is also available, would it be right to assume 1.8.2 would
> also contain the fix ?
>
>
>
> Thanks,
>
> Subb
Hi Steven,
As a conclusion, since we will have a meter metric[1] for restarts,
customized restart strategy is not needed in your case.
Is that right?
[1] https://issues.apache.org/jira/browse/FLINK-14164
Thanks,
Zhu Zhu
Steven Wu 于2019年9月25日周三 上午2:30写道:
> Zhu Zhu,
>
> Sorry, I
Hi Subramanyam,
I checked the commits.
There are 2 fixes in FLINK-10455, only release 1.8.1 and release 1.9.0
contain both of them.
Thanks,
Zhu Zhu
Subramanyam Ramanathan
于2019年9月24日周二 下午11:02写道:
> Hi Zhu,
>
>
>
> We also use FlinkKafkaProducer(011), hence I felt this fi
throw NoClassDefFoundError due to the class loader gets closed.
Thanks,
Zhu Zhu
Subramanyam Ramanathan
于2019年9月24日周二 下午8:07写道:
> Hi,
>
>
>
> Thank you.
>
> I think the takeaway for us is that we need to make sure that the threads
> are stopped in the close() method.
>
>
>
> W
ric to show failovers that respects
fine grained recovery.
[1] https://issues.apache.org/jira/browse/FLINK-14164
Thanks,
Zhu Zhu
Steven Wu 于2019年9月24日周二 上午6:41写道:
>
> When we setup alert like "fullRestarts > 1" for some rolling window, we
> want to use counter. if it i
Hi Stephen,
I think disposing static components in the closing stage of a task is
required.
This is because your code(operators/UDFs) is part of the task, namely that
it can only be executed when the task is not disposed.
Thanks,
Zhu Zhu
Stephen Connolly 于2019年9月24日周二 上午2:13写道:
> Curren
not be restarted when task failures happen and the "fullRestart"
value will not increment in such cases.
I'd appreciate if you can help with these questions and we can make better
decisions for Flink.
Thanks,
Zhu Zhu
Steven Wu 于2019年9月22日周日 上午3:31写道:
> Zhu Zhu,
>
> Flink fullRes
Thanks Steven for the feedback!
Could you share more information about the metrics you add in you
customized restart strategy?
Thanks,
Zhu Zhu
Steven Wu 于2019年9月20日周五 上午7:11写道:
> We do use config like "restart-strategy:
> org.foobar.MyRestartStrategyFactoryFactory". Mainly t
Flink 1.10
Other usages are still supported, including all the strategies and
configuring ways described in
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-strategies
.
Feel free to share in this thread if you has any concern for it.
Thanks,
Zh
estart-strategy:
org.foobar.MyRestartStrategyFactoryFactory".
The usage of restart strategies you mentioned will keep working with the
new scheduler.
Thanks,
Zhu Zhu
Oytun Tez 于2019年9月12日周四 下午10:05写道:
> Hi Zhu,
>
> We are using custom restart strategy like this:
>
> environment.setRestartStrategy(f
hanks,
Zhu Zhu
Congratulations Zili!
Thanks,
Zhu Zhu
Terry Wang 于2019年9月11日周三 下午5:34写道:
> Congratulations!
>
> Best,
> Terry Wang
>
>
>
> 在 2019年9月11日,下午5:28,Dian Fu 写道:
>
> Congratulations!
>
> 在 2019年9月11日,下午5:26,Jeff Zhang 写道:
>
> Congratulations Zili Che
tionEnvironment();"
Thanks,
Zhu Zhu
spoganshev 于2019年9月6日周五 下午11:39写道:
> Due to OptimizerPlanEnvironment.execute() throwing exception on the last
> line
> there is not way to post-process batch job execution result, like:
>
> JobExecutionResult r = env.execute(); //
cs-release-1.9/getting-started/examples/
https://ci.apache.org/projects/flink/flink-docs-release-1.9/getting-started/tutorials/local_setup.html
Thanks,
Zhu Zhu
alaa 于2019年9月5日周四 下午3:10写道:
> thank for your reply but Unfortunately this solution is not suitable
>
> <
> http://apach
client, I think you can change
"val env: StreamExecutionEnvironment = DemoStreamEnvironment.env" to be "val
env = StreamExecutionEnvironment.getExecutionEnvironment"
in the demo code like TotalArrivalCount.scala.
Thanks,
Zhu Zhu
alaa 于2019年9月4日周三 下午5:18写道:
> Hallo
1s looks good to me.
And I think the conclusion that when a user should override the delay is
worth to be documented.
Thanks,
Zhu Zhu
Steven Wu 于2019年9月3日周二 上午4:42写道:
> 1s sounds a good tradeoff to me.
>
> On Mon, Sep 2, 2019 at 1:30 PM Till Rohrmann wrote:
>
>> Thanks
ed cache of YARN.
In this way, the localized dist jar can be shared by different YARN
applications and it will not be removed when the YARN application which
localized it terminates.
This requires some changes in Flink though.
We will open a ISSUE to contribute this optimization to the community.
Thank
In our production, we usually override the restart delay to be 10 s.
We once encountered cases that external services are overwhelmed by
reconnections from frequent restarted tasks.
As a safer though not optimized option, a default delay larger than 0 s is
better in my opinion.
未来阳光 <2217232...@q
multiple TM logs. However it can be much smaller
than the "yarn logs ..." generated log.
Thanks,
Zhu Zhu
Yu Yang 于2019年8月30日周五 下午3:58写道:
> Hi,
>
> We run flink jobs through yarn on hadoop clusters. One challenge that we
> are facing is to simplify flink job log access.
>
One optimization that we take is letting yarn to reuse the flink-dist jar
which was localized when running previous jobs.
Thanks,
Zhu Zhu
Jörn Franke 于2019年8月30日周五 下午4:02写道:
> Increase replication factor and/or use HDFS cache
> https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/
with a non-null value.
In that case, would you check if the docker image is storing a
pre-generated legacy(1.7.2) JobGraph which is not compatible with Flink 1.9?
Thanks,
Zhu Zhu
Steven Nelson 于2019年8月28日周三 下午11:23写道:
> I am trying to update a cluster running in HA mode from 1.7.2 to 1.9.0. I
&
ink's task manager process.
For example for passing LD_LIBRARY_PATH as an env variable to the workers,
set: containerized.taskmanager.env.LD_LIBRARY_PATH: "/usr/lib/native" in
the flink-conf.yaml.
Thanks,
Zhu Zhu
Abhishek Jain 于2019年8月25日周日 上午2:48写道:
> Hi Miki,
> Thanks for your
,
Zhu Zhu
Juan Gentile 于2019年8月23日周五 下午7:48写道:
> Hello!
>
>
>
> We are running Flink on Yarn and we are currently getting the following
> error:
>
>
>
> *2019-08-23 06:11:01,534 WARN
> org.apache.hadoop.security.UserGroupInformation -
>
Thanks Gordon for the update.
Congratulations that we have Flink 1.9.0 released!
Thanks to all the contributors.
Thanks,
Zhu Zhu
Eliza 于2019年8月22日周四 下午8:10写道:
>
>
> On 2019/8/22 星期四 下午 8:03, Tzu-Li (Gordon) Tai wrote:
> > The Apache Flink community is very happy to announce
Hi Vishwas,
You can configure "state.checkpoints.num-retained" to specify the max
checkpoints to retain.
By default it is 1.
Thanks,
Zhu Zhu
Vishwas Siravara 于2019年8月22日周四 上午6:48写道:
> I am also using exactly once checkpointing mode, I have a kafka source and
> sink
t the stored resource manager address in HA is
replaced by jobmanager address in any case?
Thanks,
Zhu Zhu
Aleksandar Mastilovic 于2019年8月22日周四 上午8:16写道:
> Hi all,
>
> I’m experimenting with using my own implementation of HA services instead
> of ZooKeeper that would persist JobManager
Hi Lu,
I think it's OK to choose any way as long as it works.
Though I've no idea how you would extend SplittableIterator in your case.
The underlying is ParallelIteratorInputFormat and its processing is not
matched to a certain subtask index.
Thanks,
Zhu Zhu
Lu Niu 于2019年8月16日周五
Hi Jiangang,
Does "flink run -j jarpath ..." work for you?
If that jar id deployed to the same path on each worker machine, you can
try "flink run -C classpath ..." as well.
Thanks,
Zhu Zhu
刘建刚 于2019年8月15日周四 下午5:31写道:
> We are using per-job to load udf jar when
Split*s in one request, you can implement a new
*InputSplit* which wraps multiple *FileInputSplit*s. And you may need to
define in your *InputFormat* on how to process the new *InputSplit*.
Thanks,
Zhu Zhu
Lu Niu 于2019年8月15日周四 上午12:26写道:
> Hi,
>
> I have a data set backed by a director
Hi Vishwas,
If what you want is to set JVM options for Flink client JVM when running
jobs with "flink run", I think export the variable 'JVM_ARGS' does help.
Thanks,
Zhu Zhu
Vishwas Siravara 于2019年8月15日周四 上午4:03写道:
> I understand that when I run a flink job from comm
parallelism individually, you can invoke setParallelism on
each operator.
Thanks,
Zhu Zhu
Zili Chen 于2019年8月12日周一 下午8:00写道:
> Hi Cam,
>
> If you set parallelism to 60, then you would make use of all 60 slots you
> have and
> for you case, each slot executes a chained operator contai
Another possibility is the JM is killed externally, e.g. K8s may kill JM/TM
if it exceeds the resource limit.
Thanks,
Zhu Zhu
Zhu Zhu 于2019年8月12日周一 下午1:45写道:
> Hi Cam,
>
> Flink master should not die when getting disconnected with task managers.
> It may exit for cases below:
>
connection to ZK
3. encounters other unexpected fatal errors. In this case we need to check
the log to see what happens then
Thanks,
Zhu Zhu
Cam Mach 于2019年8月12日周一 下午12:15写道:
> Hello Flink experts,
>
> We are running Flink under Kubernetes and see that Job Manager
> die/restarted w
Hi Cam,
This case is expected due to slot sharing.
A slot can be shared by one instance of different tasks. So the used slot
is count of your max parallelism of a task.
You can specify the shared group with slotSharingGroup(String
slotSharingGroup) on operators.
Thanks,
Zhu Zhu
Abhishek Jain 于
JM main thread and increased
computation complexity of each RPC handling.
Thanks,
Zhu Zhu
qi luo 于2019年8月11日周日 下午6:17写道:
> Hi Chad,
>
> In our cases, 1~2k TMs with up to ~10k TM slots are used in one job. In
> general, the CPU/memory of Job Manager should be increased with more TMs.
file exists but it has some static initialization process which may
fail. This can also lead to the class to not be loaded and cause
NoClassDefFoundError.
Thanks,
Zhu Zhu
Subramanyam Ramanathan
于2019年8月10日周六 下午2:38写道:
> Hi.
>
>
>
> 1) The url pattern example :
> file:///r
data are some how corrupted, this case
may happen.
Do you have the detailed error info that why your program exits?
That can be helpful to identify the root cause.
Thanks,
Zhu Zhu
Hynek Noll 于2019年8月9日周五 下午8:59写道:
> Hi,
> I'm trying to implement a custom FileInputFormat (to rea
Hi pengchengling,
Does this issue happen before you submitting any job to the cluster or
after some jobs are terminated?
If it's the latter case, didi you wait for a while to see if the
unavailable slots became available again?
Thanks,
Zhu Zhu
pengcheng...@bonc.com.cn 于2019年8月9日周五 下午4
cause.
Thanks,
Zhu Zhu
Subramanyam Ramanathan 于2019年8月9日周五
上午1:45写道:
>
>
> Hello,
>
>
>
> I'm currently using flink 1.7.2.
>
>
>
> I'm trying to run a job that's submitted programmatically using the
> ClusterClient API.
>
>
han one URL. The protocol must be supported by the " + "{@link
java.net.URLClassLoader}."*
I think it should work.
Thanks,
Zhu Zhu
Vishwas Siravara 于2019年8月8日周四 上午1:03写道:
> Hi ,
> I am running flink on a standalone cluster without any resource manager
> like yarn or K
1 - 100 of 108 matches
Mail list logo