Hi Chad,
In our cases, 1~2k TMs with up to ~10k TM slots are used in one job. In
general, the CPU/memory of Job Manager should be increased with more TMs.
Regards,
Qi
> On Aug 11, 2019, at 2:03 AM, Chad Dombrova wrote:
>
> Hi,
> I'm still on my task management investigation, and I'm curious t
Congratulations and thanks for the hard work!
Qi
> On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai wrote:
>
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.9.0, which is the latest major release.
>
> Apache Flink® is an open-source stream processing fra
Hi guys,
In Flink 1.9 HiveTableSink is added to support writing to Hive, but it only
supports batch mode. StreamingFileSink can write to HDFS in streaming mode,
but it has no Hive related functionality (e.g. adding Hive partition).
Is there any easy way we can streaming write to Hive (with exactl
ther tweaks you need to make.
>
> It's on our list for 1.10, not high priority though.
>
> Bowen
>
> On Wed, Sep 4, 2019 at 2:23 AM Qi Luo wrote:
>
>> Hi guys,
>>
>> In Flink 1.9 HiveTableSink is added to support writing to Hive, but it
>> only support
i
> Send Time:2019年9月6日(星期五) 05:21
> To:Qi Luo
> Cc:user ; snake.fly318 ;
> lichang.bd
> Subject:Re: Streaming write to Hive
>
> Hi,
>
> I'm not sure if there's one yet. Feel free to create one if not.
>
> On Wed, Sep 4, 2019 at 11:28 PM Qi Luo wrote:
> Hi
This is weird. Could you paste your entire exception trace here?
> On Nov 26, 2018, at 4:37 PM, Flink Developer
> wrote:
>
> In addition, after the Flink job has failed from the above exception, the
> Flink job is unable to recover from previous checkpoint. Is this the expected
> behavior? Ho
Hi Alex and Lukas,
This error is controlled by another RPC timeout (which is hard coded and not
affected by “akka.ask.timeout”). Could you open an JIRA issue so I can propose
a fix on that?
Cheers,
Qi
> On Dec 12, 2018, at 7:07 AM, Alex Vinnik wrote:
>
> Hi there,
>
> Run into the same prob
om>> wrote:
> Hi Qi,
>
> Thanks for looking into this. Here is ticket
> https://issues.apache.org/jira/browse/FLINK-11143
> <https://issues.apache.org/jira/browse/FLINK-11143>
>
> Best,
> -Alex
>
> On Tue, Dec 11, 2018 at 8:47 PM qi luo <mailto:lu
MERATE_NESTED_FILES_FLAG,
> true);
>
> JsonLinesInputFormat jsonInputFormat = new JsonLinesInputFormat(new
> Path(inputPath), configuration);
> jsonInputFormat.setFilesFilter(new BucketingSinkFilter());
>
> DataSet input = env.readFile(jsonInputFormat,
> inputPath).withParameters(co
Hi Gyula,
Your issue is possibly related to [1] that slots prematurely released. I’ve
raised a PR which is still pending review.
[1] https://issues.apache.org/jira/browse/FLINK-10941
> On Dec 20, 2018, at 9:33 PM, Gyula Fóra wrote:
>
> Hi!
>
> Since we have moved to the new execution mode wi
Hi Hao,
Since Flink is using Child-First class loader, you may try search for the class
"com.zendesk.fraudprevention.examples.ConnectedStreams$$anon$90$$anon$45” in
your fat JAR. Is that an inner class?
Best,
Qi
> On Jan 3, 2019, at 7:01 AM, Hao Sun wrote:
>
> Hi,
>
> I am wondering if the
Hi,
We’re trying to distribute batch input data to (N) HDFS files partitioning by
hash using DataSet API. What I’m doing is like:
env.createInput(…)
.partitionByHash(0)
.setParallelism(N)
.output(…)
This works well for small number of files. But when we need to distribute to
arallelizing the reads from the files, given
> the parallelism that you’re specifying.
>
> — Ken
>
>
>> On Mar 11, 2019, at 5:42 AM, qi luo > <mailto:luoqi...@gmail.com>> wrote:
>>
>> Hi,
>>
>> We’re trying to distribute batch input data
//ci.apache.org/projects/flink/flink-docs-release-1.7/api/java/org/apache/flink/api/common/functions/MapPartitionFunction.html>
> would be easiest, I think.
>
> — Ken
>
>
>
>> On Mar 12, 2019, at 2:28 AM, qi luo > <mailto:luoqi...@gmail.com>> wrote:
>
n Mar 13, 2019, at 1:26 AM, qi luo > <mailto:luoqi...@gmail.com>> wrote:
>>
>> Hi Ken,
>>
>> Do you mean that I can create a batch sink which writes to N files?
>
> Correct.
>
>> That sounds viable, but since our data size is huge (billions of r
/flink-utils/>, for a rough but working
> version of a bucketing sink.
>
> — Ken
>
>
>> On Mar 13, 2019, at 7:46 PM, qi luo > <mailto:luoqi...@gmail.com>> wrote:
>>
>> Hi Ken,
>>
>> Agree. I will try partitonBy() to reducer the number
uffles are an important building block for this and there might be
> better support for your use case in the future.
>
> Best, Fabian
>
> Am Fr., 15. März 2019 um 03:56 Uhr schrieb qi luo <mailto:luoqi...@gmail.com>>:
> Hi Ken,
>
> That looks awesome! I’ve
K-10429
> <https://issues.apache.org/jira/browse/FLINK-10429>
>
>
> Am Fr., 15. März 2019 um 12:13 Uhr schrieb qi luo <mailto:luoqi...@gmail.com>>:
> Hi Fabian,
>
> I understand this is a by-design behavior, since Flink is firstly built for
> streaming. Su
Hi Yinhua,
This looks like the TM executing the sink is down, maybe due to OOM or some
other error. You can check the JVM heap and GC log to see if there’re any clues.
Regards,
Qi
> On Mar 28, 2019, at 7:23 PM, yinhua.dai wrote:
>
> Hi,
>
> I write a single flink job with flink SQL with vers
Hello,
Today we encountered an issue where our Flink job request for Yarn container
infinitely. In the JM log as below, there were errors when starting TMs (caused
by underlying HDFS errors). So the allocated container failed and the job kept
requesting for new containers. The failed containers
>
> Thanks,
> Rong
>
> [1] https://issues.apache.org/jira/browse/FLINK-10868
> <https://issues.apache.org/jira/browse/FLINK-10868>
> On Fri, Mar 29, 2019 at 5:09 AM qi luo <mailto:luoqi...@gmail.com>> wrote:
> Hello,
>
> Today we encountered an issue
+1. It would be great if someone could benchmark between difference GC in Flink
(we may do it in next few months).
I’m told that the default parallel GC provides better throughput but longer
pauses (we encountered 2min+ GC pauses in large dataset). Whereas the G1GC
provides less pauses but also
Hi all,
We use Flink 1.5 batch and start thousands of jobs per day. Occasionally we
observed some stuck jobs, due to some TM hang in “DEPLOYING” state.
On checking TM log, it shows that it stuck in downloading jars in BlobClient:
...
INFO org.apache.flink.runtime.taskexecutor.TaskExecuto
.run(Thread.java:748)
I checked the latest master code. There’s still no socket timeout in Blob
client. Should I create an issue to add this timeout?
Regards,
Qi
> On Apr 19, 2019, at 7:49 PM, qi luo wrote:
>
> Hi all,
>
> We use Flink 1.5 batch and start thousands o
ly and isn't easy to reproduce.
> On Apr 25, 2019, at 6:40 PM, Dawid Wysakowicz wrote:
>
> Hi,
>
> Feel free to open a JIRA for this issue. By the way have you investigated
> what is the root cause for it hanging?
>
> Best,
>
> Dawid
>
> On 25/04/2019 0
Hi Stephan,
We have met similar issues described as Ken. Would all these issues be
hopefully fixed in 1.9?
Thanks,
Qi
> On Jun 26, 2019, at 10:50 PM, Stephan Ewen wrote:
>
> Hi Ken!
>
> Sorry to hear you are going through this experience. The major focus on
> streaming so far means that the
> other reasons? Or you can try to increase the value of
> `taskmanager.registration.timeout`. For allocating containers using
> multi-thread, I personally think it's going to get very complicated, and the
> more recommended way is to put some waiting works into asynchronous
gt; need to wait until the release code branch being cut.
>
> Thank you~
> Xintong Song
>
>
> On Wed, Jul 10, 2019 at 1:55 PM qi luo <mailto:luoqi...@gmail.com>> wrote:
> Thanks Xintong and Haibo, I’ve found the fix in the blink branch.
>
> We’re also gl
Hi guys,
We runs thousands of Flink batch job everyday. The batch jobs are submitted in
attached mode, so we can know from the client when the job finished and then
take further actions. To respond to user abort actions, we submit the jobs with
"—shutdownOnAttachedExit” so the Flink cluster can
chanism now. To achieve this, I think it
> may be necessary to add a REST-based heartbeat mechanism between Dispatcher
> and Client. At present, perhaps you can add a monitoring service to deal with
> these residual Flink clusters.
>
> Best,
> Haibo
>
> At 2019-07-
30 matches
Mail list logo