Congratulations!
Best,
Guowei
On Tue, Mar 28, 2023 at 12:02 PM Yuxin Tan wrote:
> Congratulations!
>
> Best,
> Yuxin
>
>
> Guanghui Zhang 于2023年3月28日周二 11:06写道:
>
>> Congratulations!
>>
>> Best,
>> Zhang Guanghui
>>
>> Hang Ruan 于2023年3月28日周二 10:29写道:
>>
>> > Congratulations!
>> >
>> > Best,
astProcessFunction but if there's a way to avoid that also
> while also ensuring ordered processing of events, then do let me know.
>
> On Fri, Apr 29, 2022 at 7:35 AM Guowei Ma wrote:
>
>> Hi Vishal
>>
>> I want to understand your needs first. Your requirements are:
Hi Vishal
I want to understand your needs first. Your requirements are: After a
stateful operator receives a notification, it needs to traverse all the
data stored in the operator state, communicate with an external system
during the traversal process (maybe similar to join?). In order to improve
Hi Sam
I think the first step is to see which part of your Flink APP is blocking
the completion of Checkpoint. Specifically, you can refer to the
"Checkpoint Details" section of the document [1]. Using these methods, you
should be able to observe where the checkpoint is blocked, for example, it
ma
Hi
Afaik the commit files action happens at the committer operator instead of
the JM size after the new sink api [1].
It means this would not happen if you use the new `FlinkSink`.[2]
[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
[2]
https://github.com/apach
ht cause irreparable damage
> to applications but it could be configured per exception.
>
>
>
>
>
> Regards,
>
> José Brandão
>
>
>
> *From: *Guowei Ma
> *Date: *Friday, 15 April 2022 at 11:04
> *To: *Jose Brandao
> *Cc: *user@flink.apache.org
> *Subj
Hi, Jose
I assume you are using the DataStream API.
In general for any udf's exception in the DataStream job, only the
developer of the DataStream job knows whether the exception can be
tolerated. Because in some cases, tolerating exceptions can cause errors in
the final result. So you still have
amp;utm_campaign=signaturevirality11&;>
> 13/04/22,
> 13:46:13
>
> On Wed, Apr 13, 2022 at 1:23 PM Guowei Ma wrote:
>
>> Hi
>> I think you need to export HADOOP_CLASSPATH correclty. [1]
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-
Hi
I think you need to export HADOOP_CLASSPATH correclty. [1]
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/yarn/#preparation
Best,
Guowei
On Wed, Apr 13, 2022 at 12:50 PM Anubhav Nanda
wrote:
> Hi,
>
> I have setup flink 1.13.5 and we are us
Hi Zhanghao
AFAIK, you might to see the `StreamingJobGraphGenerator` not the
`JobGraphGenerator` which is only used by the old flink stream sql stack.
>From comment of the `StreamingJobGraphGenerator::isChainableInput` the `an
union operator` does not support chain currently.
Best,
Guowei
On We
Hi, Huang
>From the document[1] it seems that you need to close the iterate stream.
such as `iteration.closeWith(feedback);`
BTW You also could get a detailed iteration example from here [2].
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/operators/overview/#ite
It seems that the key's hashcode is not stable.
So would you like to show the details of the `TraceKeyOuterClass.TraceKey`.
Best,
Guowei
On Sun, Mar 20, 2022 at 3:21 PM Prashant Deva wrote:
> here is the key code (in kotlin)
>
> val ks = object: KeySelector TraceFragmentOuterClass.TraceFragm
Hi, Ken
If you are talking about the Batch scene, there may be another idea that
the engine automatically and evenly distributes the amount of data to be
processed by each Stage to each worker node. This also means that, in some
cases, the user does not need to manually define a Partitioner.
At p
Hi, Killian
Sorry for responding late!
I think there is no simple way that could catch csv processing errors. That
means that you need to do it yourself.(Correct me if I am missing
something).
I think you could use RockDB State Backend[1], which would spill data to
disk.
[1]
https://nightlies.apac
> both 2.8.0 pulsar and 2.9.1 pulsar)
>
>
>
> Regards,
>
> Ananth
>
>
>
> *From: *Guowei Ma
> *Date: *Monday, 21 February 2022 at 4:57 pm
> *To: *Ananth Gundabattula
> *Cc: *user@flink.apache.org
> *Subject: *Re: Pulsar connector 2.9.1 failing job submissi
Hi,
You can try flink's cdc connector [1] to see if it meets your needs.
[1] https://github.com/ververica/flink-cdc-connectors
Best,
Guowei
On Mon, Feb 21, 2022 at 6:23 AM M Singh wrote:
> Hi Folks:
>
> I am trying to monitor a jdbc source and continuously streaming data in an
> application
Hi, Ansanth
I don't see any error logs on the server side, so it's hard to tell what
the specific problem is. From the current log, there are two things to try
first:
1. From the client's log, it is a 5-minute timeout, so you can telnet
127.0.0.1:8086 to see if there is a problem with the local n
Hi,Kartik
FileSink does not expose the file name to the user now.
Would you like to share your scenario ,which needs the file name?
Best,
Guowei
发自我的iPhone
> 在 2022年1月30日,下午6:38,Kartik Khare 写道:
>
> Hi,
> For my use case, I want to get the part file name that is being created in
> the HDF
Hi, Shawn
Thank you for your sharing. Unfortunately I do not think there is an easy
way to achieve this now.
Actually we have a customer who has the same requirement but the scenario
is a little different. The bounded and unbounded pipeline have some
differences but the customer wants reuse some s
Hi,Shawn
You want to use the correct state(n-1) for day n-1 and the full amount of
data for day n to produce the correct state(n) for day n.
Then use state(n) to initialize a job to process the data for day n+1.
Am I understanding this correctly?
Best,
Guowei
Shawn Du 于2022年1月26日 周三下午7:15写道:
Hi Shawn
Currently Flink can not trigger the sp at the end of the input. An
alternative way might be that you need to develop a customized source,
which triggers a savepoint when it notices that all the input split has
been handled.
Or you could see the state process api[1], which might be helpful.
Hi, Shawn
I think Flink does not support this mechanism yet.
Would you like to share the scenario in which you need this savepoint at
the end of the bounded input?
Best,
Guowei
On Wed, Jan 26, 2022 at 1:50 PM Shawn Du wrote:
> Hi experts,
>
> assume I have several files and I want replay these
; I noticed that jobmanager have a very hight CPU usage at the moment, like
> 2000%. I’m reasoning about the cause by profiling.
>
> Best,
> Paul Lam
>
> 2022年1月21日 09:56,Guowei Ma 写道:
>
> Hi, Paul
>
> Would you like to share some information such as the Flink version yo
w ones. Would
> you know how I can enforce a reading order here ?
>
> Thanks,
> Meghajit
>
>
>
>
> On Thu, Jan 20, 2022 at 2:29 PM Guowei Ma wrote:
>
>> Hi, Meghajit
>>
>> 1. From the implementation [1] the order of split depends on the
>> implem
Hi, Paul
Would you like to share some information such as the Flink version you used
and the memory of TM and JM.
And when does the timeout happen? Such as at begin of the job or during the
running of the job
Best,
Guowei
On Thu, Jan 20, 2022 at 4:45 PM Paul Lam wrote:
> Hi,
>
> I’m tuning a
Hi, Meghajit
1. From the implementation [1] the order of split depends on the
implementation of the FileSystem.
2. From the implementation [2] the order of the file also depends on the
implementation of the FileSystem.
3. Currently there is no such public interface ,which you could extend to
imp
Hi,Mason
I assign the jira to you.
Thanks for your contribution.
Best,
Guowei
On Wed, Jan 19, 2022 at 2:07 PM Mason Chen wrote:
> Hi all,
>
> There is some interest from our users to use prometheus push gateway
> reporter with a https endpoint. So, I've filed
> https://issues.apache.org/jira/br
Hi, summer
>>>Now I need to use a third-party jar in the Flink service, should I put
it under ${FLINK_HOME}/lib?
I think maybe an alternative way is to put the third-party jar into a fat
jar.
>>>How to enable Flink to automatically load third-party jars?
In general this is the JVM mechanism. It m
Hi Rahul
>From your description I guess maybe you could try different flink.yaml(one
for server and another for client).
I am not an expert about SSL and security stuff. So please correct me if I
am wrong.
Best,
Guowei
On Wed, Nov 24, 2021 at 3:54 AM Rahul wrote:
> Hello,
> I am trying to se
Hi Joseph
Would you like to give more details about the error message?
Best,
Guowei
On Thu, Nov 25, 2021 at 2:59 AM Joseph Lorenzini
wrote:
> Hi all,
>
>
>
> I have an implementation of KafkaDeserializationSchema interface that
> deserializes a kafka consumer record into a generic record.
>
>
>
e722/no_job/blob_t-274d3c2d5acd78ced877d898b1877b10b62a64df-590b54325d599a6782a77413691e0a7b
> does not exist and failed to copy from blob store.
> at
> org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:516)
> at
> org.apache.flink.runtime.blob.BlobServer.getFile
rver.getFileInternal(BlobServer
> .java:516)
> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer
> .java:444)
> at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer.java:
> 369)
> at org.apache.flink.runtime.rest.handler.taskmanager.
> AbstractT
mode, but the problem is that I
> wont know when a batch job finishes if I don't run it in batch mode since a
> streaming process will never end.
>
> Thanks.
>
> On Wed, Nov 3, 2021 at 4:38 PM Guowei Ma wrote:
>
>> Hi, Yan
>> I do not think it is a bug. Maybe
On Wed, Nov 3, 2021 at 1:58 PM Guowei Ma wrote:
>
>> Hi
>>
>> I did not run your program directly, but I see that you are now using the
>> Batch execution mode. I suspect it is related to this, because in the Batch
>> execution mode FLINK will "sort"
Hi Oliver
I think Alexey is right that you could not assume that the record would be
output in the event time order.
And there is a small addition.I see your output and there are actually
multiple concurrencies (probably 11 subtasks). You also can't expect these
concurrencies to be ordered accordi
Hi Long
>From the API point of view, this processing time can be omitted. This is
mainly for unification: event-time&processing-time scenarios, and alignment
with other window APIs.
Thanks Jark Wu for telling me this offline.
Best,
Guowei
On Wed, Nov 3, 2021 at 11:55 AM Long Nguyễn
wrote:
>
Hi Kevin
If you want to change this configuration(execution.checkpointing.timeout)
without restarting the job, as far as I know, there may not be such a
method.
But could you consider increasing this value by default?
Best,
Guowei
On Wed, Nov 3, 2021 at 5:15 AM Kevin Lam wrote:
> Hi all,
>
> W
Thank Daisy& Kevin much for your introduction to the improvement of TM
blocking shuffle, credit base+io scheduling is indeed a very interesting
thing. At the same time, I look forward to this as a default setting for tm
blocking shuffle.
Best,
Guowei
On Wed, Nov 3, 2021 at 2:46 PM Gen Luo wrote
Hi, Smith
It seems that the log file(blob_t-274d3c2d5acd78ced877d898b1877b10b62a64df-
590b54325d599a6782a77413691e0a7b) is deleted for some reason. But AFAIK
there are no other guys reporting this exception.(Maybe other guys know
what would happen).
1. I think if you could refresh the page and you
Hi
I did not run your program directly, but I see that you are now using the
Batch execution mode. I suspect it is related to this, because in the Batch
execution mode FLINK will "sort" the Key (this might be an unstable sort).
So would you like to experiment with the results of running with Strea
Hi, Qihua
AFAIK there is no way to do it. Maybe you need to implement a "new" sink to
archive this target.
Best,
Guowei
On Wed, Nov 3, 2021 at 12:40 PM Qihua Yang wrote:
> Hi,
>
> Our flink application has two sinks(DB and kafka topic). We want to push
> same data to both sinks. Is it possibl
e given?
> Also I see that the FileSink takes GenericRecord, so how can the
> DataStream be converted to a GenericRecord?
>
> Please bear with me if my questions don't make any sense.
>
> On Sun, Sep 26, 2021 at 9:12 AM Guowei Ma wrote:
>
>> Hi, Harshvardhan
>>
Hi, Harshvardhan
I think CaiZhi is right.
I only have a small addition. Because I see that you want to convert Table
to DataStream, you can look at FileSink (ParquetWriterFactory)[1].
[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/datastream/file_sink/#bulk-encoded-for
Hi, Puneet
Could you share whether you are using Flink's session mode or application
mode?
>From the log, you are using `StandaloneDispatcher`, but you will use it in
both session and application mode.
If you use application mode, this might be in line with expectations.
Best,
Guowei
On Fri, Se
mer plus an interval time.
> If a MAX_WATERMARK comes, the timer is triggered, then registers another
> timer and forever.
> I'm not sure whether Macro meets a similar problem.
>
> Best,
> JING ZHANG
>
>
>
> Guowei Ma 于2021年9月24日周五 下午4:01写道:
>
>> Hi Macro
>
Hi Macro
Indeed, as mentioned by JING, if you want to drain when triggering
savepoint, you will encounter this MAX_WATERMARK.
But I have a problem. In theory, even with MAX_WATERMARK, there will not be
an infinite number of timers. And these timers should be generated by the
application code.
You
Hi, Thomas
I am not an expert of s3 but I think Flinkneed write/read/delete(maybe
list) permission of the path(bucket).
BTW, What error did you encounter?
Best,
Guowei
On Fri, Sep 24, 2021 at 5:00 AM Thomas Wang wrote:
> Hi,
>
> I'm trying to figure out what exact s3 permissions does a flink
Hi Hill
As far as I know you could not use byte[] as a keyby. You could find more
information from [1].
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/operators/overview/#keyby
Best,
Guowei
On Fri, Sep 24, 2021 at 3:15 PM Caizhi Weng wrote:
> Hi!
>
> It de
can pull the logs?
>
> Thanks,
> Hemant
>
> On Tue, Sep 14, 2021 at 12:22 PM Guowei Ma wrote:
>
>> Hi,
>>
>> Could you share some logs when the job fails?
>>
>> Best,
>> Guowei
>>
>>
>> On Mon, Sep 13, 2021 at 10:59 PM bat man wr
Hi, Puneet
In general every job has its own classloader. You could find more detailed
information from doc [1].
You could put some common jar into the "/lib" to avoid this [2].
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/
[2]
https://ni
Hi,
Could you share some logs when the job fails?
Best,
Guowei
On Mon, Sep 13, 2021 at 10:59 PM bat man wrote:
> Hi,
>
> I am running a POC to evaluate Flink on Native Kubernetes. I tried
> changing the default log location by using the configuration -
> kubernetes.flink.log.dir
> However, th
Hi, Ashutosh
As far as I know, there is no way to rename the system metrics name.
But would you like to share why you need to rename the metrics ?
Best,
Guowei
On Mon, Sep 13, 2021 at 2:29 PM Ashutosh Uttam
wrote:
> Hi team,
>
> We are using PrometheusReporter to expose Flink metrics to Prome
Hi, Kevin
1. Could you give me some specific information, such as what version of
Flink is you using, and is it using DataStream or SQL?
2. As far as I know, RocksDB will put state on disk, so it will not consume
memory all the time and cause OOM in theory.
So you can see if there are any obje
am?
>
> Thanks,
>
> On Thu, Sep 2, 2021 at 9:24 PM Guowei Ma wrote:
>
>> Hi, John
>>
>> I agree with Caizhi that you might need to customize a window trigger.
>> But there is a small addition, you need to convert Table to DataStream
>> first.
>>
Hi, John
I agree with Caizhi that you might need to customize a window trigger. But
there is a small addition, you need to convert Table to DataStream first.
Then you can customize the trigger of the window. Because as far as I know,
Table API does not support custom windows yet. For details on ho
Hi, Julian
I notice that your configuration
includes "restart-strategy.fixed-delay.attempts: 10". It means that the job
would fail after 10 times failure. So maybe it leads to the job not
restarting again and you could increase this value.
But I am not sure if this is the root cause. So if this do
Hi, Niklas
As far as I know, the maximum parallelism is not currently displayed on the
web ui. Maximum parallelism is the concept of operator granularity, so I
understand that it is a little difficult to show it. However, each job can
have its own default value, if not, there is a calculation meth
nTotalEvents
> super.invoke(value, context)
> sessionsWritten.inc()
> }
>
> Though I still get Caused by: org.apache.flink.util.SerializedThrowable:
> null
> So, my assumption is that something wrong with "override def open()" method
>
> Thanks!
Hi, Daniel
Could you tell me the version of Flink you use? I want to look at the
corresponding code.
Best,
Guowei
On Wed, Aug 11, 2021 at 11:23 PM Daniel Vol wrote:
> Hi Matthias,
>
> First, thanks for a fast reply.
> I am new to Flink, so probably I miss a lot in terms of flow and objects
> pa
Hi Burcu
Could you show more logs? I could try to help find out what is happening.
But to be honest the 1.4 is too old a version that the community does not
support. You’d better upgrade to a newer version.
Best,
Guowei
On Fri, Jun 25, 2021 at 2:48 PM Burcu Gül POLAT EĞRİ
wrote:
> Dear All,
>
>
Hi Pranjul
There are already some system metrics that track the jvm
status(CPU/Memory/Threads/GC). You could find them in the [1]
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#system-metrics
Best,
Guowei
On Fri, Jun 25, 2021 at 2:33 PM Pranjul Ahuja wrote:
Hi Sandeep
What I understand is that you want to manipulate the state. So I think you
could use the old schema to read the state first, and then write it to a
new schema, instead of using a new schema to read an old schema format data.
In addition, I would like to ask, if you want to do "State Sch
Hi Qihua
It seems that the job fail because of checkpoint timeout(10min) from the
second picture. I found that the checkpoint fail is because one of your own
customs source could not acknowledge the cp.
So I think you could add some log in your source to figure out what is
happening at the moment.
Hi Padarn
Will there be these errors if the jobgraph is not modified?
In addition, is this error stack all? Is it possible that other errors
caused the stream to be closed?
Best,
Guowei
On Tue, Jun 15, 2021 at 9:54 PM Padarn Wilson wrote:
> Hi all,
>
> We have a job that has a medium size state
Hi, Lingfeng
These job errors you posted happened when the job(`SpendReport`) was
running on the IDE?
According to my understanding, this document[1] & repository[2] mean that
the example is to be run in docker, not in IDE.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/try
hi, Jiang
I am afraid of misunderstanding what you mean, so can you elaborate on how
you want to change it? For example, which interface or class do you want to
add a method to?
Although I am not a state expert, as far as I know, due to incremental
checkpoints, when CompleteCheckpoint is discardin
Hi, Macro
I think you could try the `FileSource` and you could find an example from
[1]. The `FileSource` would scan the file under the given
directory recursively.
Would you mind opening an issue for lacking the document?
[1]
https://github.com/apache/flink/blob/master/flink-connectors/flink-con
Hi,
Would you like to share your code? It is very helpful to verify the
problem.
I think you could use the `JoinedStream.with().uid(xxx)` to set the
name/UID .
Best,
Guowei
On Mon, May 24, 2021 at 2:36 PM Marco Villalobos
wrote:
> Hi,
>
> Stream one has one element.
> Stream two has 2 el
Hi, Yik San
You need to change the following line:
protected final val LOG = LoggerFactory.getLogger(getClass)
protected *static* final val LOG = LoggerFactory.getLogger(getClass)
Best,
Guowei
On Mon, May 24, 2021 at 2:41 PM Yik San Chan
wrote:
> Hi community,
>
> I have a job that cons
Hi
I think you are right that the metrics are reset after the job restart. It
is because the metrics are only stored in the memory.
I think you could store the metrics to the Flink's state[1], which could be
restored after the job restarted.
[1]
https://ci.apache.org/projects/flink/flink-docs-rele
Hi, Gary
I think it might be a bug. So would you like to open a jira for this.
And could you share the exception ,which the TaskManagerLocation is null?
It might be very helpful to verify the cause.
Best,
Guowei
On Thu, May 13, 2021 at 10:36 AM Yangze Guo wrote:
> Hi, it seems to be related t
Hi,
In fact, not only JobManager(ResoruceManager) will kill TimeOut's
TaskManager, but if TaskManager finds that it cannot connect to
JobManager(ResourceManager), it will also exit by itself.
You can look at the time period during which the HB timeout occurred and
what happened in the log. Under no
Hi
I think you could configure some restart strategy[1] likes
restart-strategy: fixed-delay
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/execution/task_failure_recovery/#fixed-delay-restart-strategy
Best,
Guowei
On Thu, May 13, 2021 at 12:02 PM 1095193...@qq.com <109
Hi Sudhansu,
I think you do not need to set the config in flink-conf.
Best,
Guowei
On Thu, May 13, 2021 at 1:06 PM sudhansu jena
wrote:
> Hi Team,
>
> We have recently enabled Check Pointing in our flink job using
> FSStateBackend pointing to S3 bucket.
>
> Below is the sample code for enabling
for your reply! This information was still missing. The presenter
> mentioned the documentation but I hadn't found it. So your link to the
> specific place is valuable too.
>
> Günter
> On 13.05.21 06:09, Guowei Ma wrote:
>
> Hi,
> I do not try it. But from the documenta
Hi,
I do not try it. But from the documentation[1] it seems that you might need
add the "jobmanager.rpc.address: jobmanager" to the FLINK_PROPERTIES before
creating a network.
[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/
Best,
Gu
Hi, chenxuying
There is currently no official support for this.
What I am curious about is why you have this requirement. In theory, you
can always build your own image.
Best,
Guowei
On Mon, Apr 19, 2021 at 9:58 PM chenxuying wrote:
> Hi all, I deployed the flink in K8S by session cluster [1]
>
Hi, Olivier
Yes. The introduction of this concept is to solve the problem of rescaling
the keystate.
Best,
Guowei
On Mon, Apr 19, 2021 at 8:56 PM Olivier Nouguier
wrote:
> Hi,
> May I have the confirmation that the max-parallelism limitation only
> occurs when keyed states are used ?
>
>
> --
t; input.b,
>> input.c.avg.over(col('w'))) \
>> .execute_insert('MySink') \
>> .wait()
>>
>> But running into following exception:
>>
>> py4j.protocol.Py4JError: An error occurred while calling
>> z:org.ap
Hi, Mary
Flink has an alignment mechanism for synchronization. All upstream
taks (for example reduce1) will send a message after the end of a round
to inform all downstream that he has processed all the data. When the
downstream (reduce2) collected all the messages from all his upstream
t
umn - more like pass through
> from input to sink. What's the best way to achieve this? I was thinking
> that making it part of the select() clause would do it, but as you said
> there needs to be some aggregation performed on it.
>
> Thanks,
> Sumeet
>
>
> On Mon
Hi, Sumeet
For "input.b" I think you should aggregate the non-group-key
column[1].
But I am not sure why the "input.c.avg.alias('avg_value')" has resolved
errors. Would you mind giving more detailed error information?
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/tabl
he permission to merge this feature [1], it's a
>>>> useful improvement to sql client and won't affect
>>>> other components too much. We were plan to merge it yesterday but met
>>>> some tricky multi-process issue which
>>>> has a very high
ster - Using application-defined
> state backend: RocksDBStateBackend{checkpointStreamBackend=File State
> Backend (checkpoints: 'gs:///flink-checkpoints', savepoints:
> 'gs:///flink-savepoints', asynchronous: TRUE, fileStateThreshold:
> 1048576), localRocksDbDirectories
azonS3Client.java:5062)
> ~[?:?]
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
> ~?:?]
> at
> com.amazonaws.services.s3.AmazonS3Client.initiateMultipartUpload(AmazonS3Client.java:3581)
> ~[?:?]
> at
> org.apache.hadoop.
Hi,
I think there are many reasons that could lead to the checkpoint timeout.
Would you like to share some detailed information of checkpoint? For
example, the detailed checkpoint information from the web.[1] And which
Flink version do you use?
[1]
https://ci.apache.org/projects/flink/flink-docs-
Hi, Yaroslav
AFAIK there is no official GCS FileSystem support in FLINK. Does the GCS
is implemented by yourself?
Would you like to share the whole log of jm?
BTW: From the following log I think the implementation has already some
retry mechanism.
>>> Interrupted while sleeping before retry. Giv
Hi, Rex
I think that Flink does not have an official release that supports the arm
architecture. There are some efforts and discussion [1][2][3] about
supporting the architecture. I think you could find some builds at
openlabtesting. [4]
But AFAIK there is no clear timeline about that.(correct me
Hi, Sumeet
I am not an expert about PyFlink. But I think @Dian Fu
might give some insight about this problem.
Best,
Guowei
On Thu, Apr 1, 2021 at 12:12 AM Sumeet Malhotra
wrote:
> Cross posting from StackOverlow here:
>
>
> https://stackoverflow.com/questions/66888486/pyflink-extract-nested
Hi, Kevin
If you use the RocksDB and want to know the data on the disk I think that
is the right metric. But the SST files might include some expired data.
Some data in memory is not included in the SST files yet. In general I
think it could reflect the state size of your application.
I think tha
Hi, Robert
I think you could try to change the "s3://argo-artifacts/" to "
s3a://argo-artifacts/".
It is because that currently `StreamingFileSink` only supports Hadoop based
s3 but not Presto based s3. [1]
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/connectors/streamfile_
Hi, community:
Friendly reminder that today (3.31) is the last day of feature development.
Under normal circumstances, you will not be able to submit new features
from tomorrow (4.1). Tomorrow we will create 1.13.0-rc0 for testing,
welcome to help test together.
After the test is relatively stable
${FLINK_HOME}/plugins/s3-fs-presto
> COPY flink-s3-fs-presto-1.12.2.jar $FLINK_HOME/plugins/s3-fs-presto/
>
> Thanks!
>
> On Thu, Mar 25, 2021 at 9:24 AM Guowei Ma wrote:
>
>> Hi,
>> After some discussion with Wang Yang offline, it seems that there might
>> be
Hi,
I am not an expert of JMH but it seems that it is not an error. From the
log it looks like that the job is not finished.
The data source continues to read data when JMH finishes.
Thread[Legacy Source Thread - Source:
TableSourceScan(table=[[default_catalog, default_database,
CLICKHOUSE_SOURCE_
Hi,
After some discussion with Wang Yang offline, it seems that there might be
a jobmanager failover. So would you like to share full jobmanager log?
Best,
Guowei
On Wed, Mar 24, 2021 at 10:04 PM Lukáš Drbal wrote:
> Hi,
>
> I would like to use native kubernetes execution [1] for one batch job
Hi, Roc
Thanks for your detailed explanation.
I could not find any "stream" operator that uses `ExternalSorterBuilder` by
"find usage" of the IDEA.
Best,
Guowei
On Wed, Mar 24, 2021 at 3:27 PM Roc Marshal wrote:
> Hi, Guowei Ma.
> As far as I know, flink writes s
Hi, Roc
Could you explain more about your question?
Best,
Guowei
On Wed, Mar 24, 2021 at 2:47 PM Roc Marshal wrote:
> Hi,
>
> Can someone tell me where flink uses memory spilling to write to disk?
> Thank you.
>
> Best, Roc.
>
>
>
>
Hi,
You need some persistent storages(like hdfs) for the checkpoint. It is
Flink's fault tolerance prerequisites.[1]
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/stream/state/checkpointing.html#prerequisites
Best,
Guowei
On Wed, Mar 24, 2021 at 1:21 PM Maminspapin wrote:
Hi, M
Could you give the full stack? This might not be the root cause.
Best,
Guowei
On Wed, Mar 24, 2021 at 2:46 AM Claude M wrote:
> Hello,
>
> I'm trying to setup Flink in Kubernetes using the Application Mode as
> described here:
> https://ci.apache.org/projects/flink/flink-docs-master/docs/
I am
>> temporarily using taskmanager.memory.network.batch-shuffle-read.size in
>> my PR now. Any suggestions about that?
>>
>> Best,
>> Yingjie (Kevin)
>>
>> --
>> 发件人:Guowei Ma
>> 日 期:2021年03月09日 17:28:35
&g
1 - 100 of 174 matches
Mail list logo