Hi, Sohi
Seems like the checkpoint file
`hdfs:/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19`
did not exist for some reason, you can check the life cycle of this file
from hdfs audit log and find out why the file did not exist. maybe the
chec
Yes. File got deleted .
2019-01-15 10:40:41,360 INFO FSNamesystem.audit: allowed=true ugi=hdfs
(auth:SIMPLE) ip=/192.168.3.184 cmd=delete
src=/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19
dst=nullperm=null
May be you're generating non-standard JSON record.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Zhenghua
Yes, the topic is polluted somehow. After I create a new topic to consume,
It is OK now.
Yours sincerely
Joshua
On Tue, Jan 15, 2019 at 4:28 PM Zhenghua Gao wrote:
> May be you're generating non-standard JSON record.
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-a
Hi!
Lately I seem to be hitting a bug in the rocksdb timer service. This
happens mostly at checkpoints but sometimes even at watermark:
java.lang.RuntimeException: Exception occurred while processing valve
output watermark:
at
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor
Hi,
I have never seen this before. I would assume to see this exception because the
write batch is flushed and contained a write against a column family that does
not exist (anymore). However, we initialize everything relevant in
RocksDBCachingPriorityQueueSet as final (CF handle) and never dro
Hi John,
this is definitely not how Flink should behave in this situation and could
indicate a bug. From the logs I couldn't figure out the problem. Would it
be possible to obtain for the TMs and JM the full logs with DEBUG log
level? This would help me to further debug the problem.
Cheers,
Till
Hi John,
this looks indeed strange. How many concurrent operators do you have which
write state to s3?
After the cancellation, the JobManager should keep the slots for some time
until they are freed. This is the normal behaviour and can be controlled
with `slot.idle.timeout`. Could you maybe shar
Hi Alexandru,
you can use the `modify` command `bin/flink modify --parallelism
` to modify the parallelism of a job. At the moment, it is
implemented as first taking a savepoint, stopping the job and then
redeploying the job with the changed parallelism and resuming from the
savepoint.
Cheers,
T
Same here Pasquale, the logs on DEBUG log level could be helpful. My guess
would be that the respective tasks are overloaded or there is some resource
congestion (network, disk, etc).
You should see in the web UI the number of incoming and outgoing events. It
would be good to check that the events
Hi All,
I use the following code try to build a RestClient
org.elasticsearch.client.RestClient.builder( new HttpHost(xxx,
xxx,"http") ).build()
but when in running time, a NoSuchMethodError throws out, I think the
reason is:
There are two RestClient classes, one
Hi
As known, TableFactoryService has many methods to find a suitable service
to load. Some of them use a user defined classloader, the others just uses
the default classloader.
Now I use ConnectTableDescriptor to registerTableSource in the environment,
which uses TableFactoryUtil to load service,
Hi,
Just an update from our side. We couldn't find anything specific in the
logs and the problem is not easy reproducible. This week, the system is
running fine, which makes me suspicious as well of some resourcing issue.
But so far, we haven't been able to find the reason though we have
discarded
Hi,
I have seen a few cases where for certain jobs a small imbalance in the state
partition assignment did cascade into a larger imbalance of the job. If your
max parallelism mod parallelism is not 0, it means that some tasks have one
partition more than others. Again, depending on how much par
Hi Stefan,
Thanks for your suggestion. As you may see from the original screenshot,
the actual state is small, and even smaller than other some of the other
subtasks. We are consuming from a Kafka topic with 600 partitions, with
parallelism set to around 20. Our metrics show that all the subtasks
Dear Flink users and developers!
I start this discussion to collect feedback about maintaining a custom
RocksDB branch for Flink, if anyone sees any problems with this approach.
Are there people who already uses a custom RocksDB client build with
RocksDB state backend?
As you might already know,
Hi all,
I was wondering if anybody has any recommendation over making HTTP requests
from Flink to another service.
On the long term we are looking for a solution that is both performing and
integrates well with our flink program.
Does it matter the library we use? Do we need a special connector
Thanks Till!
To execute the above (using Kubernetes), one would enter the running
JobManager service and execute it?
The following REST API call does the same */jobs/:jobid/rescaling*?
I assume it changes the base parallelism, but what it will do if I had
already set the parallelism of my operato
I can send you some debug logs and the execution plan, can I use your personal
email? There might be sensitive info in the logs.
Incoming and Outgoing records are fairly distributed across subtasks, with
similar but alternate loads, when the checkpoint is triggered, the load drops
to nearly zer
Hi Joshua,
Could you use `TableFactoryService` directly to register TableSource? The
code looks like:
final TableSource tableSource =
> TableFactoryService.find(StreamTableSourceFactory.class,
> streamTableDescriptor, classloader)
> .createStreamTableSource(propertiesMap);
> tableEnv.registerTabl
Thanks for the tip Stefan, you are probably right that this might be
related to a custom change.
We have a change that deletes every state that hasn't been registered in
the open method and maybe it accidentally delates the timer service as
well, need to check.
Thanks!
Gyula
On Tue, Jan 15, 2019
Hi Alexandru,
at the moment `/jobs/:jobid/rescaling` will always change the parallelism
for all operators. The maximum is the maximum parallelism which you have
defined for an operator.
I agree that it should also be possible to rescale an individual operator.
There internal functionality is alre
Can you share more which use case are you trying to implement ?
On Tue, Jan 15, 2019 at 2:02 PM wrote:
> Hi all,
>
>
>
> I was wondering if anybody has any recommendation over making HTTP
> requests from Flink to another service.
>
> On the long term we are looking for a solution that is both
Hi Pasquale,
if you configured a checkpoint directory, then the MemoryStateBackend will
also write the checkpoint data to disk in order to persist it.
Cheers,
Till
On Tue, Jan 15, 2019 at 1:08 PM Pasquale Vazzana wrote:
> I can send you some debug logs and the execution plan, can I use your
>
Hi,
I have a flink program which needs to process many messages and part of this
processing is to process the data using an external web service using http
calls.
Example:
val myStream: DataStream[String]
myStream
.map(new MyProcessingFunction)
.map(new MyWebServiceHttpClient)
.print
Any s
Hello,
I am trying to install Flink on Kube, it's almost working..
I am using the kube files on flink 1.7.1 doc
My cluster is starting well, my 2 tasksmanagers are registering
successfully to job manager
On webUI, i see them :
akka.tcp://flink@dev-flink-taskmanager-3717639837-gvwh4
:37057/user/tas
Hi Jacopo,
Check this:
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/asyncio.html
Best,
Alex
On Tue, 15 Jan 2019 at 13:57, wrote:
> Hi,
>
>
>
> I have a flink program which needs to process many messages and part of
> this processing is to process the data us
Thats great news!
Are there any plans to expose it in the upcoming Flink release?
On Tue, 15 Jan 2019 at 12:59, Till Rohrmann wrote:
> Hi Alexandru,
>
> at the moment `/jobs/:jobid/rescaling` will always change the parallelism
> for all operators. The maximum is the maximum parallelism which yo
Nevermind..
Problem already discussed in thread :
Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s
environment"
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le mar. 15 janv. 2019 à 15:16, bastien dine a
écrit :
> Hello,
>
Hello Jamie,
Does #1 apply to batch jobs too ?
Regards,
--
Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io
Le lun. 14 janv. 2019 à 20:39, Jamie Grier a écrit :
> There are a lot of different ways to deploy Flink. It would be easier to
> answer your
Hi Chris,
there is no way to provide "exactly-once" and avoid duplicates without
transactions available since Kafka 0.11.
The only way I could think of is building a custom deduplication step on
consumer side.
E.g. using in memory cache with eviction or some other temporary storage to
keep set of
Dawid , Gary,
Got it . Thanks
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Henry,
I was not sure if this is the suggested way. but from what I understand of
the pom file in elasticsearch5, you are allowed to change the sub version
of the org.ealisticsearch.client via manually override using
-Delasticsearch.version=5.x.x
during maven build progress if you are using a
Hi Cristian,
Have you tried to extend AbstractUdfStreamOperator and
override processWatermark?
This method should deliver the increasing watermark. Do you use processing
or event time of records?
Best,
Andrey
On Mon, Jan 14, 2019 at 11:03 PM Cristian wrote:
> Hello.
>
> Flink emits watermark m
I'm not aware of someone working on this feature right now.
On Tue, Jan 15, 2019 at 3:22 PM Alexandru Gutan
wrote:
> Thats great news!
>
> Are there any plans to expose it in the upcoming Flink release?
>
> On Tue, 15 Jan 2019 at 12:59, Till Rohrmann wrote:
>
>> Hi Alexandru,
>>
>> at the momen
Ethan, it depends on what you mean by easy ;) It just depends a lot on
what infra tools you already have in place. On bare metal it's probably
safe to say there is no "easy" way. You need a lot of automation to make
it easy.
Bastien, IMO, #1 applies to batch jobs as well.
On Tue, Jan 15, 2019
Hi,
I'm setting up Flink 1.7.0 on a Kubernetes cluster and am seeing some
unexpected behavior when using the Prometheus Reporter.
With the following setup in flink-conf.yaml:
metrics.reporters: prometheus
metrics.reporter.prometheus.class:
org.apache.flink.metrics.prometheus.PrometheusRep
It makes sense. Thank you very much, Jamie!
> On Jan 15, 2019, at 12:48 PM, Jamie Grier wrote:
>
> Ethan, it depends on what you mean by easy ;) It just depends a lot on what
> infra tools you already have in place. On bare metal it's probably safe to
> say there is no "easy" way. You ne
Hi, Sohi
You can check out doc[1][2] to find out the answer.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/restart_strategies.html
sohim
Hi Hequn
Yes, the TableFactoryService has a proper method. As I
use StreamTableDescriptor to connect to Kafka, StreamTableDescriptor
actually uses ConnectTableDescriptor which calls TableFactoryUtil to do
service load, and TableFactoryUtil does not use a user defined classloader,
so I can not use
Hi Hequn
Thanks. Now I know what you mean. To use tableEnv.registerTableSource
instead of using StreamTableDescriptor.registerTableSource. Yes, it is a
good solution.
If the StreamTableDescriptor itself can use a user-defined classloader, it
is better.
Thank you.
Yours sincerely
Joshua
On Wed, J
Hi,
Can someone please help on this issue. We have even tried to set
fs.s3a.impl in core-site.xml, still its not working.
Regards,
Vinay Patil
On Fri, Jan 11, 2019 at 5:03 PM Taher Koitawala [via Apache Flink User
Mailing List archive.] wrote:
> Hi All,
> We have implemented S3 sink
42 matches
Mail list logo