Hi Folks,
I was wondering if it's possible to keep partial outputs from dataset
programs.
I have a batch pipeline that writes its output on HDFS using
writeAsFormattedText. When it fails the output file is deleted but I would
like to keep it so that I can generate new inputs for the pipeline to av
thanks. worth mentioning in the release notes of 1.1.2 that file source is
broken. we spent a substantial time on trying to figure out what's the
root cause.
On Sep 27, 2016 9:40 PM, "Stephan Ewen" wrote:
> Sorry for the inconvenience. This is a known issue and being fixed for
> Flink 1.1.3 - t
Thanks Stephan for your inputs
We are getting the checkpointing issue for other projects as well in which
the window and encryption stuff is not there (using Flink 1.1.1).
As you suggested, I will try using RocksDB and run the pipeline on EMR to
provide more details.
Regards,
Vinay Patil
On Tue
@vinay - Flink needs to store all pending windows in the checkpoint, i.e.,
windows that have elements but have not yet fires/purged.
I guess client side encryption can add to the delay.
If you use RocksDB asynchronous snapshots (1.1.x) then this delay should be
hidden.
Greetings,
Stephan
On Tue
@CVP
Flink stores in checkpoints in your case only the Kafka offsets (few bytes)
and the custom state (e).
Here is an illustration of the checkpoint and what is stored (from the
Flink docs).
https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html
I am quite pu
Sorry for the inconvenience. This is a known issue and being fixed for
Flink 1.1.3 - the problem is that the streaming File sources were reworked
to continuously monitor the File System, but the watermarks are not handled
correctly.
https://issues.apache.org/jira/browse/FLINK-4329
So far, 2/3 par
Hi Chen,
Please upload your Flink scala library dependencies.
Regards
Sunny.
On Tue, Sep 27, 2016 at 5:56 PM, Chen Bekor wrote:
> Hi,
>
> Since upgrading to flink 1.1.2 from 1.0.2 - I'm experiencing regression in
> my code. In order to Isolate the issue I have written a small flink job
> that
Hi,
Since upgrading to flink 1.1.2 from 1.0.2 - I'm experiencing regression in
my code. In order to Isolate the issue I have written a small flink job
that demonstrates that.
The job does some time based window operations with an input csv file (in
the example below - count the number of events o
Hi Stephan,
Ok, I think that may be taking lot of time, so when you say everything that
it stores does it mean that all the input to the window is stored in state
backend.
For Ex: for my apply function, the input is is Iterable, the DTO can
contain multiple elements, and the DTO contains roughly
Hi Seth,
the 1.2-SNAPSHOT is very fragile at the moment because of multiple big
changes for dynamic scaling.
Maybe Stefan (in CC) has an idea what is happening with the keyed state
backend here?
Timo
Am 27/09/16 um 16:14 schrieb swiesman:
Hi all,
I am working on an analytics project and a
Thank you Ufuk!
Once the execution graph is arrived at, is there a way to 'view' the graph to
understand operator chaining & slot placement specific to my scenario?
-Original Message-
From: Ufuk Celebi [mailto:u...@apache.org]
Sent: Tuesday, September 27, 2016 8:52 AM
To: user@flink.apa
Hi all,
I am working on an analytics project and am developing against flink
1.2-SNAPSHOT. The pipeline that I have built works; ie I can ingest data,
perform operations, and output the expected result. I can also see
checkpoints being written to RocksDB using Amazon S3 as the state backend.
When
Hello,
I'm dealing with an analytical job in streaming and I don't know how to
write the last part.
Actually I want to count all the elements in a window with a given status,
so I keep a state with a Map[Status,Long]. This state is updated starting
from tuples containing the oldStatus and the new
I opened a PR on kryo https://github.com/EsotericSoftware/kryo/pull/463
that fixes 2.24.0, we should push for a 2.24.1 release, in the meantime if
you want to try the fix it's in this branch:
https://github.com/marianoguerra/kryo/tree/fix-462-for-2.24.0
I made a build of flink 1.1.2 with this fix
Hi max,
that's exactly what I was looking for. What do you mean for 'the best thing
is if you keep a local copy of your sampling jars and work directly with
them'?
Best,
Flavio
On Tue, Sep 27, 2016 at 2:35 PM, Maximilian Michels wrote:
> Hi Flavio,
>
> This is not really possible at the moment.
On Mon, Sep 26, 2016 at 9:46 PM, Ramanan, Buvana (Nokia - US)
wrote:
> When the operators (including source / sink) are chained, what is the method
> of communication between them?
In general, when operators are chained they execute in the same
"physical" task and records are passed directly to t
Valid argument. I will add a comment to the issue.
Am 27/09/16 um 14:25 schrieb Stephan Ewen:
I would not bump the KRyo version easily - the serialization format
changed (that's why they have a new major version), which would render
all Flink savepoints and checkpoints incompatible.
On Tue,
Hi Flavio,
This is not really possible at the moment. Though there is a workaround.
You can create a dummy jar file (may be empty). Then you can use
./flink run -C hdfs:///path/to/cluster.jar -c org.package.SampleClass
/path/to/dummy.jar
That way Flink will include your cluster jar and you can l
I would not bump the KRyo version easily - the serialization format changed
(that's why they have a new major version), which would render all Flink
savepoints and checkpoints incompatible.
On Tue, Sep 27, 2016 at 1:19 PM, Timo Walther wrote:
> Hi Luis,
>
> there is already an issue for bumping
Hi Stefan,
Thanks a million for your detailed explanation. I appreciate it.
- The *zookeeper bundled with kafka 0.9.0.1* was used to start
zookeeper. There is only 1 instance (standalone) of zookeeper running on my
localhost (ubuntu 14.04)
- There is only 1 Kafka broker (*version
Gordon,
Thank you for your quick response!
I am looking forward to that feature. I will periodically check that JIRA.
I am also interested in the Robert's implementation because my use
current case is
system monitoring and scalability has higher priority than correctness.
Regards,
Hironori
2016
Hi Luis,
there is already an issue for bumping up the kryo version
(https://issues.apache.org/jira/browse/FLINK-3154). You could open a PR
if you like.
Timo
Am 27/09/16 um 13:10 schrieb Luis Mariano Guerra:
On Tue, Sep 27, 2016 at 11:22 AM, Luis Mariano Guerra
mailto:mari...@event-fabric.co
On Tue, Sep 27, 2016 at 11:22 AM, Luis Mariano Guerra <
mari...@event-fabric.com> wrote:
> hi, I created an issue at kryo with a simple project that reproduces the
> problem, but still asking here in case anyone knows a solution:
>
> https://github.com/EsotericSoftware/kryo/issues/462
>
I found a
Hi!
This is definitely a planned feature for the Kafka connectors, there’s a JIRA
exactly for this [1].
We’re currently going through some blocking tasks to make this happen, I also
hope to speed up things over there :)
Your observation is correct that the Kaka consumer uses “assign()” instead
Hi Max,
actually I have a jar containing sampling jobs and I need to collect
results from a client.
I've tried to use ExecutionEnvironment.createRemoteEnvironment but I fear
that it's not the right way to do that because
I just need to tell the cluster the main class and the parameters to run
the j
hi, I created an issue at kryo with a simple project that reproduces the
problem, but still asking here in case anyone knows a solution:
https://github.com/EsotericSoftware/kryo/issues/462
basically, I get an error trying to serialize a simple tree like structure,
the error happens with a 2 node,
Hello,
I want FlinkKafkaConsumer to follow changes in Kafka topic/partition change.
This means:
- When we add partitions to a topic, we want FlinkKafkaConsumer to
start reading added partitions.
- We want to specify topics by pattern (e.g accesslog.*), and want
FlinkKafkaConsumer to start reading
Hi Flavio,
Do you want to sample from a running batch job? That would be like
Queryable State in streaming jobs but it is not supported in batch
mode.
Cheers,
Max
On Mon, Sep 26, 2016 at 6:13 PM, Flavio Pompermaier
wrote:
> Hi to all,
>
> I have a use case where I need to tell a Flink cluster
@vinay - Window operators store everything in the state backend.
On Mon, Sep 26, 2016 at 7:34 PM, vinay patil
wrote:
> I am not sure about that, I will run the pipeline on cluster and share the
> details
> Since window is a stateful operator , it will store only the key part in
> the state backe
29 matches
Mail list logo