I had a similar problem. I ended up solving by not relying on checkpoints
for recovery and instead re-read my input sources (in my case a kafka
topic) from the earliest offset and rebuilding only the state I need. I
only need to care about the past 1 to 2 days of state so can afford to drop
anyt
Can the RocksDB state backend used by Flink be queries from outside, e.g.
via SQL?
Or maybe a better question, is there a RocksDB SinkFunction that exists?
Thanks
Tim
What are the pros and cons of Kafka offset keeping vs Flink offset
keeping? Is one more reliable than the other? Personally I prefer having
flink manage it due to it being intrinsically tied to its checkpointing
mechanism. But interested to learn from others experiences.
Thanks
Tim
On Thu, F
For Streaming Jobs that use RocksDB my understanding is that state is
allocated off-year via RocksDB.
If this is true then does it still make sense to leave 70% (default
taskmanager.memory.fraction) of the heap for Flink Manged memory given that
it is likely not being used for state?Or am I mi
Why not implement your own SinkFunction, or maybe inherit from the one you
are using now?
Tim
On Sat, Dec 14, 2019, 8:52 AM Theo Diefenthal <
theo.diefent...@scoop-software.de> wrote:
> Hi there,
>
> In my pipeline, I write data into a partitioned parquet directory via
> StreamingFileSink and a
jects/flink/flink-docs-stable/dev/stream/state/state.html
>
> Cheers,
> Till
>
> On Thu, Nov 7, 2019 at 2:11 PM Timothy Victor wrote:
>
>> I have a FilterFunction implementation which accepts an argument in its
>> constructor which it stores as an instance memb
I have a FilterFunction implementation which accepts an argument in its
constructor which it stores as an instance member.For example:
class ThresholdFilter implements FilterFunction {
private final MyThreshold threshold;
private int numElementsSeen;
public ThresholdFilter(MyThreshol
n, Oct 21, 2019, 10:37 AM Pritam Sadhukhan
wrote:
> Can you please share your dockerfile?
> Please upload your jar at /opt/flink/product-libs/flink-web-upload/.
>
> Regards,
> Pritam.
>
> On Mon, 21 Oct 2019 at 19:58, Timothy Victor wrote:
>
>> Thanks Pritam.
>>
ost:8081/jars/28f05eb0-9aab-4a18-ae66-f1e10970c11f_soar-ueba-training-service.jar/run>
>
> with the request parameters if any.
>
>
> Please let me know if this helps.
>
>
> Regards,
>
> Pritam.
>
> On Sat, 19 Oct 2019 at 20:06, Timothy Victor wrote:
>
>> I
I have a flink docker image with my job's JAR already contained within. I
would like to run a job with this jar via the REST api. Is that possible?
I know I can run a job via REST using JarID (ID assigned by flink when a
jar is uploaded). However I don't have such an ID since this jar is
alrea
https://bugs.openjdk.java.net/browse/JDK-8146436
>
> Roman Grebennikov | g...@dfdx.me
>
>
> On Sat, Oct 12, 2019, at 08:38, Timothy Victor wrote:
>
> This part about the GC not cleaning up after the job finishes makes
> sense. However, I o served that even after I run
ask manager that keeps the buffer to be used for the next
> batch job. When the new batch job is running, the task executor allocates
> new buffers, which will use the memory of the previous buffer that jvm
> haven't released.
>
> Thank you~
>
> Xintong Song
>
>
>
times flink/jvm do not release memory after
> jobs/tasks finished, so that it can be reused directly by other jobs/tasks,
> for the purpose of reducing allocate/deallocated overheads and optimizing
> performance.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu
After a batch job finishes in a flink standalone cluster, I notice that the
memory isn't freed up. I understand Flink uses it's own memory manager
and just allocates a large tenured byte array that is not GC'ed. But does
the memory used in this byte array get released when the batch job is done
We see a very similar (if not the same) error running version 1.9 on
Kubernetes. So far what we have discovered is that a taskmanager gets
killed and a new one is created, but JM still thinks it needs to connect to
the old (now dead TM). I was even able to see the a taskmanager on the
same host
The flink job manager UI isn't meant to be accessed from outside a firewall
I think. Plus I dont think it was designed with security in mind and
honestly it doesn't need to in my opinion.
If you need security then address your network setup. And if it is still
a problem the just turn off the U
Hi
I'm looking to write a sink function for writing to websockets, in
particular ones that speak the WAMP protocol (
https://wamp-proto.org/index.html).
Before going down that path, I wanted to ask if
a) anyone has done something like that already so I dont reinvent stuff
b) any caveats or warn
we need to deploy and
> test the whole project code. And we want to work in a more modular way.
> As you said, I also think the size of the jar is less important than what
> I wrote above.
>
> Thanks.
>
> *From:* Timothy Victor
> *Sent:* Wednesday, July 3, 2019 2:31 PM
>
I think any jars in $FLINK_HOME/lib gets picked up in the class path.
As for dynamic update, wouldn't you want to run it through your tests/CI
before deploying?
FWIW we also use a fat jar. Its large but also since we just build a
docker container anyway it doesn't really matter.
Tim
On Wed, J
:04 AM Vishal Santoshi
> wrote:
>
>> I have not tried on bare metal. We have no option but k8s.
>>
>> And this is a job cluster.
>>
>> On Sat, Jun 29, 2019 at 9:01 AM Timothy Victor wrote:
>>
>>> Hi Vishal, can this be reproduced on a bare m
Hi Vishal, can this be reproduced on a bare metal instance as well? Also
is this a job or a session cluster?
Thanks
Tim
On Sat, Jun 29, 2019, 7:50 AM Vishal Santoshi
wrote:
> OK this happened again and it is bizarre ( and is definitely not what I
> think should happen )
>
>
>
>
> The job fai
I would choose encapsulation if it the fields are indeed related and makes
sense for your model. In general, I feel it is not a good thing to let
Flink (or any other frameworks) internal mechanics dictate your data model.
Tim
On Mon, Jun 17, 2019, 4:59 AM Frank Wilson wrote:
> Hi,
>
> Is it be
Thanks for the insight. I was also interested in this topic.
One thought occurred to me is what about the queuing delay when sending to
your message bus (e.g. kafka). I am guessing the probe will be before the
message is added to the send queue?
Thanks again
Tim
On Thu, Jun 13, 2019, 6:08 AM
It's hard to tell without more info.
>From the method that threw the exception it looked like it was trying to
deserialize the accumulator. By any chance did you change your
accumulator class but forgot to update the serialVersionUID? Just
wondering if it is trying to deserialize to a different
Not an expert, but I would think this will not be trivial since the reason
for using checkpointing to trigger is to guarantee exactly once semantics
in the event of a failure which in turn is tightly integrated into the CP
mechanism. The precursor the StreamingFileSink was BucketingFileSink which
You must have checkpointing enabled to use the StreamingFileSink. The
feature relies on CP for achieving exactly once semantics.
>> This is integrated with the checkpointing mechanism to provide exactly
once semantics.
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/f
ld you share error stack trace?
>
> Thanks & Regards,
> Sushant Sawant
>
>
> On Fri, 24 May 2019, 19:18 Timothy Victor, wrote:
>
>> If a flink job crashes during startup (throws exception) the entire
>> cluster goes down. This is even on a simple bare met
If a flink job crashes during startup (throws exception) the entire cluster
goes down. This is even on a simple bare metal host.
I have tried catching the exception, but even that didnt prevent the JM and
cluster from crashing.
Has anyone run into this problem?
I'm on Flink 1.7.1
Thanks
Tim
This is probably a very subjective question, but nevertheless here are my
reasons for choosing Flink over KStreams or even Spark.
a) KStreams couples you tightly to Kafka, and I personally don't want my
stream processing engine to be married to my message bus. There are other
(even better altern
One approach I use is to write the git commit sha to the jars manifest
while compiling it (I don't use semantic versioning but rather use commit
sha).
Then at runtime I read the implementationVersion
(class.getPackage().getImplementationVersion()), and print that in the job
name.
Tim
On Mon, Apr
I face the same issue in Flink 1.7.1.
Would be good to know a solution.
Tim
On Mon, Apr 8, 2019, 12:45 PM Jins George wrote:
> Hi,
>
>
>
> I am facing a weird problem in which jobs from ‘Completed Jobs’ section in
> Flink 1.7.2 UI disappear. Looking at the job manager logs, I see the job
> wa
t;(){});
>
>> kinesisStream.keyBy(new KeySelector() {...}, info);
>> //specify typeInfo through
>>
>
> TIA,
> Vijay
>
> On Tue, Apr 2, 2019 at 6:06 PM Timothy Victor wrote:
>
>> Flink needs type information for serializing and deserializing objects
Flink needs type information for serializing and deserializing objects, and
that is lost due to Java type erasure. The only way to workaround this is
to specify the return type of the function called in the lambda.
Fabian's answer here explains it well.
https://stackoverflow.com/questions/50945
Yesterday I came across a weird problem when attempting to run 2 nearly
identical jobs on a cluster. I was able to solve it (or rather workaround
it), but am sharing here so we can consider a potential fix in Flink's
KafkaProducer code.
My scenario is as follows. I have a Flink program that read
This is more a Java question than Flink per se. But I believe you need to
specify the rounding mode because it is calling longValueExact. If it just
called longValue it would have worked without throwing an exceptionbut
you risk overflowing 64 bits and getting a totally erroneous answer.
Ar
gt;> checkpoint.
>>
>>
>> Am I better off using BucketingSink ? When to use BucketingSink and when
>> to use RollingSink is not clear at all, even though at the surface it sure
>> looks RollingSink is a better version of .BucketingSink ( or not )
>>
>> Re
I think the only rolling policy that can be used is CheckpointRollingPolicy
to ensure exactly once.
Tim
On Sun, Feb 10, 2019, 9:13 AM Vishal Santoshi Can StreamingFileSink be used instead of
> https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/filesystem_sink.html,
> ev
Hi -
Has there been any update on the below issue? I am also facing the same
problem.
http://mail-archives.apache.org/mod_mbox/flink-user/201812.mbox/%3ccac2r2948lqsyu8nab5p7ydnhhmuox5i4jmyis9g7og6ic-1...@mail.gmail.com%3E
There is a similar issue (
https://stackoverflow.com/questions/50806228
38 matches
Mail list logo