Thanks guys for your answers, that is exactly information I was looking
for.
Krzysztof
2015-12-01 19:22 GMT+01:00 Robert Metzger :
> Hi Flavio,
>
> 1. you don't have to register serializers if its working for you. I would
> add a custom serializer if its not working or if the performance is poor
Thanks, I'm basing the things I'm doing based on what I see there. One
thing that's not clear to me in that example is why supervisor is used to
keep the container alive, rather than using some simpler means. It doesn't
look like it's been configured to supervise anything.
On Wed, Dec 2, 2015 at 1
I forgot you're using Flink 0.10.1. The above was for the master.
So here's the commit for Flink 0.10.1:
https://github.com/mxm/flink/commit/a41f3866f4097586a7b2262093088861b62930cd
git fetch https://github.com/mxm/flink/ \
a41f3866f4097586a7b2262093088861b62930cd && git checkout FETCH_HEAD
http
Have you looked at
https://github.com/apache/flink/tree/master/flink-contrib/docker-flink
? This demonstrates how to use Flink with Docker. In particular it
states: "Images [..] run Supervisor to stay alive when running
containers."
Have a look at flink/config-flink.sh.
Cheers,
Max
On Wed, Dec 2
发自我的 iPhone
> 在 2015年12月3日,上午1:41,Maximilian Michels 写道:
>
> Great. Here is the commit to try out:
> https://github.com/mxm/flink/commit/f49b9635bec703541f19cb8c615f302a07ea88b3
>
> If you already have the Flink repository, check it out using
>
> git fetch https://github.com/mxm/flink/
> f49
Great. Here is the commit to try out:
https://github.com/mxm/flink/commit/f49b9635bec703541f19cb8c615f302a07ea88b3
If you already have the Flink repository, check it out using
git fetch https://github.com/mxm/flink/
f49b9635bec703541f19cb8c615f302a07ea88b3 && git checkout FETCH_HEAD
Alternativel
A bit of extra information on the example where I posted the link:
The example checks whether two events follow each other within a certain
time:
- The first event in the example is called "compute.instance.create.start"
(in your case, it would be the event that an order was placed)
- The seco
Yep, I think this makes sense. I'm currently patching the flink-daemon.sh
script to remove the `&`, but I don't think it's a very robust solution,
particularly when this script changes across versions of Flink. I'm very
new to Docker, but the resources I've found indicates that the process must
run
Hi Mihail!
Do I understand you correctly that the use case is to raise an alarm if an
order has not been processed within a certain time period (certain number
of days) ?
If that is the case, the use case is actually perfect for a special form of
session windows that monitor such timeouts. I have
Hi Mihail,
not sure if I correctly got your requirements, but you can define windows
on a keyed stream. This basically means that you partition the stream, for
example by order-id, and compute windows over the keyed stream. This will
create one (or more, depending on the window type) window for ea
Hi Gyula, Hi Stephan,
thank you for your replies.
We need a state which grows indefinitely for the following use case. An
event is created when a customer places an order. Another event is created
when the order is sent. These events typically occur within days. We need
to catch the cases when th
Hi Arnaud!
One thing you can do is to periodically retrieve them by querying the
monitoring API:
https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/monitoring_rest_api.html
A nice approach would be to let the JobManager eagerly publish the
metrics. I think that Christian Krei
Sure, just give me the git repo url to build and I'll give it a try.
Niels
On Wed, Dec 2, 2015 at 4:28 PM, Maximilian Michels wrote:
> I mentioned that the exception gets thrown when requesting container
> status information. We need this to send a heartbeat to YARN but it is
> not very crucial
Hello,
I use Grafana/Graphite to monitor my applications. The Flink GUI is really
nice, but it disappears after the job completes and consequently is not
suitable to long-term monitoring.
For batch applications, I simply send the accumulator’s values at the end of
the job to my Graphite base.
I mentioned that the exception gets thrown when requesting container
status information. We need this to send a heartbeat to YARN but it is
not very crucial if this fails once for the running job. Possibly, we
could work around this problem by retrying N times in case of an
exception.
Would it be
Do you think it is possible to push ahead this thing? I need to implement
this interactive feature of Datasets. Do you think it is possible to
implement the persist() method in Flink (similar to Spark)? If you want I
can work on it with some instructions..
On Wed, Dec 2, 2015 at 3:05 PM, Maximilia
No, I was just asking.
No upgrade is possible for the next month or two.
This week is our busiest day of the year ...
Our shop is doing about 10 orders per second these days ...
So they won't upgrade until next January/February
Niels
On Wed, Dec 2, 2015 at 3:59 PM, Maximilian Michels wrote:
>
Hi Niels,
You mentioned you have the option to update Hadoop and redeploy the
job. Would be great if you could do that and let us know how it turns
out.
Cheers,
Max
On Wed, Dec 2, 2015 at 3:45 PM, Niels Basjes wrote:
> Hi,
>
> I posted the entire log from the first log line at the moment of fai
Hi,
I posted the entire log from the first log line at the moment of failure to
the very end of the logfile.
This is all I have.
As far as I understand the Kerberos and Keytab mechanism in Hadoop Yarn is
that it catches the "Invalid Token" and then (if keytab) gets a new
Kerberos ticket (or tgt?)
Hi Flavio,
I was working on this some time ago but it didn't make it in yet and
priorities shifted a bit. The pull request is here:
https://github.com/apache/flink/pull/640
The basic idea is to remove Flink's ResultPartition buffers in memory
lazily, i.e. keep them as long as enough memory is ava
Hi Niels,
Sorry for hear you experienced this exception. From a first glance, it
looks like a bug in Hadoop to me.
> "Not retrying because the invoked method is not idempotent, and unable to
> determine whether it was invoked"
That is nothing to worry about. This is Hadoop's internal retry
mech
Mihail!
The Flink windows are currently in-memory only. There are plans to relax
that, but for the time being, having enough memory in the cluster is
important.
@Gyula: I think window state is currently also limited when using the
SqlStateBackend, by the size of a row in the database (because win
Hi Fabian,
I have already created JIRA for this one.
https://issues.apache.org/jira/browse/FLINK-3099
Thanks a lot for this.
Cheers
On Wed, Dec 2, 2015 at 6:02 PM, Fabian Hueske wrote:
> Hi Welly,
>
> at the moment we only provide native Windows .bat scripts for start-local
> and the CLI clie
Hi Nirmalya,
please find my answers in line.
2015-12-02 3:26 GMT+01:00 Nirmalya Sengupta :
> Hello Fabian (),
>
> Many thanks for your encouraging words about the blogs. I want to make a
> sincere attempt.
>
> To summarise my understanding of the rule of removal of the elements from
> the window
Hi Welly,
at the moment we only provide native Windows .bat scripts for start-local
and the CLI client.
However, we check that the Unix scripts (including start-webclient.sh) work
in a Windows Cygwin environment.
I have to admit, I am not familiar with MinGW, so not sure what is
happening there.
Hi,
We have a Kerberos secured Yarn cluster here and I'm experimenting
with Apache Flink on top of that.
A few days ago I started a very simple Flink application (just stream
the time as a String into HBase 10 times per second).
I (deliberately) asked our IT-ops guys to make my account have a m
Hi,
I am working on a use case that involves storing state for billions of
keys. For this we use a MySql state backend that will write each key-value
state to MySql server so it will only hold a limited set of key-value pairs
on heap while maintaining the processing guarantees.
This will keep our
Hi Aljoscha,
we have no upper bound for the number of expected keys. The max size for an
element is 1 KB.
There are 2 Maps, a KeyBy, a timeWindow, a Reduce and a writeAsText
operators in the job. In the first Map we parse the contained JSON object
in each element and forward it as a Flink Tuple.
There is an overview of what guarantees what sources can give you:
https://ci.apache.org/projects/flink/flink-docs-master/apis/fault_tolerance.html#fault-tolerance-guarantees-of-data-sources-and-sinks
On Wed, Dec 2, 2015 at 9:19 AM, Till Rohrmann
wrote:
> Just a small addition. Your sources have
Hi Mihail,
could you please give some information about the number of keys that you are
expecting in the data and how big the elements are that you are processing in
the window.
Also, are there any other operations that could be taxing on Memory. I think
the different exception you see for 500M
Yes, with the "start-cluster-streaming.sh" script.
If the TaskManager gets 5GB of heap it manages to process ~100 million
messages and then throws the above OOM.
If it gets only 500MB it manages to process ~8 million and a somewhat
misleading exception is thrown:
12/01/2015 19:14:07Source: Cus
Its good news that the issue has been resolved.
Regarding the OOM, did you start Flink in the streaming mode?
On Wed, Dec 2, 2015 at 10:18 AM, Vieru, Mihail
wrote:
> Thank you, Robert! The issue with Kafka is now solved with the
> 0.10-SNAPSHOT dependency.
>
> We have run into an OutOfMemory ex
Hi Brian,
I don't recall Docker requires commands to run in the foreground. Still, if
that is your requirement, simply remove the "&" at the end of this line in
flink-daemon.sh:
$JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath
"`manglePathList "$FLINK_TM_CLASSPATH:$INTERN
Thank you, Robert! The issue with Kafka is now solved with the
0.10-SNAPSHOT dependency.
We have run into an OutOfMemory exception though, which appears to be
related to the state. As my colleague, Javier Lopez, mentioned in a
previous thread, state handling is crucial for our use case. And as the
Hi Welly,
We still have to decide on the next release date but I would expect
Flink 0.10.2 within the next weeks. If you can't work around the union
limitation, you may build your own Flink either from the master or the
release-0.10 branch which will eventually be Flink 0.10.2.
Cheers,
Max
On Tu
Hi Brian,
as far as I know this is at the moment not possible with our scripts.
However it should be relatively easy to add by simply executing the Java
command in flink-daemon.sh in the foreground. Do you want to add this?
Cheers,
Till
On Dec 1, 2015 9:40 PM, "Brian Chhun" wrote:
> Hi All,
>
>
Just a small addition. Your sources have to be replayable to some extent.
With replayable I mean that they can continue from some kind of offset.
Otherwise the check pointing won't help you. The Kafka source supports that
for example.
Cheers,
Till
On Dec 1, 2015 11:55 PM, "Márton Balassi" wrote:
37 matches
Mail list logo