Hi again,
I had a bug in my logic. It works as expected (which is perfect).
So maybe for others:
Problem:
- execute superstep-dependent UDFs on datasets which do not have access
to the iteration context
Solution:
- add dummy element to the working set (W) at the beginning of the step
functi
Hello,
right now Flink's local matrices are rather raw and for this kind of usage,
you should rely on Breeze. If you need to perform operations, slicing in
this case, they are a better option if you don't want to reimplement
everything.
In case you already developed against Flink's matrices, ther
OK, I found product that seems to be what I am looking for: Apache Zeppelin. I
will have a look into that one. If anyone can point me to an example (Git)
outputting data from Flink to the Zeppelin Notebook I would be happy.
- Original meddelelse -
> Fra: Palle
> Til: user@flink.apache.
Hi,
Flinkspector is indeed a good choice to circumvent this problem as it
specifically has several mechanisms to deal with these synchronization
problems. Unfortunately, I'm still looking for a reasonable solution to
support checking of scala types.
Maybe I will provide a version in which you can
Hi,
I created a new doc specifically about the interplay of lateness and window
state garbage collection:
https://docs.google.com/document/d/1vgukdDiUco0KX4f7tlDJgHWaRVIU-KorItWgnBapq_8/edit?usp=sharing
There is still some stuff that needs to be figured out, both in the new doc
and the existing do
Hi,
right now the only way of getting at the timestamps is writing a custom
operator and using that with DataStream.transform(). Take a look at
StreamMap, which is the operator implementation that executes MapFunctions.
I think the links in the doc should point to
https://ci.apache.org/projects/fl
Thanks Aljoscha :) I added some comments that might seem relevant from the
users point of view.
Gyula
Aljoscha Krettek ezt írta (időpont: 2016. máj. 30.,
H, 10:33):
> Hi,
> I created a new doc specifically about the interplay of lateness and
> window state garbage collection:
> https://docs.goo
I was trying to run a basic program in java by submitting to the job
manager in Flink. I have a native library from open CV. When I try to
submit the job I get "java.lang.UnsatisfiedLinkError: no opencv_java310 in
java.library.path", however when I run it on Intellij by setting up the
flink executi
Hi,
hot update of a running cluster is not possible right now. And there is
also no one working on this for the near future. We are aware that this
would be nice to have, though.
For 2), this is possible, but not without stopping the job. Savepoints is
the feature that was introduced for that:
htt
Hi,
the state will be kept indefinitely but we are planning to introduce a
setting that would allow setting a time-to-live on state. I think this is
exactly what you would need. As an alternative, maybe you could implement
your program using windows? In this way you would also bound how long state
Thanks for the feedback! :-) I already read the comments on the file.
On Mon, 30 May 2016 at 11:10 Gyula Fóra wrote:
> Thanks Aljoscha :) I added some comments that might seem relevant from the
> users point of view.
>
> Gyula
>
> Aljoscha Krettek ezt írta (időpont: 2016. máj. 30.,
> H, 10:33):
I tried to reproduce the error on a subset of the data and actually
reducing the available memory and increasing a lot the gc (creating a lot
of useless objects in one of the first UDFs) caused this error:
Caused by: java.io.IOException: Thread 'SortMerger spilling thread'
terminated due to an exc
2) What about the following: You trigger a savepoint and deploy the
2nd version from the savepoint. You let both job versions run at the
same time and switch the consumers of the job to the new version (e.g.
Kafka output topic v2). On the Flink side, this should be possible,
but moves the problem t
Hi Flavio,
can you privately share the source code of your Flink job with me?
I'm wondering whether the issue might be caused by a version mixup between
different versions on the cluster (different JVM versions? or different
files in the lib/ folder?), How are you deploying the Flink job?
Regard
Hi Palle,
I think there is currently no way of sending the data from a streaming
Flink job into Zeppelin.
What rate / amount of data do you expect to send every 10 seconds to the
visualization tool?
People have used Flink -> ES -> Kibana for this purpose in the past [1],
but I think you can not se
Hi Lydia,
`FlinkMLTools.persist` method is used to save ML models and can be used to save
Matrix and Vector object. Note that the method uses TypeSerializerOutputFormat
which is binary output format.
Regards,
Chiwan Park
> On May 30, 2016, at 11:31 AM, Lydia Ickler wrote:
>
> Hi,
>
> I woul
Hello Flink team,
How can i partition and share static state among instances of a streaming
operator?
I have a huge list of keys and values, which are used to filter tuples in a
stream. The list does not change. Currently i am sharing the list with each
operator instance via the constructor, a
Hi together
I'm using flink 1.0.1 and a FlinkKafkaConsumer09.
I'm very interested in getting data from a specific Time offset in Kafka.
Is there a property which can do this?
Or is there another way of handling such issues?
I'm also using checkpointing.
If I deploy a new pipeline with the same i
Hi Stavros,
I have same problem as you and i try to solve it , did you find some
solution meanwhile?
thankyou
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/withBroadcastSet-for-a-DataStream-missing-tp5793p7262.html
Sent from the Apache Fl
Hi Simon,
Timestamp-awareness has been added to Kafka 0.10 only [1]. I'm not sure if
the 0.9 connector code of Flink will support Kafka 0.10 immediately.
Another way of handling the issue would be
- Implementing a custom offset-metadata system (storing the timestamp for
some of the offsets (say ev
Hi Robert.
Thank you for the answer.
I am looking at a rate of max 10.000 elements / 10 seconds, so
Elastic/Kibana is probably the way to go. I'll find a way to model it.
Thanks.
/Palle
- Original meddelelse -
> Fra: Robert Metzger
> Til: user@flink.apache.org
> Dato: Man, 30. maj 2
Just transform the list in a DataStream. A datastream can be finite.
One solution, in the context of a Streaming environment is to use Kafka, or any
other distributed broker, although Flink ships with a KafkaSource.
1)Create a Kafka Topic dedicated to your list of key/values. Inject your va
Hi,
sounds doable. I think it should be easy to set up a first proof of concept
for this.
Let us know if you need any further assistance.
Regards,
Robert
On Mon, May 30, 2016 at 2:29 PM, Palle wrote:
> Hi Robert.
>
> Thank you for the answer.
>
> I am looking at a rate of max 10.000 elements /
Hi David,
I guess you can verify it by adding custom log statements into the Flink
code (therefore, you need to recompile Flink).
Maybe a debugger is also sufficient (if you are running Flink locally).
We are currently reworking the reading of static files for the streaming
environment. Maybe its
Hi,
I plan to leveraging Flink data stream programs within a larger application.
I’d like to be able to execute a data stream program in detached mode directly
from the StreamExecutionEnvironment similar to how I can execute a program in
blocking mode. I was expecting to find
StreamExecutionEn
Dear Philippe,
that is exactly what i need. Thank you for the concise explanation.
This approach is excellent, as it also permits the values to be easily
updated externally.
Kind regards
Leon
30. May 2016 14:31 by philippe.capar...@orange.fr:
>
>
> Just transform the list in a DataStream. A
Hi,
Alexis is right. The original data set is only read once and the two
flatMaps run in parallel on multiple machines in the cluster.
Regards,
Robert
On Fri, May 27, 2016 at 11:10 PM, Alexis Gendronneau <
a.gendronn...@gmail.com> wrote:
> Hi Jon,
>
> I'm pretty sure your input will be processed
Hi everyone,
basic question, but I just want to check if my current understanding of
the log4j property files inside flink's conf directory is correct.
log4j-cli.properties: Used by the client "flink run/list" for code not
executed in the cluster
log4j-yarn-session.properties Used by the client w
i believe the log4j.properties are always used for JM/TM logs,
and log4j-yarn-session.properties strictly for pure YARN stuff.
On 30.05.2016 18:30, Konstantin Knauf wrote:
Hi everyone,
basic question, but I just want to check if my current understanding of
the log4j property files inside flink'
I think "log4j.properties" is also used for YARN (it is included in the
shipped bundle, together with jars).
Otherwise it is correct.
Would be good to write that down somewhere...
On Mon, May 30, 2016 at 6:30 PM, Konstantin Knauf <
konstantin.kn...@tngtech.com> wrote:
> Hi everyone,
>
> basic q
I am not sure but the reason could be that you are not submitting "fat-jar"
containing all the libs when submitting it to job manager.
When you run it through intellij it takes care of including the libraries.
To create fat jar use maven-shade-plugin.
Let me know if it works.
Thanks,
Arpit
On
Hi Jordan,
the ./bin/flink client's run command has a -d / --detached flag for
detached execution.
However, this doesn't allow you to programatically control the running job.
What you probably have to do is using the RemoteEnvironment submitting the
job in a blocking way using a separate thread.
T
Hi,
I have yarn cluster with 7 nodes.
Five nodes -16gb ram
One node - 8gb ram
One yarn resourcemanager - 8gb
I want to specifically use 8gb machine (not resourcemanager) to act as job
manager and other five nodes as task managers.
Is there a way to do that ?
I’ve merged a patch [1] for this issue. Now we can use Option as a key.
[1]:
https://git-wip-us.apache.org/repos/asf?p=flink.git;a=commit;h=c60326f85faaa38bcc359d555cd2d2818ef2e4e7
Regards,
Chiwan Park
> On Apr 5, 2016, at 2:08 PM, Chiwan Park wrote:
>
> I just found that Timur created a JIRA
Hello,
I'm curious about ability to alter processing of streams in Flink at
run-time.
Potential use-case may look like following:
1. I have a stream already running (i.e. data processing is already
started) in the Flink's cluster
2. At some point of time I decide that I need to add some more st
Hi Arpit,
I'm not sure of what you tries to do, but if you want yarn to execute a
type of job on particular nodes, you may find a way using node labels :
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
I'm not sure you'll be able to specify that the jobmanager ha
Will try and come back with feedback if its possible.
Thanks,
Arpit
On Tue, May 31, 2016 at 12:00 PM, Alexis Gendronneau <
a.gendronn...@gmail.com> wrote:
> Hi Arpit,
>
> I'm not sure of what you tries to do, but if you want yarn to execute a
> type of job on particular nodes, you may find a way
37 matches
Mail list logo