You should also add Apache Mahout, whose new Samsara DSL also runs on Flink.
-s
On 06.04.2016 08:50, Henry Saputra wrote:
Thanks, Slim. I have just updated the wiki page with this entries.
On Tue, Apr 5, 2016 at 10:20 AM, Slim Baltagi mailto:sbalt...@gmail.com>> wrote:
Hi
The followi
I was gonna hold off on that until we get Mahout 0.12.0 out of the door
(targeted for this weekend).
I would add Apache NiFi to the list.
Future :
Apache Mahout
Apache BigTop
Openstack and Kubernetes (skunkworks)
On Wed, Apr 6, 2016 at 3:03 AM, Sebastian wrote:
> You should also add Apache
Cheerz,
I have been working last few month on the comparison of different data
processing engines and recently came across Apache Flink. After reading
different academic papers on comparison of Flink with other data processing I
would definitely give it a shot. The only issue I am currently hav
>From my side I was starting the YARN session from the cluster:
flink-0.10.1/bin/yarn-session.sh -n 64 -s 4 -jm 4096 -tm 4096
Then getting the IP/port from the WebUI and then from Eclipse:
ExecutionEnvironment env =
ExecutionEnvironment.createRemoteEnvironment("xx.xx.xx.xx", 40631,
"target/FlinkTe
Yes, for sure.
I added some documentation for AWS here:
https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/aws.html
Would be nice to update that page with your pull request. :-)
– Ufuk
On Wed, Apr 6, 2016 at 4:58 AM, Chiwan Park wrote:
> Hi Timur,
>
> Great! Bootstrap action fo
Hi Balaji,
from the stack trace it looks as if you cannot open a connection redis.
Have you checked that you can access redis from all your TaskManager nodes?
Cheers,
Till
On Wed, Apr 6, 2016 at 7:46 AM, Balaji Rajagopalan <
balaji.rajagopa...@olacabs.com> wrote:
> I am trying to use AWS EMR ya
Hi,
I'm trying out the new CEP library but have some problems with event
detection.
In my case Flink detects the event pattern: A followed by B within 10
seconds.
But short time after event detection when the event pattern isn't matched
anymore, the program crashes with the error message:
04/06/20
Hi Zach!
I am working on incremental checkpointing, hope to have it in the master in
the next weeks.
The current approach is a to have a full self-contained checkpoint every
once in a while, and have incremental checkpoints most of the time. Having
a full checkpoint every now and then spares you
Till,
I have checked from all the taskmanager nodes I am able to establish a
connection by installing a redis-cli on those nodes. The thing is in the
constructor I am able to set and get values, also I am getting PONG for the
ping. But once object is initialized when I try to call DriverStreamHel
Hi Norman,
which version of Flink are you using? We recently fixed some issues with
the CEP library which looked similar to your error message. The problem
occurred when using the CEP library with processing time. Switching to
event or ingestion time, solve the problem.
The fixes to make it also
Hmm I'm not a Redis expert, but are you sure that you see a successful ping
reply in the logs of the TaskManagers and not only in the client logs?
Another thing: Is the redisClient thread safe? Multiple map tasks might be
accessing the set and get methods concurrently.
Another question: The code
Hey Zach,
just added some documentation, which will be available in ~ 30 mins
here:
https://ci.apache.org/projects/flink/flink-docs-release-1.0/internals/back_pressure_monitoring.html
If you think that something is missing there, I would appreciate some
feedback. :-)
Back pressure is determined
Thanks for the info! It is a bit difficult to tell based on the documentation
whether or not you need to put your jar onto the Flink master node and run the
flink command from there in order to get a job running. The documentation on
https://ci.apache.org/projects/flink/flink-docs-release-1.0/se
For me it was taking the local jar and uploading it into the cluster.
2016-04-06 13:16 GMT+02:00 Shannon Carey :
> Thanks for the info! It is a bit difficult to tell based on the
> documentation whether or not you need to put your jar onto the Flink master
> node and run the flink command from th
The Flink PMC is pleased to announce the availability of Flink 1.0.1.
The official release announcement:
http://flink.apache.org/news/2016/04/06/release-1.0.1.html
Release binaries:
http://apache.openmirror.de/flink/flink-1.0.1/
Please update your Maven dependencies to the new 1.0.1 version and
The website has been updated for 1.0.1. :-)
@Till: If you don't mention it in the post, it makes sense to have a
note at the end of the post saying that the code examples only work
with 1.0.1.
On Mon, Apr 4, 2016 at 3:35 PM, Till Rohrmann wrote:
> Thanks a lot to all for the valuable feedback. I
"Getting Started" in main page shows "Download 1.0" instead of 1.0.1
-Matthias
On 04/06/2016 02:03 PM, Ufuk Celebi wrote:
> The website has been updated for 1.0.1. :-)
>
> @Till: If you don't mention it in the post, it makes sense to have a
> note at the end of the post saying that the code exam
Till,
Found the issue, it was my bad assumption about GlobalConfiguration, what
I thought was once the configuration is read from the client machine
GlobalConfiguration params will passed on to the task manager nodes, as
well, it was not and values from default was getting pickup, which was
local
Ho Ritesh,
I have sone experience with Rdf and Flink. What do you mean for accessing a
Jena model? How do you create it?
>From my experience reading triples from jena models is evil because it has
some problems with garbage collection.
On 6 Apr 2016 00:51, "Ritesh Kumar Singh"
wrote:
> Hi,
>
> I
That is a good point Ufuk. Will add the note.
On Wed, Apr 6, 2016 at 2:03 PM, Ufuk Celebi wrote:
> The website has been updated for 1.0.1. :-)
>
> @Till: If you don't mention it in the post, it makes sense to have a
> note at the end of the post saying that the code examples only work
> with 1.0
On Wed, Apr 6, 2016 at 2:18 PM, Matthias J. Sax wrote:
> "Getting Started" in main page shows "Download 1.0" instead of 1.0.1
We always had it like that, but I agree that it can be confusing. 1.0
indicates the "series" and the download page shows the exact version.
We can certainly change it.
–
What about YARN(and HDFS) configuration? I put yarn-site.xml directly into
classpath? Or I can set the variables in the execution environment? I will give
it a try tomorrow morning, will report back and if successful blog about it ofc
☺
From: Christophe Salperwyck [mailto:christophe.salperw...@
Great to hear that you solved your problem :-)
On Wed, Apr 6, 2016 at 2:29 PM, Balaji Rajagopalan <
balaji.rajagopa...@olacabs.com> wrote:
> Till,
> Found the issue, it was my bad assumption about GlobalConfiguration,
> what I thought was once the configuration is read from the client machine
>
Hi Flavio,
1. How do you access your rdf dataset via flink? Are you reading it as a
normal input file and splitting the records or you have some wrappers in
place to convert the rdf data into triples? Can you please share some code
samples if possible?
2. I am using Jena TDB command
I exported it in an environment variable before starting Flink:
flink-0.10.1/bin/yarn-session.sh -n 64 -s 4 -jm 4096 -tm 4096
2016-04-06 15:36 GMT+02:00 Serhiy Boychenko :
> What about YARN(and HDFS) configuration? I put yarn-site.xml directly into
> classpath? Or I can set the variables in the e
Hi Shannon!
Welcome to the Flink community!
You are right, sinks need in general to be idempotent if you want
"exactly-once" semantics, because there can be a replay of elements that
were already written.
However, what you describe later, overwriting of a key with a new value (or
the same value
Hello,
I'm getting started with Flink for a use case that could leverage the
window processing abilities of Flink that Spark does not offer.
Basically I have dumps of timeseries data (10y in ticks) which I need to
calculate many metrics in an exploratory manner based on event time. NOTE:
I don't
The new back pressure docs are great, thanks Ufuk! I'm sure those will help
others as well.
In the Source => A => B => Sink example, if A and B show HIGH back
pressure, should Source also show HIGH? In my case it's a Kafka source and
Elasticsearch sink. I know currently our Kafka can provide data
Hi Stephan - incremental checkpointing sounds really interesting and
useful, I look forward to trying it out.
Thanks,
Zach
On Wed, Apr 6, 2016 at 4:39 AM Stephan Ewen wrote:
> Hi Zach!
>
> I am working on incremental checkpointing, hope to have it in the master
> in the next weeks.
>
> The cur
Ah sorry, I forgot to mention that in the docs.
The way that data is pulled from Kafka is bypassing Flink's task
Thread. The topic is consumed in a separate Thread and the task Thread
is just waiting. That's why you don't see any back pressure for Kafka
sources. I would expect your Kafka source to
Yeah I don't think that's the case for my setup either :) I wrote a simple
Flink job that just consumes from Kafka and sinks events/sec rate to
Graphite. That consumes from Kafka several orders of magnitude higher than
the job that also sinks to Elasticsearch. As you said, the downstream back
pres
Hi,
I am interested too. For my part, I was thinking to use HBase as a backend
so that my data are stored sorted. Nice to have to generate timeseries in
the good order.
Cheers,
Christophe
2016-04-06 21:22 GMT+02:00 Raul Kripalani :
> Hello,
>
> I'm getting started with Flink for a use case that
I tried RocksDB, but the result was almost the same.
I used the following code and put 2.6million distinct records into Kafka.
After processing all records, the state on the HDFS become about 250MB
and time needed for
the checkpoint was almost 5sec. Processing throughput was
FsStateBackend-> 8000m
Hello,
I'm not sure whether it's a Hadoop or Flink-specific question, but since I
ran into this in the context of Flink I'm asking here. I would be glad if
anyone can suggest a more appropriate place.
I have a native library that I need to use in my Flink batch job that I run
on EMR, and I try to
Hi Till,
I used Flink version 1.0.0 and tried all three TimeCharacteristics.
Not I tried the new Flink 1.0.1 that gives me the following error.
After detecting an event it processes a few stream tuples but then crashes.
I'm not sure how to solve that part of the error message: "This can indicate
t
35 matches
Mail list logo