Hi, this looks like the flink-connector-kafka jar is not available where
the job is running? Did you put it in the library folder of flink on all
the machines or did you submit it with the job?
On Thu, Jul 16, 2015, 21:05 Wendong wrote:
> Hi Gyula,
>
> Cool. I removed .print and the error was go
One improvement suggestion, please check if it is valid?
For checking system whether be adequately reliability, testers usually
designedly do some delete operation.
Steps:
1.go to "flink\build-target\log"
2.delete “flink-xx-jobmanager-linux-3lsu.log" file
3.Run jobs along with writing log info, m
Hi Martin,
good to hear that you like Flink :-)
AFAIK, there are no plans to add a containment join. The Flink community is
currently working on adding support for outer joins.
Regarding a containment join, I am not sure about the number of use cases.
I would rather try to implement it on top of F
Hi Gyula,
Cool. I removed .print and the error was gone.
However, env.execute failed with errors:
.
Caused by: java.lang.Exception: Call to registerInputOutput() of invokable
failed
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:504)
...
Caused by: org.apache.fli
but isn't something that cause problem to have multiple vertices with the
same id?
On 16 Jul 2015 19:34, "Andra Lungu" wrote:
> For now, there is a validator that checks whether the vertex ids
> correspond to the target/src ids in the edges. If you want to check for
> vertex ID uniqueness, you'll
Hey,
The reason you are getting that error is because you are calling print
after adding a sink, which is an invalid operation.
Remove either addSink or print :)
Cheers,
Gyula
On Thu, Jul 16, 2015 at 7:37 PM Wendong wrote:
> Thanks! I tried your updated MySimpleStringSchema and it works for both
Thanks! I tried your updated MySimpleStringSchema and it works for both
source and sink.
However, my problem is the runtime error "Data stream sinks cannot be
copied" as listed in previous post. I hope someone ran into the problem
before and can give a hint.
Wendong
--
View this message in co
For now, there is a validator that checks whether the vertex ids correspond
to the target/src ids in the edges. If you want to check for vertex ID
uniqueness, you'll have to implement your own custom validator... I know
people with the same error outside Gelly, so I doubt that the lack of
unique id
Good to hear that your problem is solved :-)
Cheers,
Till
On Thu, Jul 16, 2015 at 5:45 PM, Philipp Goetze <
philipp.goe...@tu-ilmenau.de> wrote:
> Hi Till,
>
> many thanks for your effort. I finally got it working.
>
> I'm a bit embarrassed because the issue was solved by using the same
> flink
I just checked on web job manager, it says that runtime for flink job is
349ms, but actually it takes 18s using "time" command in terminal
Should I care more about the latter timing ?
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scal
Hi Till,
many thanks for your effort. I finally got it working.
I'm a bit embarrassed because the issue was solved by using the same
flink-dist-JAR from the locally running Flink version. So to say I used
an older Snapshot version for compiling than for running :-[
Best Regards,
Philipp
On
Hi Philipp,
what I usually do to run a Flink program on a cluster from within my IDE, I
create a RemoteExecutionEnvironment. Since I have one UDF (the map function
which doubles the values) defined, I also need to specify the jar
containing this class. In my case, the jar is called test-1.0-SNAPSH
I thought a bit about this error..in my job I was generating multiple
vertices with the same id.
Could this cause such errors? Maybe there could be a check about such
situations in Gelly..
On Tue, Jul 14, 2015 at 10:00 PM, Andra Lungu wrote:
> Hello,
>
> Sorry for the delay. The bug is not in Ge
As the JavaDoc explains:
>* @param jarFiles The JAR files with code that needs to be shipped to
> the cluster. If the program uses
>* user-defined functions, user-defined input formats,
> or any libraries, those must be
>* provided in the J
Is it possible that it takes a long time to spawn JVMs on your system? That
this takes up all the time?
On Thu, Jul 16, 2015 at 3:34 PM, Vinh June
wrote:
> Here are my logs
> http://pastebin.com/AJwiy2D8
> http://pastebin.com/K05H3Qur
> from client log, it seems to take ~2s, but with "time flink
Hey Tim,
I think my previous mail was intercepted or something similar. However
you can find my reply below. I already tried a simpler job which just
does a env.fromElements... but still the same stack.
How do you normally submit jobs (jars) from within the code?
Best Regards,
Philipp
You are right, Arnaud. Sorry about this. :( I'm pushing a fix right now (for
0.9.1 as well)). Thanks for reporting this!
On 16 Jul 2015, at 16:22, LINZ, Arnaud wrote:
> Hello,
>
> According to the documentation, getIndexOfThisSubtask starts from 1;
>
>/**
> * Gets the number o
Hello,
According to the documentation, getIndexOfThisSubtask starts from 1;
/**
* Gets the number of the parallel subtask. The numbering starts from 1
and goes up to the parallelism,
* as returned by {@link #getNumberOfParallelSubtasks()}.
*
* @return
Missing reference:
[1]
https://ci.apache.org/projects/flink/flink-docs-release-0.9/apis/cluster_execution.html
On Don, 2015-07-16 at 16:04 +0200, Juan Fumero wrote:
> Hi,
> I would like to use the createRemoteEnvironment to run the application
> in a cluster and I have some questions. Followin
Hi,
I would like to use the createRemoteEnvironment to run the application
in a cluster and I have some questions. Following the documentation in
[1] It is not clear to me how to use it.
What should be the content of the jar file? All the external libraries
that I use? or need to include the pr
Here are my logs
http://pastebin.com/AJwiy2D8
http://pastebin.com/K05H3Qur
from client log, it seems to take ~2s, but with "time flink run ...", actual
time is ~18s
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp206
If you use the sample data from the example, there must be an issue with
the setup.
In Flink's standalone mode, it runs in 100ms on my machine.
It may be possible that the command line client takes a long time to start
up, so it appears that the program run time is long. If it takes so long,
one
Hi Philipp,
it seems that Stephan was right and that your JobGraph is somehow
corrupted. You can see it in the exception JobSubmissionException that the
JobGraph contains a vertex whose InvokableClassName is null. Furthermore,
even the ID and the vertex name are null. This is a strong indicator, t
Hello,
I’m struggling with this simple issue for hours now : I am unable to get the
accumulator result of a streaming context result, the accumulator map in the
JobExecutionResult is always empty.
Simple test code (directly inspired from the documentation) :
My source =
public static clas
@Stephan: I use the sample data comes with the sample
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2091.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.
Hey Tim,
here the console output now with log4j:
0[pool-7-thread-1-ScalaTest-running-FlinkCompileIt] INFO
org.apache.flink.client.program.Client - Starting program in
interactive mode
121 [pool-7-thread-1-ScalaTest-running-FlinkCompileIt] DEBUG
org.apache.flink.api.scala.ClosureCleaner$
You can increase Flink managed memory by increasing Taskmanager JVM Heap
(taskmanager.heap.mb) in flink-conf.yaml.
There is some explanation of options in Flink documentation [1].
Regards,
Chiwan Park
[1]
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#common-options
>
Could you also look into the JobManager logs? You may be submitting a
corrupt JobGraph...
On Thu, Jul 16, 2015 at 11:45 AM, Till Rohrmann
wrote:
> When you run your program from the IDE, then you can specify a
> log4j.properties file. There you can configure where and what to log. It
> should be
I found it in JobManager log
"21:16:54,986 INFO org.apache.flink.runtime.taskmanager.TaskManager
- Using 25 MB for Flink managed memory."
is there a way to explicitly assign this for local ?
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.
Vinh,
Are you using the sample data built into the example, or are you using your
own data?
On Thu, Jul 16, 2015 at 8:54 AM, Vinh June
wrote:
> I ran it on local, from terminal.
> And it's the Word Count example so it's small
>
>
>
> --
> View this message in context:
> http://apache-flink-user
Hey Vinh,
you have to look into the logs folder and find the log of the TaskManager
(something like *taskmanager*.log)
– Ufuk
On 16 Jul 2015, at 11:35, Vinh June wrote:
> Hi Max,
> When I call 'flink run', it doesn't show any information like that
>
>
>
> --
> View this message in context
When you run your program from the IDE, then you can specify a
log4j.properties file. There you can configure where and what to log. It
should be enough to place the log4j.properties file in the resource folder
of your project. An example properties file could look like:
log4j.rootLogger=INFO, tes
Hi Max,
When I call 'flink run', it doesn't show any information like that
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-performance-tp2065p2083.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Hi Till,
the problem is that this is the only output :( Or is it possible to get
a more verbose log output?
Maybe it is important to note, that both Flink and our project is built
with Scala 2.11.
Best Regards,
Philipp
On 16.07.2015 11:12, Till Rohrmann wrote:
Hi Philipp,
could you post
Hi Philipp,
could you post the complete log output. This might help to get to the
bottom of the problem.
Cheers,
Till
On Thu, Jul 16, 2015 at 11:01 AM, Philipp Goetze <
philipp.goe...@tu-ilmenau.de> wrote:
> Hi community,
>
> in our project we try to submit built Flink programs to the jobmanag
Hi community,
in our project we try to submit built Flink programs to the jobmanager
from within Scala code. The test program is executed correctly when
submitted via the wrapper script "bin/flink run ..." and also with the
webclient. But when executed from within the Scala code nothing seems
Hi,
your first example doesn't work because the SimpleStringSchema does not
work for sinks. You can use this modified serialization schema:
https://gist.github.com/aljoscha/e131fa8581f093915582. This works for both
source and sink (I think the current SimpleStringSchema is not correct and
should be
Hi George
Thanks for the details. It looks like I have a long way to go.
For big data benchmark, I would like to use that test cases, test data and
test methodology to test different big data technologies.
BTW, I am agree with you that no one system will necessarily be optimal for
all cases for
HI Vinh,
If you run your program locally, then Flink uses the local execution mode
which allocates only little managed memory. Managed memory is used by Flink
to perform operations on serialized data. These operations can get slow if
too little memory gets allocated because data needs to be spille
ds1.filter(//here selection of query 1)
ds2.filter(//here selection of query 2)
exist
ds1.join(ds2.distinct(id)).where(id).equal(id){ // join by your join key(s) -
note the distinct operator, otherwise you will get many line for each input line
(left, right) => left //collect left
}
or
ds1.cogr
Hi everyone,
at first, thanks for building this great framework! We are using Flink
and especially Gelly for building a graph analytics stack (gradoop.com).
I was wondering if there is a [planned] support for a containment join
operator. Consider the following example:
DataSet> left := {[0, 1],
41 matches
Mail list logo