Re: Best way to write data to HDFS by Flink

2015-06-25 Thread Hawin Jiang
Hi Stephan Yes, that is a great idea. if it is possible, I will try my best to contribute some codes to Flink. But I have to run some flink examples first to understand Apache Flink. I just run some kafka with flink examples. No examples working for me. I am so sad right now. I didn't get any

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-25 Thread Hawin Jiang
Dear Marton Here are some errors when I run KafkaProducerExample.java from Eclipse. kafka.common.KafkaException: fetching topic metadata for topics [Set(flink-kafka-topic)] from broker [ArrayBuffer(id:0,host:192.168.0.112,port:2181)] failed at kafka.client.ClientUtils$.fetchTopicMetadata(ClientU

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-25 Thread Hawin Jiang
Dear Marton I have upgraded my Flink to 0.9.0. But I could not consume a data from Kafka by Flink. I have fully followed your example. Please help me. Thanks. Here is my code StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment().setParallelism(4); DataStream kafk

Re: Connecting the channel failed: Connection refused

2015-06-25 Thread Stephan Ewen
That makes perfect sense, thanks! Am 25.06.2015 21:39 schrieb "Aaron Jackson" : > So the JobManager was running on host1. This also explains why I didn't > see the problem until I had asked for a sizeable degree of parallelism > since it probably never assigned a task to host3. > > Thanks for you

Re: Connecting the channel failed: Connection refused

2015-06-25 Thread Aaron Jackson
So the JobManager was running on host1. This also explains why I didn't see the problem until I had asked for a sizeable degree of parallelism since it probably never assigned a task to host3. Thanks for your help On Thu, Jun 25, 2015 at 3:34 AM, Stephan Ewen wrote: > Nice! > > TaskManagers ne

ArrayIndexOutOfBoundsException when running job from JAR

2015-06-25 Thread Mihail Vieru
Hi, I get an ArrayIndexOutOfBoundsException when I run my job from a JAR in the CLI. This doesn't occur in the IDE. I've build the JAR using the "maven-shade-plugin" and the pom.xml configuration Robert has provided here: https://stackoverflow.com/questions/30102523/linkage-failure-when-runn

Re: writeAsCsv on HDFS

2015-06-25 Thread Hawin Jiang
HI Flavio Here is the example from Marton: You can used env.writeAsText method directly. StreamExecutionEnvironment env = StreamExecutionEnvironment. getExecutionEnvironment(); env.addSource(PerisitentKafkaSource(..)) .map(/* do you operations*/) .wirteAsText("hdfs://:/path/to/your/file

Re: writeAsCsv on HDFS

2015-06-25 Thread Stephan Ewen
You could also just qualify the HDFS URL, if that is simpler (put host and port of the namenode in there): "hdfs://myhost:40010/path/to/file" On Thu, Jun 25, 2015 at 3:20 PM, Robert Metzger wrote: > You have to put it into all machines > > On Thu, Jun 25, 2015 at 3:17 PM, Flavio Pompermaier > w

Re: writeAsCsv on HDFS

2015-06-25 Thread Robert Metzger
You have to put it into all machines On Thu, Jun 25, 2015 at 3:17 PM, Flavio Pompermaier wrote: > Do I have to put the hadoop conf file on each task manager or just on the > job-manager? > > On Thu, Jun 25, 2015 at 3:12 PM, Chiwan Park > wrote: > >> It represents the folder containing the hadoo

Re: writeAsCsv on HDFS

2015-06-25 Thread Flavio Pompermaier
Do I have to put the hadoop conf file on each task manager or just on the job-manager? On Thu, Jun 25, 2015 at 3:12 PM, Chiwan Park wrote: > It represents the folder containing the hadoop config files. :) > > Regards, > Chiwan Park > > > > On Jun 25, 2015, at 10:07 PM, Flavio Pompermaier > wrot

Re: Documentation Error

2015-06-25 Thread Ufuk Celebi
On 25 Jun 2015, at 14:31, Maximilian Michels wrote: > Thanks for noticing, Chiwan. I have the feeling this problem arose when the > website was updated. The problem about linking documentation pages from the > main website is that it is currently hard to go back to the main web site > from th

Re: writeAsCsv on HDFS

2015-06-25 Thread Chiwan Park
It represents the folder containing the hadoop config files. :) Regards, Chiwan Park > On Jun 25, 2015, at 10:07 PM, Flavio Pompermaier wrote: > > fs.hdfs.hadoopconf represents the folder containing the hadoop config files > (*-site.xml) or just one specific hadoop config file (e.g. core-site

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
Hi Robert, thanks for the offer. At the moment I'm to busy. But maybe when we begin to use Flink for ML in the BBDC. Cheers, Max On Thu, Jun 25, 2015 at 2:48 PM, Robert Metzger wrote: > Hey Maximilian Alber, > I don't know if you are interested in contributing in Flink, but if you > would like

Re: writeAsCsv on HDFS

2015-06-25 Thread Flavio Pompermaier
*fs.hdfs.hadoopconf* represents the folder containing the hadoop config files (*-site.xml) or just one specific hadoop config file (e.g. core-site.xml or the hdfs-site.xml)? On Thu, Jun 25, 2015 at 3:04 PM, Robert Metzger wrote: > Hi Flavio, > > there is a file called "conf/flink-conf.yaml" > Ad

Re: writeAsCsv on HDFS

2015-06-25 Thread Robert Metzger
Hi Flavio, there is a file called "conf/flink-conf.yaml" Add a new line in the file with the following contents: fs.hdfs.hadoopconf: /path/to/your/hadoop/config This should fix the problem. Flink can not load the configuration file from the jar containing the user code, because the file system i

Re: writeAsCsv on HDFS

2015-06-25 Thread Flavio Pompermaier
Could you describe it better with an example please? Why Flink doesn't load automatically the properties of the hadoop conf files within the jar? On Thu, Jun 25, 2015 at 2:55 PM, Robert Metzger wrote: > Hi, > > Flink is not loading the Hadoop configuration from the classloader. You > have to spe

Re: writeAsCsv on HDFS

2015-06-25 Thread Robert Metzger
Hi, Flink is not loading the Hadoop configuration from the classloader. You have to specify the path to the Hadoop configuration in the flink configuration "fs.hdfs.hadoopconf" On Thu, Jun 25, 2015 at 2:50 PM, Flavio Pompermaier wrote: > Hi to all, > I'm experiencing some problem in writing a f

writeAsCsv on HDFS

2015-06-25 Thread Flavio Pompermaier
Hi to all, I'm experiencing some problem in writing a file as csv on HDFS with flink 0.9.0. The code I use is myDataset.writeAsCsv(new Path("hdfs:///tmp", "myFile.csv").toString()); If I run the job from Eclipse everything works fine but when I deploy the job on the cluster (cloudera 5.1.3) I ob

Re: Documentation Error

2015-06-25 Thread Robert Metzger
Hey Maximilian Alber, I don't know if you are interested in contributing in Flink, but if you would like to, these small fixes to the documentation are really helpful for us! Its actually quite easy to work with the documentation locally. It is located in the "docs/" directory of the Flink source.

Re: Memory in local setting

2015-06-25 Thread Sebastian
Is there a way to configure this setting for a delta iteration in the scala API? Best, Sebastian On 17.06.2015 10:04, Ufuk Celebi wrote: On 17 Jun 2015, at 09:35, Mihail Vieru wrote: Hi, I had the same problem and setting the solution set to unmanaged helped: VertexCentricConfiguration p

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
Another one: in the stream guide under "Connecting to the outside world" "Sources" I guess one "by" should be a "be". http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html On Thu, Jun 25, 2015 at 2:42 PM, Maximilian Michels wrote: > Thanks Max. I think the documentatio

Re: Documentation Error

2015-06-25 Thread Maximilian Michels
Thanks Max. I think the documentation has grown a lot and needs an overhaul. We should remove the unnecessary non-Flink-related stuff (e.g. configuring ssh keys in the setup guide). I like your idea of having an "essential" guide that just covers the basics for people already familiar with other bi

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
Something different. I just read through the Spark documentation and yours. While the Spark one is quite unstructured and easy to understand, yours is structured and really detailed. It's great that you have that in depth documentation, but I would recommend you to make a boiled-down page with just

Re: Documentation Error

2015-06-25 Thread Maximilian Michels
Thanks for noticing, Chiwan. I have the feeling this problem arose when the website was updated. The problem about linking documentation pages from the main website is that it is currently hard to go back to the main web site from the documentation (the nav and URL changes). However, now we are suf

Re: Documentation Error

2015-06-25 Thread Chiwan Park
How to contribute, and coding guidelines are also duplicated on the web site and the documentation. I think this duplication is not needed. We need to merge the duplication. Regards, Chiwan Park > On Jun 25, 2015, at 9:01 PM, Maximilian Michels wrote: > > Thanks. Fixed. Actually, that one is n

Re: Documentation Error

2015-06-25 Thread Maximilian Michels
Thanks. Fixed. Actually, that one is not linked anywhere, right? Just realized the FAQ page is duplicated on the web site and the Flink documentation. So there is http://ci.apache.org/projects/flink/flink-docs-master/faq.html and http://flink.apache.org/faq.html I'm guessing somebody wanted a FAQ

Re: Documentation Error

2015-06-25 Thread Maximilian Alber
Another one: on http://ci.apache.org/projects/flink/flink-docs-master/faq.html in the "What is parallelism? How do I set it?" Section the links are broken. Cheers, Max On Wed, Jun 24, 2015 at 9:52 AM, Maximilian Michels wrote: > Hi Max, > > Thanks for noticing! Fixed on the master and for the 0

Re: Cannot instantiate Mysql connection

2015-06-25 Thread Stephan Ewen
Good to hear it works. Libraries, class-loading, and initialization seems to be one of the things that remains tricky once one switches to distributed processed. On Thu, Jun 25, 2015 at 10:58 AM, Flavio Pompermaier wrote: > Sorry for the late response but I was on vacation the last 2 weeks.. >

Re: Cannot instantiate Mysql connection

2015-06-25 Thread Flavio Pompermaier
Sorry for the late response but I was on vacation the last 2 weeks.. Calling Class.forName("com.mysql.jdbc.Driver") in the main() of my class made the things work! Thanks for the support, Flavio On Fri, Jun 5, 2015 at 11:13 PM, Stephan Ewen wrote: > Can you manually load the driver class, with

Re: Connecting the channel failed: Connection refused

2015-06-25 Thread Stephan Ewen
Nice! TaskManagers need to announce where they listen for connections. We do not yet block "localhost" as an acceptable address, to not prohibit local test setups. There are some routines that try to select an interface that can communicate with the outside world. Is host3 running on the same m