[jira] [Created] (FLINK-2365) Review of How to contribute page

2015-07-15 Thread Enrique Bautista Barahona (JIRA)
Enrique Bautista Barahona created FLINK-2365: Summary: Review of How to contribute page Key: FLINK-2365 URL: https://issues.apache.org/jira/browse/FLINK-2365 Project: Flink Issue

[jira] [Created] (FLINK-2364) Link to Guide is Broken

2015-07-15 Thread Suminda Dharmasena (JIRA)
Suminda Dharmasena created FLINK-2364: - Summary: Link to Guide is Broken Key: FLINK-2364 URL: https://issues.apache.org/jira/browse/FLINK-2364 Project: Flink Issue Type: Bug R

Re: Read XML from HDFS

2015-07-15 Thread santosh_rajaguru
Thanks Fabian Kostas for info. Using XMLInputFormat, I am able to read a xml file from HDFS. Cheers, Santosh -- View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Read-XML-from-HDFS-tp7023p7035.html Sent from the Apache Flink Mailing List archive. ma

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Rohit Shinde
Okay! Thank you! On Wed, Jul 15, 2015 at 6:22 PM, Ufuk Celebi wrote: > Hey Rohit, > > it's best to do the discussion related to a specific issue *in* the issue > itself instead of the mailing list. > > In general, it's better to ask specific questions. But a general pointer > would be to look i

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Ufuk Celebi
Hey Rohit, it's best to do the discussion related to a specific issue *in* the issue itself instead of the mailing list. In general, it's better to ask specific questions. But a general pointer would be to look into the existing ML algorithm implementations, Stephan's approximate PageRank impleme

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Rohit Shinde
I intend to solve this issue: https://issues.apache.org/jira/browse/FLINK-1748 Could someone give me some pointers on how to approach this? On Wed, Jul 15, 2015 at 4:58 PM, Kostas Tzoumas wrote: > IDE choice is up to you with some limitations, see here for IDE setup > instructions: > > https://

Re: [Gelly] Help with GSA compiler tests

2015-07-15 Thread Stephan Ewen
Lady Kalamari, The plan looks good. To test whether the data is partitioned there: If you have the optimizer plan, make sure the global properties have a partitioning property of "PATITIONED_HASH". Thanks, Stephan On Wed, Jul 15, 2015 at 2:07 PM, Vasiliki Kalavri wrote: > Hi, > > thank you S

Re: [Gelly] Help with GSA compiler tests

2015-07-15 Thread Vasiliki Kalavri
Hi, thank you Stephan! Here's the missing part of the plan: http://i.imgur.com/N861tg1.png There is one hash partition / sort. Is this what you're talking about? Regarding your second point, how can I test if the data is known to be partitioned at the end? -Vasia. On 15 July 2015 at 13:13, St

Re: Read XML from HDFS

2015-07-15 Thread Kostas Tzoumas
Perhaps there is also an existing HadoopInputFormat for XML that you might be able to reuse for your purposes (Flink supports Hadoop input formats). For example, there is an XMLInputFormat in the Apache Mahout codebase that you could take a look at: https://github.com/apache/mahout/blob/ad84344e40

Re: Read XML from HDFS

2015-07-15 Thread Fabian Hueske
Hi Santosh, yes that is possible, if you want to read a complete file without splitting it into records. However, you need to implement a custom InputFormat for that which extends Flink's FileInputFormat. If you want to split it into records, you need a character sequence that delimits two record

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Kostas Tzoumas
IDE choice is up to you with some limitations, see here for IDE setup instructions: https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/ide_setup.html Scala IDE is not limited to Scala, it is based on Eclipse, so you can develop in Java. Most committers are using IntelliJ as far

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Rohit Shinde
What IDE should I use? There are various options and I already have Eclipse Luna. The IDE page lists that the Scala IDE is the best. So should I go with the Scala IDE? Will I be able to develop in Java later? On Wed, Jul 15, 2015 at 4:44 PM, Kostas Tzoumas wrote: > Hi Rohit, > > If you are just

Re: Student looking to contribute to Stratosphere

2015-07-15 Thread Kostas Tzoumas
Hi Rohit, If you are just working on your laptop, I personally find it much easier to work without Hadoop and use the local file system or just Java collections for testing and trying out ideas. When you move to a cluster, it is common to use a Hadoop installation to store large files in HDFS. Th

Re: [Gelly] Help with GSA compiler tests

2015-07-15 Thread Stephan Ewen
Hey Vasia! Sorry for the late response... Thanks for pinging again! The optimizer is acting a little funky here - seems an artifact of the "properties" optimization. -> The initial join needs to be partitioned and sorted. Can you check whether one partitioning and sorting happens before the it

Read XML from HDFS

2015-07-15 Thread santosh_rajaguru
Hi, Is there any way to read the complete XML string or file from HDFS using flink? Thanks and Regards, Santosh -- View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Read-XML-from-HDFS-tp7023.html Sent from the Apache Flink Mailing List archive. mail

Re: [Gelly] Help with GSA compiler tests

2015-07-15 Thread Vasiliki Kalavri
Hey, any input on this? or a hint? or where to look to figure this out by myself? Thanks! -Vasia. On 7 July 2015 at 15:20, Vasiliki Kalavri wrote: > Hello to my squirrels, > > I've started looking into FLINK-1943 > and I need some help > to un

[jira] [Created] (FLINK-2363) Add an end-to-end overview of program execution in Flink to the docs

2015-07-15 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2363: --- Summary: Add an end-to-end overview of program execution in Flink to the docs Key: FLINK-2363 URL: https://issues.apache.org/jira/browse/FLINK-2363 Project: Flink

Re: Hadoop 2.5.2 compatible - flink

2015-07-15 Thread santosh_rajaguru
Thanks Robert. I will try that too -- View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Hadoop-2-5-2-compatible-flink-tp7018p7020.html Sent from the Apache Flink Mailing List archive. mailing list archive at Nabble.com.

Re: Hadoop 2.5.2 compatible - flink

2015-07-15 Thread Robert Metzger
Hi Santosh, I would try the Hadoop 2.4.1 build of Flink. On Wed, Jul 15, 2015 at 10:13 AM, santosh_rajaguru wrote: > Hi, > > which version of flink is compatible with hadoop-2.5.2? > The releases in 0.9.0 explicitly mentioned the compatibility with 2.2.0, > 2.6.0, 2.7.0, but not 2.5.x > > Rega

Hadoop 2.5.2 compatible - flink

2015-07-15 Thread santosh_rajaguru
Hi, which version of flink is compatible with hadoop-2.5.2? The releases in 0.9.0 explicitly mentioned the compatibility with 2.2.0, 2.6.0, 2.7.0, but not 2.5.x Regards, Santosh -- View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Hadoop-2-5-2-compa