Re: Simple record matching using Spark SQL

2014-07-24 Thread Yin Huai
Hi Sarath, Have you tried the current branch 1.0? If not, can you give it a try and see if the problem can be resolved? Thanks, Yin On Thu, Jul 24, 2014 at 11:17 AM, Yin Huai wrote: > Hi Sarath, > > I will try to reproduce the problem. > > Thanks, > > Yin > > > On Wed, Jul 23, 2014 at 11:32

Re: Simple record matching using Spark SQL

2014-07-24 Thread Yin Huai
Hi Sarath, I will try to reproduce the problem. Thanks, Yin On Wed, Jul 23, 2014 at 11:32 PM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Hi Michael, > > Sorry for the delayed response. > > I'm using Spark 1.0.1 (pre-built version for hadoop 1). I'm running spark > prog

Re: Simple record matching using Spark SQL

2014-07-17 Thread Michael Armbrust
What version are you running? Could you provide a jstack of the driver and executor when it is hanging? On Thu, Jul 17, 2014 at 10:55 AM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Added below 2 lines just before the sql query line - > *...* > *file1_schema.count;* > *fi

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
Added below 2 lines just before the sql query line - *...* *file1_schema.count;* *file2_schema.count;* *...* and it started working. But I couldn't get the reason. Can someone please explain me? What was happening earlier and what is happening with addition of these 2 lines? ~Sarath On Thu, Jul

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
No Sonal, I'm not doing any explicit call to stop context. If you see my previous post to Michael, the commented portion of the code is my requirement. When I run this over standalone spark cluster, the execution keeps running with no output or error. After waiting for several minutes I'm killing

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sonal Goyal
Hi Sarath, Are you explicitly stopping the context? sc.stop() Best Regards, Sonal Nube Technologies On Thu, Jul 17, 2014 at 12:51 PM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Hi Michael, Soumya, >

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
Hi Michael, Soumya, Can you please check and let me know what is the issue? what am I missing? Let me know if you need any logs to analyze. ~Sarath On Wed, Jul 16, 2014 at 8:24 PM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Hi Michael, > > Tried it. It's correctly print

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Hi Michael, Tried it. It's correctly printing the line counts of both the files. Here's what I tried - *Code:* *package test* *object Test4 {* * case class Test(fld1: String, * * fld2: String, * * fld3: String, * * fld4: String, * * fld5: String, * * fld6: Double, * * fld7: String);*

Re: Simple record matching using Spark SQL

2014-07-16 Thread Michael Armbrust
What if you just run something like: *sc.textFile("hdfs://localhost:54310/user/hduser/file1.csv").count()* On Wed, Jul 16, 2014 at 10:37 AM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Yes Soumya, I did it. > > First I tried with the example available in the documentation

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Yes Soumya, I did it. First I tried with the example available in the documentation (example using people table and finding teenagers). After successfully running it, I moved on to this one which is starting point to a bigger requirement for which I'm evaluating Spark SQL. On Wed, Jul 16, 2014 a

Re: Simple record matching using Spark SQL

2014-07-16 Thread Soumya Simanta
Can you try submitting a very simple job to the cluster. > On Jul 16, 2014, at 10:25 AM, Sarath Chandra > wrote: > > Yes it is appearing on the Spark UI, and remains there with state as > "RUNNING" till I press Ctrl+C in the terminal to kill the execution. > > Barring the statements to cre

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Yes it is appearing on the Spark UI, and remains there with state as "RUNNING" till I press Ctrl+C in the terminal to kill the execution. Barring the statements to create the spark context, if I copy paste the lines of my code in spark shell, runs perfectly giving the desired output. ~Sarath On

Re: Simple record matching using Spark SQL

2014-07-16 Thread Soumya Simanta
When you submit your job, it should appear on the Spark UI. Same with the REPL. Make sure you job is submitted to the cluster properly. On Wed, Jul 16, 2014 at 10:08 AM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Hi Soumya, > > Data is very small, 500+ lines in each file.

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Hi Soumya, Data is very small, 500+ lines in each file. Removed last 2 lines and placed this at the end "matched.collect().foreach(println);". Still no luck. It's been more than 5min, the execution is still running. Checked logs, nothing in stdout. In stderr I don't see anything going wrong, all

Re: Simple record matching using Spark SQL

2014-07-16 Thread Soumya Simanta
Check your executor logs for the output or if your data is not big collect it in the driver and print it. > On Jul 16, 2014, at 9:21 AM, Sarath Chandra > wrote: > > Hi All, > > I'm trying to do a simple record matching between 2 files and wrote following > code - > > import org.apache.sp