Re: Running Spark on a single machine

2014-03-16 Thread goi cto
or iPhone > > > On Sun, Mar 16, 2014 at 11:39 PM, goi cto wrote: > >> Hi, >> >> I know it is probably not the purpose of spark but the syntax is easy and >> cool... >> I need to run some spark like code in memory on a single machine any >> poi

Running Spark on a single machine

2014-03-16 Thread goi cto
Hi, I know it is probably not the purpose of spark but the syntax is easy and cool... I need to run some spark like code in memory on a single machine any pointers how to optimize it to run only on one machine? -- Eran | CTO

How to work with ReduceByKey?

2014-03-13 Thread goi cto
Hi, I have an RDD with > which I want to reduceByKey and get I+I and List of List (add the integers and build a list of the lists. BUT reduce by key requires that the return value is of the same type of the input so I can combine the lists. JavaPairRDD>>> callCount = byCaller.*reduceByKey*( new

What is the difference between map and flatMap

2014-03-12 Thread goi cto
Hi, Can someone explain to me the difference between map and flatMap and what is a good use case for each? -- Eran | CTO

Re: Problem with "delete spark temp dir" on spark 0.8.1

2014-03-04 Thread goi cto
his file once the program completed. Eran On Tue, Mar 4, 2014 at 11:36 AM, Akhil Das wrote: > Hi, > > Try to clean your temp dir, System.getProperty("java.io.tmpdir") > > Also, Can you paste a longer stacktrace? > > > > > Thanks > Best Regards > > >

Fwd: Problem with "delete spark temp dir" on spark 0.8.1

2014-03-04 Thread goi cto
Hi, I am running a spark java program on a local machine. when I try to write the output to a file (RDD.SaveAsTextFile) I am getting this exception: Exception in thread "Delete Spark temp dir ..." This is running on my local window machine. Any ideas? -- Eran | CTO

Re: Beginners Hadoop question

2014-03-03 Thread goi cto
> My favorite quotes (today): > "If debugging is the process of removing software bugs, then programming > must be the process of putting ..." > - Edsger Dijkstra > > "If you pay peanuts you get monkeys" > > > > 2014-03-03 12:10 GMT+01:00 goi cto :

Beginners Hadoop question

2014-03-03 Thread goi cto
Hi, I am sorry for the beginners question but... I have a spark java code which reads a file (c:\my-input.csv) process it and writes an output file (my-output.csv) Now I want to run it on Hadoop in a distributed environment 1) My inlut file should be one big file or separate smaller files? 2) if w

Problem with "delete spark temp dir" on spark 0.8.1

2014-03-03 Thread goi cto
Hi, I am running a spark java program on a local machine. when I try to write the output to a file (RDD.SaveAsTextFile) I am getting this exception: Exception in thread "Delete Spark temp dir ..." This is running on my local window machine. Any ideas? -- Eran | CTO

Error: Could not find or load main class org.apache.spark.repl.Main on GitBash

2014-03-03 Thread goi cto
Hi, I am trying to run Spark-shell on GitBash on windows with Spark 0.9 I am getting "*Error: Could not find or load main class org.apache.spark.repl.Main*" I tried running sbt/sbt clean assembly which completed successfully but the problem still exist. Any other ideas? Which path variables shou