Hi Will,
For issue #2 I was concerned that the build & packaging had to be
internal. So I am using the already packaged make-distribution.sh
(modified to use a maven build) to create a tar ball which I then package
it using a RPM spec file.
Although on a side note, it would interesting to learn h
I am seeing a small standalone cluster (master, slave) hang when I reach a
certain memory threshold, but I cannot detect how to configure memory to avoid
this.
I added memory by configuringĀ SPARK_DAEMON_MEMORY=2G and I can see this
allocated, but it does not help.
The reduce is by key to get th
It worked when I converted the nested RDD to an array
--
case class TradingTier(tierId:String, lowerLimit:Int,upperLimit:Int ,
transactionFees:Double)
//userTransactions Seq[(accountId,numTransactions)]
val userTransactionsRDD =
sc.parallelize(Seq((id1,2),(
Hi All,
I am using spark-0.9.0 and am able to run my program successfully if spark
master and worker are in same machine.
If i run the same program in spark master in Machine A and worker in
Machine B, I am getting below exception
I am running program with java -cp "..." instead of scala command
Hi,
This will work nicely unless you're using spot instances, in this case
the "start" does not work as slaves are lost on shutdown.
I feel like spark-ec2 script need a major refactor to cope with new
features/more users using it in dynamic environments.
Are there any current plans to migrate it to
Venkat, correct, though to be sure, I'm referring to I/O related to
loading/saving data from/to their persistence locations, and not I/O
related to local operations like RDD caching or shuffling.
Sent while mobile. Pls excuse typos etc.
On Apr 5, 2014 11:11 AM, "Venkat Krishnamurthy" wrote:
> C