Dear Spark developers,
What happens if RDD does not fit into memory and cache would not work in the
code below? Will all previous iterations repeated each new iteration within
iterative RDD update (as described below)?
Also, could you clarify regarding DataFrame and GC overhead: does setting
s
Hi Burak,
If I change the code as you suggested then it fails with (given that blockSize
is 1):
“org.apache.spark.SparkException: The MatrixBlock at (3, 3) has dimensions
different than rowsPerBlock: 2, and colsPerBlock: 2. Blocks on the
right and bottom edges can have smaller dimen
I have tested on another pc which has 8 CPU cores.
But it hangs when defaultParallelismLevel > 4, e.g.
sparkConf.setMaster("local[*]")
local[1] ~ local[3] work well.
4 is the mysterious boundary.
It seems that I am not the only one encountered this problem:
https://issues.apache.org/jira/browse/S
Can you provide a bit more information such as:
release of Spark you use
snippet of your SparkSQL query
Thanks
On Thu, Jul 16, 2015 at 5:31 AM, nipun wrote:
> I have a dataframe. I register it as a temp table and run a spark sql query
> on it to get another dataframe. Now when I run groupBy on
I have a dataframe. I register it as a temp table and run a spark sql query
on it to get another dataframe. Now when I run groupBy on it, it gives me
this exception
e: Lost task 1.3 in stage 21.0 (TID 579, 172.28.0.162):
java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.s
Hi, some time ago we’ve found that it’s better use Kryo serializer instead of
Java one.
So, we turned it on and use it everywhere.
I have pretty complex objects, which I can’t change. Previously my algo was
building such an objects and then storing them into external storage. It was
not require
One of my pull requests is failing in a test that I have gone nowhere near
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37491/testReport/junit/org.apache.spark/DistributedSuite/_It_is_not_a_test_/
This isn't the only pull request that's failing, and I've merged in the maste
Given the following code which just reads from s3, then saves files to s3
val inputFileName: String = "s3n://input/file/path"
val outputFileName: String = "s3n://output/file/path"
val conf = new
SparkConf().setAppName(this.getClass.getName).setMaster("local[4]")
val
Patching hadoop's build will fix this long term, but not until Hadoop-2.7.2
I think just adding the openstack JAR to the spark classpath should be enough
to pick this up, which the --jars command can do with ease
On that topic, one thing I would like to see (knowing what it takes to get
azure
You can possibly raise a JIRA ticket for feature and start working on it,
once done you can send a pull request with the code changes.
Thanks
Best Regards
On Wed, Jul 15, 2015 at 7:30 PM, Joel Zambrano wrote:
> Thanks Akhil! For the one where I change the rest client, how likely
> would it be
To move this forward, I think one of two things needs to happen:
1. Move this guidance to the wiki. Seems that people gathered here
believe that resolves the issue. Done.
2. Put disclaimers on the current downloads page. This may resolve the
issue, but then we bring it up on the right mailing lis
11 matches
Mail list logo