Thanks a lot for the info on it.
Does this explains 2 temp file generation per each task ( one temp that is
renamed to another )?
I understand why there is one temp file per task, but still not sure why
there were 2 per each task,
Thanks
Gil.
From: Imran Rashid
To: Gil Vernik/Haifa/
Ran tests on OS X
+1
Sean
> On Apr 14, 2015, at 10:59 PM, Patrick Wendell wrote:
>
> I'd like to close this vote to coincide with the 1.3.1 release,
> however, it would be great to have more people test this release
> first. I'll leave it open for a bit longer and see if others can give
> a +
+1
On Wed, Apr 15, 2015 at 5:40 PM, Tom Graves
wrote:
> +1 tested on spark on yarn on hadoop 2.6 cluster with security.
> Tom
>
>
> On Sunday, April 5, 2015 6:25 PM, Patrick Wendell
> wrote:
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 1.2.2!
>
> The ta
+1 tested on spark on yarn on hadoop 2.6 cluster with security.
Tom
On Sunday, April 5, 2015 6:25 PM, Patrick Wendell
wrote:
Please vote on releasing the following candidate as Apache Spark version 1.2.2!
The tag to be voted on is v1.2.2-rc1 (commit 7531b50):
https://git-wip-us.apa
It does not work now, could you file a jira for it?
On Wed, Apr 15, 2015 at 9:29 AM, Suraj Shetiya wrote:
> Thank you :)
>
> That worked. I had another query regarding date being used as filter.
>
> With the new df which has the column cast as date I am unable to apply a
> filter that compares th
Thank you :)
That worked. I had another query regarding date being used as filter.
With the new df which has the column cast as date I am unable to apply a
filter that compares the dates.
The query I am using is :
df.filter(df.Datecol > datetime.date(2015,1,1)).show()
I do not want to use date a
The temp file creation is controlled by a hadoop OutputCommitter, which is
normally FileOutputCommitter by default. Its used in SparkHadoopWriter
(which in turn is used by PairRDDFunctions.saveAsHadoopDataset).
You could change the output committer to not use tmp files (eg. use this
from Aaron Da