-1 because of https://issues.apache.org/jira/browse/SPARK-16121.
This jira was resolved after 2.0.0-RC1 was cut. Without the fix, Spark
SQL effectively only uses the driver to list files when loading datasets
and the driver-side file listing is very slow for datasets having many
files and partitio
Hi,
I am running a PySpark app with 1000's of cores (partitions is a small
multiple of # of cores) and the overall application performance is fine.
However, I noticed that, at the end of the job, PySpark initiates job
clean-up procedures and as part of this procedure, PySpark executes a job
shown
Hi,
After reviewing makeOffer and launchTasks in
CoarseGrainedSchedulerBackend I came to the following conclusion:
Scheduling in Spark relies on cores only (not memory), i.e. the number
of tasks Spark can run on an executor is constrained by the number of
cores available only. When submitting Spa
Hi,
There are a lot of moving parts and a lot of unknowns from your description.
Besides the version stuff.
How many executors, how many cores? How much memory?
Are you persisting (memory and disk) or just caching (memory)
During the execution… same tables… are you seeing a lot of shufflin
Vote passed. Please see below. I will work on packaging the release.
+1 (9 votes, 4 binding)
Reynold Xin*
Sean Owen*
Tim Hunter
Michael Armbrust*
Sean McNamara*
Kousuke Saruta
Sameer Agarwal
Krishna Sankar
Vaquar Khan
0
none
-1
Maciej Bryński
* binding votes
On Sun, Jun 19, 2016 at 9:24 PM,
Maciej let's fix SPARK-13283. It won't block 1.6.2 though.
On Thu, Jun 23, 2016 at 5:45 AM, Maciej Bryński wrote:
> -1
>
> I need SPARK-13283 to be solved.
>
> Regards,
> Maciek Bryński
>
> 2016-06-23 0:13 GMT+02:00 Krishna Sankar :
>
>> +1 (non-binding, of course)
>>
>> 1. Compiled OSX 10.10 (Y
+1 (non-binding
Regards,
Vaquar khan
On 23 Jun 2016 07:50, "Sean Owen" wrote:
> I don't think that qualifies as a blocker; not even clear it's a
> regression. Even non-binding votes here should focus on whether this
> is OK to release as a maintenance update to 1.6.1.
>
> On Thu, Jun 23, 2016 at
I don't think that qualifies as a blocker; not even clear it's a
regression. Even non-binding votes here should focus on whether this
is OK to release as a maintenance update to 1.6.1.
On Thu, Jun 23, 2016 at 1:45 PM, Maciej Bryński wrote:
> -1
>
> I need SPARK-13283 to be solved.
>
> Regards,
>
-1
I need SPARK-13283 to be solved.
Regards,
Maciek Bryński
2016-06-23 0:13 GMT+02:00 Krishna Sankar :
> +1 (non-binding, of course)
>
> 1. Compiled OSX 10.10 (Yosemite) OK Total time: 37:11 min
> mvn clean package -Pyarn -Phadoop-2.6 -DskipTests
> 2. Tested pyspark, mllib (iPython 4.0)
>
Hi All,
On submitting 20 parallel same SQL query to Spark Thrift Server, the
query execution time for some queries are less than a second and some are
more than 2seconds. The Spark Thrift Server logs shows all 20 queries are
submitted at same time 16/06/23 12:12:01 but the result schema are at
I'm also seeing some of these same failures:
- spilling with compression *** FAILED ***
I have seen this occassionaly
- to UTC timestamp *** FAILED ***
This was fixed yesterday in branch-2.0 (
https://issues.apache.org/jira/browse/SPARK-16078)
- offset recovery *** FAILED ***
Haven't seen this f
First pass of feedback on the RC: all the sigs, hashes, etc are fine.
Licensing is up to date to the best of my knowledge.
I'm hitting test failures, some of which may be spurious. Just putting
them out there to see if they ring bells. This is Java 8 on Ubuntu 16.
- spilling with compression ***
12 matches
Mail list logo