I added @since version tag for all public dataframe/sql methods/classes in
this patch: https://github.com/apache/spark/pull/6101/files
>From now on, if you merge anything related to DF/SQL, please make sure the
public functions have @since tag. Thanks.
Hey Kevin and Ron,
So is the main shortcoming of the launcher library the inability to
get an app ID back from YARN? Or are there other issues here that
fundamentally regress things for you.
It seems like adding a way to get back the appID would be a reasonable
addition to the launcher.
- Patric
The class (called Row) for rows from Spark SQL is created on the fly, is
different from pyspark.sql.Row (is an public API to create Row by users).
The reason we done it in this way is that we want to have better performance
when accessing the columns. Basically, the rows are just named tuples
Due to an ASF infrastructure change (bug?) [1] the default JIRA
resolution status has switched to "Pending Closed". I've made a change
to our merge script to coerce the correct status of "Fixed" when
resolving [2]. Please upgrade the merge script to master.
I've manually corrected JIRA's that were
Hi, there
Which version are you using ? Actually the problem seems gone after we change
our spark version from 1.2.0 to 1.3.0
Not sure what the internal changes did.
Best,
Sun.
fightf...@163.com
From: Night Wolf
Date: 2015-05-12 22:05
To: fightf...@163.com
CC: Patrick Wendell; user; dev
Su
i will need to restart jenkins to finish a plugin install and resolve
https://issues.apache.org/jira/browse/SPARK-7561
this will be very brief, and i'll retrigger any errant jobs i kill.
please let me know if there are any comments/questions/concerns.
thanks!
shane
On Tue, May 12, 2015 at 11:34 AM, Kevin Markey
wrote:
> I understand that SparkLauncher was supposed to address these issues, but
> it really doesn't. Yarn already provides indirection and an arm's length
> transaction for starting Spark on a cluster. The launcher introduces yet
> another layer
Hello Spark community,
I am currently trying to implement a proof-of-concept RDD that will allow
to integrate Apache Spark and Apache Ignite (incubating) [1]. My original
idea was to embed an Ignite node in Spark's worker process, in order for
the user code to have a direct access to in-memory dat
We have the same issue. As result, we are stuck back on 1.0.2.
Not being able to programmatically interface directly with the Yarn
client to obtain the application id is a show stopper for us, which is a
real shame given the Yarn enhancements in 1.2, 1.3, and 1.4.
I understand that SparkLaun
We have a small mesos cluster and these slaves need to have a vfs setup on them
so that the slaves can pull down the data they need from S3 when spark runs.
There doesn’t seem to be any obvious way online on how to do this or how easily
accomplish this. Does anyone have some best practices or so
I tend to find that any large project has a lot of walking dead JIRAs, and
pretending they are simply Open causes problems. Any state is better for
these, so I favor this.
Agreed.
1. Inactive: A way to clear out inactive/dead JIRA’s without
indicating a decision has been made one way or th
It could also be that your hash function is expensive. What is the key class
you have for the reduceByKey / groupByKey?
Matei
> On May 12, 2015, at 10:08 AM, Night Wolf wrote:
>
> I'm seeing a similar thing with a slightly different stack trace. Ideas?
>
> org.apache.spark.util.collection.App
I'm seeing a similar thing with a slightly different stack trace. Ideas?
org.apache.spark.util.collection.AppendOnlyMap.changeValue(AppendOnlyMap.scala:150)
org.apache.spark.util.collection.SizeTrackingAppendOnlyMap.changeValue(SizeTrackingAppendOnlyMap.scala:32)
org.apache.spark.util.collection.E
Seeing similar issues, did you find a solution? One would be to increase
the number of partitions if you're doing lots of object creation.
On Thu, Feb 12, 2015 at 7:26 PM, fightf...@163.com
wrote:
> Hi, patrick
>
> Really glad to get your reply.
> Yes, we are doing group by operations for our wo
May be you should check where exactly its throwing up permission denied
(possibly trying to write to some directory). Also you can try manually
cloning the git repo to a directory and then try opening that in eclipse.
Thanks
Best Regards
On Tue, May 12, 2015 at 3:46 PM, Chandrashekhar Kotekar <
s
Hi,
I am trying to clone Spark source using Eclipse. After providing spark
source URL, eclipse downloads some code which I can see in download
location but as soon as downloading reaches 99% Eclipse throws "Gi
repository clone failed. Access is denied" error.
Has anyone encountered such a proble
I tend to find that any large project has a lot of walking dead JIRAs, and
pretending they are simply Open causes problems. Any state is better for
these, so I favor this.
The possible objection is that this will squash or hide useful issues, but
in practice we have the opposite problem. Resolved
In Spark we sometimes close issues as something other than "Fixed",
and this is an important part of maintaining our JIRA.
The current resolution types we use are the following:
Won't Fix - bug fix or (more often) feature we don't want to add
Invalid - issue is underspecified or not appropriate f
18 matches
Mail list logo