Mridul,I may have added some confusion by giving examples in completely
different areas. For example the number of cores available for tasking on each
worker machine is a resource-controller level configuration variable. In
standalone mode (ie using Spark's home-grown resource manager) the
conf
Let me try to rephrase my query.
How can a user specify, for example, what the executor memory should
be or number of cores should be.
I dont want a situation where some variables can be specified using
one set of idioms (from this PR for example) and another set cannot
be.
Regards,
Mridul
O
Thanks for your questions Mridul.
I assume you are referring to how the functionality to query system state works
in Yarn and Mesos?
The API's used are the standard JVM API's so the functionality will work
without change. There is no real use case for using 'physicalMemoryBytes' in
these case
Hi Reynold,They are some very good questions.
Re: Known libraries
There are a number of well known libraries that we could use to implement this
features, including MVEL, OGNL and JBOSS EL, or even Spring's EL.I looked at
using them to prototype this feature in the beginning, but they all ende
i'm thinking that this was something transient, and hopefully won't happen
again. a ton of weird stuff happened around the time of this failure (see
my flaky httpd email), and this was the only build exhibiting this behavior.
i'll keep an eye out for this failure over the weekend...
On Fri, Ma
ok, things seem to have stabilized... httpd hasn't flaked since ~noon, the
hanging PRB job on amp-jenkins-worker-06 was removed w/the restart and
things are now building.
i cancelled and retriggered a bunch of PRB builds, btw:
4848 (https://github.com/apache/spark/pull/3699)
5922 (https://github.
i tried a couple of things, but will also be doing a jenkins reboot as soon
as the current batch of builds finish.
On Fri, Mar 13, 2015 at 12:40 PM, shane knapp wrote:
> ok we have a few different things happening:
>
> 1) httpd on the jenkins master is randomly (though not currently) flaking
>
When Kerberos is enabled, I get the following exceptions. (Spark 1.2.1 git
commit
b6eaf77d4332bfb0a698849b1f5f917d20d70e97, Hive 0.13.1, Apache Hadoop 2.4.1)
when starting Spark ThriftServer.
Command to start thriftserver
./start-thriftserver.sh --hiveconf hive.server2.thrift.port=2
ok we have a few different things happening:
1) httpd on the jenkins master is randomly (though not currently) flaking
out and causing visits to the site to return a 503. nothing in the logs
shows any problems.
2) there are some github timeouts, which i tracked down and think it's a
problem with
Here is the JIRA: https://issues.apache.org/jira/browse/SPARK-6315
On Thu, Mar 12, 2015 at 11:00 PM, Michael Armbrust
wrote:
> We are looking at the issue and will likely fix it for Spark 1.3.1.
>
> On Thu, Mar 12, 2015 at 8:25 PM, giive chen wrote:
>
>> Hi all
>>
>> My team has the same issue.
Here you are:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28571/consoleFull
On Fri, Mar 13, 2015 at 11:58 AM, shane knapp wrote:
> link to a build, please?
>
> On Fri, Mar 13, 2015 at 11:53 AM, Hari Shreedharan <
> hshreedha...@cloudera.com> wrote:
>
>> Looks like somethin
we just started having issues when visiting jenkins and getting 503 service
unavailable errors.
i'm on it and will report back with an all-clear.
link to a build, please?
On Fri, Mar 13, 2015 at 11:53 AM, Hari Shreedharan <
hshreedha...@cloudera.com> wrote:
> Looks like something is causing the PR Builder to timeout since this
> morning with the ivy cache being locked.
>
> Any idea what is happening?
>
Looks like something is causing the PR Builder to timeout since this
morning with the ivy cache being locked.
Any idea what is happening?
This is an interesting idea.
Are there well known libraries for doing this? Config is the one place
where it would be great to have something ridiculously simple, so it is
more or less bug free. I'm concerned about the complexity in this patch and
subtle bugs that it might introduce to config opti
Hey Sean,
Yes, go crazy. Once we close the release vote, it's open season to
merge backports into that release.
- Patrick
On Fri, Mar 13, 2015 at 9:31 AM, Mridul Muralidharan wrote:
> Who is managing 1.3 release ? You might want to coordinate with them before
> porting changes to branch.
>
> Re
i'll be taking jenkins down for some much-needed plugin updates, as well as
potentially upgrading jenkins itself.
this will start at 730am PDT, and i'm hoping to have everything up by noon.
the move to the anaconda python will take place in the next couple of weeks
as i'm in the process of rebuil
Kudos to the whole team for such a significant achievement!
On Fri, Mar 13, 2015 at 10:00 AM, Patrick Wendell
wrote:
> Hi All,
>
> I'm happy to announce the availability of Spark 1.3.0! Spark 1.3.0 is
> the fourth release on the API-compatible 1.X line. It is Spark's
> largest release ever, with
Hi All,
I'm happy to announce the availability of Spark 1.3.0! Spark 1.3.0 is
the fourth release on the API-compatible 1.X line. It is Spark's
largest release ever, with contributions from 172 developers and more
than 1,000 commits!
Visit the release notes [1] to read about the new features, or
d
Hi Reynold,
I left Chester with a copy of the slides, so I assume they'll be posted
on the SF ML or Big Data sites. We have a draft paper under review. I
can ask the co-authors about arxiv'ing it.
We have a few heuristics for power-law data. One of them is to keep the
feature set sorted by freq
I use the spark-submit script and the config files in a conf directory. I
see the memory settings reflected in the stdout, as well as in the webUI.
(it prints all variables from spark-default.conf, and metions I have 540GB
free memory available when trying to store a broadcast variable or RDD). I
a
Who is managing 1.3 release ? You might want to coordinate with them before
porting changes to branch.
Regards
Mridul
On Friday, March 13, 2015, Sean Owen wrote:
> Yeah, I'm guessing that is all happening quite literally as we speak.
> The Apache git tag is the one of reference:
>
> https://git
Yeah, I'm guessing that is all happening quite literally as we speak.
The Apache git tag is the one of reference:
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=4aaf48d46d13129f0f9bdafd771dd80fe568a7dc
Open season on 1.3 branch then...
On Fri, Mar 13, 2015 at 4:20 PM, Nicholas Cha
Looks like the release is out:
http://spark.apache.org/releases/spark-release-1-3-0.html
Though, interestingly, I think we are missing the appropriate v1.3.0 tag:
https://github.com/apache/spark/releases
Nick
On Fri, Mar 13, 2015 at 6:07 AM Sean Owen wrote:
> Is the release certain enough that
I am curious how you are going to support these over mesos and yarn.
Any configure change like this should be applicable to all of them, not
just local and standalone modes.
Regards
Mridul
On Friday, March 13, 2015, Dale Richardson wrote:
>
>
>
>
>
>
>
>
>
>
>
> PR#4937 ( https://github.com/apa
Reyonld,
Prof Canny gives me the slides yesterday I will posted the link to the
slides to both SF BIg Analytics and SF Machine Learning meetups.
Chester
Sent from my iPad
On Mar 12, 2015, at 22:53, Reynold Xin wrote:
> Thanks for chiming in, John. I missed your meetup last night - do yo
PR#4937 ( https://github.com/apache/spark/pull/4937) is a feature to allow for
Spark configuration options (whether on command line, environment variable or a
configuration file) to be specified via a simple expression language.
Such a feature has the following end-user benefits:
- A
Is the release certain enough that we can resume merging into
branch-1.3 at this point? I have a number of back-ports queued up and
didn't want to merge in case another last RC was needed. I see a few
commits to the branch though.
---
How did you run the Spark command? Maybe the memory setting didn't actually
apply? How much memory does the web ui say is available?
BTW - I don't think any JVM can actually handle 700G heap ... (maybe Zing).
On Thu, Mar 12, 2015 at 4:09 PM, Tom Hubregtsen
wrote:
> Hi all,
>
> I'm running the t
Hi all,
RDD.toLocalIterator() creates as many jobs as # of partitions and it spams
Spark UI especially when the method is used on an RDD with hundreds or
thousands of partitions.
Does anyone have a way to work around this issue? What do people think about
introducing a SparkContext local prope
30 matches
Mail list logo