Hi Alexander, Joseph, Evan,
I just wanted to weigh in an empirical result that we've had on a
standalone cluster with 16 nodes and 256 cores.
Typically we run optimization tasks with 256 partitions for 1
partition per core, and find that performance worsens with more
partitions than physical core
See this thread
http://search-hadoop.com/m/q3RTtV3VFNdgNri2&subj=Re+Build+spark+1+5+1+branch+fails
> On Oct 19, 2015, at 6:59 PM, Annabel Melongo
> wrote:
>
> I tried to build Spark according to the build directions and the it failed
> due to the following error:
>
>
>
>
>
>
> Bui
Seems to be a heap space issue for Maven. Have you configured Maven's
memory according the instruction on the web page?
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
On Mon, Oct 19, 2015 at 6:59 PM, Annabel Melongo <
melongo_anna...@yahoo.com.invalid> wrote:
> I
I tried to build Spark according to the build directions and the it failed due
to the following error:
| |
| | | | | |
| Building Spark - Spark 1.5.1 DocumentationBuilding Spark Building with
build/mvn Building a Runnable Distribution Setting up Maven’s Memory Usage
Specifying the H
things are green, nice catch on the job config, josh.
On Mon, Oct 19, 2015 at 1:57 PM, shane knapp wrote:
> ++joshrosen
>
> some of those 1.4 builds were incorrectly configured and launching on
> a reserved executor... josh fixed them and we're looking a lot better
> (meaning that we're building
Hi all
I feel like this questions is more Spark dev related that Spark user
related. Please correct me if I'm wrong.
My project's data flow involves sampling records from the data stored as
Parquet dataset.
I've checked DataFrames API and it doesn't support user defined predicates
projection push
Hi, Michael,
Thank you again! Just found the functions that generate the ! mark
/**
* A prefix string used when printing the plan.
*
* We use "!" to indicate an invalid plan, and "'" to indicate an
unresolved plan.
*/
protected def statePrefix = if (missingInput.nonEmpty &&
childr
++joshrosen
some of those 1.4 builds were incorrectly configured and launching on
a reserved executor... josh fixed them and we're looking a lot better
(meaning that we're building and not failing at launch).
shane
On Mon, Oct 19, 2015 at 1:49 PM, Patrick Wendell wrote:
> I think many of them
I think many of them are coming form the Spark 1.4 builds:
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/Spark-1.4-Maven-pre-YARN/3900/console
On Mon, Oct 19, 2015 at 1:44 PM, Patrick Wendell wrote:
> This is what I'm looking at:
>
>
> https://amplab.cs.berkele
This is what I'm looking at:
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/
On Mon, Oct 19, 2015 at 12:58 PM, shane knapp wrote:
> all we did was reboot -05 and -03... i'm seeing a bunch of green
> builds. could you provide me w/some specific failures so i can
all we did was reboot -05 and -03... i'm seeing a bunch of green
builds. could you provide me w/some specific failures so i can look
in to them more closely?
On Mon, Oct 19, 2015 at 12:27 PM, Patrick Wendell wrote:
> Hey Shane,
>
> It also appears that every Spark build is failing right now. Co
Hey Shane,
It also appears that every Spark build is failing right now. Could it be
related to your changes?
- Patrick
On Mon, Oct 19, 2015 at 11:13 AM, shane knapp wrote:
> worker 05 is back up now... looks like the machine OOMed and needed
> to be kicked.
>
> On Mon, Oct 19, 2015 at 9:39 AM
It means that there is an invalid attribute reference (i.e. a #n where the
attribute is missing from the child operator).
On Sun, Oct 18, 2015 at 11:38 PM, Xiao Li wrote:
> Hi, all,
>
> After turning on the trace, I saw a strange exclamation mark in
> the intermediate plans. This happened in cat
worker 05 is back up now... looks like the machine OOMed and needed
to be kicked.
On Mon, Oct 19, 2015 at 9:39 AM, shane knapp wrote:
> i'll have to head down to the colo and see what's up with it... it
> seems to be wedged (pings ok, can't ssh in) and i'll update the list
> when i figure out w
Evan, Joseph
Thank you for valuable suggestions. It would be great to improve TreeAggregate
(if possible).
Making less updates would certainly make sense, though that will mean using
batch gradient such as LBFGS. It seems as today it is the only viable option in
Spark.
I will also take a look
Can you reproduce it on master?
I can't reproduce it with the following code:
>>> t2 = sqlContext.range(50).selectExpr("concat('A', id) as id")
>>> t1 = sqlContext.range(10).selectExpr("concat('A', id) as id")
>>> t1.join(t2).where(t1.id == t2.id).explain()
ShuffledHashJoin [id#21], [id#19], Buil
i'll have to head down to the colo and see what's up with it... it
seems to be wedged (pings ok, can't ssh in) and i'll update the list
when i figure out what's wrong.
i don't think it caught fire (#toosoon?), because everything else is
up and running. :)
shane
Hey all,
tl;dr; I built Spark with Java 1.8 even though my JAVA_HOME pointed to 1.7.
Then it failed with binary incompatibilities.
I couldn’t find any mention of this in the docs, so It might be a known
thing, but it’s definitely too easy to do the wrong thing.
The problem is that Maven is using
Just testing spark v1.5.0 (on mesos v0.23) and we saw something
unexpected (according to the event timeline) - when a spark task failed
(intermittent S3 connection failure), the whole executor was removed and
was never recovered so the job proceeded slower than normal.
Looking at the code I sa
Hi Rohith,
Do you have multiple interfaces on the machine hosting the master ?
If so, can you try to force to the public interface using:
sbin/start-master.sh --ip xxx.xxx.xxx.xxx
Regards
JB
On 10/19/2015 02:05 PM, Rohith Parameshwara wrote:
Hi all,
I am doing some experime
Hi all,
I am doing some experiments on spark standalone cluster setup
and I am facing the following issue:
I have a 4 node cluster setup. As per
http://spark.apache.org/docs/latest/spark-standalone.html#starting-a-cluster-manually
I tried to start the cluster with the scripts but
Hi Hao,
Each table is created with the following python code snippet:
data = [{'id': 'A%d'%i, 'value':ceil(random()*10)} for i in range(0,50)]
with open('A.json', 'w+') as output:
json.dump(data, output)
The tables A and B containing 10 and 50 tuples respectively.
In spark shell I type
sq
This is a deliberate killing request by heartbeat mechanism, have nothing
to do with dynamic allocation. Here because you're running on yarn mode, so
"supportDynamicAllocation" will be true, but actually there's no relation
to dynamic allocation.
>From my understanding "doRequestTotalExecutors" is
Wojciech,
I am a programmer with over 30 years of programming experience, most
recently in Java, and with lots of experience in languages like LISP
(functional), and R (array/list). I'm currently learning Haskell, and
working in an environment where I need to apply Spark to "large data". I'd
be v
Hey all,
Thanks in advance. I ran into a situation where spark driver reduced the
total executors count for my job even with dynamic allocation disabled, and
caused the job to hang for ever.
Setup:
Spark-1.3.1 on hadoop-yarn-2.4.0 cluster.
All servers in cluster running Linux version 2.6.32.
25 matches
Mail list logo