date:20140405

Re: Recent heartbeats

2014-04-05 Thread Debasish Das

Thanks Patrick...I searched in the archives and found the answer...tuning
the akka and gc params


On Fri, Apr 4, 2014 at 10:35 PM, Patrick Wendell  wrote:

> I answered this over on the user list...
>
>
> On Fri, Apr 4, 2014 at 6:13 PM, Debasish Das  >wrote:
>
> > Hi,
> >
> > Also posted it on user but then I realized it might be more involved.
> >
> > In my ALS runs I am noticing messages that complain about heart beats:
> >
> > 14/04/04 20:43:09 WARN BlockManagerMasterActor: Removing BlockManager
> > BlockManagerId(17, machine1, 53419, 0) with no recent heart beats:
> 48476ms
> > exceeds 45000ms
> > 14/04/04 20:43:09 WARN BlockManagerMasterActor: Removing BlockManager
> > BlockManagerId(12, machine2, 60714, 0) with no recent heart beats:
> 45328ms
> > exceeds 45000ms
> > 14/04/04 20:43:09 WARN BlockManagerMasterActor: Removing BlockManager
> > BlockManagerId(19, machine3, 39496, 0) with no recent heart beats:
> 53259ms
> > exceeds 45000ms
> >
> > Is this some issue with the underlying jvm over which akka is run ? Can I
> > increase the heartbeat somehow to get these messages resolved ?
> >
> > Any more insight about the possible cause for the heartbeat will be
> > helpful...
> >
> > Thanks.
> > Deb
> >
>

Master compilation

2014-04-05 Thread Debasish Das

I am synced with apache/spark master but getting error in spark/sql
compilation...

Is the master broken ?

[info] Compiling 34 Scala sources to
/home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
[error]
/home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
value getGlobal is not a member of object java.util.logging.Logger
[error]   logger.setParent(Logger.getGlobal)
[error]   ^
[error] one error found
[error] (sql/compile:compile) Compilation failed
[error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM

Thanks.
Deb

Re: Master compilation

2014-04-05 Thread Sean Owen

That method was added in Java 7. The project is on Java 6, so I think
this was just an inadvertent error in a recent PR (it was the 'Spark
parquet improvements' one).

I'll open a hot-fix PR after looking for other stuff like this that
might have snuck in.
--
Sean Owen | Director, Data Science | London


On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das  wrote:
> I am synced with apache/spark master but getting error in spark/sql
> compilation...
>
> Is the master broken ?
>
> [info] Compiling 34 Scala sources to
> /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
> [error]
> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
> value getGlobal is not a member of object java.util.logging.Logger
> [error]   logger.setParent(Logger.getGlobal)
> [error]   ^
> [error] one error found
> [error] (sql/compile:compile) Compilation failed
> [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
>
> Thanks.
> Deb

Re: Master compilation

2014-04-05 Thread Debasish Das

I can compile with Java 7...let me try that...


On Sat, Apr 5, 2014 at 2:19 PM, Sean Owen  wrote:

> That method was added in Java 7. The project is on Java 6, so I think
> this was just an inadvertent error in a recent PR (it was the 'Spark
> parquet improvements' one).
>
> I'll open a hot-fix PR after looking for other stuff like this that
> might have snuck in.
> --
> Sean Owen | Director, Data Science | London
>
>
> On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das 
> wrote:
> > I am synced with apache/spark master but getting error in spark/sql
> > compilation...
> >
> > Is the master broken ?
> >
> > [info] Compiling 34 Scala sources to
> > /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
> > [error]
> >
> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
> > value getGlobal is not a member of object java.util.logging.Logger
> > [error]   logger.setParent(Logger.getGlobal)
> > [error]   ^
> > [error] one error found
> > [error] (sql/compile:compile) Compilation failed
> > [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
> >
> > Thanks.
> > Deb
>

Re: Master compilation

2014-04-05 Thread Debasish Das

I verified this is happening for both CDH4.5 and 1.0.4...My deploy
environment is Java 6...so Java 7 compilation is not going to help...

Is this the PR which caused it ?

Andre Schumacher

fbebaedSpark parquet improvements A few improvements to the Parquet
support for SQL queries: - Instead of files a ParquetRelation is now backed
by a directory, which simplifies importing data from other sources -
InsertIntoParquetTable operation now supports switching between overwriting
or appending (at least in HiveQL) - tests now use the new API - Parquet
logging can be set to WARNING level (Default) - Default compression for
Parquet files (GZIP, as in parquet-mr) Author: Andre Schumacher &...2
days agoSPARK-1383

I will go to a stable checkin before this




On Sat, Apr 5, 2014 at 2:22 PM, Debasish Das wrote:

> I can compile with Java 7...let me try that...
>
>
> On Sat, Apr 5, 2014 at 2:19 PM, Sean Owen  wrote:
>
>> That method was added in Java 7. The project is on Java 6, so I think
>> this was just an inadvertent error in a recent PR (it was the 'Spark
>> parquet improvements' one).
>>
>> I'll open a hot-fix PR after looking for other stuff like this that
>> might have snuck in.
>> --
>> Sean Owen | Director, Data Science | London
>>
>>
>> On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das 
>> wrote:
>> > I am synced with apache/spark master but getting error in spark/sql
>> > compilation...
>> >
>> > Is the master broken ?
>> >
>> > [info] Compiling 34 Scala sources to
>> > /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
>> > [error]
>> >
>> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
>> > value getGlobal is not a member of object java.util.logging.Logger
>> > [error]   logger.setParent(Logger.getGlobal)
>> > [error]   ^
>> > [error] one error found
>> > [error] (sql/compile:compile) Compilation failed
>> > [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
>> >
>> > Thanks.
>> > Deb
>>
>
>

Re: Master compilation

2014-04-05 Thread Patrick Wendell

If you want to submit a hot fix for this issue specifically please do. I'm
not sure why it didn't fail our build...


On Sat, Apr 5, 2014 at 2:30 PM, Debasish Das wrote:

> I verified this is happening for both CDH4.5 and 1.0.4...My deploy
> environment is Java 6...so Java 7 compilation is not going to help...
>
> Is this the PR which caused it ?
>
> Andre Schumacher
>
> fbebaedSpark parquet improvements A few improvements to the Parquet
> support for SQL queries: - Instead of files a ParquetRelation is now backed
> by a directory, which simplifies importing data from other sources -
> InsertIntoParquetTable operation now supports switching between overwriting
> or appending (at least in HiveQL) - tests now use the new API - Parquet
> logging can be set to WARNING level (Default) - Default compression for
> Parquet files (GZIP, as in parquet-mr) Author: Andre Schumacher &...2
> days agoSPARK-1383
>
> I will go to a stable checkin before this
>
>
>
>
> On Sat, Apr 5, 2014 at 2:22 PM, Debasish Das  >wrote:
>
> > I can compile with Java 7...let me try that...
> >
> >
> > On Sat, Apr 5, 2014 at 2:19 PM, Sean Owen  wrote:
> >
> >> That method was added in Java 7. The project is on Java 6, so I think
> >> this was just an inadvertent error in a recent PR (it was the 'Spark
> >> parquet improvements' one).
> >>
> >> I'll open a hot-fix PR after looking for other stuff like this that
> >> might have snuck in.
> >> --
> >> Sean Owen | Director, Data Science | London
> >>
> >>
> >> On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das  >
> >> wrote:
> >> > I am synced with apache/spark master but getting error in spark/sql
> >> > compilation...
> >> >
> >> > Is the master broken ?
> >> >
> >> > [info] Compiling 34 Scala sources to
> >> > /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
> >> > [error]
> >> >
> >>
> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
> >> > value getGlobal is not a member of object java.util.logging.Logger
> >> > [error]   logger.setParent(Logger.getGlobal)
> >> > [error]   ^
> >> > [error] one error found
> >> > [error] (sql/compile:compile) Compilation failed
> >> > [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
> >> >
> >> > Thanks.
> >> > Deb
> >>
> >
> >
>

Re: Master compilation

2014-04-05 Thread Sean Owen

Will do. I'm just finishing a recompile to check for anything else like this.

The reason is because the tests run with Java 7 (like lots of us do
including me) so it used the Java 7 classpath and found the class.
It's possible to use Java 7 with the Java 6 -bootclasspath. Or just
use Java 6.
--
Sean Owen | Director, Data Science | London


On Sat, Apr 5, 2014 at 11:06 PM, Patrick Wendell  wrote:
> If you want to submit a hot fix for this issue specifically please do. I'm
> not sure why it didn't fail our build...
>
>
> On Sat, Apr 5, 2014 at 2:30 PM, Debasish Das wrote:
>
>> I verified this is happening for both CDH4.5 and 1.0.4...My deploy
>> environment is Java 6...so Java 7 compilation is not going to help...
>>
>> Is this the PR which caused it ?
>>
>> Andre Schumacher
>>
>> fbebaedSpark parquet improvements A few improvements to the Parquet
>> support for SQL queries: - Instead of files a ParquetRelation is now backed
>> by a directory, which simplifies importing data from other sources -
>> InsertIntoParquetTable operation now supports switching between overwriting
>> or appending (at least in HiveQL) - tests now use the new API - Parquet
>> logging can be set to WARNING level (Default) - Default compression for
>> Parquet files (GZIP, as in parquet-mr) Author: Andre Schumacher &...2
>> days agoSPARK-1383
>>
>> I will go to a stable checkin before this
>>
>>
>>
>>
>> On Sat, Apr 5, 2014 at 2:22 PM, Debasish Das > >wrote:
>>
>> > I can compile with Java 7...let me try that...
>> >
>> >
>> > On Sat, Apr 5, 2014 at 2:19 PM, Sean Owen  wrote:
>> >
>> >> That method was added in Java 7. The project is on Java 6, so I think
>> >> this was just an inadvertent error in a recent PR (it was the 'Spark
>> >> parquet improvements' one).
>> >>
>> >> I'll open a hot-fix PR after looking for other stuff like this that
>> >> might have snuck in.
>> >> --
>> >> Sean Owen | Director, Data Science | London
>> >>
>> >>
>> >> On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das > >
>> >> wrote:
>> >> > I am synced with apache/spark master but getting error in spark/sql
>> >> > compilation...
>> >> >
>> >> > Is the master broken ?
>> >> >
>> >> > [info] Compiling 34 Scala sources to
>> >> > /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
>> >> > [error]
>> >> >
>> >>
>> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
>> >> > value getGlobal is not a member of object java.util.logging.Logger
>> >> > [error]   logger.setParent(Logger.getGlobal)
>> >> > [error]   ^
>> >> > [error] one error found
>> >> > [error] (sql/compile:compile) Compilation failed
>> >> > [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
>> >> >
>> >> > Thanks.
>> >> > Deb
>> >>
>> >
>> >
>>

Re: Master compilation

2014-04-05 Thread Debasish Das

@patrick our cluster still has java6 deployed...and I compiled using jdk6...

Sean is looking into it...this api is in java7 but not java6...



On Sat, Apr 5, 2014 at 3:06 PM, Patrick Wendell  wrote:

> If you want to submit a hot fix for this issue specifically please do. I'm
> not sure why it didn't fail our build...
>
>
> On Sat, Apr 5, 2014 at 2:30 PM, Debasish Das  >wrote:
>
> > I verified this is happening for both CDH4.5 and 1.0.4...My deploy
> > environment is Java 6...so Java 7 compilation is not going to help...
> >
> > Is this the PR which caused it ?
> >
> > Andre Schumacher
> >
> > fbebaedSpark parquet improvements A few improvements to the
> Parquet
> > support for SQL queries: - Instead of files a ParquetRelation is now
> backed
> > by a directory, which simplifies importing data from other sources -
> > InsertIntoParquetTable operation now supports switching between
> overwriting
> > or appending (at least in HiveQL) - tests now use the new API - Parquet
> > logging can be set to WARNING level (Default) - Default compression for
> > Parquet files (GZIP, as in parquet-mr) Author: Andre Schumacher &...2
> > days agoSPARK-1383
> >
> > I will go to a stable checkin before this
> >
> >
> >
> >
> > On Sat, Apr 5, 2014 at 2:22 PM, Debasish Das  > >wrote:
> >
> > > I can compile with Java 7...let me try that...
> > >
> > >
> > > On Sat, Apr 5, 2014 at 2:19 PM, Sean Owen  wrote:
> > >
> > >> That method was added in Java 7. The project is on Java 6, so I think
> > >> this was just an inadvertent error in a recent PR (it was the 'Spark
> > >> parquet improvements' one).
> > >>
> > >> I'll open a hot-fix PR after looking for other stuff like this that
> > >> might have snuck in.
> > >> --
> > >> Sean Owen | Director, Data Science | London
> > >>
> > >>
> > >> On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das <
> debasish.da...@gmail.com
> > >
> > >> wrote:
> > >> > I am synced with apache/spark master but getting error in spark/sql
> > >> > compilation...
> > >> >
> > >> > Is the master broken ?
> > >> >
> > >> > [info] Compiling 34 Scala sources to
> > >> > /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
> > >> > [error]
> > >> >
> > >>
> >
> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
> > >> > value getGlobal is not a member of object java.util.logging.Logger
> > >> > [error]   logger.setParent(Logger.getGlobal)
> > >> > [error]   ^
> > >> > [error] one error found
> > >> > [error] (sql/compile:compile) Compilation failed
> > >> > [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
> > >> >
> > >> > Thanks.
> > >> > Deb
> > >>
> > >
> > >
> >
>

ephemeral storage level in spark ?

2014-04-05 Thread Mridul Muralidharan

Hi,

  We have a requirement to use a (potential) ephemeral storage, which
is not within the VM, which is strongly tied to a worker node. So
source of truth for a block would still be within spark; but to
actually do computation, we would need to copy data to external device
(where it might lie around for a while : so data locality really
really helps if we can avoid a subsequent copy if it is already
present on computations on same block again).

I was wondering if the recently added storage level for tachyon would
help in this case (note, tachyon wont help; just the storage level
might).
What sort of guarantees does it provide ? How extensible is it ? Or is
it strongly tied to tachyon with only a generic name ?


Thanks,
Mridul

Re: ephemeral storage level in spark ?

2014-04-05 Thread Haoyuan Li

Hi Mridul,

Do you mean the scenario that different Spark applications need to read the
same raw data, which is stored in a remote cluster or machines. And the
goal is to load the remote raw data only once?

Haoyuan


On Sat, Apr 5, 2014 at 4:30 PM, Mridul Muralidharan wrote:

> Hi,
>
>   We have a requirement to use a (potential) ephemeral storage, which
> is not within the VM, which is strongly tied to a worker node. So
> source of truth for a block would still be within spark; but to
> actually do computation, we would need to copy data to external device
> (where it might lie around for a while : so data locality really
> really helps if we can avoid a subsequent copy if it is already
> present on computations on same block again).
>
> I was wondering if the recently added storage level for tachyon would
> help in this case (note, tachyon wont help; just the storage level
> might).
> What sort of guarantees does it provide ? How extensible is it ? Or is
> it strongly tied to tachyon with only a generic name ?
>
>
> Thanks,
> Mridul
>



-- 
Haoyuan Li
Algorithms, Machines, People Lab, EECS, UC Berkeley
http://www.cs.berkeley.edu/~haoyuan/

Re: ephemeral storage level in spark ?

2014-04-05 Thread Mridul Muralidharan

No, I am thinking along lines of writing to an accelerator card or
dedicated card with its own memory.

Regards,
Mridul
On Apr 6, 2014 5:19 AM, "Haoyuan Li"  wrote:

> Hi Mridul,
>
> Do you mean the scenario that different Spark applications need to read the
> same raw data, which is stored in a remote cluster or machines. And the
> goal is to load the remote raw data only once?
>
> Haoyuan
>
>
> On Sat, Apr 5, 2014 at 4:30 PM, Mridul Muralidharan  >wrote:
>
> > Hi,
> >
> >   We have a requirement to use a (potential) ephemeral storage, which
> > is not within the VM, which is strongly tied to a worker node. So
> > source of truth for a block would still be within spark; but to
> > actually do computation, we would need to copy data to external device
> > (where it might lie around for a while : so data locality really
> > really helps if we can avoid a subsequent copy if it is already
> > present on computations on same block again).
> >
> > I was wondering if the recently added storage level for tachyon would
> > help in this case (note, tachyon wont help; just the storage level
> > might).
> > What sort of guarantees does it provide ? How extensible is it ? Or is
> > it strongly tied to tachyon with only a generic name ?
> >
> >
> > Thanks,
> > Mridul
> >
>
>
>
> --
> Haoyuan Li
> Algorithms, Machines, People Lab, EECS, UC Berkeley
> http://www.cs.berkeley.edu/~haoyuan/
>

Re: Master compilation

2014-04-05 Thread Debasish Das

With jdk7 I could compile it fine:

java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

What happens if I say take the jar and try to deploy it on ancient centos6
default on cluster ?

java -version
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)

Breeze compilation also fails with jdk6, runs fine with jdk7 and breeze jar
is already included in spark mllib with Xiangrui's Sparse vector checkin

Does that mean that classes compiled and generated using jdk7 will run fine
on jre6 ?

I am confused


On Sat, Apr 5, 2014 at 3:09 PM, Sean Owen  wrote:

> Will do. I'm just finishing a recompile to check for anything else like
> this.
>
> The reason is because the tests run with Java 7 (like lots of us do
> including me) so it used the Java 7 classpath and found the class.
> It's possible to use Java 7 with the Java 6 -bootclasspath. Or just
> use Java 6.
> --
> Sean Owen | Director, Data Science | London
>
>
> On Sat, Apr 5, 2014 at 11:06 PM, Patrick Wendell 
> wrote:
> > If you want to submit a hot fix for this issue specifically please do.
> I'm
> > not sure why it didn't fail our build...
> >
> >
> > On Sat, Apr 5, 2014 at 2:30 PM, Debasish Das  >wrote:
> >
> >> I verified this is happening for both CDH4.5 and 1.0.4...My deploy
> >> environment is Java 6...so Java 7 compilation is not going to help...
> >>
> >> Is this the PR which caused it ?
> >>
> >> Andre Schumacher
> >>
> >> fbebaedSpark parquet improvements A few improvements to the
> Parquet
> >> support for SQL queries: - Instead of files a ParquetRelation is now
> backed
> >> by a directory, which simplifies importing data from other sources -
> >> InsertIntoParquetTable operation now supports switching between
> overwriting
> >> or appending (at least in HiveQL) - tests now use the new API - Parquet
> >> logging can be set to WARNING level (Default) - Default compression for
> >> Parquet files (GZIP, as in parquet-mr) Author: Andre Schumacher &...
>  2
> >> days agoSPARK-1383
> >>
> >> I will go to a stable checkin before this
> >>
> >>
> >>
> >>
> >> On Sat, Apr 5, 2014 at 2:22 PM, Debasish Das  >> >wrote:
> >>
> >> > I can compile with Java 7...let me try that...
> >> >
> >> >
> >> > On Sat, Apr 5, 2014 at 2:19 PM, Sean Owen  wrote:
> >> >
> >> >> That method was added in Java 7. The project is on Java 6, so I think
> >> >> this was just an inadvertent error in a recent PR (it was the 'Spark
> >> >> parquet improvements' one).
> >> >>
> >> >> I'll open a hot-fix PR after looking for other stuff like this that
> >> >> might have snuck in.
> >> >> --
> >> >> Sean Owen | Director, Data Science | London
> >> >>
> >> >>
> >> >> On Sat, Apr 5, 2014 at 10:04 PM, Debasish Das <
> debasish.da...@gmail.com
> >> >
> >> >> wrote:
> >> >> > I am synced with apache/spark master but getting error in spark/sql
> >> >> > compilation...
> >> >> >
> >> >> > Is the master broken ?
> >> >> >
> >> >> > [info] Compiling 34 Scala sources to
> >> >> > /home/debasish/spark_deploy/sql/core/target/scala-2.10/classes...
> >> >> > [error]
> >> >> >
> >> >>
> >>
> /home/debasish/spark_deploy/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetRelation.scala:106:
> >> >> > value getGlobal is not a member of object java.util.logging.Logger
> >> >> > [error]   logger.setParent(Logger.getGlobal)
> >> >> > [error]   ^
> >> >> > [error] one error found
> >> >> > [error] (sql/compile:compile) Compilation failed
> >> >> > [error] Total time: 171 s, completed Apr 5, 2014 4:58:41 PM
> >> >> >
> >> >> > Thanks.
> >> >> > Deb
> >> >>
> >> >
> >> >
> >>
>

ALS array index out of bound with 50 factors

2014-04-05 Thread Debasish Das

Hi,

I deployed apache/spark master today and recently there were many ALS
related checkins and enhancements..

I am running ALS with explicit feedback and I remember most enhancements
were related to implicit feedback...

With 25 factors my runs were successful but with 50 factors I am getting
array index out of bound...

Note that I was hitting gc errors before with an older version of spark but
it seems like the sparse matrix partitioning scheme has changed now...data
caching looks much balanced now...earlier one node was becoming
bottleneck...Although I ran with 64g memory per node...

There are around 3M products, 25M users...

Anyone noticed this bug or something similar ?

14/04/05 23:03:15 WARN TaskSetManager: Loss was due to
java.lang.ArrayIndexOutOfBoundsException
java.lang.ArrayIndexOutOfBoundsException: 81029
at
org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateBlock$1$$anonfun$apply$mcVI$sp$1.apply$mcVI$sp(ALS.scala:450)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at
org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateBlock$1.apply$mcVI$sp(ALS.scala:446)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.mllib.recommendation.ALS.org
$apache$spark$mllib$recommendation$ALS$$updateBlock(ALS.scala:445)
at
org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:416)
at
org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:415)
at
org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31)
at
org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:149)
at
org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:147)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:147)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:229)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:220)
at
org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:229)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:220)
at
org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:229)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:220)
at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:229)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:220)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
at org.apache.spark.scheduler.Task.run(Task.scala:52)
at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:43)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:42)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Thanks.
Deb

Re: Recent heartbeats

Master compilation

Re: Master compilation

Re: Master compilation

Re: Master compilation

Re: Master compilation

Re: Master compilation

Re: Master compilation

ephemeral storage level in spark ?

Re: ephemeral storage level in spark ?

Re: ephemeral storage level in spark ?

Re: Master compilation

ALS array index out of bound with 50 factors

13 matches

Site Navigation

Mail list logo

Footer information