[mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread Yu Ishikawa
Hi all,

Is there any bugs to divide a Breeze sparse vector at Spark v1.3.0-rc3? When
I tried to divide a sparse vector at Spark v1.3.0-rc3, I got a wrong result
if the target vector has any zero values.

Spark v1.3.0-rc3 depends on Breeze v0.11.1. And Breeze v0.11.1 seems to have
any bugs to divide a sparse vector by a scalar value. When dividing a breeze
sparse vector which has any zero values, the result seems to be a zero
vector. However, we can run the same code on Spark v1.2.x.

However, there is no problem to multiply a breeze sparse vector. I asked the
breeze community this problem on the below issue.
https://github.com/scalanlp/breeze/issues/382

For example,
```
test("dividing a breeze spark vector") {
val vec = Vectors.sparse(6, Array(0, 4), Array(0.0, 10.0)).toBreeze
val n = 60.0
val answer1 = vec :/ n
val answer2 = vec.toDenseVector :/ n
println(vec)
println(answer1)
println(answer2)
assert(answer1.toDenseVector === answer2)
}

SparseVector((0,0.0), (4,10.0))
SparseVector()
DenseVector(0.0, 0.0, 0.0, 0.0, 0.1, 0.0)

DenseVector(0.0, 0.0, 0.0, 0.0, 0.0, 0.0) did not equal DenseVector(0.0,
0.0, 0.0, 0.0, 0.1, 0.0)
org.scalatest.exceptions.TestFailedException: DenseVector(0.0, 0.0, 0.0,
0.0, 0.0, 0.0) did not equal DenseVector(0.0, 0.0, 0.0, 0.0,
0.1, 0.0)
```

Thanks,
Yu Ishikawa



-
-- Yu Ishikawa
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread DB Tsai
It's a bug in breeze's side. Once David fixes it and publishes it to
maven, we can upgrade to breeze 0.11.2. Please file a jira ticket for
this issue. thanks.

Sincerely,

DB Tsai
---
Blog: https://www.dbtsai.com


On Sun, Mar 15, 2015 at 12:45 AM, Yu Ishikawa
 wrote:
> Hi all,
>
> Is there any bugs to divide a Breeze sparse vector at Spark v1.3.0-rc3? When
> I tried to divide a sparse vector at Spark v1.3.0-rc3, I got a wrong result
> if the target vector has any zero values.
>
> Spark v1.3.0-rc3 depends on Breeze v0.11.1. And Breeze v0.11.1 seems to have
> any bugs to divide a sparse vector by a scalar value. When dividing a breeze
> sparse vector which has any zero values, the result seems to be a zero
> vector. However, we can run the same code on Spark v1.2.x.
>
> However, there is no problem to multiply a breeze sparse vector. I asked the
> breeze community this problem on the below issue.
> https://github.com/scalanlp/breeze/issues/382
>
> For example,
> ```
> test("dividing a breeze spark vector") {
> val vec = Vectors.sparse(6, Array(0, 4), Array(0.0, 10.0)).toBreeze
> val n = 60.0
> val answer1 = vec :/ n
> val answer2 = vec.toDenseVector :/ n
> println(vec)
> println(answer1)
> println(answer2)
> assert(answer1.toDenseVector === answer2)
> }
>
> SparseVector((0,0.0), (4,10.0))
> SparseVector()
> DenseVector(0.0, 0.0, 0.0, 0.0, 0.1, 0.0)
>
> DenseVector(0.0, 0.0, 0.0, 0.0, 0.0, 0.0) did not equal DenseVector(0.0,
> 0.0, 0.0, 0.0, 0.1, 0.0)
> org.scalatest.exceptions.TestFailedException: DenseVector(0.0, 0.0, 0.0,
> 0.0, 0.0, 0.0) did not equal DenseVector(0.0, 0.0, 0.0, 0.0,
> 0.1, 0.0)
> ```
>
> Thanks,
> Yu Ishikawa
>
>
>
> -
> -- Yu Ishikawa
> --
> View this message in context: 
> http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread Yu Ishikawa
David Hall who is a breeze creator told me that it's a bug. So, I made a jira
ticket about this issue. We need to upgrade breeze from 0.11.1 to 0.11.2 or
later in order to fix the bug, when the new version of breeze will be
released.

[SPARK-6341] Upgrade breeze from 0.11.1 to 0.11.2 or later - ASF JIRA
https://issues.apache.org/jira/browse/SPARK-6341

Thanks,
Yu Ishikawa



-
-- Yu Ishikawa
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark ThriftServer encounter java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: [auth-int, auth-conf, auth]

2015-03-15 Thread Cheng Lian

Hey Andrew,

Would you please create a JIRA ticket for this? To preserve 
compatibility with existing Hive JDBC/ODBC drivers, Spark SQL's 
HiveThriftServer intercepts some HiveServer2 components and injects 
Spark stuff into it. This makes the implementation details are somewhat 
hacky (e.g. a bunch of reflection tricks were used). We haven't include 
KRB tests in Spark unit/integration test suites, and it's possible that 
HiveThriftServer2 somehow breaks Hive's KRB feature.


Cheng

On 3/14/15 3:43 AM, Andrew Lee wrote:

When Kerberos is enabled, I get the following exceptions. (Spark 1.2.1 git 
commit








b6eaf77d4332bfb0a698849b1f5f917d20d70e97, Hive 0.13.1, Apache Hadoop 2.4.1) 
when starting Spark ThriftServer.
Command to start thriftserver
./start-thriftserver.sh --hiveconf hive.server2.thrift.port=2 --hiveconf 
hive.server2.thrift.bind.host=$(hostname) --master yarn-client
Error message in spark.log

2015-03-13 18:26:05,363 ERROR 
org.apache.hive.service.cli.thrift.ThriftCLIService 
(ThriftBinaryCLIService.java:run(93)) - Error:
java.lang.IllegalArgumentException: Unknown auth type: null Allowed values are: 
[auth-int, auth-conf, auth]
 at org.apache.hive.service.auth.SaslQOP.fromString(SaslQOP.java:56)
 at 
org.apache.hive.service.auth.HiveAuthFactory.getSaslProperties(HiveAuthFactory.java:118)
 at 
org.apache.hive.service.auth.HiveAuthFactory.getAuthTransFactory(HiveAuthFactory.java:133)
 at 
org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:43)
 at java.lang.Thread.run(Thread.java:744)

I'm wondering if this is due to the same problem described in HIVE-8154 
HIVE-7620 due to an older code based for the Spark ThriftServer?
Any insights are appreciated. Currently, I can't get Spark ThriftServer to run 
against a Kerberos cluster (Apache 2.4.1).

My hive-site.xml looks like the following for spark/conf.









   hive.semantic.analyzer.factory.impl
   org.apache.hcatalog.cli.HCatSemanticAnalyzerFactory


   hive.metastore.execute.setugi
   true


   hive.stats.autogather
   false


   hive.session.history.enabled
   true


   hive.querylog.location
   /home/hive/log/${user.name}


   hive.exec.local.scratchdir
   /tmp/hive/scratch/${user.name}


   hive.metastore.uris
   thrift://somehostname:9083



   hive.server2.authentication
   KERBEROS


   hive.server2.authentication.kerberos.principal
   ***


   hive.server2.authentication.kerberos.keytab
   ***


   hive.server2.thrift.sasl.qop
   auth
   Sasl QOP value; one of 'auth', 'auth-int' and 
'auth-conf'


   hive.server2.enable.impersonation
   Enable user impersonation for HiveServer2
   true



   hive.metastore.sasl.enabled
   true


   hive.metastore.kerberos.keytab.file
   ***


   hive.metastore.kerberos.principal
   ***


   hive.metastore.cache.pinobjtypes
   Table,Database,Type,FieldSchema,Order


   hdfs_sentinel_file
   ***


   hive.metastore.warehouse.dir
   /hive


   hive.metastore.client.socket.timeout
   600


   hive.warehouse.subdir.inherit.perms
   true
   



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Wrong version on the Spark documentation page

2015-03-15 Thread Cheng Lian

It's still marked as 1.2.1 here http://spark.apache.org/docs/latest/

But this page is updated (1.3.0) 
http://spark.apache.org/docs/latest/index.html


Cheng

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Wrong version on the Spark documentation page

2015-03-15 Thread Patrick Wendell
Cheng - what if you hold shift+refresh? For me the /latest link
correctly points to 1.3.0

On Sun, Mar 15, 2015 at 10:40 AM, Cheng Lian  wrote:
> It's still marked as 1.2.1 here http://spark.apache.org/docs/latest/
>
> But this page is updated (1.3.0)
> http://spark.apache.org/docs/latest/index.html
>
> Cheng
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Wrong version on the Spark documentation page

2015-03-15 Thread Ted Yu
When I enter  http://spark.apache.org/docs/latest/ into Chrome address bar,
I saw 1.3.0

Cheers

On Sun, Mar 15, 2015 at 11:12 AM, Patrick Wendell 
wrote:

> Cheng - what if you hold shift+refresh? For me the /latest link
> correctly points to 1.3.0
>
> On Sun, Mar 15, 2015 at 10:40 AM, Cheng Lian 
> wrote:
> > It's still marked as 1.2.1 here http://spark.apache.org/docs/latest/
> >
> > But this page is updated (1.3.0)
> > http://spark.apache.org/docs/latest/index.html
> >
> > Cheng
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: SparkSQL 1.3.0 (RC3) failed to read parquet file generated by 1.1.1

2015-03-15 Thread Pei-Lun Lee
Thanks!

On Sat, Mar 14, 2015 at 3:31 AM, Michael Armbrust 
wrote:

> Here is the JIRA: https://issues.apache.org/jira/browse/SPARK-6315
>
> On Thu, Mar 12, 2015 at 11:00 PM, Michael Armbrust  >
> wrote:
>
> > We are looking at the issue and will likely fix it for Spark 1.3.1.
> >
> > On Thu, Mar 12, 2015 at 8:25 PM, giive chen  wrote:
> >
> >> Hi all
> >>
> >> My team has the same issue. It looks like Spark 1.3's sparkSQL cannot
> read
> >> parquet file generated by Spark 1.1. It will cost a lot of migration
> work
> >> when we wanna to upgrade Spark 1.3.
> >>
> >> Is there  anyone can help me?
> >>
> >>
> >> Thanks
> >>
> >> Wisely Chen
> >>
> >>
> >> On Tue, Mar 10, 2015 at 5:06 PM, Pei-Lun Lee  wrote:
> >>
> >> > Hi,
> >> >
> >> > I found that if I try to read parquet file generated by spark 1.1.1
> >> using
> >> > 1.3.0-rc3 by default settings, I got this error:
> >> >
> >> > com.fasterxml.jackson.core.JsonParseException: Unrecognized token
> >> > 'StructType': was expecting ('true', 'false' or 'null')
> >> >  at [Source: StructType(List(StructField(a,IntegerType,false))); line:
> >> 1,
> >> > column: 11]
> >> > at
> >> >
> >>
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1419)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:508)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:2300)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1459)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:683)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3105)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3051)
> >> > at
> >> >
> >> >
> >>
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161)
> >> > at
> >> org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19)
> >> > at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44)
> >> > at
> >> > org.apache.spark.sql.types.DataType$.fromJson(dataTypes.scala:41)
> >> > at
> >> >
> >> >
> >>
> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$readSchema$1$$anonfun$25.apply(newParquet.scala:675)
> >> > at
> >> >
> >> >
> >>
> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$readSchema$1$$anonfun$25.apply(newParquet.scala:675)
> >> >
> >> >
> >> >
> >> > this is how I save parquet file with 1.1.1:
> >> >
> >> > sql("select 1 as a").saveAsParquetFile("/tmp/foo")
> >> >
> >> >
> >> >
> >> > and this is the meta data of the 1.1.1 parquet file:
> >> >
> >> > creator: parquet-mr version 1.4.3
> >> > extra:   org.apache.spark.sql.parquet.row.metadata =
> >> > StructType(List(StructField(a,IntegerType,false)))
> >> >
> >> >
> >> >
> >> > by comparison, this is 1.3.0 meta:
> >> >
> >> > creator: parquet-mr version 1.6.0rc3
> >> > extra:   org.apache.spark.sql.parquet.row.metadata =
> >> > {"type":"struct","fields":[{"name":"a","type":"integer","nullable":t
> >> > [more]...
> >> >
> >> >
> >> >
> >> > It looks like now ParquetRelation2 is used to load parquet file by
> >> default
> >> > and it only recognizes JSON format schema but 1.1.1 schema was case
> >> class
> >> > string format.
> >> >
> >> > Setting spark.sql.parquet.useDataSourceApi to false will fix it, but I
> >> > don't know the differences.
> >> > Is this considered a bug? We have a lot of parquet files from 1.1.1,
> >> should
> >> > we disable data source api in order to read them if we want to upgrade
> >> to
> >> > 1.3?
> >> >
> >> > Thanks,
> >> > --
> >> > Pei-Lun
> >> >
> >>
> >
> >
>


broadcast hang out

2015-03-15 Thread lonely Feb
Hi all, i meet up with a problem that torrent broadcast hang out in my
spark cluster (1.2, standalone) , particularly serious when driver and
executors are cross-region. when i read the code of broadcast i found that
a sync block read here:

  def fetchBlockSync(host: String, port: Int, execId: String, blockId:
String): ManagedBuffer = {
// A monitor for the thread to wait on.
val result = Promise[ManagedBuffer]()
fetchBlocks(host, port, execId, Array(blockId),
  new BlockFetchingListener {
override def onBlockFetchFailure(blockId: String, exception:
Throwable): Unit = {
  result.failure(exception)
}
override def onBlockFetchSuccess(blockId: String, data:
ManagedBuffer): Unit = {
  val ret = ByteBuffer.allocate(data.size.toInt)
  ret.put(data.nioByteBuffer())
  ret.flip()
  result.success(new NioManagedBuffer(ret))
}
  })

Await.result(result.future, Duration.Inf)
  }

it seems that fetchBlockSync method does not have a timeout limit but wait
forever ? Anybody can show me how to control the timeout here?


Re: broadcast hang out

2015-03-15 Thread Chester Chen
can you just replace "Duration.Inf" with a shorter duration  ? how about

  import scala.concurrent.duration._
  val timeout = new Timeout(10 seconds)
  Await.result(result.future, timeout.duration)

  or

  val timeout = new FiniteDuration(10, TimeUnit.SECONDS)
  Await.result(result.future, timeout)

  or simply
  import scala.concurrent.duration._
  Await.result(result.future, 10 seconds)



On Sun, Mar 15, 2015 at 8:08 PM, lonely Feb  wrote:

> Hi all, i meet up with a problem that torrent broadcast hang out in my
> spark cluster (1.2, standalone) , particularly serious when driver and
> executors are cross-region. when i read the code of broadcast i found that
> a sync block read here:
>
>   def fetchBlockSync(host: String, port: Int, execId: String, blockId:
> String): ManagedBuffer = {
> // A monitor for the thread to wait on.
> val result = Promise[ManagedBuffer]()
> fetchBlocks(host, port, execId, Array(blockId),
>   new BlockFetchingListener {
> override def onBlockFetchFailure(blockId: String, exception:
> Throwable): Unit = {
>   result.failure(exception)
> }
> override def onBlockFetchSuccess(blockId: String, data:
> ManagedBuffer): Unit = {
>   val ret = ByteBuffer.allocate(data.size.toInt)
>   ret.put(data.nioByteBuffer())
>   ret.flip()
>   result.success(new NioManagedBuffer(ret))
> }
>   })
>
> Await.result(result.future, Duration.Inf)
>   }
>
> it seems that fetchBlockSync method does not have a timeout limit but wait
> forever ? Anybody can show me how to control the timeout here?
>


Re: broadcast hang out

2015-03-15 Thread lonely Feb
Thx. But this method is in BlockTransferService.scala of spark which i can
not replace unless i rewrite the core code. I wonder if it is handled
somewhere already.

2015-03-16 11:27 GMT+08:00 Chester Chen :

> can you just replace "Duration.Inf" with a shorter duration  ? how about
>
>   import scala.concurrent.duration._
>   val timeout = new Timeout(10 seconds)
>   Await.result(result.future, timeout.duration)
>
>   or
>
>   val timeout = new FiniteDuration(10, TimeUnit.SECONDS)
>   Await.result(result.future, timeout)
>
>   or simply
>   import scala.concurrent.duration._
>   Await.result(result.future, 10 seconds)
>
>
>
> On Sun, Mar 15, 2015 at 8:08 PM, lonely Feb  wrote:
>
>> Hi all, i meet up with a problem that torrent broadcast hang out in my
>> spark cluster (1.2, standalone) , particularly serious when driver and
>> executors are cross-region. when i read the code of broadcast i found that
>> a sync block read here:
>>
>>   def fetchBlockSync(host: String, port: Int, execId: String, blockId:
>> String): ManagedBuffer = {
>> // A monitor for the thread to wait on.
>> val result = Promise[ManagedBuffer]()
>> fetchBlocks(host, port, execId, Array(blockId),
>>   new BlockFetchingListener {
>> override def onBlockFetchFailure(blockId: String, exception:
>> Throwable): Unit = {
>>   result.failure(exception)
>> }
>> override def onBlockFetchSuccess(blockId: String, data:
>> ManagedBuffer): Unit = {
>>   val ret = ByteBuffer.allocate(data.size.toInt)
>>   ret.put(data.nioByteBuffer())
>>   ret.flip()
>>   result.success(new NioManagedBuffer(ret))
>> }
>>   })
>>
>> Await.result(result.future, Duration.Inf)
>>   }
>>
>> it seems that fetchBlockSync method does not have a timeout limit but wait
>> forever ? Anybody can show me how to control the timeout here?
>>
>
>


Re: broadcast hang out

2015-03-15 Thread Mridul Muralidharan
Cross region as in different data centers ?

- Mridul

On Sun, Mar 15, 2015 at 8:08 PM, lonely Feb  wrote:
> Hi all, i meet up with a problem that torrent broadcast hang out in my
> spark cluster (1.2, standalone) , particularly serious when driver and
> executors are cross-region. when i read the code of broadcast i found that
> a sync block read here:
>
>   def fetchBlockSync(host: String, port: Int, execId: String, blockId:
> String): ManagedBuffer = {
> // A monitor for the thread to wait on.
> val result = Promise[ManagedBuffer]()
> fetchBlocks(host, port, execId, Array(blockId),
>   new BlockFetchingListener {
> override def onBlockFetchFailure(blockId: String, exception:
> Throwable): Unit = {
>   result.failure(exception)
> }
> override def onBlockFetchSuccess(blockId: String, data:
> ManagedBuffer): Unit = {
>   val ret = ByteBuffer.allocate(data.size.toInt)
>   ret.put(data.nioByteBuffer())
>   ret.flip()
>   result.success(new NioManagedBuffer(ret))
> }
>   })
>
> Await.result(result.future, Duration.Inf)
>   }
>
> it seems that fetchBlockSync method does not have a timeout limit but wait
> forever ? Anybody can show me how to control the timeout here?

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: broadcast hang out

2015-03-15 Thread lonely Feb
yes

2015-03-16 11:43 GMT+08:00 Mridul Muralidharan :

> Cross region as in different data centers ?
>
> - Mridul
>
> On Sun, Mar 15, 2015 at 8:08 PM, lonely Feb  wrote:
> > Hi all, i meet up with a problem that torrent broadcast hang out in my
> > spark cluster (1.2, standalone) , particularly serious when driver and
> > executors are cross-region. when i read the code of broadcast i found
> that
> > a sync block read here:
> >
> >   def fetchBlockSync(host: String, port: Int, execId: String, blockId:
> > String): ManagedBuffer = {
> > // A monitor for the thread to wait on.
> > val result = Promise[ManagedBuffer]()
> > fetchBlocks(host, port, execId, Array(blockId),
> >   new BlockFetchingListener {
> > override def onBlockFetchFailure(blockId: String, exception:
> > Throwable): Unit = {
> >   result.failure(exception)
> > }
> > override def onBlockFetchSuccess(blockId: String, data:
> > ManagedBuffer): Unit = {
> >   val ret = ByteBuffer.allocate(data.size.toInt)
> >   ret.put(data.nioByteBuffer())
> >   ret.flip()
> >   result.success(new NioManagedBuffer(ret))
> > }
> >   })
> >
> > Await.result(result.future, Duration.Inf)
> >   }
> >
> > it seems that fetchBlockSync method does not have a timeout limit but
> wait
> > forever ? Anybody can show me how to control the timeout here?
>


Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread David Hall
snapshot is pushed. If you verify I'll publish the new artifacts.

On Sun, Mar 15, 2015 at 1:14 AM, Yu Ishikawa 
wrote:

> David Hall who is a breeze creator told me that it's a bug. So, I made a
> jira
> ticket about this issue. We need to upgrade breeze from 0.11.1 to 0.11.2 or
> later in order to fix the bug, when the new version of breeze will be
> released.
>
> [SPARK-6341] Upgrade breeze from 0.11.1 to 0.11.2 or later - ASF JIRA
> https://issues.apache.org/jira/browse/SPARK-6341
>
> Thanks,
> Yu Ishikawa
>
>
>
> -
> -- Yu Ishikawa
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: broadcast hang out

2015-03-15 Thread lonely Feb
Anyone can help? Thanks a lot !

2015-03-16 11:45 GMT+08:00 lonely Feb :

> yes
>
> 2015-03-16 11:43 GMT+08:00 Mridul Muralidharan :
>
>> Cross region as in different data centers ?
>>
>> - Mridul
>>
>> On Sun, Mar 15, 2015 at 8:08 PM, lonely Feb  wrote:
>> > Hi all, i meet up with a problem that torrent broadcast hang out in my
>> > spark cluster (1.2, standalone) , particularly serious when driver and
>> > executors are cross-region. when i read the code of broadcast i found
>> that
>> > a sync block read here:
>> >
>> >   def fetchBlockSync(host: String, port: Int, execId: String, blockId:
>> > String): ManagedBuffer = {
>> > // A monitor for the thread to wait on.
>> > val result = Promise[ManagedBuffer]()
>> > fetchBlocks(host, port, execId, Array(blockId),
>> >   new BlockFetchingListener {
>> > override def onBlockFetchFailure(blockId: String, exception:
>> > Throwable): Unit = {
>> >   result.failure(exception)
>> > }
>> > override def onBlockFetchSuccess(blockId: String, data:
>> > ManagedBuffer): Unit = {
>> >   val ret = ByteBuffer.allocate(data.size.toInt)
>> >   ret.put(data.nioByteBuffer())
>> >   ret.flip()
>> >   result.success(new NioManagedBuffer(ret))
>> > }
>> >   })
>> >
>> > Await.result(result.future, Duration.Inf)
>> >   }
>> >
>> > it seems that fetchBlockSync method does not have a timeout limit but
>> wait
>> > forever ? Anybody can show me how to control the timeout here?
>>
>
>