Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Kan Zhang
+1 Compiled, ran newly-introduced PySpark Hadoop input/output examples. On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov wrote: > +1 > > Compiled, ran on yarn-hadoop-2.3 simple job. > > > 2014-09-04 22:22 GMT+04:00 Henry Saputra : > > > LICENSE and NOTICE files are good > > Hash files are good > >

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-09-02 Thread Kan Zhang
+1 Verified PySpark InputFormat/OutputFormat examples. On Tue, Sep 2, 2014 at 4:10 PM, Reynold Xin wrote: > +1 > > > On Tue, Sep 2, 2014 at 3:08 PM, Cheng Lian wrote: > > > +1 > > > >- Tested Thrift server and SQL CLI locally on OSX 10.9. > >- Checked datanucleus dependencies in distr

Re: Markdown viewer for the docs

2014-08-18 Thread Kan Zhang
If you are willing to compile it, "The markdown code can be compiled to HTML using the [Jekyll tool](http://jekyllrb.com)." More in docs/README.md. On Mon, Aug 18, 2014 at 9:00 AM, Stephen Boesch wrote: > Which viewer is capable of seeing all of the content in the spark docs > -including the (a

Re: Calling Scala/Java methods which operates on RDD

2014-07-11 Thread Kan Zhang
Hi Jai, Your suspicion is correct. In general, Python RDDs are pickled into byte arrays and stored in Java land as RDDs of byte arrays. union/zip operates on byte arrays directly without deserializing. Currently, Python byte arrays only get unpickled into Java objects in special cases, like SQL fu

Re: Add my JIRA username (hsaputra) to Spark's contributor's list

2014-06-03 Thread Kan Zhang
Same here please, username (kzhang). Thanks! On Tue, Jun 3, 2014 at 11:39 AM, Henry Saputra wrote: > Thanks Matei! > > - Henry > > On Tue, Jun 3, 2014 at 11:36 AM, Matei Zaharia > wrote: > > Done. Looks like this was lost in the JIRA import. > > > > Matei > > > > On Jun 3, 2014, at 11:33 AM, H

Re: Why does spark REPL not embed scala REPL?

2014-05-30 Thread Kan Zhang
One reason is standard Scala REPL uses object based wrappers and their static initializers will be run on remote worker nodes, which may fail due to differences between driver and worker nodes. See discussion here https://groups.google.com/d/msg/scala-internals/h27CFLoJXjE/JoobM6NiUMQJ On Fri, Ma

Re: [VOTE] Release Apache Spark 1.0.0 (rc5)

2014-05-17 Thread Kan Zhang
+1 on the running commentary here, non-binding of course :-) On Sat, May 17, 2014 at 8:44 AM, Andrew Ash wrote: > +1 on the next release feeling more like a 0.10 than a 1.0 > On May 17, 2014 4:38 AM, "Mridul Muralidharan" wrote: > > > I had echoed similar sentiments a while back when there was