Re: PySpark on PyPi

2015-06-05 Thread Jey Kottalam
Couldn't we have a pip installable "pyspark" package that just serves as a shim to an existing Spark installation? Or it could even download the latest Spark binary if SPARK_HOME isn't set during installation. Right now, Spark doesn't play very well with the usual Python ecosystem. For example, why

Re: Experience using binary packages on various Hadoop distros

2015-03-24 Thread Jey Kottalam
Could we gracefully fallback to an in-tree Hadoop binary (e.g. 1.0.4) in that case? I think many new Spark users are confused about why Spark has anything to do with Hadoop, e.g. I could see myself being confused when the download page asks me to select a "package type". I know that what I want is

Re: Making RDDs Covariant

2014-03-21 Thread Jey Kottalam
That would be awesome. I support this! On Fri, Mar 21, 2014 at 7:28 PM, Michael Armbrust wrote: > Hey Everyone, > > Here is a pretty major (but source compatible) change we are considering > making to the RDD API for 1.0. Java and Python APIs would remain the same, > but users of Scala would lik