Re: Building spark 1.2 from source requires more dependencies

2015-03-27 Thread Sean Owen
This is not a compile error, but an error from the scalac compiler. That is, the code and build are fine, but scalac is not compiling it. Usually when this happens, a clean build fixes it. On Fri, Mar 27, 2015 at 7:09 PM, Pala M Muthaia wrote: > No, i am running from the root directory, parent of

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Davies Liu
On Fri, Mar 27, 2015 at 4:16 PM, Stephen Boesch wrote: > Thx much! This works. > > My workflow is making changes to files in Intelij and running ipython to > execute pyspark. > > Is there any way for ipython to "see the updated class files without first > exiting? No, iPython shell is statefull,

Re: RDD.count

2015-03-27 Thread Sean Owen
I assume because map() could have side effects? Even if that's not generally a good idea. The expectation or contract is that it is still invoked. In this program the caller could also call count() on the parent. On Mar 28, 2015 1:00 AM, "jimfcarroll" wrote: > Hi all, > > I was wondering why the

RDD.count

2015-03-27 Thread jimfcarroll
Hi all, I was wondering why the RDD.count call recomputes the RDD in all cases? In most cases it can simply ask the next dependent RDD. I have several RDD implementations and was surprised to see a call like the following never call my RDD's count method but instead recompute/traverse the entire d

Re: Support for Hive 0.14 in secure mode on hadoop 2.6.0

2015-03-27 Thread Zhan Zhang
Hi Doug, Spark-5111 is to make spark work with security hadoop cluster in 2.6. There is some compatibility issue which need the fix Spark-5111 patch. In insecure cluster, current spark can connect to hive-0.14 without problems. By the way, I am really glad to hear that "an adaption layer in Spar

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Stephen Boesch
Thx much! This works. My workflow is making changes to files in Intelij and running ipython to execute pyspark. Is there any way for ipython to "see the updated class files without first exiting? 2015-03-27 10:21 GMT-07:00 Davies Liu : > put these lines in your ~/.bash_profile > > export SPARK

Re: Support for Hive 0.14 in secure mode on hadoop 2.6.0

2015-03-27 Thread Doug Balog
Is there a JIRA for this adaption layer ? It sounds like a better long term solution. If anybody knows what is require to get the current Shim layer working with Hive 0.14, please post what you know. I’m willing to spend some time on it, but I’m still learning how things fit together and it mig

Re: LogisticGradient Design

2015-03-27 Thread Joseph Bradley
Makes sense! On Wed, Mar 25, 2015 at 2:46 PM, Debasish Das wrote: > Cool...Thanks...It will be great if they move in two code paths just for > the sake of code clean-up > > On Wed, Mar 25, 2015 at 2:37 PM, DB Tsai wrote: > >> I did the benchmark when I used the if-else statement to switch the >

Re: Building spark 1.2 from source requires more dependencies

2015-03-27 Thread Pala M Muthaia
No, i am running from the root directory, parent of core. Here is the first set of errors that i see when i compile from source (sorry the error message is very long, but adding it in case it helps in diagnosis). After i manually add javax.servlet dependency for version 3.0, these set of errors g

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Davies Liu
put these lines in your ~/.bash_profile export SPARK_PREPEND_CLASSES=true export SPARK_HOME=path_to_spark export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip:${SPARK_HOME}/python:${PYTHONPATH}" $ source ~/.bash_profile $ build/sbt assembly $ build/sbt ~compile # do not stop this T

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Stephen Boesch
Compile alone did not show the scala code changes AFAICT. I will reverify. 2015-03-27 10:16 GMT-07:00 Davies Liu : > I usually just open a terminal to do `build/sbt ~compile`, coding in > IntelliJ, then run python tests in another terminal once it compiled > successfully. > > On Fri, Mar 27, 2015

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Davies Liu
I usually just open a terminal to do `build/sbt ~compile`, coding in IntelliJ, then run python tests in another terminal once it compiled successfully. On Fri, Mar 27, 2015 at 10:11 AM, Reynold Xin wrote: > Python is tough if you need to change Scala at the same time. > > sbt/sbt assembly/assembl

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Davies Liu
see https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools On Fri, Mar 27, 2015 at 10:02 AM, Stephen Boesch wrote: > I am iteratively making changes to the scala side of some new pyspark code > and re-testing from the python/pyspark side. > > Presently my only solution is to reb

Re: Iterative pyspark / scala codebase development

2015-03-27 Thread Reynold Xin
Python is tough if you need to change Scala at the same time. sbt/sbt assembly/assembly can be slightly faster than just assembly. On Fri, Mar 27, 2015 at 10:02 AM, Stephen Boesch wrote: > I am iteratively making changes to the scala side of some new pyspark code > and re-testing from the pyt

Iterative pyspark / scala codebase development

2015-03-27 Thread Stephen Boesch
I am iteratively making changes to the scala side of some new pyspark code and re-testing from the python/pyspark side. Presently my only solution is to rebuild completely sbt assembly after any scala side change - no matter how small. Any better / expedited way for pyspark to see small s

Re: Building spark 1.2 from source requires more dependencies

2015-03-27 Thread Sean Owen
I built from the head of branch-1.2 and spark-core compiled correctly with your exact command. You have something wrong with how you are building. For example, you're not trying to run this from the core subdirectory are you? On Thu, Mar 26, 2015 at 10:36 PM, Pala M Muthaia wrote: > Hi, > > We ar

Re: Support for Hive 0.14 in secure mode on hadoop 2.6.0

2015-03-27 Thread Cheng Lian
We're planning to replace the current Hive version profiles and shim layer with an adaption layer in Spark SQL in 1.4. This adaption layer allows Spark SQL to connect to arbitrary Hive version greater than or equal to 0.12.0 (or maybe 0.13.1, not decided yet). However, it's not a promise yet,

Support for Hive 0.14 in secure mode on hadoop 2.6.0

2015-03-27 Thread Doug Balog
Hi, I'm just wondering if anybody is working on supporting Hive 0.14 in secure mode on hadoop 2.6.0 ? I see once Jira referring to it https://issues.apache.org/jira/browse/SPARK-5111 but it mentions no effort to move to 0.14. Thanks, Doug -