date:20160404

Re: Build changes after SPARK-13579

2016-04-04 Thread Reynold Xin

pyspark and R On Mon, Apr 4, 2016 at 9:59 PM, Marcelo Vanzin wrote: > No, tests (except pyspark) should work without having to package anything > first. > > On Mon, Apr 4, 2016 at 9:58 PM, Koert Kuipers wrote: > > do i need to run sbt package before doing tests? > > > > On Mon, Apr 4, 2016 at 1

Re: Build changes after SPARK-13579

2016-04-04 Thread Marcelo Vanzin

No, tests (except pyspark) should work without having to package anything first. On Mon, Apr 4, 2016 at 9:58 PM, Koert Kuipers wrote: > do i need to run sbt package before doing tests? > > On Mon, Apr 4, 2016 at 11:00 PM, Marcelo Vanzin wrote: >> >> Hey all, >> >> We merged SPARK-13579 today, a

Re: Build changes after SPARK-13579

2016-04-04 Thread Koert Kuipers

do i need to run sbt package before doing tests? On Mon, Apr 4, 2016 at 11:00 PM, Marcelo Vanzin wrote: > Hey all, > > We merged SPARK-13579 today, and if you're like me and have your > hands automatically type "sbt assembly" anytime you're building Spark, > that won't work anymore. > > You sho

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-04-04 Thread Nezih Yigitbasi

Nope, I didn't have a chance to track the root cause, and IIRC we didn't observe it when dyn. alloc. is off. On Mon, Apr 4, 2016 at 6:16 PM Reynold Xin wrote: > BTW do you still see this when dynamic allocation is off? > > On Mon, Apr 4, 2016 at 6:16 PM, Reynold Xin wrote: > >> Nezih, >> >> Hav

Build changes after SPARK-13579

2016-04-04 Thread Marcelo Vanzin

Hey all, We merged SPARK-13579 today, and if you're like me and have your hands automatically type "sbt assembly" anytime you're building Spark, that won't work anymore. You should now use "sbt package"; you'll still need "sbt assembly" if you require one of the remaining assemblies (streaming c

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Koert Kuipers

can you try: spark.shuffle.reduceLocality.enabled=false On Mon, Apr 4, 2016 at 8:17 PM, Mike Hynes <91m...@gmail.com> wrote: > Dear all, > > Thank you for your responses. > > Michael Slavitch: > > Just to be sure: Has spark-env.sh and spark-defaults.conf been > correctly propagated to all nodes?

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-04-04 Thread Reynold Xin

Nezih, Have you had a chance to figure out why this is happening? On Tue, Mar 22, 2016 at 1:32 AM, james wrote: > I guess different workload cause diff result ? > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/java-lang-OutOfMemoryError-Una

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-04-04 Thread Reynold Xin

BTW do you still see this when dynamic allocation is off? On Mon, Apr 4, 2016 at 6:16 PM, Reynold Xin wrote: > Nezih, > > Have you had a chance to figure out why this is happening? > > > On Tue, Mar 22, 2016 at 1:32 AM, james wrote: > >> I guess different workload cause diff result ? >> >> >> >

Re: error: reference to sql is ambiguous after import org.apache.spark._ in shell?

2016-04-04 Thread Ted Yu

Looks like the import comes from repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala : processLine("import sqlContext.sql") On Mon, Apr 4, 2016 at 5:16 PM, Jacek Laskowski wrote: > Hi Spark devs, > > I'm unsure if what I'm seeing is correct. I'd appreciate any input > to

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Mike Hynes

Dear all, Thank you for your responses. Michael Slavitch: > Just to be sure: Has spark-env.sh and spark-defaults.conf been correctly > propagated to all nodes? Are they identical? Yes; these files are stored on a shared memory directory accessible to all nodes. Koert Kuipers: > we ran into si

error: reference to sql is ambiguous after import org.apache.spark._ in shell?

2016-04-04 Thread Jacek Laskowski

Hi Spark devs, I'm unsure if what I'm seeing is correct. I'd appreciate any input to...rest my nerves :-) I did `import org.apache.spark._` by mistake, but since it's valid, I'm wondering why does Spark shell imports sql at all since it's available after the import?! (it's today's build) scala>

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Nicholas Chammas

Thanks, that was the command. :thumbsup: On Mon, Apr 4, 2016 at 6:28 PM Jakob Odersky wrote: > I just found out how the hash is calculated: > > gpg --print-md sha512 .tgz > > you can use that to check if the resulting output matches the contents > of .tgz.sha > > On Mon, Apr 4, 2016 at 3:19 PM,

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Jakob Odersky

I just found out how the hash is calculated: gpg --print-md sha512 .tgz you can use that to check if the resulting output matches the contents of .tgz.sha On Mon, Apr 4, 2016 at 3:19 PM, Jakob Odersky wrote: > The published hash is a SHA512. > > You can verify the integrity of the packages by r

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Jakob Odersky

The published hash is a SHA512. You can verify the integrity of the packages by running `sha512sum` on the archive and comparing the computed hash with the published one. Unfortunately however, I don't know what tool is used to generate the hash and I can't reproduce the format, so I ended up manu

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-04-04 Thread Karlis Zigurs

Curveball: Is there a need to use lambdas quite yet? On Mon, Apr 4, 2016 at 10:58 PM, Ofir Manor wrote: > I think that a backup plan could be to announce that JDK7 is deprecated in > Spark 2.0 and support for it will be fully removed in Spark 2.1. This gives > admins enough warning to install JDK

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Nicholas Chammas

An additional note: The Spark packages being served off of CloudFront (i.e. the “direct download” option on spark.apache.org) are also corrupt. Btw what’s the correct way to verify the SHA of a Spark package? I’ve tried a few commands on working packages downloaded from Apache mirrors, but I can’t

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-04-04 Thread Ofir Manor

I think that a backup plan could be to announce that JDK7 is deprecated in Spark 2.0 and support for it will be fully removed in Spark 2.1. This gives admins enough warning to install JDK8 along side their "main" JDK (or fully migrate to it), while allowing the project to merge JDK8-specific change

Re: running lda in spark throws exception

2016-04-04 Thread Joseph Bradley

It's possible this was caused by incorrect Graph creation, fixed in [SPARK-13355]. Could you retry your dataset using the current master to see if the problem is fixed? Thanks! On Tue, Jan 19, 2016 at 5:31 AM, Li Li wrote: > I have modified my codes. I can get the total vocabulary size and > i

Re: [SQL] Dataset.map gives error: missing parameter type for expanded function?

2016-04-04 Thread Michael Armbrust

It is called groupByKey now. Similar to joinWith, the schema produced by relational joins and aggregations is different than what you would expect when working with objects. So, when combining DataFrame+Dataset we renamed these functions to make this distinction clearer. On Sun, Apr 3, 2016 at 1

Re: explain codegen

2016-04-04 Thread Ted Yu

Thanks to all who have responded. It turned out that the following command line for maven caused the error (I forgot to include this in first email): eclipse:eclipse Once I omitted the above, 'explain codegen' works. On Mon, Apr 4, 2016 at 9:37 AM, Reynold Xin wrote: > Why don't you wipe every

Re: explain codegen

2016-04-04 Thread Reynold Xin

Why don't you wipe everything out and try again? On Monday, April 4, 2016, Ted Yu wrote: > The commit you mentioned was made Friday. > I refreshed workspace Sunday - so it was included. > > Maybe this was related: > > $ bin/spark-shell > Failed to find Spark jars directory > (/home/hbase/spark/a

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Koert Kuipers

we ran into similar issues and it seems related to the new memory management. can you try: spark.memory.useLegacyMode = true On Mon, Apr 4, 2016 at 9:12 AM, Mike Hynes <91m...@gmail.com> wrote: > [ CC'ing dev list since nearly identical questions have occurred in > user list recently w/o resoluti

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-04-04 Thread Luciano Resende

Reynold, Considering the performance improvements you mentioned in your original e-mail and also considering that few other big data projects have already or are in progress of abandoning JDK 7, I think it would benefit Spark if we go with JDK 8.0 only. Are there users that will be less aggressiv

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Ted Yu

Maybe temporarily take out the artifacts on S3 before the root cause is found. On Thu, Mar 24, 2016 at 7:25 AM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Just checking in on this again as the builds on S3 are still broken. :/ > > Could it have something to do with us moving release-

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Kousuke Saruta

Thanks. Of course, I verified checksum and it didn't matched. Kousuke On 2016/04/05 0:39, Jitendra Shelar wrote: We can think of using checksum for this kind of issues. On Mon, Apr 4, 2016 at 8:32 PM, Kousuke Saruta mailto:saru...@oss.nttdata.co.jp>> wrote: Oh, I overlooked that. Thank

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-04-04 Thread Xuefeng Wu

Many open source projects are aggressive, such as Oracle JDK and Ubuntu, But they provide stable commercial supporting. In other words, the enterprises doesn't drop JDK7, might aslo do not drop Spark 1.x to adopt Spark 2.x early version. On Sun, Apr 3, 2016 at 10:29 PM -0700, "Reynold Xi

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Jitendra Shelar

We can think of using checksum for this kind of issues. On Mon, Apr 4, 2016 at 8:32 PM, Kousuke Saruta wrote: > Oh, I overlooked that. Thanks. > > Kousuke > > > On 2016/04/04 22:58, Nicholas Chammas wrote: > > This is still an issue. The Spark 1.6.1 packages on S3 are corrupt. > > Is anyone loo

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Kousuke Saruta

Oh, I overlooked that. Thanks. Kousuke On 2016/04/04 22:58, Nicholas Chammas wrote: This is still an issue. The Spark 1.6.1 packages on S3 are corrupt. Is anyone looking into this issue? Is there anything contributors can do to help solve this problem? Nick On Sun, Mar 27, 2016 at 8:49 PM

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Ted Yu

bq. the modifications do not touch the scheduler If the changes can be ported over to 1.6.1, do you mind reproducing the issue there ? I ask because master branch changes very fast. It would be good to narrow the scope where the behavior you observed started showing. On Mon, Apr 4, 2016 at 6:12

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Michael Slavitch

Just to be sure: Has spark-env.sh and spark-defaults.conf been correctly propagated to all nodes? Are they identical? > On Apr 4, 2016, at 9:12 AM, Mike Hynes <91m...@gmail.com> wrote: > > [ CC'ing dev list since nearly identical questions have occurred in > user list recently w/o resolution;

Re: explain codegen

2016-04-04 Thread Ted Yu

The commit you mentioned was made Friday. I refreshed workspace Sunday - so it was included. Maybe this was related: $ bin/spark-shell Failed to find Spark jars directory (/home/hbase/spark/assembly/target/scala-2.10). You need to build Spark before running this program. Then I did: $ ln -s /ho

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-04-04 Thread Nicholas Chammas

This is still an issue. The Spark 1.6.1 packages on S3 are corrupt. Is anyone looking into this issue? Is there anything contributors can do to help solve this problem? Nick On Sun, Mar 27, 2016 at 8:49 PM Nicholas Chammas wrote: > Pingity-ping-pong since this is still a problem. > > > On Thu,

RDD Partitions not distributed evenly to executors

2016-04-04 Thread Mike Hynes

[ CC'ing dev list since nearly identical questions have occurred in user list recently w/o resolution; c.f.: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tt26502.html http://apache-spark-user-list.1001560.n3.nabble.com/Partitions-are-get-placed-on-the-sing

Spark 1.6.1 binary pre-built for Hadoop 2.6 may be broken

2016-04-04 Thread Kousuke Saruta

Hi all, I noticed the binary pre-build for Hadoop 2.6 which we can download from spark.apache.org/downloads.html (Direct Download) may be broken. I couldn't decompress at least following 4 tgzs with "tar xfzv" command and md5-checksum did't match. * spark-1.6.1-bin-hadoop2.6.tgz * spark-1.6.1-bin

Re: explain codegen

2016-04-04 Thread Herman van Hövell tot Westerflier

No, it can''t. You only need implicits when you are using the catalyst DSL. The error you get is due to the fact that the parser does not recognize the CODEGEN keyword (which was the case before we introduced this in https://github.com/apache/spark/commit/fa1af0aff7bde9bbf7bfa6a3ac74699734c2fd8a).

Re: explain codegen

2016-04-04 Thread Ted Yu

Could the error I encountered be due to missing import(s) of implicit ? Thanks On Sun, Apr 3, 2016 at 9:42 PM, Reynold Xin wrote: > Works for me on latest master. > > > > scala> sql("explain codegen select 'a' as a group by 1").head > res3: org.apache.spark.sql.Row = > [Found 2 WholeStageCodege

Re: Build changes after SPARK-13579

Re: Build changes after SPARK-13579

Re: Build changes after SPARK-13579

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

Build changes after SPARK-13579

Re: RDD Partitions not distributed evenly to executors

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

Re: error: reference to sql is ambiguous after import org.apache.spark._ in shell?

Re: RDD Partitions not distributed evenly to executors

error: reference to sql is ambiguous after import org.apache.spark._ in shell?

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: [discuss] ending support for Java 7 in Spark 2.0

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: [discuss] ending support for Java 7 in Spark 2.0

Re: running lda in spark throws exception

Re: [SQL] Dataset.map gives error: missing parameter type for expanded function?

Re: explain codegen

Re: explain codegen

Re: RDD Partitions not distributed evenly to executors

Re: [discuss] ending support for Java 7 in Spark 2.0

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: [discuss] ending support for Java 7 in Spark 2.0

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

Re: RDD Partitions not distributed evenly to executors

Re: RDD Partitions not distributed evenly to executors

Re: explain codegen

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

RDD Partitions not distributed evenly to executors

Spark 1.6.1 binary pre-built for Hadoop 2.6 may be broken

Re: explain codegen

Re: explain codegen

36 matches

Site Navigation

Mail list logo

Footer information