from:"Will Benton"

Re: Easy way to convert Row back to case class

2015-05-08 Thread Will Benton

This might not be the easiest way, but it's pretty easy: you can use Row(field_1, ..., field_n) as a pattern in a case match. So if you have a data frame with foo as an int column and bar as a String columns and you want to construct instances of a case class that wraps these up, you can do so

Re: Standardized Spark dev environment

2015-01-21 Thread Will Benton

- Original Message - > From: "Patrick Wendell" > To: "Sean Owen" > Cc: "dev" , "jay vyas" , > "Paolo Platter" > , "Nicholas Chammas" , > "Will Benton" > Sent: Wednesday, January 21, 2015 2:09:3

Re: Standardized Spark dev environment

2015-01-20 Thread Will Benton

Hey Nick, I did something similar with a Docker image last summer; I haven't updated the images to cache the dependencies for the current Spark master, but it would be trivial to do so: http://chapeau.freevariable.com/2014/08/jvm-test-docker.html best, wb - Original Message - > From

Re: not found: type LocalSparkContext

2015-01-20 Thread Will Benton

It's declared here: https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/LocalSparkContext.scala I assume you're already importing LocalSparkContext, but since the test classes aren't included in Spark packages, you'll also need to package them up in order to use

Re: best IDE for scala + spark development?

2014-10-27 Thread Will Benton

I'll chime in as yet another user who is extremely happy with sbt and a text editor. (In my experience, running "ack" from the command line is usually just as easy and fast as using an IDE's find-in-project facility.) You can, of course, extend editors with Scala-specific IDE-like functionalit

Re: Question about SparkSQL and Hive-on-Spark

2014-09-24 Thread Will Benton

spark/pull/1567 As far as windowing, I'll be developing my own test cases but would appreciate it if you could also share some kinds of queries you're interested in so that I can incorporate them as well. best, wb - Original Message - > From: "Yi Tian" &g

Re: Question about SparkSQL and Hive-on-Spark

2014-09-23 Thread Will Benton

Hi Yi, I've had some interest in implementing windowing and rollup in particular for some of my applications but haven't had them on the front of my plate yet. If you need them as well, I'm happy to start taking a look this week. best, wb - Original Message - > From: "Yi Tian" > To

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-09-02 Thread Will Benton

+1 Tested Scala/MLlib apps on Fedora 20 (OpenJDK 7) and OS X 10.9 (Oracle JDK 8). best, wb - Original Message - > From: "Patrick Wendell" > To: dev@spark.apache.org > Sent: Saturday, August 30, 2014 5:07:52 PM > Subject: [VOTE] Release Apache Spark 1.1.0 (RC3) > > Please vote on rele

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-09-02 Thread Will Benton

can't reproduce those now but will take another look later this week. best, wb - Original Message - > From: "Sean Owen" > To: "Will Benton" > Cc: "Patrick Wendell" , dev@spark.apache.org > Sent: Sunday, August 31, 2014 12:18:42 PM >

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Will Benton

- Original Message - > dev/run-tests fails two tests (1 Hive, 1 Kafka Streaming) for me > locally on 1.1.0-rc3. Does anyone else see that? It may be my env. > Although I still see the Hive failure on Debian too: > > [info] - SET commands semantics for a HiveContext *** FAILED *** > [info]

preferred Hive/Hadoop environment for generating golden test outputs

2014-07-17 Thread Will Benton

Hi all, What's the preferred environment for generating golden test outputs for new Hive tests? In particular: * what Hadoop version and Hive version should I be using, * are there particular distributions people have run successfully, and * are there any system properties or environment varia

Re: Profiling Spark tests with YourKit (or something else)

2014-07-14 Thread Will Benton

tests with YourKit (or something else) > > Would you mind filing a JIRA for this? That does sound like something bogus > happening on the JVM/YourKit level, but this sort of diagnosis is > sufficiently important that we should be resilient against it. > > > On Mon,

Re: Profiling Spark tests with YourKit (or something else)

2014-07-14 Thread Will Benton

- Original Message - > From: "Aaron Davidson" > To: dev@spark.apache.org > Sent: Monday, July 14, 2014 5:21:10 PM > Subject: Re: Profiling Spark tests with YourKit (or something else) > > Out of curiosity, what problems are you seeing with Utils.getCallSite? Aaron, if I enable call site

Re: Profiling Spark tests with YourKit (or something else)

2014-07-14 Thread Will Benton

u > can do this in the SBT build file). Maybe they are very close to full and > profiling pushes them over the edge. > > Matei > > On Jul 14, 2014, at 9:51 AM, Will Benton wrote: > > > Hi all, > > > > I've been evaluating YourKit and would like to

Profiling Spark tests with YourKit (or something else)

2014-07-14 Thread Will Benton

Hi all, I've been evaluating YourKit and would like to profile the heap and CPU usage of certain tests from the Spark test suite. In particular, I'm very interested in tracking heap usage by allocation site. Unfortunately, I get a lot of crashes running Spark tests with profiling (and thus al

odd test suite failures while adding functions to Catalyst

2014-07-08 Thread Will Benton

Hi all, I was testing an addition to Catalyst today (reimplementing a Hive UDF) and ran into some odd failures in the test suite. In particular, it seems that what most of these have in common is that an array is spuriously reversed somewhere. For example, the stddev tests in the HiveCompatib

Re: Scala examples for Spark do not work as written in documentation

2014-06-20 Thread Will Benton

Hey, sorry to reanimate this thread, but just a quick question: why do the examples (on http://spark.apache.org/examples.html) use "spark" for the SparkContext reference? This is minor, but it seems like it could be a little confusing for people who want to run them in the shell and need to ch

Re: question about Hive compatiblilty tests

2014-06-18 Thread Will Benton

> I assume you are adding tests? because that is the only time you should > see that message. Yes, I had added the HAVING test to the whitelist. > That error could mean a couple of things: > 1) The query is invalid and hive threw an exception > 2) Your Hive setup is bad. > > Regarding #2, you

question about Hive compatiblilty tests

2014-06-18 Thread Will Benton

Hi all, Does a "Failed to generate golden answer for query" message from HiveComparisonTests indicate that it isn't possible to run the query in question under Hive from Spark's test suite rather than anything about Spark's implementation of HiveQL? The stack trace I'm getting implicates Hive

Re: Kryo serialization for closures: a workaround

2014-05-28 Thread Will Benton

This is an interesting approach, Nilesh! Someone will correct me if I'm wrong, but I don't think this could go into ClosureCleaner as a default behavior (since Kryo apparently breaks on some classes that depend on custom Java serializers, as has come up on the list recently). But it does seem

ContextCleaner, weak references, and serialization

2014-05-28 Thread Will Benton

Friends, For context (so to speak), I did some work in the 0.9 timeframe to fix SPARK-897 (provide immediate feedback when closures aren't serializable) and SPARK-729 (make sure that free variables in closures are captured when the RDD transformations are declared). I currently have a branch

Re: [VOTE] Release Apache Spark 1.0.0 (RC11)

2014-05-28 Thread Will Benton

+1 I made the necessary interface changes to my apps that use MLLib and tested all of my code against rc11 on Fedora 20 and OS X 10.9.3. (The Fedora Rawhide package remains at 0.9.1 pending some additional dependency packaging work.) best, wb - Original Message - > From: "Tathagata

Re: [VOTE] Release Apache Spark 0.9.1 (RC3)

2014-03-28 Thread Will Benton

RC3 works with the applications I'm working on now and MLLib performance is indeed perceptibly improved over 0.9.0 (although I haven't done a real evaluation). Also, from the downstream perspective, I've been tracking the 0.9.1 RCs in Fedora and have no issues to report there either: http:/

Re: Suggest to workaround the org.eclipse.jetty.orbit problem with SBT 0.13.2-RC1

2014-03-25 Thread Will Benton

- Original Message - > At last, I worked around this issue by updating my local SBT to 0.13.2-RC1. > If any of you are experiencing similar problem, I suggest you upgrade your > local SBT version. If this issue is causing grief for anyone on Fedora 20, know that you can install sbt via y

Re: Easy way to convert Row back to case class

Re: Standardized Spark dev environment

Re: Standardized Spark dev environment

Re: not found: type LocalSparkContext

Re: best IDE for scala + spark development?

Re: Question about SparkSQL and Hive-on-Spark

Re: Question about SparkSQL and Hive-on-Spark

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

preferred Hive/Hadoop environment for generating golden test outputs

Re: Profiling Spark tests with YourKit (or something else)

Re: Profiling Spark tests with YourKit (or something else)

Re: Profiling Spark tests with YourKit (or something else)

Profiling Spark tests with YourKit (or something else)

odd test suite failures while adding functions to Catalyst

Re: Scala examples for Spark do not work as written in documentation

Re: question about Hive compatiblilty tests

question about Hive compatiblilty tests

Re: Kryo serialization for closures: a workaround

ContextCleaner, weak references, and serialization

Re: [VOTE] Release Apache Spark 1.0.0 (RC11)

Re: [VOTE] Release Apache Spark 0.9.1 (RC3)

Re: Suggest to workaround the org.eclipse.jetty.orbit problem with SBT 0.13.2-RC1

24 matches

Site Navigation

Mail list logo

Footer information