date:20141230

Re: Unsupported Catalyst types in Parquet

2014-12-30 Thread Alessandro Baretta

I think I might have figure it out myself. Here's a pull request for you guys to check out: https://github.com/apache/spark/pull/3855 I successfully tested this code on my cluster. On Tue, Dec 30, 2014 at 11:01 PM, Alessandro Baretta wrote: > Here's a more meaningful exception: > > java.lang.C

Re: Unsupported Catalyst types in Parquet

2014-12-30 Thread Alessandro Baretta

Here's a more meaningful exception: java.lang.ClassCastException: org.apache.spark.sql.catalyst.types.DateType$ cannot be cast to org.apache.spark.sql.catalyst.types.PrimitiveType at org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:188) at org.apach

Re: Sample Spark Program Error

2014-12-30 Thread Nicholas Chammas

You sent this to the dev list. Please send it instead to the user list. We use the dev list to discuss development on Spark itself, new features, fixes to known bugs, and so forth. The user list is to discuss issues using Spark, which I believe is what you are looking for. Nick On Tue Dec 30 2

Re: Why the major.minor version of the new hive-exec is 51.0?

2014-12-30 Thread Ted Yu

I extracted org/apache/hadoop/hive/common/CompressionUtils.class from the jar and used hexdump to view the class file. Bytes 6 and 7 are 00 and 33, respectively. According to http://en.wikipedia.org/wiki/Java_class_file, the jar was produced using Java 7. FYI On Tue, Dec 30, 2014 at 8:09 PM, Shi

Sample Spark Program Error

2014-12-30 Thread Naveen Madhire

Hi All, I am trying to run a sample Spark program using Scala SBT, Below is the program, def main(args: Array[String]) { val logFile = "E:/ApacheSpark/usb/usb/spark/bin/README.md" // Should be some file on your system val sc = new SparkContext("local", "Simple App", "E:/ApacheSpark/

Why the major.minor version of the new hive-exec is 51.0?

2014-12-30 Thread Shixiong Zhu

The major.minor version of the new org.spark-project.hive.hive-exec is 51.0, so it will require people use JDK7. Is it intentional? org.spark-project.hive hive-exec 0.12.0-protobuf-2.5 You can use the following steps to reproduce it (Need to use JDK6): 1. Create a Test.java file with the follo

Re: Help, pyspark.sql.List flatMap results become tuple

2014-12-30 Thread guoxu1231

Thanks Davies, it works in 1.2. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Help-pyspark-sql-List-flatMap-results-become-tuple-tp9961p9975.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. ---

Re: Help, pyspark.sql.List flatMap results become tuple

2014-12-30 Thread Davies Liu

This should be fixed in 1.2, could you try it? On Mon, Dec 29, 2014 at 8:04 PM, guoxu1231 wrote: > Hi pyspark guys, > > I have a json file, and its struct like below: > > {"NAME":"George", "AGE":35, "ADD_ID":1212, "POSTAL_AREA":1, > "TIME_ZONE_ID":1, "INTEREST":[{"INTEREST_NO":1, "INFO":"x"}, > {

Re: Adding third party jars to classpath used by pyspark

2014-12-30 Thread Davies Liu

On Mon, Dec 29, 2014 at 7:39 PM, Jeremy Freeman wrote: > Hi Stephen, it should be enough to include > >> --jars /path/to/file.jar > > in the command line call to either pyspark or spark-submit, as in > >> spark-submit --master local --jars /path/to/file.jar myfile.py Unfortunately, you also need

Re: Unsupported Catalyst types in Parquet

2014-12-30 Thread Alessandro Baretta

Sorry! My bad. I had stale spark jars sitting on the slave nodes... Alex On Tue, Dec 30, 2014 at 4:39 PM, Alessandro Baretta wrote: > Gents, > > I tried #3820. It doesn't work. I'm still getting the following exceptions: > > Exception in thread "Thread-45" java.lang.RuntimeException: Unsupporte

Re: Unsupported Catalyst types in Parquet

2014-12-30 Thread Alessandro Baretta

Gents, I tried #3820. It doesn't work. I'm still getting the following exceptions: Exception in thread "Thread-45" java.lang.RuntimeException: Unsupported datatype DateType at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$

Re: Is there any way to tell if compute is being called from a retry?

2014-12-30 Thread Josh Rosen

This is timely, since I just ran into this issue myself while trying to write a test to reproduce a bug related to speculative execution (I wanted to configure a job so that the first attempt to compute a partition would run slow so that a second, fast speculative copy would be launched). I've ope

Is there any way to tell if compute is being called from a retry?

2014-12-30 Thread Cody Koeninger

It looks like taskContext.attemptId doesn't mean what one thinks it might mean, based on http://apache-spark-developers-list.1001551.n3.nabble.com/Get-attempt-number-in-a-closure-td8853.html and the unresolved https://issues.apache.org/jira/browse/SPARK-4014 Is there any alternative way to te

Re: Registering custom metrics

2014-12-30 Thread eshioji

Hi, Did you find a way to do this / working on this? Am trying to find a way to do this as well, but haven't been able to find a way. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Registering-custom-metrics-tp9030p9968.html Sent from the Apache Spar

[Documentation] Adding external blog to documentation

2014-12-30 Thread Rahul Kavale

Hi all, I recently wrote a blog comparing MapReduce model with that of Apache Spark trying to explain some important question I think a beginner might have while exploring Spark. The blog can be found here: http://rahulkavale.github.io/blog/2014/11/16/scrap-your-map-reduce/ The blog received quite

Re: Problems concerning implementing machine learning algorithm from scratch based on Spark

2014-12-30 Thread MEETHU MATHEW

Hi, The GMMSpark.py you mentioned is the old one.The new code is now added to spark-packages and is available at http://spark-packages.org/package/11 . Have a look at the new code. We have used numpy functions in our code and didnt notice any slowdown because of this. Thanks & Regards, Meethu M

Re: Unsupported Catalyst types in Parquet

Re: Unsupported Catalyst types in Parquet

Re: Sample Spark Program Error

Re: Why the major.minor version of the new hive-exec is 51.0?

Sample Spark Program Error

Why the major.minor version of the new hive-exec is 51.0?

Re: Help, pyspark.sql.List flatMap results become tuple

Re: Help, pyspark.sql.List flatMap results become tuple

Re: Adding third party jars to classpath used by pyspark

Re: Unsupported Catalyst types in Parquet

Re: Unsupported Catalyst types in Parquet

Re: Is there any way to tell if compute is being called from a retry?

Is there any way to tell if compute is being called from a retry?

Re: Registering custom metrics

[Documentation] Adding external blog to documentation

Re: Problems concerning implementing machine learning algorithm from scratch based on Spark

16 matches

Site Navigation

Mail list logo

Footer information