I'm aiming for 1.3. On Mon, Jan 26, 2015 at 3:05 PM, Manoj Samel <manojsamelt...@gmail.com> wrote:
> Thanks Michael. I am sure there have been many requests for this support. > > Any release targeted for this? > > Thanks, > > On Sat, Jan 24, 2015 at 11:47 AM, Michael Armbrust <mich...@databricks.com > > wrote: > >> Those annotations actually don't work because the timestamp is SQL has >> optional nano-second precision. >> >> However, there is a PR to add support using parquets INT96 type: >> https://github.com/apache/spark/pull/3820 >> >> On Fri, Jan 23, 2015 at 12:08 PM, Manoj Samel <manojsamelt...@gmail.com> >> wrote: >> >>> Looking further at the trace and ParquetTypes.scala, it seems there is >>> no support for Timestamp and Date in fromPrimitiveDataType(ctype: >>> DataType): Option[ParquetTypeInfo]. Since Parquet supports these type >>> with some decoration over Int ( >>> https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md), >>> any reason why Date / Timestamp are not supported right now ? >>> >>> Thanks, >>> >>> Manoj >>> >>> >>> On Fri, Jan 23, 2015 at 11:40 AM, Manoj Samel <manojsamelt...@gmail.com> >>> wrote: >>> >>>> Using Spark 1.2 >>>> >>>> Read a CSV file, apply schema to convert to SchemaRDD and then >>>> schemaRdd.saveAsParquetFile >>>> >>>> If the schema includes Timestamptype, it gives following trace when >>>> doing the save >>>> >>>> Exception in thread "main" java.lang.RuntimeException: Unsupported >>>> datatype TimestampType >>>> >>>> at scala.sys.package$.error(package.scala:27) >>>> >>>> at >>>> org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$fromDataType$2.apply( >>>> ParquetTypes.scala:343) >>>> >>>> at >>>> org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$fromDataType$2.apply( >>>> ParquetTypes.scala:292) >>>> >>>> at scala.Option.getOrElse(Option.scala:120) >>>> >>>> at org.apache.spark.sql.parquet.ParquetTypesConverter$.fromDataType( >>>> ParquetTypes.scala:291) >>>> >>>> at org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$4.apply( >>>> ParquetTypes.scala:363) >>>> >>>> at org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$4.apply( >>>> ParquetTypes.scala:362) >>>> >>>> at scala.collection.TraversableLike$$anonfun$map$1.apply( >>>> TraversableLike.scala:244) >>>> >>>> at scala.collection.TraversableLike$$anonfun$map$1.apply( >>>> TraversableLike.scala:244) >>>> >>>> at scala.collection.mutable.ResizableArray$class.foreach( >>>> ResizableArray.scala:59) >>>> >>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>>> >>>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244 >>>> ) >>>> >>>> at scala.collection.AbstractTraversable.map(Traversable.scala:105) >>>> >>>> at >>>> org.apache.spark.sql.parquet.ParquetTypesConverter$.convertFromAttributes( >>>> ParquetTypes.scala:361) >>>> >>>> at org.apache.spark.sql.parquet.ParquetTypesConverter$.writeMetaData( >>>> ParquetTypes.scala:407) >>>> >>>> at org.apache.spark.sql.parquet.ParquetRelation$.createEmpty( >>>> ParquetRelation.scala:166) >>>> >>>> at org.apache.spark.sql.parquet.ParquetRelation$.create( >>>> ParquetRelation.scala:145) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkStrategies$ParquetOperations$.apply( >>>> SparkStrategies.scala:204) >>>> >>>> at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply( >>>> QueryPlanner.scala:58) >>>> >>>> at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply( >>>> QueryPlanner.scala:58) >>>> >>>> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) >>>> >>>> at org.apache.spark.sql.catalyst.planning.QueryPlanner.apply( >>>> QueryPlanner.scala:59) >>>> >>>> at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute( >>>> SQLContext.scala:418) >>>> >>>> at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan( >>>> SQLContext.scala:416) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute( >>>> SQLContext.scala:422) >>>> >>>> at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan( >>>> SQLContext.scala:422) >>>> >>>> at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute( >>>> SQLContext.scala:425) >>>> >>>> at org.apache.spark.sql.SQLContext$QueryExecution.toRdd( >>>> SQLContext.scala:425) >>>> >>>> at org.apache.spark.sql.SchemaRDDLike$class.saveAsParquetFile( >>>> SchemaRDDLike.scala:76) >>>> >>>> at org.apache.spark.sql.SchemaRDD.saveAsParquetFile(SchemaRDD.scala:108 >>>> ) >>>> >>>> at bdrt.MyTest$.createParquetWithDate(MyTest.scala:88) >>>> >>>> at bdrt.MyTest$delayedInit$body.apply(MyTest.scala:54) >>>> >>>> at scala.Function0$class.apply$mcV$sp(Function0.scala:40) >>>> >>>> at scala.runtime.AbstractFunction0.apply$mcV$sp( >>>> AbstractFunction0.scala:12) >>>> >>>> at scala.App$$anonfun$main$1.apply(App.scala:71) >>>> >>>> at scala.App$$anonfun$main$1.apply(App.scala:71) >>>> >>>> at scala.collection.immutable.List.foreach(List.scala:318) >>>> >>>> at scala.collection.generic.TraversableForwarder$class.foreach( >>>> TraversableForwarder.scala:32) >>>> >>>> at scala.App$class.main(App.scala:71) >>>> >>>> at bdrt.MyTest$.main(MyTest.scala:10) >>>> >>>> >>>> >>> >> >