Re: Timestamp support in v1.0

Andrew Ash Thu, 29 May 2014 09:11:15 -0700

I can confirm that the commit is included in the 1.0.0 release candidates
(it was committed before branch-1.0 split off from master), but I can't
confirm that it works in PySpark.  Generally the Python and Java interfaces
lag a little behind the Scala interface to Spark, but we're working to keep
that diff much smaller going forward.


Can you try the same thing in Scala?


On Thu, May 29, 2014 at 8:54 AM, dataginjaninja <rickett.stepha...@gmail.com
> wrote:

> Can anyone verify which rc  [SPARK-1360] Add Timestamp Support for SQL #275
> <https://github.com/apache/spark/pull/275>   is included in? I am running
> rc3, but receiving errors with TIMESTAMP as a datatype in my Hive tables
> when trying to use them in pyspark.
>
> *The error I get:
> *
> 14/05/29 15:44:47 INFO ParseDriver: Parsing command: SELECT COUNT(*) FROM
> aol
> 14/05/29 15:44:48 INFO ParseDriver: Parse Completed
> 14/05/29 15:44:48 INFO metastore: Trying to connect to metastore with URI
> thrift:
> 14/05/29 15:44:48 INFO metastore: Waiting 1 seconds before next connection
> attempt.
> 14/05/29 15:44:49 INFO metastore: Connected to metastore.
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/opt/spark-1.0.0-rc3/python/pyspark/sql.py", line 189, in hql
>     return self.hiveql(hqlQuery)
>   File "/opt/spark-1.0.0-rc3/python/pyspark/sql.py", line 183, in hiveql
>     return SchemaRDD(self._ssql_ctx.hiveql(hqlQuery), self)
>   File
> "/opt/spark-1.0.0-rc3/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py",
> line 537, in __call__
>   File
> "/opt/spark-1.0.0-rc3/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line
> 300, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling o14.hiveql.
> : java.lang.RuntimeException: Unsupported dataType: timestamp
>
> *The table I loaded:*
> DROP TABLE IF EXISTS aol;
> CREATE EXTERNAL TABLE aol (
>         userid STRING,
>         query STRING,
>         query_time TIMESTAMP,
>         item_rank INT,
>         click_url STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> LOCATION '/tmp/data/aol';
>
> *The pyspark commands:*
> from pyspark.sql import HiveContext
> hctx= HiveContext(sc)
> results = hctx.hql("SELECT COUNT(*) FROM aol").collect()
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Timestamp-support-in-v1-0-tp6850.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>

Re: Timestamp support in v1.0

Reply via email to