@Shahab, based on https://issues.apache.org/jira/browse/HIVE-5472, current_date was added in Hive *1.2.0 (not 0.12.0)*. For my previous email, I meant current_date is not in neither Hive 0.12.0 nor Hive 0.13.1 (Spark SQL currently supports these two Hive versions).
On Tue, Mar 3, 2015 at 8:55 AM, Rohit Rai <ro...@tuplejump.com> wrote: > The Hive dependency comes from spark-hive. > > It does work with Spark 1.1 we will have the 1.2 release later this month. > On Mar 3, 2015 8:49 AM, "shahab" <shahab.mok...@gmail.com> wrote: > >> >> Thanks Rohit, >> >> I am already using Calliope and quite happy with it, well done ! except >> the fact that : >> 1- It seems that it does not support Hive 0.12 or higher, Am i right? >> for example you can not use : current_time() UDF, or those new UDFs added >> in hive 0.12 . Are they supported? Any plan for supporting them? >> 2-It does not support Spark 1.1 and 1.2. Any plan for new release? >> >> best, >> /Shahab >> >> On Tue, Mar 3, 2015 at 5:41 PM, Rohit Rai <ro...@tuplejump.com> wrote: >> >>> Hello Shahab, >>> >>> I think CassandraAwareHiveContext >>> <https://github.com/tuplejump/calliope/blob/develop/sql/hive/src/main/scala/org/apache/spark/sql/hive/CassandraAwareHiveContext.scala> >>> in >>> Calliopee is what you are looking for. Create CAHC instance and you should >>> be able to run hive functions against the SchemaRDD you create from there. >>> >>> Cheers, >>> Rohit >>> >>> *Founder & CEO, **Tuplejump, Inc.* >>> ____________________________ >>> www.tuplejump.com >>> *The Data Engineering Platform* >>> >>> On Tue, Mar 3, 2015 at 6:03 AM, Cheng, Hao <hao.ch...@intel.com> wrote: >>> >>>> The temp table in metastore can not be shared cross SQLContext >>>> instances, since HiveContext is a sub class of SQLContext (inherits all of >>>> its functionality), why not using a single HiveContext globally? Is there >>>> any specific requirement in your case that you need multiple >>>> SQLContext/HiveContext? >>>> >>>> >>>> >>>> *From:* shahab [mailto:shahab.mok...@gmail.com] >>>> *Sent:* Tuesday, March 3, 2015 9:46 PM >>>> >>>> *To:* Cheng, Hao >>>> *Cc:* user@spark.apache.org >>>> *Subject:* Re: Supporting Hive features in Spark SQL Thrift JDBC server >>>> >>>> >>>> >>>> You are right , CassandraAwareSQLContext is subclass of SQL context. >>>> >>>> >>>> >>>> But I did another experiment, I queried Cassandra >>>> using CassandraAwareSQLContext, then I registered the "rdd" as a temp table >>>> , next I tried to query it using HiveContext, but it seems that hive >>>> context can not see the registered table suing SQL context. Is this a >>>> normal case? >>>> >>>> >>>> >>>> best, >>>> >>>> /Shahab >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Mar 3, 2015 at 1:35 PM, Cheng, Hao <hao.ch...@intel.com> wrote: >>>> >>>> Hive UDF are only applicable for HiveContext and its subclass >>>> instance, is the CassandraAwareSQLContext a direct sub class of >>>> HiveContext or SQLContext? >>>> >>>> >>>> >>>> *From:* shahab [mailto:shahab.mok...@gmail.com] >>>> *Sent:* Tuesday, March 3, 2015 5:10 PM >>>> *To:* Cheng, Hao >>>> *Cc:* user@spark.apache.org >>>> *Subject:* Re: Supporting Hive features in Spark SQL Thrift JDBC server >>>> >>>> >>>> >>>> val sc: SparkContext = new SparkContext(conf) >>>> >>>> val sqlCassContext = new CassandraAwareSQLContext(sc) // I used some >>>> Calliope Cassandra Spark connector >>>> >>>> val rdd : SchemaRDD = sqlCassContext.sql("select * from db.profile " ) >>>> >>>> rdd.cache >>>> >>>> rdd.registerTempTable("profile") >>>> >>>> rdd.first //enforce caching >>>> >>>> val q = "select from_unixtime(floor(createdAt/1000)) from profile >>>> where sampling_bucket=0 " >>>> >>>> val rdd2 = rdd.sqlContext.sql(q ) >>>> >>>> println ("Result: " + rdd2.first) >>>> >>>> >>>> >>>> And I get the following errors: >>>> >>>> xception in thread "main" >>>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved >>>> attributes: 'from_unixtime('floor(('createdAt / 1000))) AS c0#7, tree: >>>> >>>> Project ['from_unixtime('floor(('createdAt / 1000))) AS c0#7] >>>> >>>> Filter (sampling_bucket#10 = 0) >>>> >>>> Subquery profile >>>> >>>> Project >>>> [company#8,bucket#9,sampling_bucket#10,profileid#11,createdat#12L,modifiedat#13L,version#14] >>>> >>>> CassandraRelation localhost, 9042, 9160, normaldb_sampling, >>>> profile, org.apache.spark.sql.CassandraAwareSQLContext@778b692d, None, >>>> None, false, Some(Configuration: core-default.xml, core-site.xml, >>>> mapred-default.xml, mapred-site.xml) >>>> >>>> >>>> >>>> at >>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:72) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:70) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183) >>>> >>>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >>>> >>>> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >>>> >>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >>>> >>>> at >>>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >>>> >>>> at >>>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >>>> >>>> at >>>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >>>> >>>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >>>> >>>> at scala.collection.AbstractIterator.to(Iterator.scala:1157) >>>> >>>> at >>>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >>>> >>>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >>>> >>>> at >>>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >>>> >>>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:212) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:168) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:156) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:70) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:68) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59) >>>> >>>> at >>>> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) >>>> >>>> at >>>> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) >>>> >>>> at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51) >>>> >>>> at scala.collection.immutable.List.foreach(List.scala:318) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:402) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:402) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:403) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:403) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:407) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:405) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:411) >>>> >>>> at >>>> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:411) >>>> >>>> at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) >>>> >>>> at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:440) >>>> >>>> at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:103) >>>> >>>> at org.apache.spark.rdd.RDD.first(RDD.scala:1091) >>>> >>>> at boot.SQLDemo$.main(SQLDemo.scala:65) //my code >>>> >>>> at boot.SQLDemo.main(SQLDemo.scala) //my code >>>> >>>> >>>> >>>> On Tue, Mar 3, 2015 at 8:57 AM, Cheng, Hao <hao.ch...@intel.com> wrote: >>>> >>>> Can you provide the detailed failure call stack? >>>> >>>> >>>> >>>> *From:* shahab [mailto:shahab.mok...@gmail.com] >>>> *Sent:* Tuesday, March 3, 2015 3:52 PM >>>> *To:* user@spark.apache.org >>>> *Subject:* Supporting Hive features in Spark SQL Thrift JDBC server >>>> >>>> >>>> >>>> Hi, >>>> >>>> >>>> >>>> According to Spark SQL documentation, "....Spark SQL supports the vast >>>> majority of Hive features, such as User Defined Functions( UDF) ", and one >>>> of these UFDs is "current_date()" function, which should be supported. >>>> >>>> >>>> >>>> However, i get error when I am using this UDF in my SQL query. There >>>> are couple of other UDFs which cause similar error. >>>> >>>> >>>> >>>> Am I missing something in my JDBC server ? >>>> >>>> >>>> >>>> /Shahab >>>> >>>> >>>> >>>> >>>> >>> >>> >>