Basically, I want to run the following query:
select 'a\'b', case(null as Array<String>)
However, neither HiveContext and SQLContext can execute it without
exception.
I have tried
sql(select 'a\'b', case(null as Array<String>))
and
df.selectExpr("'a\'b'", "case(null as Array<String>)")
Neither of them works.
>From the exceptions, I find the query is parsed differently.
On Fri, May 13, 2016 at 8:00 AM, Yong Zhang <[email protected]> wrote:
> Not sure what do you mean? You want to have one exactly query running fine
> in both sqlContext and HiveContext? The query parser are different, why do
> you want to have this feature? Do I understand your question correctly?
>
> Yong
>
> ------------------------------
> Date: Thu, 12 May 2016 13:09:34 +0200
> Subject: SQLContext and HiveContext parse a query string differently ?
> From: [email protected]
> To: [email protected]
>
>
> HI,
>
> I just want to figure out why the two contexts behavior differently even
> on a simple query.
> In a netshell, I have a query in which there is a String containing single
> quote and casting to Array/Map.
> I have tried all the combination of diff type of sql context and query
> call api (sql, df.select, df.selectExpr).
> I can't find one rules all.
>
> Here is the code for reproducing the problem.
>
> -----------------------------------------------------------------------------
>
> import org.apache.spark.sql.SQLContext
> import org.apache.spark.sql.hive.HiveContext
> import org.apache.spark.{SparkConf, SparkContext}
>
> object Test extends App {
>
> val sc = new SparkContext("local[2]", "test", new SparkConf)
> val hiveContext = new HiveContext(sc)
> val sqlContext = new SQLContext(sc)
>
> val context = hiveContext
> // val context = sqlContext
>
> import context.implicits._
>
> val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
> df.registerTempTable("tbl")
> df.printSchema()
>
> // case 1
> context.sql("select cast(a as array<string>) from tbl").show()
> // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize
> input near 'array' '<' 'string' in primitive type specification; line 1 pos 17
> // SQLContext => OK
>
> // case 2
> context.sql("select 'a\\'b'").show()
> // HiveContext => OK
> // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string
> literal) found
>
> // case 3
> df.selectExpr("cast(a as array<string>)").show() // OK with HiveContext and
> SQLContext
>
> // case 4
> df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end
> of input expected
> }
>
> -----------------------------------------------------------------------------
>
> Any clarification / workaround is high appreciated.
>
> --
> Hao Ren
>
> Data Engineer @ leboncoin
>
> Paris, France
>
--
Hao Ren
Data Engineer @ leboncoin
Paris, France