Hi,

I am trying to execute below query in spark sql but throws exception

select n_name from NATION where n_regionkey = (select r_regionkey from
REGION where r_name='ASIA')

Exception:
Exception in thread "main" java.lang.RuntimeException: [1.55] failure:
``)'' expected but identifier r_regionkey found

select n_name from NATION where n_regionkey = (select r_regionkey from
REGION where r_name='ASIA')
                                                      ^
    at scala.sys.package$.error(package.scala:27)
    at
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
    at
org.apache.spark.sql.SnappyParserDialect.parse(snappyParsers.scala:65)
    at
org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:169)
    at
org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:169)
    at
org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115)
    at
org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
    at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at
scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
    at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at
scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at
scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
    at
scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
    at
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)


Same is working in mysql as well as memsql.

Expected Result is

memsql> select n_name from NATION where n_regionkey = (select r_regionkey
from REGION where r_name='ASIA');
+-----------+
| n_name    |
+-----------+
| INDIA     |
| INDONESIA |
| JAPAN     |
| CHINA     |
| VIETNAM   |
+-----------+
5 rows in set (0.71 sec)

How can I make this work in spark sql?

Actually above query is one simplified version of Minimum cost supplier
query (Q2) of TPCH which has this nested select nature. I am working on
these TPCH queries. If anybody has the modified set of TPCH queries for
spark sql, kindly let me know. It will be very useful for me.

select
s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
from
part,
supplier,
partsupp,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = [SIZE]
and p_type like '%[TYPE]'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = '[REGION]'
and ps_supplycost = (
  select
min(ps_supplycost)
from
partsupp, supplier,
nation, region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = '[REGION]'
)
order by
s_acctbal desc,
n_name,
s_name,
p_partkey;


Regards
  Kishor

Reply via email to