Hi, I am trying to execute below query in spark sql but throws exception
select n_name from NATION where n_regionkey = (select r_regionkey from REGION where r_name='ASIA') Exception: Exception in thread "main" java.lang.RuntimeException: [1.55] failure: ``)'' expected but identifier r_regionkey found select n_name from NATION where n_regionkey = (select r_regionkey from REGION where r_name='ASIA') ^ at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36) at org.apache.spark.sql.SnappyParserDialect.parse(snappyParsers.scala:65) at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:169) at org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:169) at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115) at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110) at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34) Same is working in mysql as well as memsql. Expected Result is memsql> select n_name from NATION where n_regionkey = (select r_regionkey from REGION where r_name='ASIA'); +-----------+ | n_name | +-----------+ | INDIA | | INDONESIA | | JAPAN | | CHINA | | VIETNAM | +-----------+ 5 rows in set (0.71 sec) How can I make this work in spark sql? Actually above query is one simplified version of Minimum cost supplier query (Q2) of TPCH which has this nested select nature. I am working on these TPCH queries. If anybody has the modified set of TPCH queries for spark sql, kindly let me know. It will be very useful for me. select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = [SIZE] and p_type like '%[TYPE]' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = '[REGION]' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = '[REGION]' ) order by s_acctbal desc, n_name, s_name, p_partkey; Regards Kishor