pin_zhang created SPARK-17398:
---------------------------------

             Summary: Failed to query on external JSon Partitioned table
                 Key: SPARK-17398
                 URL: https://issues.apache.org/jira/browse/SPARK-17398
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: pin_zhang


1. Create External Json partitioned table 
with SerDe in hive-hcatalog-core-1.2.1.jar, download fom
https://mvnrepository.com/artifact/org.apache.hive.hcatalog/hive-hcatalog-core/1.2.1
2. Query table meet exception, which works in spark1.5.2
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to 
stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost 
task
 0.0 in stage 1.0 (TID 1, localhost): java.lang.ClassCastException: 
java.util.ArrayList cannot be cast to org.apache.hive.hcatalog.data.HCatRecord
        at 
org.apache.hive.hcatalog.data.HCatRecordObjectInspector.getStructFieldData(HCatRecordObjectInspector.java:45)
        at 
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$2.apply(TableReader.scala:430)
        at 
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$2.apply(TableReader.scala:426)
 

3. Test Code

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.sql.hive.HiveContext

object JsonBugs {

  def main(args: Array[String]): Unit = {
    val table = "test_json"
    val location = "file:///g:/home/test/json"
    val create = s"""CREATE   EXTERNAL  TABLE  ${table}
             (id string,  seq string )
              PARTITIONED BY(index int)
              ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
              LOCATION "${location}" 
          """
    val add_part = s"""
         ALTER TABLE ${table} ADD 
         PARTITION (index=1)LOCATION '${location}/index=1'
    """

    val conf = new SparkConf().setAppName("scala").setMaster("local[2]")
    conf.set("spark.sql.warehouse.dir", "file:///g:/home/warehouse")
    val ctx = new SparkContext(conf)

    val hctx = new HiveContext(ctx)
    val exist = hctx.tableNames().map { x => x.toLowerCase() }.contains(table)
    if (!exist) {
      hctx.sql(create)
      hctx.sql(add_part)
    } else {
      hctx.sql("show partitions " + table).show()
    }
    hctx.sql("select * from test_json").show()
  }
}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to