Hi,
   I stumbled on the following today. We have Parquet files that expose a
column in a Map format. This is very convenient as we have data parts that
can vary in time. Not knowing what the data will be, we simply split it in
tuples and insert it as a map inside 1 column.

Retrieving the data is very easy. Syntax looks like this:

select column.key1.key2 from table;

Column value look like this:
{}
{"key1":"value"}
{"key1":{"key2":"value"}}

But when trying to do basic operators on that column, I get the following
error:

query: select (column.key1.key2 / 30 < 1) from table

ERROR processing query/statement. Error Code: 0, SQL state:
TStatus(statusCode:ERROR_STATUS,
infoMessages:[*org.apache.hive.service.cli.HiveSQLException:java.lang.ClassCastException:
org.apache.spark.sql.types.NullType$ cannot be cast to
org.apache.spark.sql.types.MapType:26:25,
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation:runInternal:SparkExecuteStatementOperation.scala:259,
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation:run:SparkExecuteStatementOperation.scala:144,
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:388,
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:HiveSessionImpl.java:369,
sun.reflect.GeneratedMethodAccessor115:invoke::-1,
sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
java.lang.reflect.Method:invoke:Method.java:497,
org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
java.security.AccessController:doPrivileged:AccessController.java:-2,
javax.security.auth.Subject:doAs:Subject.java:422,
org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1628,
org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
com.sun.proxy.$Proxy39:executeStatement::-1,
org.apache.hive.service.cli.CLIService:executeStatement:CLIService.java:261,
org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:486,
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1313,
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1298,
org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285,
java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142,
java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617,
java.lang.Thread:run:Thread.java:745], errorCode:0,
errorMessage:java.lang.ClassCastException:
org.apache.spark.sql.types.NullType$ cannot be cast to
org.apache.spark.sql.types.MapType), Query:

Trying to apply a Logical or Arithmetic Operator on Map value will yield the
error.

The solution I found was to simply cast as a int and change my logical
evaluation to equal. (In this case, I wanted to get TRUE if the value was
less than 1 and casting it as a float or double was yielding the same error)

select cast(cast(column.key1.key2 as int) / 30 as int) = 0 from table
(Yes. Need to cast the remainder for some reason...)

Can anyone shine some light on why map type have problem dealing with basic
operators?

Thanks!




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-Map-and-Basic-Operators-yield-java-lang-ClassCastException-Parquet-Hive-Spark-SQL-1-5-0-Thrift-tp24809.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to