Scientific Notation and Precision Error

Meeraj Kunnumpurath Sat, 08 Oct 2016 18:51:07 -0700

Hello,

I have a dataset in which some of the row for a numeric column have data
represented in scientific format. When I enable schema inference, I get a
precision error trying to set decimal value, in any operation involving the
rows for which the column value is represented in scientific notation. An
example of the literal that is causing the issue is 1.225e+006.


scala> val df = spark.read.option("header", "true").option("inferSchema",
"true").csv("sales_data.csv")

df: org.apache.spark.sql.DataFrame = [id: bigint, date: string ... 19 more
fields]


scala> df.select(sum("price")).show

16/10/09 05:46:01 ERROR Executor: Exception in task 0.0 in stage 63.0 (TID
68)

java.lang.IllegalArgumentException: requirement failed: Decimal precision 7
exceeds max precision 6

at scala.Predef$.require(Predef.scala:224)

at org.apache.spark.sql.types.Decimal.set(Decimal.scala:112)
Many thanks

-- 
*Meeraj Kunnumpurath*


*Director and Executive PrincipalService Symphony Ltd00 44 7702 693597*

*00 971 50 409 0169mee...@servicesymphony.com <mee...@servicesymphony.com>*

Scientific Notation and Precision Error

Reply via email to