Hi starting new thread following old thread looks like code for compiling callUdf("percentile_approx",col("mycol"),lit(0.25)) is not merged in spark 1.5.1 source but I dont understand why this function call works in Spark 1.5.1 spark-shell/bin. Please guide.
---------- Forwarded message ---------- From: "Ted Yu" <yuzhih...@gmail.com> Date: Oct 14, 2015 3:26 AM Subject: Re: How to calculate percentile of a column of DataFrame? To: "Umesh Kacha" <umesh.ka...@gmail.com> Cc: "Michael Armbrust" <mich...@databricks.com>, "<saif.a.ell...@wellsfargo.com>" <saif.a.ell...@wellsfargo.com>, "user" <user@spark.apache.org> I modified DataFrameSuite, in master branch, to call percentile_approx instead of simpleUDF : - deprecated callUdf in SQLContext - callUDF in SQLContext *** FAILED *** org.apache.spark.sql.AnalysisException: undefined function percentile_approx; at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry$$anonfun$2.apply(FunctionRegistry.scala:64) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry$$anonfun$2.apply(FunctionRegistry.scala:64) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:63) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506) at org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:505) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:502) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227) SPARK-10671 is included. For 1.5.1, I guess the absence of SPARK-10671 means that SparkSQL treats percentile_approx as normal UDF. Experts can correct me, if there is any misunderstanding. Cheers -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/callUdf-percentile-approx-col-mycol-lit-0-25-does-not-compile-spark-1-5-1-source-but-it-does-work-inn-tp25111.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org