Try count(distinct columnane)
In SQL distinct is not part of the function name. On Tuesday, October 27, 2015, Shagun Sodhani <sshagunsodh...@gmail.com> wrote: > Oops seems I made a mistake. The error message is : Exception in thread > "main" org.apache.spark.sql.AnalysisException: undefined function > countDistinct > On 27 Oct 2015 15:49, "Shagun Sodhani" <sshagunsodh...@gmail.com > <javascript:_e(%7B%7D,'cvml','sshagunsodh...@gmail.com');>> wrote: > >> Hi! I was trying out some aggregate functions in SparkSql and I noticed >> that certain aggregate operators are not working. This includes: >> >> approxCountDistinct >> countDistinct >> mean >> sumDistinct >> >> For example using countDistinct results in an error saying >> *Exception in thread "main" org.apache.spark.sql.AnalysisException: >> undefined function cosh;* >> >> I had a similar issue with cosh operator >> <http://apache-spark-developers-list.1001551.n3.nabble.com/Exception-when-using-cosh-td14724.html> >> as well some time back and it turned out that it was not registered in the >> registry: >> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala >> >> >> *I* *think it is the same issue again and would be glad to send over a >> PR if someone can confirm if this is an actual bug and not some mistake on >> my part.* >> >> >> Query I am using: SELECT countDistinct(`age`) as `data` FROM `table` >> Spark Version: 10.4 >> SparkSql Version: 1.5.1 >> >> I am using the standard example of (name, age) schema (though I am >> setting age as Double and not Int as I am trying out maths functions). >> >> The entire error stack can be found here <http://pastebin.com/G6YzQXnn>. >> >> Thanks! >> >