Re: Aggregated column name

Wen Pei Yu Thu, 23 Mar 2017 22:45:29 -0700

Thanks. Kevin

This works for one or two column agg.
But not work for this:


val expr = (Map("forCount" -> "count") ++ features.map((_ -> "mean")))
val averageDF = originalDF
  .withColumn("forCount", lit(0))
  .groupBy(col("..."))
  .agg(expr)

Yu Wenpei.



From:   Kevin Mellott <kevin.r.mell...@gmail.com>
To:     Wen Pei Yu <yuw...@cn.ibm.com>
Cc:     user <user@spark.apache.org>
Date:   03/24/2017 09:48 AM
Subject:        Re: Aggregated column name



I'm not sure of the answer to your question; however, when performing
aggregates I find it useful to specify an alias for each column. That will
give you explicit control over the name of the resulting column.

In your example, that would look something like:

df.groupby(col("...")).agg(count("number")).alias("ColumnNameCount")

Hope that helps!
Kevin

On Thu, Mar 23, 2017 at 2:41 AM, Wen Pei Yu <yuw...@cn.ibm.com> wrote:
  Hi All

  I found some spark version(spark 1.4) return upper case aggregated
  column,  and some return low case.
  As below code,
  df.groupby(col("...")).agg(count("number"))
  may return

  COUNT(number)  ------ spark 1,4
  count(number) ----- spark 1.6

  Anyone know if there is configure parameter for this, or which PR change
  this?

  Thank you very much.
  Yu Wenpei.

  --------------------------------------------------------------------- To
  unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Aggregated column name

Reply via email to