Xu Yang created FLINK-12671: ------------------------------- Summary: Summarizer: summary statistics for Table Key: FLINK-12671 URL: https://issues.apache.org/jira/browse/FLINK-12671 Project: Flink Issue Type: Sub-task Reporter: Xu Yang Assignee: Xu Yang
We provide summary statistics for Table through Summarizer. User can easily get the total count and the basic column-wise metrics: max, min, mean, variance, standardDeviation, normL1, normL2, the number of missing values and the number of valid values. SparkML has same function, [http://spark.apache.org/docs/latest/ml-statistics.html#summarizer] Example: Table input = … TableSummary summary = *new* Summarizer(_input_).collectResult(); System.*_out_*.println(summary.mean(*"age"*)); // print the mean of the column(Name: “age”) System.out.println(summary); -- This message was sent by Atlassian JIRA (v7.6.3#76005)