Do Le Quoc created FLINK-4205: --------------------------------- Summary: Implement stratified sampling for DataSet Key: FLINK-4205 URL: https://issues.apache.org/jira/browse/FLINK-4205 Project: Flink Issue Type: New Feature Reporter: Do Le Quoc
Since a Dataset might consist of data from disparate sources. As such, every data source should be considered fairly to have a representative sample. For this, stratified sampling is needed to ensure that data from every source (stratum) is selected and none of the minorities are excluded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)