Re: some questions about data skew

2021-05-10 Thread Dawid Wysakowicz
Hi, What you could do to improve processing of a skewed data is to introduce an artificial preaggregation. You could add some artificial uniformly distributed secondary key and calculate your aggregates on (original key, secondary uniform key) and then do the final aggregation in an additional ste

some questions about data skew

2021-05-06 Thread jester jim
Hi, I have run a program to monitor the sum of the delay in every minutes of a stream,this is my code: .map(new RichMapFunction[String,(Long,Int)] { override def map(in: String): (Long,Int) = { var str:String = "" try { val arr = in.split("\\|") ((System.currentTime