Re: Want to avoid groupByKey as its running for ever

2015-06-30 Thread ๏̯͡๏
I modified to detailInputsToGroup.map { case (detailInput, dataRecord) => val key: StringBuilder = new StringBuilder dimensions.foreach { dimension => key ++= { Option(dataRecord.get(dimension)).getOrElse(Option(detailInput.get(dimensi

Re: Want to avoid groupByKey as its running for ever

2015-06-30 Thread Daniel Siegmann
If the number of items is very large, have you considered using probabilistic counting? The HyperLogLogPlus class from stream-lib