If you know keys distribution, you may first split data on N non-intersecting
buckets, where N is number of CPU cores, and then use Sort on each bucket in
parallel.
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group
The general way this is done is to extract the keys needed for the sorting
along, sort this index, and then either use the index as an access
mechanism to access the unmodified bulk data, or, to do a single pass
reordering.
On Thu, Nov 30, 2017 at 8:36 AM, Subramanian K wrote:
> Thanks all, did
Thanks all, did some metrics on comparison function "Less", had JSON data
in this huge file, had to unmarshall JSON data and then do a comparison
based on obtained data. unmarshalling took 98% of time in this comparison.
Looking to avoid unmarshaliing and finding an alternative way to do this.
T
It should be very simple if you have additional 2G of memory. You divide
the data to X parts where X is power of 2 and X needs to be less than
number of cores available. Eg. for 2000MB it can be 250x8. Then you sort it
in paralell using built-in sorting function and at the end you just
concaten