Re: [go-nuts] Re: Sort a huge slice of data around 2GB

2017-12-01 Thread Sokolov Yura
If you know keys distribution, you may first split data on N non-intersecting buckets, where N is number of CPU cores, and then use Sort on each bucket in parallel. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group

Re: [go-nuts] Re: Sort a huge slice of data around 2GB

2017-11-30 Thread Michael Jones
The general way this is done is to extract the keys needed for the sorting along, sort this index, and then either use the index as an access mechanism to access the unmodified bulk data, or, to do a single pass reordering. On Thu, Nov 30, 2017 at 8:36 AM, Subramanian K wrote: > Thanks all, did

[go-nuts] Re: Sort a huge slice of data around 2GB

2017-11-30 Thread Subramanian K
Thanks all, did some metrics on comparison function "Less", had JSON data in this huge file, had to unmarshall JSON data and then do a comparison based on obtained data. unmarshalling took 98% of time in this comparison. Looking to avoid unmarshaliing and finding an alternative way to do this. T

[go-nuts] Re: Sort a huge slice of data around 2GB

2017-11-30 Thread Slawomir Pryczek
It should be very simple if you have additional 2G of memory. You divide the data to X parts where X is power of 2 and X needs to be less than number of cores available. Eg. for 2000MB it can be 250x8. Then you sort it in paralell using built-in sorting function and at the end you just concaten