Hi All,

It is known, that  collation "C" significantly speeds up string comparisons and 
as a result sorting. I was wondering, whether it is possible to use it 
regardless of collation set on a column in sorts not visible to users?

Example I have in  mind is sorting performed for GroupAggregate. Purpose of 
that sort is to bring equal values next to each other, so as long as:
   1) user didn't request ORDER BY in addition to GROUP BY
   2) source column has any deterministic collation (as per docs all builtin 
collations are deterministic)

it seems to be possible to do sorting with any deterministic collation, 
regardless of what user specifid for the column being sorted. "C" collation is 
deterministic and fastest.

In other words, couldn't PostgreSQL convert this:
->  GroupAggregate  (cost=15726557.87..22944558.69 rows=7200001 width=176) 
(actual time=490103.209..771536.389 rows=36000000 loops=1)
      Group Key: ec_180days.msn, ec_180days.to_date_time
      ->  Sort  (cost=15726557.87..15906557.89 rows=72000008 width=113) (actual 
time=490094.849..524854.662 rows=72000000 loops=1)
            Sort Key: ec_180days.msn, ec_180days.to_date_time
            Sort Method: external merge  Disk: 7679136kB

To this:
->  GroupAggregate  (cost=14988274.87..22206275.69 rows=7200001 width=155) 
(actual time=140497.729..421510.001 rows=36000000 loops=1)

      Group Key: ec_180days.msn, ec_180days.to_date_time

      ->  Sort  (cost=14988274.87..15168274.89 rows=72000008 width=92) (actual 
time=140489.807..174228.722 rows=72000000 loops=1)

            Sort Key: ec_180days.msn COLLATE "C", ec_180days.to_date_time

            Sort Method: external merge  Disk: 7679136kB

which is 3 times faster in my tests.

Reply via email to