Optimize COPY FROM (FORMAT {text,csv}) using SIMD.

Presently, such commands scan the input buffer one byte at a time
looking for special characters.  This commit adds a new path that
uses SIMD instructions to skip over chunks of data without any
special characters.  This can be much faster.

To avoid regressions, SIMD processing is disabled for the remainder
of the COPY FROM command as soon as we encounter a short line or a
special character (except for end-of-line characters, else we'd
always disable it after the first line).  This is perhaps too
conservative, but it could probably be made more lenient in the
future via fine-tuned heuristics.

Author: Nazir Bilal Yavuz <[email protected]>
Co-authored-by: Shinya Kato <[email protected]>
Reviewed-by: Ayoub Kazar <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Neil Conway <[email protected]>
Reviewed-by: Greg Burd <[email protected]>
Tested-by: Manni Wood <[email protected]>
Tested-by: Mark Wong <[email protected]>
Discussion: 
https://postgr.es/m/CAOzEurSW8cNr6TPKsjrstnPfhf4QyQqB4tnPXGGe8N4e_v7Jig%40mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/e0a3a3fd5361913502ff696ecf47770ca55975ae

Modified Files
--------------
src/backend/commands/copyfrom.c          |   1 +
src/backend/commands/copyfromparse.c     | 185 ++++++++++++++++++++++++++++++-
src/include/commands/copyfrom_internal.h |   1 +
3 files changed, 184 insertions(+), 3 deletions(-)

Reply via email to