On Wed, May 5, 2021, at 21:51, Tom Lane wrote: > Andrew Dunstan <and...@dunslane.net <mailto:andrew%40dunslane.net>> writes: > > On 5/5/21 2:45 PM, Tom Lane wrote: > >> Yeah, that's because of the conversion to "chr". But a regexp > >> is overkill for that anyway. Don't we have something that will > >> split on simple substring matches? > > > Not that I know of. There is split_part but I don't think that's fit for > > purpose here. Do we need one, or have I missed something? > > [ checks manual ... ] > > string_to_array or string_to_table would do, I think. > > regards, tom lane >
Thanks for these new functions, they seem really useful for a lot of cases. However, I see two problems with using string_to_table() for this particular use-case. - Doesn't work with files larger than 1GB, due to pg_read_file()'s limit. - 68% slower than using the COPY-hack. % ls -lah foo.txt -rw-r--r-- 1 joel staff 623M May 6 07:31 foo.txt % wc -l foo.txt 6771864 foo.txt # \d txt Table "public.txt" Column | Type | Collation | Nullable | Default --------+------+-----------+----------+--------- line | text | | | # COPY txt (line) FROM 'foo.txt' DELIMITER '"'; COPY 6771864 Time: 9829.707 ms (00:09.830) Time: 9552.286 ms (00:09.552) Time: 9483.115 ms (00:09.483) # TRUNCATE txt; TRUNCATE TABLE # INSERT INTO txt (line) SELECT string_to_table(pg_read_file('foo.txt'),E'\n'); INSERT 0 6771865 Time: 16556.078 ms (00:16.556) Time: 14720.343 ms (00:14.720) Time: 17266.088 ms (00:17.266) /Joel