The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, failed
Spec compliant: tested, failed
Documentation:tested, failed
Hi Joel,
After testing the patch, I observed that for single-
Thread renamed to: New "single" COPY format [1]
[1] https://postgr.es/m/1db18e33-f1cf-4f2c-9d52-b6d7ff242...@app.fastmail.com
/Joel
On Mon, Nov 4, 2024 at 7:22 PM Joel Jacobson wrote:
>
> On Mon, Nov 4, 2024, at 19:34, Masahiko Sawada wrote:
> > On Sat, Nov 2, 2024 at 4:08 AM Joel Jacobson wrote:
> >>
> >> On Fri, Nov 1, 2024, at 22:28, Masahiko Sawada wrote:
> >> > As I mentioned in a separate email, if we use the OS default
On Mon, Nov 4, 2024, at 19:34, Masahiko Sawada wrote:
> On Sat, Nov 2, 2024 at 4:08 AM Joel Jacobson wrote:
>>
>> On Fri, Nov 1, 2024, at 22:28, Masahiko Sawada wrote:
>> > As I mentioned in a separate email, if we use the OS default EOL as
>> > the default EOL in raw format, it would not be neces
On Sat, Nov 2, 2024 at 4:08 AM Joel Jacobson wrote:
>
> On Fri, Nov 1, 2024, at 22:28, Masahiko Sawada wrote:
> > As I mentioned in a separate email, if we use the OS default EOL as
> > the default EOL in raw format, it would not be necessary to allow it
> > to be multi characters. I think it's wo
On Fri, Nov 1, 2024, at 22:28, Masahiko Sawada wrote:
> As I mentioned in a separate email, if we use the OS default EOL as
> the default EOL in raw format, it would not be necessary to allow it
> to be multi characters. I think it's worth considering it.
I like the idea, but not sure I understand
On Wed, Oct 30, 2024 at 4:54 AM Joel Jacobson wrote:
>
> On Wed, Oct 30, 2024, at 09:14, Joel Jacobson wrote:
> > $ psql -f bench_result.sql
>
> Ops, I realized I benchmarked a debug build,
> reran the benchmark with `meson setup build --buildtype=release`,
> and also added benchmarking of HEAD:
>
On Tue, Oct 29, 2024 at 9:48 AM Joel Jacobson wrote:
>
> > ---
> > It's a bit odd to me to use the delimiter as a EOL marker in raw
> > format, but probably it's okay.
> >
> > ---
> > - if (cstate->opts.format != COPY_FORMAT_BINARY)
> > + if (cstate->opts.format == COPY_FORMAT_
On Mon, Oct 28, 2024, at 18:50, Masahiko Sawada wrote:
> Thank you for updating the patch. Here are review comments on the v15
> 0002 patch:
Thanks for review.
> When testing the patch with an empty delimiter, I got the following failure:
>
> postgres(1:903898)=# copy hoge from '/tmp/tmp.raw' wit
On Mon, Oct 28, 2024 at 3:21 AM Joel Jacobson wrote:
>
> On Mon, Oct 28, 2024, at 10:30, Joel Jacobson wrote:
> > On Mon, Oct 28, 2024, at 08:56, jian he wrote:
> >> /* Check force_quote */
> >> - if (!opts_out->csv_mode && (opts_out->force_quote ||
> >> opts_out->force_quote_all))
> >> + if (op
On Mon, Oct 28, 2024, at 10:30, Joel Jacobson wrote:
> On Mon, Oct 28, 2024, at 08:56, jian he wrote:
>> /* Check force_quote */
>> - if (!opts_out->csv_mode && (opts_out->force_quote ||
>> opts_out->force_quote_all))
>> + if (opts_out->format != COPY_FORMAT_CSV && (opts_out->force_quote ||
>> +
On Mon, Oct 28, 2024, at 08:56, jian he wrote:
> /* Check force_quote */
> - if (!opts_out->csv_mode && (opts_out->force_quote ||
> opts_out->force_quote_all))
> + if (opts_out->format != COPY_FORMAT_CSV && (opts_out->force_quote ||
> + opts_out->force_quote_all))
> ereport(ERROR,
> (errcode(
On Thu, Oct 24, 2024 at 2:30 PM Joel Jacobson wrote:
>
> On Thu, Oct 24, 2024, at 03:54, Masahiko Sawada wrote:
> > I have one question:
> >
> > From the 0001 patch's commit message:
> >
> > No behavioral changes are intended; this is a pure refactoring to improve
> > code
> > clarity and maintai
On Thu, Oct 24, 2024, at 03:54, Masahiko Sawada wrote:
> I have one question:
>
> From the 0001 patch's commit message:
>
> No behavioral changes are intended; this is a pure refactoring to improve code
> clarity and maintainability.
>
> Does the reorganization of the option validation done by this
Hi,
On Sat, Oct 19, 2024 at 8:33 AM Joel Jacobson wrote:
>
> On Sat, Oct 19, 2024, at 12:13, jian he wrote:
> > We already make RAW and can only have one column.
> > if RAW has no default delimiter, then COPY FROM a text file will
> > become one datum value;
> > which makes it looks like importin
On Mon, Oct 21, 2024, at 16:35, jian he wrote:
> make the ProcessCopyOptions process in following order:
> 1. Extract options from the statement node tree
> 2. checking each option, if not there set default value.
> 3. checking for interdependent options
>
> I still think
> making step2 aligned wit
On Sat, Oct 19, 2024 at 11:33 PM Joel Jacobson wrote:
>
> > ProcessCopyOptions
> > /* Extract options from the statement node tree */
> > foreach(option, options)
> > {
> > }
> > /* --- DELIMITER option --- */
> > /* --- NULL option --- */
> > /* --- QUOTE option --- */
> > Currently the regress t
On Sat, Oct 19, 2024, at 12:13, jian he wrote:
> We already make RAW and can only have one column.
> if RAW has no default delimiter, then COPY FROM a text file will
> become one datum value;
> which makes it looks like importing a Large Object.
> (https://www.postgresql.org/docs/17/lo-funcs.html)
On Sat, Oct 19, 2024 at 1:24 AM Joel Jacobson wrote:
>>
> Handling of e.g. JSON and other structured text files that could contain
> newlines, in a seamless way seems important, so therefore the default is
> no delimiter for the raw format, so that the entire input is read as one data
> value for
On Fri, Oct 18, 2024, at 19:24, Joel Jacobson wrote:
> Attachments:
> * v11-0001-Refactor-ProcessCopyOptions-introduce-CopyFormat-enu.patch
> * v11-0002-Add-raw-format-to-COPY-command.patch
Here is a demo of a importing a decently sized real text file,
that can't currently be imported without the
On Fri, Oct 18, 2024, at 15:52, jian he wrote:
> Raw Format is duplicated
> Raw Format didn't mention the special handling of
> end-of-data marker.
Thanks for reviewing, above fixed.
Here is a summary of the changes since v10, thanks to the feedback:
Handling of e.g. JSON and other structured t
On Wed, Oct 16, 2024 at 2:37 PM Joel Jacobson wrote:
>
> On Wed, Oct 16, 2024, at 05:31, jian he wrote:
> > Hi.
> > I only checked 0001, 0002, 0003.
> > the raw format patch is v9-0016.
> > 003-0016 is a lot of small patches, maybe you can consolidate it to
> > make the review more easier.
>
> Tha
On Wed, Oct 16, 2024, at 21:13, Joel Jacobson wrote:
> Therefore, maybe DELIMITER NONE would be a better default
> for RAW? Especially since it's then also more honest in being "raw".
>
> If needing to import an unstructured text file that is just newline
> delimited, and not wanting the entire fil
On Wed, Oct 16, 2024, at 20:30, Joel Jacobson wrote:
> A final thought is to maybe consider just skipping
> the automagical newline detection for RAW?
>
> Instead of the automagical detection,
> the default newline delimiter could be the OS default,
> similar to how COPY TO works.
>
> That way, it
On Wed, Oct 16, 2024, at 18:34, Daniel Verite wrote:
> Joel Jacobson wrote:
>
>> However, I thinking rejecting such column data seems like the
>> better alternative, to ensure data exported with COPY TO
>> can always be imported back using COPY FROM,
>> for the same format.
>
> On the other hand,
On Wed, Oct 16, 2024, at 18:04, Jacob Champion wrote:
> A hypothetical type whose text representation can contain '\r' but not
> '\n' still can't be unambiguously round-tripped under this scheme:
> COPY FROM will see the "mixed" line endings and complain, even though
> there's no ambiguity.
Yeah,
Joel Jacobson wrote:
> However, I thinking rejecting such column data seems like the
> better alternative, to ensure data exported with COPY TO
> can always be imported back using COPY FROM,
> for the same format.
On the other hand, that might prevent cases where we
want to export, for i
On Tue, Oct 15, 2024 at 1:38 PM Joel Jacobson wrote:
>
> However, I thinking rejecting such column data seems like the
> better alternative, to ensure data exported with COPY TO
> can always be imported back using COPY FROM,
> for the same format. If text column data contains newlines,
> users pro
On Wed, Oct 16, 2024, at 05:31, jian he wrote:
> Hi.
> I only checked 0001, 0002, 0003.
> the raw format patch is v9-0016.
> 003-0016 is a lot of small patches, maybe you can consolidate it to
> make the review more easier.
Thanks for reviewing.
OK, I've consolidated the v9 0003-0016 into a singl
On Tue, Oct 15, 2024 at 8:50 PM Joel Jacobson wrote:
>
Hi.
I only checked 0001, 0002, 0003.
the raw format patch is v9-0016.
003-0016 is a lot of small patches, maybe you can consolidate it to
make the review more easier.
-COPY x to stdin (format TEXT, force_quote(a));
+COPY x to stdout (format
On Tue, Oct 15, 2024, at 19:30, Jacob Champion wrote:
> Hi,
>
> Idle thoughts from a design perspective -- feel free to ignore, since
> I'm not the target audience for the feature:
Many thanks for looking at this!
> - If the column data stored in Postgres contains newlines, it seems
> like COPY T
Hi,
Idle thoughts from a design perspective -- feel free to ignore, since
I'm not the target audience for the feature:
- If the column data stored in Postgres contains newlines, it seems
like COPY TO won't work "correctly". Is that acceptable?
- RAW seems like an okay-ish label, but for something
On Mon, Oct 14, 2024, at 10:51, Joel Jacobson wrote:
> On Mon, Oct 14, 2024, at 10:07, Joel Jacobson wrote:
>> Attached is a first draft implementation of the new proposed COPY "raw"
>> format.
>>
>> The first two patches are just the bug fix in HEAD, reported separately:
>> https://commitfest.pos
On Mon, Oct 14, 2024, at 10:07, Joel Jacobson wrote:
> Attached is a first draft implementation of the new proposed COPY "raw"
> format.
>
> The first two patches are just the bug fix in HEAD, reported separately:
> https://commitfest.postgresql.org/50/5297/
I forgot about adding support for the
opyFormat, with options for the three
current formats.
* v4-0004-Reorganize-ProcessCopyOptions-for-clarity-and-consis.patch
The fourth patch reorganize ProcessCopyOptions for clarity and consistent
option handling.
* v4-0005-Add-raw-COPY-format-support-for-unstructured-text-da.patch
Finally, the fi
On Sun, Oct 13, 2024, at 11:52, Tatsuo Ishii wrote:
> After copy imported the "unstructured text file" in "row" COPY format,
> what the column type is? text? or bytea? If it's text, how do you
> handle encoding conversion if the "unstructured text file" is encoded
> in server side unsafe encoding
> Hi hackers,
>
> This thread is about implementing a new "raw" COPY format.
>
> This idea came up in a different thread [1], moved here.
>
> [1]
> https://postgr.es/m/47b5c6a7-5c0e-40aa-8ea2-c7b95ccf296f%40app.fastmail.com
>
> The main use-case for t
On Sat, Oct 12, 2024, at 02:48, jian he wrote:
> git version 2.34.1
> cannot do `git apply`
Sorry about that, fixed.
> typedef enum CopyFormat
> {
> COPY_FORMAT_TEXT,
> COPY_FORMAT_BINARY,
> COPY_FORMAT_CSV
> } CopyFormat;
Thanks, fixed.
> CopyFormat should add to
> src/tools/pginde
On Sat, Oct 12, 2024 at 5:02 AM Joel Jacobson wrote:
>
> On Fri, Oct 11, 2024, at 22:29, Joel Jacobson wrote:
> > Hi hackers,
> >
> > This thread is about implementing a new "raw" COPY format.
> ...
> > The attached patch implements the above ideas.
>
On Fri, Oct 11, 2024, at 22:29, Joel Jacobson wrote:
> Hi hackers,
>
> This thread is about implementing a new "raw" COPY format.
...
> The attached patch implements the above ideas.
>
> I think with these changes, it would be easier to hack on new and existin
Hi hackers,
This thread is about implementing a new "raw" COPY format.
This idea came up in a different thread [1], moved here.
[1] https://postgr.es/m/47b5c6a7-5c0e-40aa-8ea2-c7b95ccf296f%40app.fastmail.com
The main use-case for the raw format, is when needing to import arbitrary
un
41 matches
Mail list logo