small cleanup in unicode_norm.c

2020-12-07 Thread John Naylor
We've had get_canonical_class() for a while as a backend-only function. There is some ad-hoc code elsewhere that implements the same logic in a couple places, so it makes sense for all sites to use this function instead, as in the attached. -- John Naylor EnterpriseDB:

Re: small cleanup in unicode_norm.c

2020-12-08 Thread John Naylor
On Tue, Dec 8, 2020 at 5:45 AM Michael Paquier wrote: > > On Mon, Dec 07, 2020 at 03:24:56PM -0400, John Naylor wrote: > > We've had get_canonical_class() for a while as a backend-only function. > > There is some ad-hoc code elsewhere that implements the same logic in a &

Re: cutting down the TODO list thread

2020-12-10 Thread John Naylor
to work in multiple-argument aggregate calls Tom suggested this in 2006 for the sake of orthogonality. Given the amount of time passed, it seems not very important. - Allow DELETE and UPDATE to be used with LIMIT and ORDER BY Some use cases mentioned, but nearly all have some kind of worka

Re: cutting down the TODO list thread

2020-12-14 Thread John Naylor
On Thu, Dec 10, 2020 at 3:29 PM John Naylor wrote: > > *Views and Rules > *SQL Commands Hearing no objections, the items mentioned have been moved over. -- John Naylor EDB: http://www.enterprisedb.com

Re: Perform COPY FROM encoding conversions in larger chunks

2020-12-22 Thread John Naylor
out upgrade compatibility. Is that because user-defined conversions would no longer have the right signature? -- John Naylor EDB: http://www.enterprisedb.com

Re: Perform COPY FROM encoding conversions in larger chunks

2020-12-23 Thread John Naylor
re-create it with > the new function signature afterwards. A note in the release notes and a > check in pg_upgrade, with instructions to drop and recreate the > conversion, are probably enough. > That was my thought as well. -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-12 Thread John Naylor
dded by the noError argument patch) 2. Add SSE4 validator -- it turns out the demo I referred to earlier doesn't match the algorithm in the paper. I plan to only copy the lookup tables from simdjson verbatim, but the code will basically be written from scratch, using simdjson as a hint. 3. Adjus

Re: Preventing free space from being reused

2021-02-13 Thread John Naylor
N opclass multi-minmax currently in development. It's designed to address that exact situation, and more review would be welcome: https://commitfest.postgresql.org/32/2523/ -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-15 Thread John Naylor
warnings in the magic constants macros. That needs some polish. I also attached a C file that visually demonstrates every step of the algorithm following the example found in Table 9 in the paper. That contains the skeleton coding I started with and got abandoned early, so it might differ from the actual patch. -- John Naylor EDB: http://www.enterprisedb.com v3-SSE4-with-autoconf-support.patch Description: Binary data test-utf8.c Description: Binary data

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-16 Thread John Naylor
tually completely broken if you tried to pass the special flags to configure. I redesigned this part and it seems to work now. -- John Naylor EDB: http://www.enterprisedb.com v4-SSE4-with-autoconf-support.patch Description: Binary data

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-18 Thread John Naylor
On Mon, Feb 15, 2021 at 9:32 PM John Naylor wrote: > > On Mon, Feb 15, 2021 at 9:18 AM Heikki Linnakangas wrote: > > > > I'm guessing that's because the unaligned access in check_ascii() is > > expensive on this platform. > Some possible remedies: >

Re: WIP: BRIN multi-range indexes

2021-02-22 Thread John Naylor
d make sure it looks sane in page_inspect, but that's about it. -- John Naylor EDB: http://www.enterprisedb.com

Re: non-HOT update not looking at FSM for large tuple update

2021-02-24 Thread John Naylor
to take your -50 idea and make it more general and safe, by scaling the fudge factor based on fillfactor, such that if fillfactor is less than 100, the requested freespace is a bit smaller than the max. It's still a bit of a hack, though. I've attached a draft of this idea. -- John Naylor E

Re: non-HOT update not looking at FSM for large tuple update

2021-02-24 Thread John Naylor
example, we expect 1.5% of the page could be line items, then: > > targetFreeSpace = MaxHeapTupleSize * 0.985 That makes sense, although the exact number seems precisely tailored to your use case. 2% gives 164 bytes of slack and doesn't seem too large. Updated patch attached. -- John Naylor EDB: h

Re: Removing support for COPY FROM STDIN in protocol version 2

2021-02-25 Thread John Naylor
but that's not relevant here) and got the expected message when trying to connect: master: Welcome to psql 7.3.21, the PostgreSQL interactive terminal. patch: psql: FATAL: unsupported frontend protocol 2.0: server supports 3.0 to 3.0 I couldn't find any traces of version 2 in the tr

Re: non-HOT update not looking at FSM for large tuple update

2021-02-26 Thread John Naylor
e as I could, but 2% is obviously great too. :-) I can't think of any large drawbacks either of having a slightly larger value. > Thanks for posting the patch! I've added this to the commitfest as a bug fix and added you as an author. -- John Naylor EDB: http://www.enterprisedb.com

our use of popcount

2021-03-03 Thread John Naylor
stions/25078285/replacing-a-32-bit-loop-counter-with-64-bit-introduces-crazy-performance-deviati -- John Naylor EDB: http://www.enterprisedb.com v1-0001-Use-direct-function-calls-for-pg_popcount-32-64.patch Description: Binary data v1-0002-Use-platform-specific-implementations-of-pg_popco.patch Desc

Re: Speeding up GIST index creation for tsvectors

2021-03-03 Thread John Naylor
, so the compiler is not going to inline it on x86 anyway. That just confuses things. (I did make sure to remove indirect calls from the retail functions in [1], in case we want to go that route). [1] https://www.postgresql.org/message-id/CAFBsxsFCWys_yfPe4PoF3%3D2_oxU5fFR2H%2BmtM6njUA8nBiCYug%40mail.gmail.com -- John Naylor EDB: http://www.enterprisedb.com

Re: Force lookahead in COPY FROM parsing

2021-03-04 Thread John Naylor
s something not affected until the patch to do the encoding conversion in larger chunks. -- John Naylor EDB: http://www.enterprisedb.com

Re: WIP: BRIN multi-range indexes

2021-03-04 Thread John Naylor
d I assumed that was applicable here. -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-03-09 Thread John Naylor
09176236caf. This way it would benefit other platforms > as well. I'm fairly certain that the author of a compiler capable of doing that in this case would be eligible for some kind of AI prize. :-) [1] https://www.postgresql.org/message-id/06d45421-61b8-86dd-e765-f1ce527a5...@iki.fi -- John Naylor EDB: http://www.enterprisedb.com

Re: non-HOT update not looking at FSM for large tuple update

2021-03-09 Thread John Naylor
gt; to clean up trailing unused line pointers. As in, can't we trim the line pointer > > array when vacuum detects that the trailing line pointers on the page are all > > unused? That seems like the proper fix, and I see you've started a thread for that. I don't think that

Re: non-HOT update not looking at FSM for large tuple update

2021-03-09 Thread John Naylor
the 2% slack logic (and has other benefits), but the rest of this patch would be needed regardless. -- John Naylor EDB: http://www.enterprisedb.com

Re: WIP: BRIN multi-range indexes

2021-03-09 Thread John Naylor
;m thinking the number of ranges that need to be scanned will increase regardless. Maybe rather than ignoring correlation, we could clamp it or otherwise tweak it. Not sure about the details, though, that would require some testing. -- John Naylor EDB: http://www.enterprisedb.com

Re: get rid of tags in the docs?

2021-03-10 Thread John Naylor
On Thu, Feb 4, 2021 at 11:31 AM Tom Lane wrote: > > John Naylor writes: > > While looking at the proposed removal of the v2 protocol, I noticed that we > > italicize some, but not all, instances of 'per se', 'pro forma', and 'ad > > hoc'. I

Re: Speeding up GIST index creation for tsvectors

2021-03-10 Thread John Naylor
On Mon, Mar 8, 2021 at 8:43 AM Amit Khandekar wrote: > > On Wed, 3 Mar 2021 at 23:32, John Naylor wrote: > > 0001: > > > > + /* > > + * We can process 64-bit chunks only if both are mis-aligned by the same > > + * number of bytes. > > + */ > > +

Re: Speeding up GIST index creation for tsvectors

2021-03-10 Thread John Naylor
uncomfortable with the fact that we can't rely on alignment, but maybe there's a simple fix somewhere in the gist code. -- John Naylor EDB: http://www.enterprisedb.com src/backend/access/heap/visibilitymap.c | 13 +- src/backend/nodes/bitmapset.c | 23 +-- src/backend/uti

Re: non-HOT update not looking at FSM for large tuple update

2021-03-11 Thread John Naylor
len <= maxPaddedFsmRequest) { ... targetFreeSpace = maxPaddedFsmRequest; } else targetFreeSpace = len + saveFreeSpace; Also, should I write a regression test for it? The test case is already available, just no obvious place to put it. -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-03-12 Thread John Naylor
uld have. In reality, simdjson has different files for SSE4, AVX, AVX512, NEON, and Altivec. We can incorporate any of those as needed. That's a PG15 project, though, and I'm not volunteering. -- John Naylor EDB: http://www.enterprisedb.com

Re: Move catalog toast table and index declarations

2020-10-27 Thread John Naylor
On Tue, Oct 27, 2020 at 7:43 AM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2020-10-24 15:23, John Naylor wrote: > > Style: In genbki.h, "extern int no_such_variable" is now out of place. > > Also, the old comments like "The macro de

duplicate function oid symbols

2020-10-27 Thread John Naylor
exist, as well as prevent such symbols from being emitted into pg_proc_d.h. But then again there is no guarantee the standard symbol is not being used elsewhere. Thoughts? -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

cutting down the TODO list thread

2020-10-27 Thread John Naylor
nd old - Change walsender so that it applies per-role settings Old and possibly obsolete -- [1] https://www.postgresql.org/message-id/CAFBsxsHbqMzDoGB3eAGmpcpB%2B7uae%2BLLi_G%2Bo8HMEECM9CbQcQ%40mail.gmail.com -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: cutting down the TODO list thread

2020-10-27 Thread John Naylor
On Tue, Oct 27, 2020 at 4:00 PM Thomas Munro wrote: > On Wed, Oct 28, 2020 at 8:36 AM Andres Freund wrote: > > On 2020-10-27 15:24:35 -0400, John Naylor wrote: > > > - Allow WAL replay of CREATE TABLESPACE to work when the directory > > > structure on the recovery com

Re: cutting down the TODO list thread

2020-10-27 Thread John Naylor
E (I haven't looked further into this) -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: duplicate function oid symbols

2020-10-27 Thread John Naylor
On Tue, Oct 27, 2020 at 9:51 AM Tom Lane wrote: > John Naylor writes: > > I noticed that the table AM abstraction introduced the symbol > > HEAP_TABLE_AM_HANDLER_OID, although we already have a convention for > > defining symbols automatically for builtin functions, wh

Re: cutting down the TODO list thread

2020-10-28 Thread John Naylor
On Tue, Oct 27, 2020 at 6:05 PM Bruce Momjian wrote: > On Tue, Oct 27, 2020 at 04:54:24PM -0400, John Naylor wrote: > > > > > > On Tue, Oct 27, 2020 at 3:52 PM Bruce Momjian wrote: > > > > > > Do any of these limitations need to be documented befor

Re: cutting down the TODO list thread

2020-10-28 Thread John Naylor
technical detail on the topic but if doing that, let's not > mark them as that inline -- create a separate page with those items on > it. > How about a section on the same page at the bottom, near "features we don't want"? -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: duplicate function oid symbols

2020-10-28 Thread John Naylor
I wrote: > Ok, here is a patch to fix that, and also throw an error if pg_proc.dat > has an explicitly defined symbol. > It occurred to me I neglected to explain the error with a comment, which I've added in v2. -- John Naylor EnterpriseDB: http://www.enterprisedb.com

Re: duplicate function oid symbols

2020-10-28 Thread John Naylor
dler case is the standard macros don't already exist for these pg_type entries. The handmade macro idea could be used for all eight just as easily as for one. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

document pg_settings view doesn't display custom options

2020-10-28 Thread John Naylor
Starting separate threads to keep from cluttering the TODO list thread. Here's a patch for the subject, as mentioned in https://www.postgresql.org/message-id/20201027220555.GS4951%40momjian.us -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company d

Re: document pg_settings view doesn't display custom options

2020-10-28 Thread John Naylor
On Wed, Oct 28, 2020 at 2:15 PM John Naylor wrote: > Starting separate threads to keep from cluttering the TODO list thread. > > Here's a patch for the subject, as mentioned in > https://www.postgresql.org/message-id/20201027220555.GS4951%40momjian.us > I just realized I i

Re: duplicate function oid symbols

2020-10-28 Thread John Naylor
On Wed, Oct 28, 2020 at 3:24 PM Tom Lane wrote: > and then the negotiation here is only about whether to make this list > longer. We don't need to complicate genbki.pl with a new facility. > Agreed, and reformat_dat_files.pl must also know about these special attributes. -

Re: duplicate function oid symbols

2020-10-28 Thread John Naylor
> */ > #define CASHOID MONEYOID > #define LSNOID PG_LSNOID > > #endif > Here is a quick patch implementing this much. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company forbid-custom-pg-type-symbols.patch Description: Binary data

Re: document pg_settings view doesn't display custom options

2020-10-29 Thread John Naylor
On Wed, Oct 28, 2020 at 11:38 PM Fujii Masao wrote: > > > On 2020/10/29 3:45, John Naylor wrote: > > On Wed, Oct 28, 2020 at 2:15 PM John Naylor < > john.nay...@enterprisedb.com <mailto:john.nay...@enterprisedb.com>> wrote: > > > > Starting separat

Re: cutting down the TODO list thread

2020-10-30 Thread John Naylor
oing", but I'm open to better ideas. Once that's agreed upon, I'll make a new page and migrate the items over, minus the two that were mentioned upthread. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: document pg_settings view doesn't display custom options

2020-10-30 Thread John Naylor
On Thu, Oct 29, 2020 at 11:51 PM Fujii Masao wrote: > > > On 2020/10/29 21:54, John Naylor wrote: > > > The pg_settings does not display > > customized options > > that have been set before the relevant extension module has been > loaded. > > I

Re: document pg_settings view doesn't display custom options

2020-10-30 Thread John Naylor
On Fri, Oct 30, 2020 at 12:07 PM Tom Lane wrote: > John Naylor writes: > > On Thu, Oct 29, 2020 at 11:51 PM Fujii Masao < > masao.fu...@oss.nttdata.com> > > wrote: > >> Also I think this note should be in the different paragraph from the > >> paragra

Re: document pg_settings view doesn't display custom options

2020-10-30 Thread John Naylor
On Fri, Oct 30, 2020 at 12:48 PM Tom Lane wrote: > John Naylor writes: > > Okay, along those lines here's a patch using "this view" in a new > paragraph > > for simplicity. > > Basically OK with me, but ... > > > It seems fairly weird to use a n

document deviation from standard on REVOKE ROLE

2020-10-30 Thread John Naylor
This is the other doc fix as suggested in https://www.postgresql.org/message-id/20201027220555.GS4951%40momjian.us There is already a compatibility section, so put there. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company v1-doc-fix-revoke-role.patch

Re: [proposal] de-TOAST'ing using a iterator

2020-11-02 Thread John Naylor
s, so I > move it to "Waiting on author". > > As I understand, the patch he posted is fine -- it only crashes when he > tried a change I suggested. That's my recollection as well. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: cutting down the TODO list thread

2020-11-03 Thread John Naylor
rejected, in a way. Ultimately, it comes down to "when time permits". [1] https://www.postgresql.org/message-id/flat/CAFBsxsGsBZsG%3DcLM0Op5HFb2Ks6SzJrOc_eRO_jcKSNuqFRKnQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAFBsxsEmg=kqrekxrlygy0ujcfyck4vgxzkalrwh_olfj8o...@mail.gmail.com -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: Move catalog toast table and index declarations

2020-11-05 Thread John Naylor
On Thu, Nov 5, 2020 at 4:24 AM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2020-10-27 13:12, John Naylor wrote: > > There's nothing wrong; it's just a minor point of consistency. For the > > first part, I mean defined symbols in this file

Re: Move catalog toast table and index declarations

2020-11-06 Thread John Naylor
On Thu, Nov 5, 2020 at 2:20 PM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2020-11-05 12:59, John Naylor wrote: > > I think we're talking past eachother. Here's a concrete example: > > > > #define BKI_ROWTYPE_OID(oid,oidmacro) > > #

Re: speed up unicode decomposition and recomposition

2020-11-06 Thread John Naylor
There is a latent bug in the way code pairs for recomposition are sorted due to a copy-pasto on my part. Makes no difference now, but it could in the future. While looking, it seems pg_bswap.h should actually be backend-only. Both fixed in the attached. -- John Naylor EnterpriseDB: http

Re: document pg_settings view doesn't display custom options

2020-11-09 Thread John Naylor
On Mon, Nov 9, 2020 at 2:12 AM Fujii Masao wrote: > Pushed. Thanks! > Thank you! -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: WIP: BRIN multi-range indexes

2020-11-09 Thread John Naylor
ld have it check only odd numbers. That's a common feature of sieves, but also makes the code a bit harder to understand if you haven't seen it before. Also to fill in something I left for later, the reference for this /* upper bound of number of primes below limit */ /* WIP: reference for t

Re: WIP: BRIN multi-range indexes

2020-11-09 Thread John Naylor
gt; but the whole index row is too large). But it's probably better to do at > least something, and maybe improve that later with some whole-row check. A whole-row check would be nice, but I don't know how hard that would be. As a Devil's advocate proposal, how awful would it be to not allow multicolumn brin-bloom indexes? -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: Clean up optional rules in grammar

2020-11-11 Thread John Naylor
e work down the line. > > The attached patch cleans this up to make them all look like the first > style. > +1 for standardizing in this area. It's worth noting that Bison 3.0 introduced %empty for this situation, which is self-documenting. Until we get there, this is a good step

Re: cutting down the TODO list thread

2020-11-11 Thread John Naylor
On Tue, Nov 10, 2020 at 7:08 PM Bruce Momjian wrote: > On Tue, Nov 3, 2020 at 02:06:13PM -0400, John Naylor wrote: > > I was thinking of not having the next updates during commitfest, but it > could > > also be argued this is a type of review, and the things here will be &

Re: cutting down the TODO list thread

2020-11-11 Thread John Naylor
ay Unclear path forward - better handling of XPath data types - Improve handling of PIs and DTDs in xmlconcat() Zero interest - Restructure XML and /contrib/xml2 functionality As discussed in the thread, it's an unrealistically large project -- John Naylor EnterpriseDB: http://www.en

Re: cutting down the TODO list thread

2020-11-16 Thread John Naylor
On Wed, Nov 11, 2020 at 4:45 PM John Naylor wrote: > Here is the next section on data types, proposed to be moved to the "not > worth doing" page. As before, if there are any objections, do speak up. > I'll make the move in a few days. > Hearing no objection,

Re: cutting down the TODO list thread

2020-11-18 Thread John Naylor
discussion thread. - More sensible support for Unicode combining characters, normal forms We have normalization as of PG13, so I propose to mark this Done rather than move it. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: cutting down the TODO list thread

2020-11-19 Thread John Naylor
On Wed, Nov 18, 2020 at 2:42 PM Bruce Momjian wrote: > On Wed, Nov 18, 2020 at 02:26:46PM -0400, John Naylor wrote: > > Here are the next couple of sections with items proposed to be moved to > the > > "not worth doing" page. As before, if there are any objections,

Re: cutting down the TODO list thread

2020-11-19 Thread John Naylor
On Wed, Nov 18, 2020 at 3:05 PM Tom Lane wrote: > John Naylor writes: > > Here are the next couple of sections with items proposed to be moved to > the > > "not worth doing" page. As before, if there are any objections, let me > > know. I'll make the m

Re: Should we document IS [NOT] OF?

2020-11-20 Thread John Naylor
his at any point since then. > > Pushed. > Documenting or improving IS OF was a TODO, so I've removed that entry. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: cutting down the TODO list thread

2020-11-23 Thread John Naylor
With the exception of "Fix /contrib/btree_gist's implementation of inet indexing", all items above have been either moved over, or removed if they were done already by PG13. -- John Naylor EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: truncating timestamps on arbitrary intervals

2020-11-23 Thread John Naylor
On Thu, Nov 12, 2020 at 9:56 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > > On 2020-06-30 06:34, John Naylor wrote: > > In v9, I've simplified the patch somewhat to make it easier for future > > work to build on. > > > > - When truncati

Re: WIP: BRIN multi-range indexes

2021-01-12 Thread John Naylor
magine something algorithmic is involved. Is it worth digging further to see if some code path is taking more time than we would expect? -- John Naylor EDB: http://www.enterprisedb.com

outdated references to replication timeout

2021-01-12 Thread John Naylor
Hi, The parameter replication_timeout was retired in commit 6f60fdd701 back in 2012, but some comments and error messages seem to refer to that old setting instead of wal_sender_timeout or wal_receiver_timeout. The attached patch replaces the old language with more specific references. -- John

Re: outdated references to replication timeout

2021-01-13 Thread John Naylor
be consistent within a single message. Maybe the parameter should be spelled exactly as is, with underscores? I'll take a broader look and send an updated patch. -- John Naylor EDB: http://www.enterprisedb.com

Re: outdated references to replication timeout

2021-01-14 Thread John Naylor
On Thu, Jan 14, 2021 at 1:55 AM Michael Paquier wrote: > > On Wed, Jan 13, 2021 at 11:28:55PM +0900, Fujii Masao wrote: > > On Wed, Jan 13, 2021 at 10:51 PM John Naylor < john.nay...@enterprisedb.com> wrote: > >> It is strange, now that I think about it. My thinki

Re: truncating timestamps on arbitrary intervals

2021-01-18 Thread John Naylor
On Mon, Nov 23, 2020 at 1:44 PM John Naylor wrote: > > On Thu, Nov 12, 2020 at 9:56 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > > - After reading the discussion a few times, I'm not so sure anymore > > whether making this a cousin of date_trun

Re: WIP: BRIN multi-range indexes

2021-01-19 Thread John Naylor
mp_eq There's no one place that's pathological enough to explain the 4x slowness over traditional BRIN and nearly 3x slowness over btree when using a large number of unique values per range, so making progress here would have to involve a more holistic approach. -- John Naylor EDB: http://www.enterprisedb.com

Re: WIP: BRIN multi-range indexes

2021-01-22 Thread John Naylor
rom generate_series( '2020-01-01 0:00'::timestamptz, '2020-02-01 23:59'::timestamptz, '1 second'::interval) x; -- mono-10-asc truncate table iot; insert into iot (num, create_dt) select random(), '2020-01-01 0:00'::timestamptz + (x % 10 || ' seconds')::interval from generate_series(1,5*365*24*60*60) x; -- John Naylor EDB: http://www.enterprisedb.com

Re: WIP: BRIN multi-range indexes

2021-01-26 Thread John Naylor
for page header and tuple header at least. As the comment before says, the filter will eventually not be compressible. I remember we can't be exact here, with the possibility of multiple columns, but we can leave a little slack space. -- John Naylor EDB: http://www.enterprisedb.com

Re: WIP: BRIN multi-range indexes

2021-01-26 Thread John Naylor
On Fri, Jan 22, 2021 at 10:59 PM Tomas Vondra wrote: > > > On 1/23/21 12:27 AM, John Naylor wrote: > > Still, it would be great if multi-minmax can be a drop in replacement. I > > know there was a sticking point of a distance function not being > > available on all

Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-27 Thread John Naylor
#x27;) loop +for byte3 in hex('a1')..hex('fe') loop + return next b(byte1, byte2, byte3); +end loop; + end loop; Not sure if it matters , but thought I'd mention it anyway. -- John Naylor EDB: http://www.enterprisedb.com drive_conversion.c Description: Binary data

Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-30 Thread John Naylor
unconsumed bytes available in raw_buf */ #define RAW_BUF_BYTES(cstate) ((cstate)->raw_buf_len - (cstate)->raw_buf_index) It might make sense to create a CONVERSION_BUF_BYTES equivalent since the patch calculates cstate->conversion_buf_len - cstate->conversion_buf_index in a couple plac

Re: WIP: BRIN multi-range indexes

2021-01-30 Thread John Naylor
On Tue, Jan 26, 2021 at 6:59 PM Tomas Vondra wrote: > > > > On 1/26/21 7:52 PM, John Naylor wrote: > > On Fri, Jan 22, 2021 at 10:59 PM Tomas Vondra > > mailto:tomas.von...@enterprisedb.com>> > > wrote: > > > Hmm. I think Alvaro also

[POC] verifying UTF-8 using SIMD instructions

2021-02-01 Thread John Naylor
gresql.org/message-id/06d45421-61b8-86dd-e765-f1ce527a5...@iki.fi -- John Naylor EDB: http://www.enterprisedb.com diff --git a/src/common/wchar.c b/src/common/wchar.c index 6e7d731e02..12b3a5e1a2 100644 --- a/src/common/wchar.c +++ b/src/common/wchar.c @@ -13,6 +13,10 @@ #include "c.h"

Re: Perform COPY FROM encoding conversions in larger chunks

2021-02-02 Thread John Naylor
ffer */ Lastly, it looks like pg_do_encoding_conversion_buf() ended up in 0003 accidentally? -- John Naylor EDB: http://www.enterprisedb.com

Re: Bug in COPY FROM backslash escaping multi-byte chars

2021-02-03 Thread John Naylor
n multibyte delimiters in the wild, so it's not as outlandish as it seems. The fix is simple enough, so +1. -- John Naylor EDB: http://www.enterprisedb.com

get rid of tags in the docs?

2021-02-04 Thread John Naylor
d but weren't. The other case is 'voilĂ ', found in rules.sgml. The case for italics here is stronger, but looking at that file, I actually think a more generic-sounding phrase here would be preferable. Other opinions? -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-04 Thread John Naylor
On Mon, Feb 1, 2021 at 2:01 PM Heikki Linnakangas wrote: > > On 01/02/2021 19:32, John Naylor wrote: > > It makes sense to start with the ascii subset of UTF-8 for a couple > > reasons. First, ascii is very widespread in database content, > > particularly in bulk loa

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-07 Thread John Naylor
noticeably slower on pure ascii, but still several times faster than before, so the conclusions haven't changed any. I'll run full measurements later this week, but I'll share the patch now for review. [1] https://www.postgresql.org/message-id/11d39e63-b80a-5f8d-8043-fff04201f...@i

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-08 Thread John Naylor
bytes > aligned. Use memcpy to fetch the next 8-byte chunk to fix. Will do. [1] https://github.com/lemire/fastvalidate-utf-8/tree/master/include [2] https://lemire.me/blog/2018/10/19/validating-utf-8-bytes-using-only-0-45-cycles-per-byte-avx-edition/ -- John Naylor EDB: http://www.enterprisedb.com

Re: Perform COPY FROM encoding conversions in larger chunks

2021-02-09 Thread John Naylor
On Sun, Feb 7, 2021 at 2:13 PM Heikki Linnakangas wrote: > > On 02/02/2021 23:42, John Naylor wrote: > > > > In copyfromparse.c, this is now out of date: > > > > * Read the next input line and stash it in line_buf, with conversion to > > * server encoding.

Re: WIP: BRIN multi-range indexes

2021-02-09 Thread John Naylor
gt; shouldn't do that either. For existing minmax indexes that's useless > (the opclass seems to be working, otherwise the index would be dropped). > But even for new indexes I'm not sure it's the right thing, so I don't > plan to change this. Okay. -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-09 Thread John Naylor
27;s a smarter way to check for zeros in C. Or maybe be more careful about cache -- running memchr() on the whole input first might not be the best thing to do. -- John Naylor EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-09 Thread John Naylor
re updated 8-12 years ago, but that would still be something to check, in addition to more configure checks. [1] https://github.com/lemire/fastvalidate-utf-8/tree/master/include -- John Naylor EDB: http://www.enterprisedb.com utf-sse42-demo.patch Description: Binary data

Re: [POC] verifying UTF-8 using SIMD instructions

2021-02-09 Thread John Naylor
On Tue, Feb 9, 2021 at 4:22 PM Heikki Linnakangas wrote: > > On 09/02/2021 22:08, John Naylor wrote: > > Maybe there's a smarter way to check for zeros in C. Or maybe be more > > careful about cache -- running memchr() on the whole input first might > > not be the be

Re: cutting down the TODO list thread

2021-12-08 Thread John Naylor
ong and doesn't seem terribly helpful to someone trying to get up to speed on the issues that are still relevant. I don't see any more recent discussion, either. Thoughts? -- John Naylor EDB: http://www.enterprisedb.com

Re: speed up verifying UTF-8

2021-12-08 Thread John Naylor
is to export some symbols and add the counting function. That wouldn't materially affect the current patch for input verification, and would be separate, but it would be nice to get the symbol visibility right up front. I've set this to waiting on author while I experiment with tha

do only critical work during single-user vacuum?

2021-12-09 Thread John Naylor
would have a chance to get a handle on things. Thoughts? -- John Naylor EDB: http://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2021-12-09 Thread John Naylor
nts. [Peter again] > single-user mode should prompt the user about > what exact VACUUM command they ought to run to get things going. The current message is particularly bad in its vagueness because some users immediately reach for VACUUM FULL, which quite logically seems like the most complete thing to do. -- John Naylor EDB: http://www.enterprisedb.com

Re: speed up verifying UTF-8

2021-12-13 Thread John Naylor
more: - if (!IS_HIGHBIT_SET(*s) || - IS_UTF8_2B_LEAD(*s) || - IS_UTF8_3B_LEAD(*s) || - IS_UTF8_4B_LEAD(*s)) + if (!IS_HIGHBIT_SET(*s) || pg_utf_mblen(s) > 1) And I moved is_val

speed up text_position() for utf-8

2021-12-13 Thread John Naylor
to/branchless-utf8/blob/master/utf8.h -- John Naylor EDB: http://www.enterprisedb.com src/backend/utils/adt/varlena.c | 112 src/common/wchar.c | 90 ++-- src/include/mb/pg_wchar.h | 53 -

Re: cutting down the TODO list thread

2021-12-14 Thread John Naylor
On Wed, Dec 8, 2021 at 1:40 PM John Naylor wrote: > > It's been a while, but here are a few more suggested > removals/edits/additions to the TODO list. Any objections or new > information, let me know: > > - Auto-fill the free space map by scanning the buffer cache or by &

Re: speed up text_position() for utf-8

2021-12-15 Thread John Naylor
0 | 1080 | 920 > inline pg_utf_mblen() + ascii fast path | 382 | 470 | 918 I failed to mention that the above numbers are milliseconds, so smaller is better. -- John Naylor EDB: http://www.enterprisedb.com

Re: speed up verifying UTF-8

2021-12-17 Thread John Naylor
I plan to push v25 early next week, unless there are further comments. -- John Naylor EDB: http://www.enterprisedb.com

  1   2   3   4   5   6   7   8   9   10   >