Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
On 2010-11-11 17:02, Heikki Linnakangas wrote: On 11.11.2010 17:48, Tom Lane wrote: "Yeb Havinga" writes: postgres=# create table a as select ''::oidvector; SELECT 1 postgres=# copy a to '/tmp/test' with binary; COPY 1 postgres=# copy a from '/tmp/test' with binary; ERROR: invalid oidvector data The problem seems to be that array_recv passes back a zero-dimensional array, *not* a 1-D array, when it observes that the input has no elements. A zero-D array is not part of the subset of possible arrays that we allow for oidvector. Yeah, I just reached that conclusion too.. The patch below changes array_recv, so that it returns an empty 1-D array when an empty 1-D array was written binary. No changes in oidvectorrecv or int2vectorrecv are needed. diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 4ceb256..98e725a 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -1288,7 +1288,7 @@ array_recv(PG_FUNCTION_ARGS) my_extra->element_type = element_type; } - if (nitems == 0) + if (ndim == 0) { /* Return empty array ... but not till we've validated element_type */ PG_RETURN_ARRAYTYPE_P(construct_empty_array(element_type)); Make check passes. regards, Yeb Havinga -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
On 15.11.2010 12:08, Yeb Havinga wrote: On 2010-11-11 17:02, Heikki Linnakangas wrote: On 11.11.2010 17:48, Tom Lane wrote: "Yeb Havinga" writes: postgres=# create table a as select ''::oidvector; SELECT 1 postgres=# copy a to '/tmp/test' with binary; COPY 1 postgres=# copy a from '/tmp/test' with binary; ERROR: invalid oidvector data The problem seems to be that array_recv passes back a zero-dimensional array, *not* a 1-D array, when it observes that the input has no elements. A zero-D array is not part of the subset of possible arrays that we allow for oidvector. Yeah, I just reached that conclusion too.. The patch below changes array_recv, so that it returns an empty 1-D array when an empty 1-D array was written binary. No changes in oidvectorrecv or int2vectorrecv are needed. That seems like a bad idea. array_in() represents an empty array with zero dimensions, we're not going to change the generic array_recv() function used for all arrays to behave differently because of some corner-case in oidvectorrecv. If we want do anything about this, the right fix is to add a special case in oidvectorrecv/int2vectorrecv to handle empty array. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
On 2010-11-15 11:40, Heikki Linnakangas wrote: On 15.11.2010 12:08, Yeb Havinga wrote: On 2010-11-11 17:02, Heikki Linnakangas wrote: On 11.11.2010 17:48, Tom Lane wrote: "Yeb Havinga" writes: postgres=# create table a as select ''::oidvector; SELECT 1 postgres=# copy a to '/tmp/test' with binary; COPY 1 postgres=# copy a from '/tmp/test' with binary; ERROR: invalid oidvector data The problem seems to be that array_recv passes back a zero-dimensional array, *not* a 1-D array, when it observes that the input has no elements. A zero-D array is not part of the subset of possible arrays that we allow for oidvector. Yeah, I just reached that conclusion too.. The patch below changes array_recv, so that it returns an empty 1-D array when an empty 1-D array was written binary. No changes in oidvectorrecv or int2vectorrecv are needed. That seems like a bad idea. array_in() represents an empty array with zero dimensions, we're not going to change the generic array_recv() function used for all arrays to behave differently because of some corner-case in oidvectorrecv. It's possible to trigger it with any element type, as long as the use manages to construct a 1-D array. E.g. with contrib/_int.sql loaded, array substraction is possible. Now: postgres=# select ARRAY[1]::int4[] - ARRAY[1]::int4[]; ?column? -- {} (1 row) postgres=# select array_ndims(ARRAY[1]::int4[] - ARRAY[1]::int4[]); array_ndims - 1 (1 row) Binary send and recv of the value at hand would change it into a 0-D vector. If we want do anything about this, the right fix is to add a special case in oidvectorrecv/int2vectorrecv to handle empty array. Judging from the fact that it also happens with int4 arrays, this seems like the wrong place. regards, Yeb Havinga -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
Yeb Havinga writes: > Binary send and recv of the value at hand would change it into a 0-D vector. The reason for that is that in general we don't make a distinction between zero-size arrays of different dimensions. oidvector and int2vector are different though. Which is why this should be fixed locally to those two types, rather than changing the behavior of regular arrays. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
I wrote: > Yeb Havinga writes: >> Binary send and recv of the value at hand would change it into a 0-D vector. > The reason for that is that in general we don't make a distinction > between zero-size arrays of different dimensions. Actually, after consuming a bit more caffeine, I see what Yeb is on about. Even though the system in general doesn't make much of a distinction between zero-element arrays of different dimensionalities, there *are* functions that can distinguish --- array_ndims() being the most obvious one. Shouldn't we ensure that binary dump and reload of an array value doesn't change the value in any SQL-observable way? If so, I think his patch is correct, even though it's changing more than just the originally-complained-of behavior. While I'm looking at this ... why is it that array_ndims returns NULL and not 0 for a zero-dimensional array? 0-D arrays might have been unsupported at one time, but they're certainly considered valid now. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
On Mon, Nov 15, 2010 at 4:51 PM, Tom Lane wrote: > Actually, after consuming a bit more caffeine, I see what Yeb is on about. > Even though the system in general doesn't make much of a distinction > between zero-element arrays of different dimensionalities, there *are* > functions that can distinguish --- array_ndims() being the most obvious > one. Shouldn't we ensure that binary dump and reload of an array value > doesn't change the value in any SQL-observable way? If so, I think his > patch is correct, even though it's changing more than just the > originally-complained-of behavior. We went to a lot of effort to preserve lower bounds for dumped arrays so I would agree. I was actually one of the few people that actually ran into this prior to the fix. We had arrays generated by the intagg functions which were 0-based and after dumping and reloading were 1-based causing our functions which calculated the array sizes to misbehave. > > While I'm looking at this ... why is it that array_ndims returns NULL > and not 0 for a zero-dimensional array? 0-D arrays might have been > unsupported at one time, but they're certainly considered valid now. Is this the same question as split() on enmpty strings? Do we have a problem distinguishing between a 0-dimensional array and a 1-dimensional empty array? -- greg -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5748: Invalid oidvector data during binary recv
Greg Stark writes: > Is this the same question as split() on enmpty strings? Do we have a > problem distinguishing between a 0-dimensional array and a > 1-dimensional empty array? Internally they're certainly different. I've just been sniffing around the code a bit, and the main problem I see with trying to be more rigorous about this is that array_out just prints '{}' for any zero-element array, regardless of the recorded dimensionality. We could reserve that notation to mean a 0-D array (which is what array_in reads it as). But then we have to figure out what to print for other cases. The only good idea that comes to mind is [1:0]={} (or variants of that depending on what the stored dimensions actually are). array_in currently rejects that: regression=# select '[1:0]={}'::int[]; ERROR: upper bound cannot be less than lower bound but perhaps it wouldn't be too painful to tweak it to allow the case. It appears to be fairly hard to actually get a non-zero-D empty array into the system, so while this is pretty ugly I think it wouldn't affect common usage. Comments? regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] Bug in concurrency control in temporary GiST indexes
While hacking on the GiST insertion algorithm, I noticed a bug with temporary GiST indexes. The scan algorithm uses LSNs to detect a concurrent page split, but for a temporary index, we just use a constant LSN on the assumption that there can't be anyone else modifying the index concurrently. But that assumption is not true. While a temporary index can't be modified by other backends, it's entirely possible to insert new tuples to the index within the same backend while a scan is in progress. Here's a test case, using btree_gist. If you modify it to use a non-temporary table, it works fine, but with a temporary table the 2nd SELECT misses some rows, when new rows are inserted in the middle of the scan. Operations on temporary tables are not WAL-logged, so we don't have LSNs to use, but we just need a monotonically increasing sequence of numbers for this purpose. Attached patch provides a function to generate such fake LSNs. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Initialize a temporary table with 1000 rows and a gist index. CREATE TEMPORARY TABLE tt (a integer); CREATE INDEX i_tt ON tt USING gist (a); INSERT INTO tt SELECT generate_series(1,1000); -- Fetch n tuples from "tt", and write them to table called "resultset" CREATE OR REPLACE FUNCTION fetchfunc(n int) RETURNS bool AS $$ declare r record; c refcursor; begin c:= 'foocur'; FOR i IN 1..n LOOP FETCH FROM c INTO r; IF r IS NULL THEN RETURN true; END IF; INSERT INTO resultset VALUES (r.a); END LOOP; RETURN false; end; $$ LANGUAGE plpgsql; -- ensure that we use a plain index scan. set enable_bitmapscan=off; set enable_seqscan=off; CREATE TEMPORARY TABLE resultset (a integer); -- Fetch all rows from tt into resultset, with no inserts in the middle BEGIN; DECLARE foocur CURSOR FOR SELECT a FROM tt WHERE a > 100; SELECT fetchfunc(100); SELECT fetchfunc(1000); COMMIT; SELECT COUNT(*) FROM resultset; -- Display result truncate resultset; -- Do the same, but insert new tuples in the middle of the scan. The result -- should be the same. BEGIN; DECLARE foocur CURSOR FOR SELECT a FROM tt WHERE a > 100; SELECT fetchfunc(100); INSERT INTO tt SELECT a% 10 FROM generate_series(1,1000) a; SELECT fetchfunc(1000); COMMIT; SELECT COUNT(*) FROM resultset; -- Display result diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index 3054f98..8c2dbc9 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -22,8 +22,6 @@ #include "storage/indexfsm.h" #include "utils/memutils.h" -const XLogRecPtr XLogRecPtrForTemp = {1, 1}; - /* Working state for gistbuild and its callback */ typedef struct { @@ -132,7 +130,7 @@ gistbuild(PG_FUNCTION_ARGS) PageSetTLI(page, ThisTimeLineID); } else - PageSetLSN(page, XLogRecPtrForTemp); + PageSetLSN(page, GetXLogRecPtrForTemp()); UnlockReleaseBuffer(buffer); @@ -423,7 +421,7 @@ gistplacetopage(GISTInsertState *state, GISTSTATE *giststate) { for (ptr = dist; ptr; ptr = ptr->next) { -PageSetLSN(ptr->page, XLogRecPtrForTemp); +PageSetLSN(ptr->page, GetXLogRecPtrForTemp()); } } @@ -491,7 +489,7 @@ gistplacetopage(GISTInsertState *state, GISTSTATE *giststate) PageSetTLI(state->stack->page, ThisTimeLineID); } else - PageSetLSN(state->stack->page, XLogRecPtrForTemp); + PageSetLSN(state->stack->page, GetXLogRecPtrForTemp()); if (state->stack->blkno == GIST_ROOT_BLKNO) state->needInsertComplete = false; @@ -1027,7 +1025,7 @@ gistnewroot(Relation r, Buffer buffer, IndexTuple *itup, int len, ItemPointer ke PageSetTLI(page, ThisTimeLineID); } else - PageSetLSN(page, XLogRecPtrForTemp); + PageSetLSN(page, GetXLogRecPtrForTemp()); END_CRIT_SECTION(); } diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c index 28ab575..9de08f9 100644 --- a/src/backend/access/gist/gistutil.c +++ b/src/backend/access/gist/gistutil.c @@ -677,3 +677,22 @@ gistoptions(PG_FUNCTION_ARGS) PG_RETURN_BYTEA_P(result); PG_RETURN_NULL(); } + +/* + * Temporary GiST indexes are not WAL-logged, but we need LSNs to detect + * concurrent page splits anyway. GetXLogRecPtrForTemp() provides a fake + * monotonically increasing sequence of LSNs for that purpose. + */ +XLogRecPtr +GetXLogRecPtrForTemp(void) +{ + static XLogRecPtr counter = {0, 1}; + + counter.xrecoff++; + if (counter.xrecoff == 0) + { + counter.xlogid++; + counter.xrecoff++; + } + return counter; +} diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c index 0ff5ba8..dbe9406 100644 --- a/src/backend/access/gist/gistvacuum.c +++ b/src/backend/access/gist/gistvacuum.c @@ -268,7 +268,7 @@ gistbulkdelete(PG_FUNCTION_ARGS) pfree(rdata); } else - PageSetLSN(page, XLogRecPtrForTemp); + PageSetLSN(page, GetXLogRecPtrForTemp()); END_CRIT_SECTION(); } diff --git a/src/include/access/gist_private.h b/src/include/acce
[BUGS] BUG #5754: CTE optimization fails to account for side effects
The following bug has been logged online: Bug reference: 5754 Logged by: David Fetter Email address: da...@fetter.org PostgreSQL version: 8.4+ Operating system: All Description:CTE optimization fails to account for side effects Details: Here's how to reproduce: BEGIN; CREATE SEQUENCE my_seq; WITH t AS (SELECT nextval('my_seq')) VALUES(1); SELECT currval('my_seq'); ERROR: currval of sequence "my_seq" is not yet defined in this session What's happened is that the optimization didn't account for the idea that a SELECT might have a side effect, and if we're going with the "CTEs execute exactly once and (equivalent to) fully," this is a bug. CTEs should not optimize away (parts of) SELECTs that have side effects. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5754: CTE optimization fails to account for side effects
"David Fetter" writes: > CREATE SEQUENCE my_seq; > WITH t AS (SELECT nextval('my_seq')) VALUES(1); > SELECT currval('my_seq'); > ERROR: currval of sequence "my_seq" is not yet defined in this session > What's happened is that the optimization didn't account for the idea that a > SELECT might have a side effect, and if we're going with the "CTEs execute > exactly once and (equivalent to) fully," this is a bug. The reason it's not a bug is that we have not adopted that position. There is a proposal to make it so for wCTEs, but that doesn't mean we should change the existing, documented and useful behavior of regular CTEs. (If you're wondering where it's documented, I cite section 7.8's statement that only as much of a CTE query is evaluated as is read by the parent query. The limiting case of that is no reference -> no rows read.) regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs