Hi,
I'm having trouble with range types and btree_gist - after some playing
I believe it's
caused by a bug in how btree_gist handles text columns. All this is on
freshly compiled
9.2.2.
I'm trying to achieve almost exactly what's described in the second
example on
http://www.postgresql.org/docs/9.2/interactive/rangetypes.html#RANGETYPES-CONSTRAINT
i.e. I maintaining a list of ranges for each ID, except that I'm using
text instead of
integers for an ID. so the table looks like this:
=========================================================================================
CREATE TABLE test (
id TEXT,
validity TSRANGE NOT NULL DEFAULT tsrange('-infinity'::timestamp,
'infinity'::timestamp),
CONSTRAINT test_validity_check EXCLUDE USING GIST (id WITH =,
validity WITH &&)
);
=========================================================================================
This table is repeatedly filled with new versions of the data (which
were removed from
the demo for sake of simplicity), so I've defined a trigger that checks
if there's a
range with overlapping range, and split the range accordingly.
Each record starts with validity=[-infinity, infinity). On the first
update this would
be split into [-infinity, now()) and [now(), infinity) and so on. This
is what the following
trigger should do:
=========================================================================================
CREATE OR REPLACE FUNCTION test_close() RETURNS trigger AS $$
BEGIN
-- close the previous record (set upper bound of the range)
UPDATE test SET validity = tsrange(lower(validity),
now()::timestamp)
WHERE id = NEW.id AND (upper(validity) = 'infinity'::timestamp);
-- if there was a preceding record, set the lower bound (otherwise
use unbounded range)
IF FOUND THEN
NEW.validity := tsrange(now()::timestamp,
'infinity'::timestamp);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER test_close BEFORE INSERT ON test FOR EACH ROW EXECUTE
PROCEDURE test_close();
=========================================================================================
To generate the sample data, do this:
=========================================================================================
echo "SimpleTestString" > /tmp/data.csv
for f in `seq 1 20000`; do
echo $f > /tmp/x.log;
md5sum /tmp/x.log | awk '{print $1}' >> /tmp/data.csv;
done;
=========================================================================================
The first line (combination of upper and lower-case letters) is what
seems to trigger the
behavior. Now load the file into the table repeatedly, and you'll
eventually get this error
=========================================================================================
db=# copy test(id) from '/tmp/data.csv';
COPY 10001
db=# copy test(id) from '/tmp/data.csv';
COPY 10001
db=# copy test(id) from '/tmp/data.csv';
ERROR: conflicting key value violates exclusion constraint
"test_validity_check"
DETAIL: Key (id, validity)=(SimpleTestString, ["2013-02-01
23:32:04.329975",infinity))
conflicts with existing key (id, validity)=(SimpleTestString,
[-infinity,infinity)).
CONTEXT: COPY test, line 1: "SimpleTestString"
=========================================================================================
The number of necessary COPY executions varies - what's even stranger
is the result of
this select once it fails:
=========================================================================================
test=# select * from test where id = 'SimpleTestString';
id | validity
------------------+----------------------
SimpleTestString | [-infinity,infinity)
SimpleTestString | [-infinity,infinity)
(2 rows)
=========================================================================================
Yup, there are two overlapping ranges for the same ID. Moreover after
disabling bitmap and
index scans, the COPY takes much longer but works just fine (including
the trigger).
Creating a plain b-tree index on the "ID" column seems to fix that too.
That leads me to the belief that this is a bug in the GIST indexing,
and the variations
are probably caused by the index scan kicking in after one of the COPY
executions (and
reaching some threshold). I'm using en_US.UTF-8 for the database.
By replacing the "infinity" with a plain NULL (in the table and
trigger), it fails too,
but in a slightly different way. For example I'm seeing this after the
failure:
=========================================================================================
test=# select * from test where id = 'SimpleTest';
id | validity
------------+---------------------------------
SimpleTest | (,"2013-02-02 00:07:07.038324")
(1 row)
test=# set enable_bitmapscan to off;
SET
test=# set enable_indexscan to off;
SET
test=# select * from test where id = 'SimpleTest';
id | validity
------------+---------------------------------
SimpleTest | (,"2013-02-02 00:07:07.038324")
SimpleTest | ["2013-02-02 00:07:07.038324",)
(2 rows)
=========================================================================================
I've been unable to achieve this using a generated sample, therefore
prepared sample
scripts and CSV files
1) with-infinity.sql + sample-1.csv (this is described in the text
above)
2) with-nulls.sql + sample-2.csv (this is the NULL version)
available for download at http://www.fuzzy.cz/tmp/samples.tgz (~1MB).
kind regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers