Re: [HACKERS] Temporal extensions

Jim Nasby Mon, 27 Apr 2015 19:51:07 -0700

On 4/27/15 6:08 PM, Dave Jones wrote:

(Though, I dislike using timestamps to do change/history tracking, but
>that's just me...)

I've been playing around with history tracking (in the context of BI,
typically with batch loaded reporting databases) for about 7-8 years now
and always found timestamps perfect for the purpose, but are you perhaps
referring to using it for audit purposes? If that's the case I'd agree
entirely - this is absolutely the wrong tool for such things (which is
something I need to put a bit more prominently in the docs - it's buried
in the design section at the moment).


Most warehouses dumb things down to a day level, so it's probably OK there.

What I specifically don't like is that using a timestamp to try anddetermine the order in which something happened is just fraught withgotchas. For starters, now() is locked in when you do a BEGIN, but maybea newer transaction modifies a table before an older one does. Now theordering is *backwards*. You have the same problem with using an XID.The only way I've thought of to make this guaranteed safe is to somehowserialize the logging with something like


CREATE TABLE customer_history(
  customer_hid serial primary key -- hid == history_id
  , previous_customer_hid int references customer_history
  , customer_id int NOT NULL references customer
...
);

CREATE UNIQUE INDEX ... ON customer_history(previous_customer_hid) WHEREprevious_customer_hid IS NOT NULL;CREATE UNIQUE INDEX ... ON customer_history(customer_hid) WHEREprevious_customer_hid IS NULL;


and then have a before trigger enforce
NEW.previous_customer_hid := customer_history__get_latest(customer_id)

where customer_history__get_latest() will 'walk the chain' starting withthe first link customer_id = blah AND previous_customer_id = NULL

Because of the indexes that will serialize inserts on a per-customerbasis. You could still run into problems with a newer snapshot creatinga history record before a transaction with an older snapshot doesthough. :( Though, if you included txid_current_snapshot() with eachrecord you could probably detect when that happens.

Or did you mean ranges would be better? They certainly looked intriguing
when I started moving this stuff to postgres, and I'd like to re-visit
them in the near future as they offer capabilities I don't have with
timestamps (such as guaranteeing no overlapping ranges via exclusion
constraints) but my initial tests suggested some rather major
performance degradation so I put it on the back-burner at first.

If you're going to keep both a start and end for each record you'ddefinitely want to do it with a range. If you're only keeping the changetime then you can handle it differently.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Temporal extensions

Reply via email to