[HACKERS] native bi-temporal support?

2003-06-09 Thread Andrew Piskorski
Do you know of anyone working on bi-temporal support for PostgreSQL,
especially native support in PostgreSQL itself?  Any comments from
anyone who's looked into or been interested in this in the past?

A while back, some folks pointed out[1] an excellent book[2] by
Snodgrass[3], Developing Time-Oriented Database Applications in SQL.
However, the central thing I took from that book is that _no way_ do I
ever want to try to do any serious, practical bi-temporal work in
standard SQL-92.  I'd want either a good translation layer to take
nice temporal DDL and DML and convert it to standard SQL, or
preferably, native temporal support in the RDBMS itself.

Does any such thing exist?  Is anyone out their working on temporal
support for, say, PostgreSQL?  Anyone know of anything interesting?

The only thing I really found on the net was BtPgsql[4], a Ruby 
bi-temporal emulation layer for PostgreSQL.

There are or were some proposed standards for this temporal stuff,
TSQL2[5] and SQL3/Temporal[6], but their current status sounds rather
confused.  Note that last link says in part, "Due to disagreements
within the ISO committee as to where temporal support in SQL should
go, the project responsible for temporal support was canceled near the
end of 2001.  Hence, the working draft, "Part 7, SQL/Temporal" is in
limbo."!

[1] http://openacs.org/forums/message-view?message_id=31761 
[2] http://www.amazon.com/exec/obidos/ASIN/1558604367/ 
[3] http://www.cs.arizona.edu/people/rts/timecenter/timecenter.html 
[4] http://raa.ruby-lang.org/list.rhtml?name=btpgsql 
[5] http://www.cs.arizona.edu/people/rts/tsql2.html 
[6] http://www.cs.arizona.edu/people/rts/sql3.html 

FYI, I also asked this question in another forum, here:

http://openacs.org/forums/message-view?message_id=105737

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Testing of MVCC

2005-08-16 Thread Andrew Piskorski
On Mon, Aug 15, 2005 at 06:01:20PM -0400, Tom Lane wrote:

> What we really need is a test program that can issue a command on one
> connection (perhaps waiting for it to finish, perhaps not) and then
> issue other commands on other connections, all according to a script.

Well, using Tcl with its Tcl Threads Extension should certainly let
you easily control multiple concurrent PostgreSQL connections.  (The
Thread Extension's APIs are particularly nice for multi-threaded
programming.)  Its docs are here:

  http://cvs.sourceforge.net/viewcvs.py/tcl/thread/doc/html/

> I am unsure that the existing pg_regress infrastructure is the right
> place to start from.  Perhaps we should look at Expect or something
> similar.

I don't have any clear idea of what sort of tests you want to run
"according to a script" though, so I'm not sure whether the Tcl
Threads Extension, or Expect, or some other tool would best meet your
needs.

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] SAN, clustering, MPI, Backplane Re: Postgresql on SAN

2004-07-08 Thread Andrew Piskorski
On Wed, Jul 07, 2004 at 12:39:34PM +0200, Yannick Lecaillez wrote:
> Thanks a lot for all people which answer.
> 
> I have this "clustering on SAN" problem today and i think it could be
> less harder to implement this today than it was for Oracle in 1993

You want to do clustering for failover/reliability reasons, for
performance/scalability reasons, or for both?

For some stuff to read, see the dozen or so links I posted here:

  http://openacs.org/forums/message-view?message_id=128060

E.g., the Lustre cluster file system claims full POSIX file system
semantics (locking, etc.), so you should certainly be able to run
PostgreSQL on it.  No idea how well that works, but it should
certainly let you do fail over.

Perhaps you could even somehow, eventually, get multiple PostgreSQL
instances on different machines to all cooperate with read/write
access to the same database files over the network.  (And without
using super-expensive SCI hardware the way Clusgres does.)  That might
get you a true cluster RDBMS, if it worked well.

Another thing I've been wondering about, but haven't been able to find
any discussion of:

Just how closely tied is PostgreSQL to its use of shared memory?  At
least in principle, could all direct use of shared memory be replaced
with MPI?  Could that even work at all?  And in the degenerate case
where the MPI implementation is itself actually using shared memory
underneath (rather than sending messages across the network), would
performance still suck?

In other words, if MPI is unsuitable for PostgreSQL, is it because the
SEMANTICS of MPI are inherently unsuitable, or is it just
implementation or performance issues?

What about PostgreSQL specifically makes message passing no good, and
is the same also true for ALL RDBMSs?  What about systems like
Backplane, which claims to be "the only open-source, transactional,
truly distributed database."?

  http://www.backplane.com/

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread Andrew Piskorski
On Sun, Aug 29, 2004 at 09:24:59AM -0400, [EMAIL PROTECTED] wrote:

> Think about a website shopping cart. The session and the things in the
> cart don't matter after 10 or 15 minutes. Most of the time, it will just
> be abandoned. The only time it is important is at checkout. At which point
> it is transfered to an ACID compliant system.

Much of the rest of what you said makes sense, but this is a TERRIBLE
example.  As a user, I have been burned repeatedly by crappy,
non-persistent shopping carts built by developers who, apparently,
agree with you that, "Oh, it's just a shopping cart, no one cares if
it randomly disappears and the user has to start over from scratch."

Nonsense!  Damn it, if put 12 different items in my shopping cart and
then go to lunch, I expect my shopping cart to STILL BE THERE when I
get back!  Note that Amazon does this right, so any ecommerce site
doing it WRONG does so at severe peril of losing its customers.

> My session manager on a modern dual PIII can handle 8000 full "get vars,
> set vars" cycles a second.

I don't really see why you need or want to do this in PostgreSQL,
though.  Can't you do it in your web or application server?  (I
certainly can in AOLserver with its nsv API, for example.)

What's the advantage of instead stuffing these in-memory variables
into PostgreSQL?  That it still works correctly and simply even if you
have a farm of 10 separate web servers sitting in front of it?  That
could be useful for certain applications, but is there any more to it
than that?

I've occasionally REALLY wanted a small in-memory ACId (no "D" for
Durability) RDBMS for tracking transient data that doesn't need to go
to disk (and one of these days I will experiment with using SQLite for
that), but that's not what you're talking about, right?  The shared
variables you're talking about here are just simple scalar values, no
tables, no integrity constraints, no MVCC, none of that, right?

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] OpenORB Re: 2-phase commit

2003-10-11 Thread Andrew Piskorski
> From: "Dann Corbit" <[EMAIL PROTECTED]>
> Subject: Re: 2-phase commit
> Date: Fri, 10 Oct 2003 21:37:53 -0700

> Why not apply the effort to something already done and compatibly
> licensed?

I'm not sure, what is your point here exactly?

To use OpenORB as as a transaction monitor, you'd still need XA
protocol support (which means 2-phase commit) in each of the
PostgreSQL databases being overseen by that transaction monitor, no?

Or are you saying that OpenORB includes XA support for both the master
and the slave (I am probably using the wrong terminology here), the
code be linked in or otherwise used as-is for PostgreSQL?

Either way, have you actually used OpenORB with a RDBMS, or heard
feedback from anyone else who personally has?

> This:
> http://dog.intalio.com/ots.html
> 
> Appears to be a Berkeley style licensed:
> http://dog.intalio.com/license.html
> 
> Transaction monitor.
> 
> "Overview
> The OpenORB Transaction Service is a very scalable transaction monitor
> which also provides several extensions like XA management, a management
> interface to control all transaction processes and a high reliable
> recovery system. 

> Here is a sourceforge version of the same thing
> http://openorb.sourceforge.net/

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Andrew Piskorski
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote:
> On Tue, May 16, 2006 at 12:27:42PM -0400, Andrew Dunstan wrote:
> > Rod Taylor wrote:
> > >>I habitually turn off all compression on my Windows boxes, because it's 
> > >>a performance hit in my experience. Disk is cheap ...
> > >
> > >Disk storage is cheap. Disk bandwidth or throughput is very expensive.

> > Sure, but in my experience using Windows File System compression is not 
> > a win here. Presumably if it were an unqualified win they would have it 

> Does anyone have time to hack some kind of compression into the on-disk
> sort code just to get some benchmark numbers? Unfortunately, doing so is
> beyond my meager C abilitiy...

Folks, first of all, I'm in no way an expert on data compression in
RDBMSs, but other databases DO include data compression features and
claim it as a SIGNIFICANT win in I/O reduction.

Looking at performance of the Windows File System compression, etc.,
doesn't make too much sense when there are actual RDBMS compression
implementations to compare to, on the commerical market, in open
source code, and in the academic literature.

Oracle has included "table compression" since 9iR2.  They report table
size reductions of 2x to 4x as typical, with proportional reductions
in I/O, and supposedly, usually low to negligible overhead for writes:

  
http://download-east.oracle.com/docs/cd/B19306_01/server.102/b14211/build_db.htm#sthref289

  Decision Speed: Table Compression In Action by Meikel Poess and Hermann Baer 
(2003):
  
http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/poess_tablecomp.html

  Compressing Data for Space and Speed by Sanjay Mishra (2004):
  http://www.oracle.com/technology/oramag/oracle/04-mar/o24tech_data.html

  Order For Maximum Compression: 
  
http://oramossoracle.blogspot.com/2005/11/table-compression-order-for-maximum.html

I don't remember whether the current (Open Source) MonetDB includes
table compression or not, but they've published papers with lots of
interesting detail on the compression and other high performance OLAP
features in their latest (not released) "X100" MoneyDB research
codebase:

  http://monetdb.cwi.nl/
  http://homepages.cwi.nl/~mk/MonetDB/
  http://sourceforge.net/projects/monetdb/
  ftp://ftp.research.microsoft.com/pub/debull/A05june/issue1.htm

Now, the docs and papers above are all focused on query performance,
they say nothing directly about using using compression for on-disk
sorts.  But, I'd bet that similar rules of thumb will apply in both
cases.

The main tricks seem to be:  One, EXTREMELY lightweight compression
schemes - basically table lookups designed to be as cpu friendly as
posible.  Two, keep the data compressed in RAM as well so that you can
also cache more of the data, and indeed keep it the compressed until
as late in the CPU processing pipeline as possible.

A corrolary of that is forget compression schemes like gzip - it
reduces data size nicely but is far too slow on the cpu to be
particularly useful in improving overall throughput rates.

Note, I have not really tested ANY of the above myself, your mileage
may well vary from what I recall from those various articles...

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Andrew Piskorski
On Tue, May 16, 2006 at 11:48:21PM -0400, Greg Stark wrote:

> There are some very fast decompression algorithms:
> 
> http://www.oberhumer.com/opensource/lzo/

Sure, and for some tasks in PostgreSQL perhaps it would be useful.
But at least as of July 2005, a Sandor Heman, one of the MonetDB guys,
had looked at zlib, bzlib2, lzrw, and lzo, and claimed that:

  "... in general, it is very unlikely that we could achieve any
  bandwidth gains with these algorithms. LZRW and LZO might increase
  bandwidth on relatively slow disk systems, with bandwidths up to
  100MB/s, but this would induce high processing overheads, which
  interferes with query execution. On a fast disk system, such as our
  350MB/s 12 disk RAID, all the generic algorithms will fail to achieve
  any speedup."

  http://www.google.com/search?q=MonetDB+LZO+Heman&btnG=Search
  http://homepages.cwi.nl/~heman/downloads/msthesis.pdf

> I think most of the mileage from "lookup tables" would be better implemented
> at a higher level by giving tools to data modellers that let them achieve
> denser data representations. Things like convenient enum data types, 1-bit
> boolean data types, short integer data types, etc.

Things like enums and 1 bit booleans certainly could be useful, but
they cannot take advantage of duplicate values across multiple rows at
all, even if 1000 rows have the exact same value in their "date"
column and are all in the same disk block, right?

Thus I suspect that the exact opposite is true, a good table
compression scheme would render special denser data types largely
redundant and obsolete.

Good table compression might be a lot harder to do, of course.
Certainly Oracle's implementation of it had some bugs which made it
difficult to use reliably in practice (in certain circumstances
updates could fail, or if not fail perhaps have pathological
performance), bugs which are supposed to be fixed in 10.2.0.2, which
was only released within the last few months.

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] ice-broker scan thread

2005-11-29 Thread Andrew Piskorski
On Tue, Nov 29, 2005 at 03:14:38PM +1100, Gavin Sherry wrote:
> On Mon, 28 Nov 2005, David Boreham wrote:
> > Gavin Sherry wrote:
> > > MySQL, Oracle and others implement read-ahead threads to simulate async IO
> >
> > I always believed that Oracle used async file I/O. Not that I've seen their

> The paper I linked to seemed to suggest that they weren't using async IO
> in 9.2 -- which is fairly old.

  http://www.vldb2005.org/program/paper/wed/p1116-hall.pdf
  "Getting Priorities Straight: Improving Linux Support for Database I/O"
  by Hall and Bonnet
  Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005

I think you've misread that paper.  AFAICT it neither says nor even
suggests that Oracle 9.2 does not use asynchronous I/O on Linux.  In
fact, it seems to strongly suggest exactly the opposite, that Oracle
does use async I/O whereever it can.

Note they also reference this document, which as of 2002 and Linux
kernel 2.4.x, was urging Oracle DBAs to use Oracle's kernel-based
asynchronous I/O support whenever possible:

  http://www.ixora.com.au/tips/use_asynchronous_io.htm

What Hall and Bonnet's paper DOES say, is that both Oracle and MySQL
InnoDB appear to use a "conservative" I/O submission policy, but
Oracle does so more efficiently.  They also argue that both Oracle and
MySQL fail to utilize the "full potential" of Linux async I/O because
of their conservative submission policies, and that an "agressive" I/O
submissions policy would work better, but only if support for
Prioritized I/O is added to Linux.  They then proceed to add that
support, and make some basic changes to InnoDB to partially take
advantage of it.

Also interesting is their casual mention that for RDBMS workloads, the
default Linux 2.6 disk scheduler "anticipatory" is inferior to the
"deadline" scheduler.  They base their (simple sounding) Prioritized
I/O support on the deadline scheduler.

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend