[GENERAL] pg_restore load data

2017-11-16 Thread Ron Johnson

Hi,

v9.2.7  (Yes, I know, it's old.  Nothing I can do about it.)

During a "whole database" restore using pg_restore of a custom dump, when is 
the data actually loaded?  I've looked in the list output and don't see any 
"load" statements.


Thanks

--
World Peace Through Nuclear Pacification



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_restore load data

2017-11-16 Thread Ron Johnson

On 11/16/2017 03:13 PM, bricklen wrote:


On Thu, Nov 16, 2017 at 1:07 PM, Ron Johnson <mailto:ron.l.john...@cox.net>> wrote:


v9.2.7  (Yes, I know, it's old.  Nothing I can do about it.)

During a "whole database" restore using pg_restore of a custom dump,
when is the data actually loaded?  I've looked in the list output and
don't see any "load" statements.


Look for COPY lines, that's how the data is restored.


$ pg_restore -l CDSHA01.dump > CDSHA01.txt
$ grep --color -i copy CDSHA01.txt
$ echo $?
1

There are lots of "restoring data", though.  I should have thought to grep 
for that.


One thing that puzzles me is how fast the tables (even large ones) loaded 
compared to how slow the pg_dump -Fc was.  Granted, I'm running -j4 but 
still, these were some really large, poorly compressible tables (the dump 
file was about as big as du -mc data/base).


--
World Peace Through Nuclear Pacification



Re: [GENERAL] mild modification to pg_dump

2017-11-17 Thread Ron Johnson

On 11/17/2017 02:23 PM, John R Pierce wrote:

On 11/17/2017 12:19 PM, marcelo wrote:

Sorry, I was not exact.
I don't need nor like to change pg_dump. Rather, based on pg_dump code, I 
need to develop a daemon which can receive a TCP message (from a 
privileged app) containing some elements: the database to dump, the user 
under which do that, and his password. (My apps are using that same data, 
of course, encripted to the common users).



I would just fork pg_dump to do the actual dump rather than try and 
incorporate its source code into your app.




Specifically, do you mean to write a simple daemon which forks pg_dump at 
the appropriate time?



--
World Peace Through Nuclear Pacification



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] mild modification to pg_dump

2017-11-17 Thread Ron Johnson


What about the pgpass file?

https://www.postgresql.org/docs/9.2/static/libpq-pgpass.html

On 11/17/2017 03:06 PM, marcelo wrote:
I need to "emulate" the pg_dump code because the password prompt. Years 
ago I write a program (for the QnX environment) that catched some prompt 
and emulates the standard input. I don't like to do that again.


On 17/11/17 17:23, John R Pierce wrote:

On 11/17/2017 12:19 PM, marcelo wrote:

Sorry, I was not exact.
I don't need nor like to change pg_dump. Rather, based on pg_dump code, 
I need to develop a daemon which can receive a TCP message (from a 
privileged app) containing some elements: the database to dump, the user 
under which do that, and his password. (My apps are using that same 
data, of course, encripted to the common users).



I would just fork pg_dump to do the actual dump rather than try and 
incorporate its source code into your app.







--
World Peace Through Nuclear Pacification



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Will a DELETE violate an FK?

2007-05-29 Thread Ron Johnson

On 05/29/07 09:48, Robert James wrote:
I'd like to be able to detect if a record has associations. I don't want 
to actually delete it, just know if it could be deleted. (This is to 
build an intelligent menu on a GUI)


Are you wanting to know this in a generic way or for a specific 
database?


P.S. - Please don't top-post.

On 5/29/07, *Albe Laurenz* <[EMAIL PROTECTED] 
<mailto:[EMAIL PROTECTED]>> wrote:


 > Is there anyway to know if a DELETE will violate an FK
 > without actually trying it?

I don't know what you mean by 'without trying it', but does the
following answer your question?

CREATE TABLE a (id integer PRIMARY KEY);
CREATE TABLE b (id integer PRIMARY KEY,
  a_id integer NOT NULL CONSTRAINT b_fkey REFERENCES a(id));

INSERT INTO a (id) VALUES (1);
INSERT INTO b (id, a_id) VALUES (42, 1);

DELETE FROM a WHERE id=1;
ERROR:  update or delete on table "a" violates foreign key constraint
"b_fkey" on table "b"
DETAIL:  Key (id)=(1) is still referenced from table "b".

Yours,
Laurenz Albe





--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] hundreds of schema vs hundreds of databases

2007-05-29 Thread Ron Johnson

On 05/29/07 04:02, Albe Laurenz wrote:
I have an application with some hundreds users, each one 
having the same 
data definitions, and each one storing up to 2 GB of data.
A user have just access to his own data. His data will have its own 
tablespace.


Therefore, it seems to me I have a choice between "one database per 
user" and "one schema per user in the same database".


What is the best practice here ? Which solution will be the 
easiest to manage ?


Advantages of many databases:
- Each database is smaller.
- No danger of one user accessing another user's data (because of
  misconfigured permissions and similar).
- Guaranteed independence of each user's data.
- More scalable: If you decide that one machine or one cluster


You could always dump a schema then drop it and restore it in a new 
database.  At 2GB, that should be quick.



  is not enough to handle the load, you can easily transfer some
  of the databases somewhere else.

>

Advantages of one database with many schemata:
- Fewer databases to administrate.


But since they all have to have the same schema, you'd still have 
the same DDL overhead whether it's one DB or many.


Does PG set up buffers at the postmaster level or the database level?

If at the database level, then you'll be allocating memory to 
databases that might not be in use at any one time, thus wasting it. 
 One database buffer pool would make more efficient use of RAM.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Rookie Questions: Storing the results of calculations vs. not?

2007-05-29 Thread Ron Johnson

On 05/24/07 15:20, cjl wrote:

PG:


Sorry it's taken so long for anyone to answer you, but it appears 
that some emails were hung up for a while.



I am playing around with some historical stock option data, and have
decided to use a database to make my life easier.  The data is "end-of-
day" for all equitable options, so something like 17 columns and
approximately 200,000 rows a day.  I have several months of data in
"csv" format, one file for each trading day.

After some simple experiments, I found it was easier to import this
data directly into postgresql than into mysql, because of the
expiration date format being un-friendly to mysql.  I'm using the COPY
command to load the data.

I realize I have a lot of reading to do, but I would like to ask a few
questions to help guide my reading.

1)  The data contains the price of the underlying stock, the strike
price of the option, and the option premium. From this I can calculate
the "cost basis" and the "maximum potential profit", which are
elements I would like to be able to SELECT and ORDER.  Should I store
the results of these calculation with the data, or is this "business
logic" which doesn't belong in the database.  Is this what views are
for?


I'd say "business logic", and yes, views are good for that.


2)  For each underlying stock there are lots of options, each having
unique strike prices and expirations.  For example, AAPL (apple
computer) have stock options (calls and puts) that expire in June, at
various strike prices.  Lets say that apple stock is trading at $112.
I would like to be able to select the options with strikes just above
and below this price, for example $110 and $115.  The data contains
options with strikes from $60 through $125, every $5.  Is this
something I need to do programatically, or can I create a complex SQL
query to extract this information?


I'd have a table with one row per option.  Then make this kind of query:
SELECT *
FROM T_OPTION
WHERE TICKER = 'AAPL'
  AND EXPIRE_DT BETWEEN '2007-06-01' AND 2007-06-30'
  AND PRICE = 112.0
ORDER BY PRICE DESC
LIMIT 1
UNION
SELECT *
FROM T_OPTION
WHERE TICKER = 'AAPL'
  AND EXPIRE_DT BETWEEN '2007-06-01' AND 2007-06-30'
  AND PRICE = 112.0
ORDER BY PRICE ASC
LIMIT 1
;


I have rudimentary python skills, and I'm getting the hang of
psycopg2.  After reading the postgresql manual, what should I read
next?


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] hundreds of schema vs hundreds of databases

2007-05-30 Thread Ron Johnson

On 05/30/07 01:38, Albe Laurenz wrote:

Ron Johnson wrote:

Does PG set up buffers at the postmaster level or the database level?

If at the database level, then you'll be allocating memory to 
databases that might not be in use at any one time, thus wasting it. 
  One database buffer pool would make more efficient use of RAM.


Shared memory is allocated at the cluster level.
See
http://www.postgresql.org/docs/current/static/runtime-config-resource.ht
ml#RUNTIME-CONFIG-RESOURCE-MEMORY


I read that page, but don't see any references to "cluster level". 
Maybe I am misinterpreting "cluster"?


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Geographic data sources, queries and questions

2007-05-30 Thread Ron Johnson

On 05/29/07 17:46, Michael Glaesemann wrote:


On May 29, 2007, at 15:28 , John D. Burger wrote:


Even ISO country codes are not guaranteed to be stable


I'm not sure where the idea that primary keys must be stable comes from. 
There's nothing necessarily wrong with updating a primary key. All a 
primary key does is uniquely identify a row in a table. If that id 
changes over time, that's fine, as long as the primary key columns 
continue to uniquely identify each row in the table.


And any archived data (for example, transaction detail that you must 
keep for 7 years but don't still want in your database, since it 
doubles your backup/restore times) will still have the old codes.


"Static" data needs to be static.

 SQL even provides 
ON UPDATE CASCADE to make this convenient. There may be performance 
arguments against updating a primary key (as the changes need to 
propagate), but that depends on the needs of a particular (benchmarked 
and tested) application environment.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] problem calling psql multiple times from a script ?

2007-05-30 Thread Ron Johnson

On 05/30/07 10:23, Tom Lane wrote:

Paul Tilles <[EMAIL PROTECTED]> writes:
I am wondering if one of the previous executions of psql is doing some 
"back room" work in the database while allowing the script to continue.  


It takes finite time for the backend to quit, and psql doesn't wait
around for that to happen.  I've noticed that on many systems it seems
that the kernel scheduler actively discriminates against an exiting
backend --- maybe it thinks it's a low-priority background process?
The amount of work needed to start a new psql and a new backend
vastly exceeds what it takes to quit (unless you've got lots of temp
tables to drop, or some such), and yet people report cases like this
pretty often.

We could fix it by making PQfinish() wait for the connection to drop,
but that cure could be worse than the disease; most apps would just
see this as useless delay.

In the meantime, a sleep(1) or some such before trying to drop a
recently-used database ought to be a usable workaround.


The proprietary DB that I use at work implemented something similar 
as a command-line qualifier, and I guarantee you that it's been 
tremendously useful.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Geographic data sources, queries and questions

2007-05-30 Thread Ron Johnson

On 05/30/07 11:01, John D. Burger wrote:

Even ISO country codes are not guaranteed to be stable


I'm not sure where the idea that primary keys must be stable comes 
from. There's nothing necessarily wrong with updating a primary key. 
All a primary key does is uniquely identify a row in a table. If that 
id changes over time, that's fine, as long as the primary key columns 
continue to uniquely identify each row in the table.


And any archived data (for example, transaction detail that you must 
keep for 7 years but don't still want in your database, since it 
doubles your backup/restore times) will still have the old codes.


"Static" data needs to be static.


Yes, and then there is the question of what such a recycled code 
actually =means= as a foreign key.


For example, CS used to be the code for Czechoslovakia, then it was for 
Serbia and Montenegro, now it is in "transition" before being deleted.  
Czechoslovakia no longer has a code, since it no longer exists, as far 
as ISO is concerned.  What do you want to do with your biography 
database for 19th century Slavic poets, which indicate that some people 
were born in Czechoslovakia.  Did those people move (briefly) to Serbia 
and Montenegro?  Or did their birthplace change to NULL?  If you want to 
give them a code, you have to find out what part of Czechoslovakia they 
actually lived in, and what country that region's now in.  Do you really 
want some external agency forcing you to muck with you data like this?


In situations like this (a toll road schedule) we add beginning and 
expiring timestamps, and the expiring timestamp is part of the 
natural PK.


Anyway, regardless of one's feelings along these lines, I thought many 
might be implicitly assuming that all of these standards guarantee such 
stability, and I wanted to disabuse folks of that.


It's very interesting and useful to know.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Continuous PITR (was Re: multimaster)

2007-06-01 Thread Ron Johnson

On 06/01/07 11:16, Andrew Sullivan wrote:
[snip]


My real question in all this is, "What is the problem you are trying
to solve?"  Hot failover using combinations of hardware and software,
and a disk array that can be mounted across two machines, is actually
probably good enough for most cases, assuming it is implemented
correctly (see recent discussion on this topic).  So the availability
piece is mostly solved.  What else do you want?


What you need are disk blocks to be mirrored to a machine at the DR 
site.  Or "continuous PITR" to the DR machine.


How difficult would it be to modify the process (the postmaster?) 
that writes the xlogs(?) to tee them to a listening process across 
the cloud on the DR machine, which then applies them to the DR database?


This then begs the question: are CREATE|ALTER TABLESPACE commands 
stored in the xlogs?


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] why postgresql over other RDBMS

2007-06-01 Thread Ron Johnson

On 06/01/07 11:22, Bruce Momjian wrote:

PFC wrote:

On Thu, 31 May 2007 22:20:09 +0200, Vivek Khera <[EMAIL PROTECTED]> wrote:


On May 25, 2007, at 5:28 PM, Tom Lane wrote:


That's true at the level of DDL operations, but AFAIK we could
parallelize table-loading and index-creation steps pretty effectively
--- and that's where all the time goes.
I would be happy with parallel builds of the indexes of a given table.   
That way you have just one scan of the whole table to build all its  
indexes.
	Will the synchronized seq scan patch be able to do this by issuing all  
the CREATE INDEX commands at the same time from several different database  
connections ?


No, but it could someday.


Or would a "CREATE MANY INDEXES" (where in one statement you specify 
all the indexes on a single table) command be easier to implement?


This way also the process reads the table once, building separate 
sortwork files on-the-fly.  Too bad child processes can't inherit 
transaction state.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Continuous PITR (was Re: multimaster)

2007-06-01 Thread Ron Johnson

On 06/01/07 16:25, Andrew Sullivan wrote:

On Fri, Jun 01, 2007 at 03:58:01PM -0500, Ron Johnson wrote:
What you need are disk blocks to be mirrored to a machine at the DR 
site.  Or "continuous PITR" to the DR machine.


I thought you could already do this?  (I'm not, but I was pretty sure
someone reported doing it already.)


From within PG, or "manually"?

This then begs the question: are CREATE|ALTER TABLESPACE commands 
stored in the xlogs?


(I'll spare the rant about begging the question.)  Since they're
transactional, they must be, no?


I wasn't sure whether they were a special case or not.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] why postgresql over other RDBMS

2007-06-01 Thread Ron Johnson

On 06/01/07 16:38, PFC wrote:


Will the synchronized seq scan patch be able to do this by 
issuing all  the CREATE INDEX commands at the same time from several 
different database  connections ?

 No, but it could someday.


Actually I tested, it does it right now, albeit unconsciously (pg 
doesn't do anything to synchronize the scans, but if you launch the 
concurrent connections at the same time and issue all your "create 
index" at the same time, only 1 table scan is needed). Maybe if the 
tables were bigger, it would lose sync between the 3 concurrent scans 
and would end up going slower. That's why I spoke about the 
"synchronized scan" patch.


How much of this, though, is from the OS's disk cache?  Or are Seq 
Scans O_DIRECT and bypass the OS cache?


Or would a "CREATE MANY INDEXES" (where in one statement you specify 
all the indexes on a single table) command be easier to implement?


You can get the effect right now by using concurrent connections it 
seems. Not very practical in a psql script, though...


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


One last Slony question (was Re: [GENERAL] Slightly OT.)

2007-06-01 Thread Ron Johnson

On 06/01/07 17:31, Andrew Sullivan wrote:

On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote:

Could you not (I ask naively) detect the first DDL statement is
submitted in a transaction 


Maybe.


on the master, then start a transaction on
each slave, then funnel this and all subsequent statements
synchronously to every nodes, then prepare and commit everyone?


You could if 2PC was ubiquitous, which is certainly wasn't when the
code was designed (remember, it was originally compatible all the way
back to 7.3).  Some people suggested using 2PC "if it's there", but
that just seems to me to be asking for really painful problems.  It
also entails that all DDL has to happen on every node at the same
time, which imposes a bottleneck not actually currently in the
system.


Since DDL is infrequent, is that bottleneck an acceptable trade-off?


It is probably the case, however, that version 2 of the system will
break some of these backwards compatibility attempts in order to
depend on some new back end features -- putting this entirely in user
space turns out to be awful.  It's how we got the monstrous catalog
corruption hack.

This is getting pretty Slony specific, though, so if we're to
continue this thread, I suggest we do it on the Slony list.




--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: One last Slony question (was Re: [GENERAL] Slightly OT.)

2007-06-01 Thread Ron Johnson

On 06/01/07 18:35, Joshua D. Drake wrote:

Ron Johnson wrote:

On 06/01/07 17:31, Andrew Sullivan wrote:

On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote:

Could you not (I ask naively) detect the first DDL statement is
submitted in a transaction 


Maybe.


on the master, then start a transaction on
each slave, then funnel this and all subsequent statements
synchronously to every nodes, then prepare and commit everyone?


You could if 2PC was ubiquitous, which is certainly wasn't when the
code was designed (remember, it was originally compatible all the way
back to 7.3).  Some people suggested using 2PC "if it's there", but
that just seems to me to be asking for really painful problems.  It
also entails that all DDL has to happen on every node at the same
time, which imposes a bottleneck not actually currently in the
system.


Since DDL is infrequent, is that bottleneck an acceptable trade-off?


Define infrequent? I have customers that do it, everyday in prod. They 
do it willingly and refuse to change that habit.


Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements 
per day is a drop in the bucket compared to the number of I/U/D 
statements, no?


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: One last Slony question (was Re: [GENERAL] Slightly OT.)

2007-06-01 Thread Ron Johnson

On 06/01/07 19:17, Joshua D. Drake wrote:

Ron Johnson wrote:

On 06/01/07 18:35, Joshua D. Drake wrote:


Since DDL is infrequent, is that bottleneck an acceptable trade-off?


Define infrequent? I have customers that do it, everyday in prod. 
They do it willingly and refuse to change that habit.


Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per 
day is a drop in the bucket compared to the number of I/U/D 
statements, no?


True.


So Alexander Staubo's idea of synchronous DDL replication via 2PC 
has some merit?


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Partitioning (was Re: [GENERAL] Slightly OT.)

2007-06-01 Thread Ron Johnson

On 06/01/07 19:29, Jeff Davis wrote:
[snip]

You shouldn't use a volatile function in a check constraint. Use a
trigger instead, but even that is unlikely to work for enforcing
constraints correctly.

In general, for partitioning, you have to make some sacrifices. It's
very challenging (and/or expensive) to ensure uniqueness across
partitions.


Are partitioned databases the same as federated databases?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Transactional DDL

2007-06-02 Thread Ron Johnson


You were politely asked not to top-post.

On 06/02/07 11:46, Harpreet Dhaliwal wrote:

So, while writing any technical document, would it be wrong to mention
stored procedures in postgresql?
what is the general convention?


Did I miss something?  What does "stored procedures" have to do with 
 "Transactional DDL"?



On 6/2/07, Dawid Kuroczko <[EMAIL PROTECTED]> wrote:


On 6/2/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote:
> But its said that transactions in any RDBMS follow ACID properties.
> So if i put a create table and an Insert statement in the same begin 
end

> block as one single transactioin, won't both create and insert follow
acid
> property, being in one single trasaction, and either both get committed
or
> none, talking about oracle lets say

Actually, Oracle inserts implicit COMMIT after each DDL.

So, if you have:

BEGIN;
INSERT INTO foo (bar) VALUES (1);
CREATE INDEX foo_bar ON foo (bar);
-- Here Oracle will insert implicit COMMIT, thus your foo table will
have value 1 commited.
-- And here Oracle will BEGIN a new trasaction.
INSERT INTO foo (bar) VALUES (2);
ROLLBACK;
-- And you will rollback the insert of value 2.  Value 1 remains in the
table,
-- because it is already committed.

   Regards,
   Dawid


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Transactional DDL

2007-06-02 Thread Ron Johnson

On 06/02/07 13:35, Jasbinder Singh Bali wrote:

On 6/2/07, Michael Glaesemann <[EMAIL PROTECTED]> wrote:



On Jun 2, 2007, at 11:08 , Harpreet Dhaliwal wrote:

> Whats so novel about postgresql here?
> This would happen in any RDBMS. right?
> You induced divide by zero exception that crashed the whole
> transaction and it did not create the table bar?

[Please don't top-post. It makes the discussion hard to follow.]

I used the divide by zero to raise an error to show that both the
CREATE TABLE and the INSERT were rolled back when the transaction
failed. If there's another definition of transactional DDL, I'd like
to know what it is.

Michael Glaesemann
grzm seespotcode net



This is what happens in every RDBMS. Whats so special about postgres then?


But it's NOT what happens in every RDBMS.  Oracle implicitly 
executes a COMMIT after every DDL statement.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] general features

2007-06-03 Thread Ron Johnson

On 06/03/07 09:08, Badawy, Mohamed wrote:
[snip]


Thanks for all of the replies,,,

What I mean by clustering is to put the database on many machines.


Single database on many machines?

Do you mean federation/horizontal scaling, or DR replication or 
something different.  PostgreSQL has master-slave replication using 
Slony-1.


If you mean simultaneous access to the same disks from multiple 
machines without corrupting the data, you'll have to go to a 
proprietary system.  10K (15K if you've got the scratch) RPM RAID10, 
8GB RAM and a couple of dual-core Opterons is a *really* fast 
database box.  You might not need more than that.


But if you *do* need the continuous uptime that shared-disk 
clustering and rolling in-place upgrades gives you, then Rdb/VMS 
can't be beat.  It'll *cost*, though.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Multiple customers sharing one database?

2007-06-03 Thread Ron Johnson

On 06/03/07 12:45, Lew wrote:
[snip]


I am always confused by the overloaded term "database" in such 
discussions. Do we mean the RDBMS engine, i.e., run separate instances 
of PG?  I sure would recommend against that.  Or do we mean it as the PG 
documentation does, e.g.,

<http://www.postgresql.org/docs/8.1/interactive/ddl-schemas.html>


A (relational) database is a database, and an RDBMS is what 
manipulates that (relational) database.


"The" schema "defines" the database.  "A" schema defines a specific 
logical sub-set of "the" schema.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Database design wisdom needed

2007-06-05 Thread Ron Johnson
 engine for different kinds of tables help -- engines that were
better for INSERT and SELECT, while others that were good for UPDATE?
We ran MySQL until a year ago but we have switched to Pg since we had
data corruption issues. I am sure I'm missing some tricks in the Pg
world, and would truly appreciate any ideas.

If you are reading until here, well done! And so many thanks in
advance for any insight you can shed into this matter.


What index(es) is/are on GAME and GAME_COUNTS?

What version of PG are you running?

Are you pushing the box too hard?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Encrypted column

2007-06-05 Thread Ron Johnson

On 06/04/07 17:54, Guy Rouillier wrote:

Ranieri Mazili wrote:

Hello,

I need to store users and passwords on a table and I want to store it 
encrypted, but I don't found documentation about it, how can I create 
a table with columns "user" and "password" with column "password" 
encrypted and how can I check if "user" and "password" are correct 
using a sql query ?


Many people consider two-way encryption to be insecure; two-way 
encryption means you can decrypt a value if you know the key, and it is 
insecure because you usually have to put the key into the source code. 
That means at least one person in your company, the programmer 
maintaining the source code, can learn all of your users' passwords.


Two-way encryption is needed for companies that store customer 
credit cards.


But yes, I've always worried about that.

 One 
way around that is to hash the value instead.  Then to validate, at 
runtime you hash the user-entered password using the same hash function, 
and validate that it matches the stored hash.  No one in your company 
ever knows end-user passwords.





--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Encrypted column

2007-06-05 Thread Ron Johnson

On 06/05/07 08:59, Alvaro Herrera wrote:

Ron Johnson wrote:

On 06/04/07 17:54, Guy Rouillier wrote:


Many people consider two-way encryption to be insecure; two-way 
encryption means you can decrypt a value if you know the key, and it is 
insecure because you usually have to put the key into the source code. 
That means at least one person in your company, the programmer 
maintaining the source code, can learn all of your users' passwords.
Two-way encryption is needed for companies that store customer 
credit cards.


I thought that the advice for companies storing customer CCs was: don't.


Sometimes you "must".

An example from my industry: transponder "toll tags" and toll roads. 
 The customer pre-pays a certain amount based on expected usage, 
and every time he drives thru a plaza, his balance decreases.  Once 
it drops to a certain threshold, more money needs to be added to the 
account.


If he is a CASH/CHEK customer, a light at the lane flashes yellow 
and (depending on the Agency) a message pops up saying, "Balance 
low", so he drives over to the customer service center, stands in 
line and pays his cash.


If he is a CC customer, the system (which I am DBA of) bills his 
card directly, saving the customer much time and effort.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] pointer to feature comparisons, please

2007-06-13 Thread Ron Johnson

On 06/13/07 15:02, Kevin Hunter wrote:
[snip]


'To enforce this rule without integrity constraints, you can use a 
trigger to query the department table and test that each new employee's 
department is valid. But this method is less reliable than the integrity 
constraint. SELECT in Oracle Database uses "consistent read", so the 
query might miss uncommitted changes from other transactions.'


Isn't it *supposed* to mis UNcommitted changes from other transactions?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] pointer to feature comparisons, please

2007-06-13 Thread Ron Johnson

On 06/13/07 16:59, PFC wrote:

Isn't it *supposed* to mis UNcommitted changes from other transactions?


Well, if the "uncommited change" is a DELETE of the row that allowed 
the constraint check to pass, then when this delete is commited, your 
data is no longer consistent.


The DELETE should block, no?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] pointer to feature comparisons, please

2007-06-13 Thread Ron Johnson

On 06/13/07 17:23, PFC wrote:
On Thu, 14 Jun 2007 00:09:20 +0200, Ron Johnson <[EMAIL PROTECTED]> 
wrote:



On 06/13/07 16:59, PFC wrote:

Isn't it *supposed* to mis UNcommitted changes from other transactions?
 Well, if the "uncommited change" is a DELETE of the row that 
allowed the constraint check to pass, then when this delete is 
commited, your data is no longer consistent.


The DELETE should block, no?


Why ?

Foreign keys put an ON DELETE trigger on the referenced table 


Foreign keys that silently, automatic DELETE records?

Did I read that correctly?

besides checking the referencing column on insert/update... If you just 
implement a constraint, you only get half the functionality.


But when I define a FK *constraint*, that's all I *want*!

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] pointer to feature comparisons, please

2007-06-14 Thread Ron Johnson

On 06/14/07 02:24, PFC wrote:



The DELETE should block, no?

 Why ?
 Foreign keys put an ON DELETE trigger on the referenced table


Foreign keys that silently, automatic DELETE records?
Did I read that correctly?


Isn't that the point of ON DELETE CASCADE ?


Where'd that come from?  Did I miss something in the thread?

besides checking the referencing column on insert/update... If you 
just implement a constraint, you only get half the functionality.


But when I define a FK *constraint*, that's all I *want*!


When you add a REFERENCE foo( bar ) foreign key and don't specify ON 
DELETE clause, I believe you get ON DELETE NO ACTION, which means the 
referenced row can't be deleted if there are referencing rows...  so 
when you have a FK constraint, it actually constrains two tables, and pg 
uses share locking to ensure race conditions can't cause an inconsistent 
database.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP

2007-06-16 Thread Ron Johnson

On 06/16/07 11:24, PFC wrote:
[snip]


It's a matter of mindset. PHP and Postgres have really opposite 
mindsets. Python is a lot more similar to Postgres, for instance :


- Postgres, Python : strongly typed, throws an error rather than 
doing funny stuff with your data, your code does what you think it 
should do.


I wouldn't call Python *strongly* typed, but I do know what you 
mean.  I think.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-06-16 Thread Ron Johnson

On 06/16/07 10:47, Lincoln Yeoh wrote:

Hi,

I've been wondering, what O/S or hardware feature would be useful for 
databases?


If Postgresql developers could get the CPU and O/S makers to do things 
that would make certain things easier/faster (and in the long term) what 
would they be?


By long term I mean it's not something that's only useful for a few 
years. Not something "gimmicky".


For example - something like virtual memory definitely made many things 
easier. Hardware support for virtualization also makes stuff like vmware 
easier and better.


What's the purpose of a multi-processing OS if you're just going to 
run a bunch of single-task VMs?


Seems CPU makers currently have more transistors than they know what to 
do with, so they're adding cores and doing a lot of boring stuff like 
SSE2, SSE3, SSE4, etc.


So is there anything else useful that they (and the O/S ppl) can do that 
they aren't doing already?


Reducing memory latency always helps.  That's AMD's strong point and 
now Intel is doing it.


They've both got more cache.  While I can't see the big use in PCs 
with quad-cores, multi-core can't help but benefit database servers.


AMD, Intel & IBM are always profiling code, to find bottlenecks in 
their microarchitectures.


POWER6 can run at 4GHz and is multi-core.

Anyway... databases are always(?) IO bound.  I'd try to figure out 
how to make a bigger hose (or more hoses) between the spindles and 
the mobo.


The Alpha 8400 had multiple PCI *buses*, so as not to have a 133MBps 
chokepoint.  A server with multiple PCI-e buses, 10Gb Ethernet, and 
lots of 4Gb HBAs attached to a big, fat SAN chock full of 15K SCSI 
disks could suck up a heck of a lot of data.


Better support for distributed locking (across cluster nodes etc)? OK 
that's old stuff, but the last I checked HP was burying VMS and Tandem.


AMD's HyperTransport could probably be used similar to Memory 
Channel.  However, nowadays, gigabit Ethernet is the CI of choice, 
meaning that it's all done in software.


Hardware acceleration for quickly counting the number of 
set/unset/matching bits?


x86 doesn't already do that?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP

2007-06-16 Thread Ron Johnson

On 06/16/07 15:04, Andrej Ricnik-Bay wrote:

On 6/17/07, John Smith <[EMAIL PROTECTED]> wrote:

guys,
love both tools but php @ 2.5 *billion* google results is far more
popular than postgresql @ 25 million google results. *if* somebody's
gotto adapt it's not php. php does what it does best in a way that
stuffy academics don't get.

>

That's like saying that BSD or Linux should be more like
Windows because there's more Windows than Linux stuff
to be found on the web 


You've not used KDE lately, have you?  :)

And while *ix might not get viruses, it's got it's share of worms 
and rootkits.



The following is a bit out of proportion, but you may get
the idea"
"If the majority of the population had cancer it would still be
an illness and not the norm."


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP

2007-06-16 Thread Ron Johnson

On 06/16/07 15:34, John Smith wrote:

On 6/16/07, Uwe C. Schroeder <[EMAIL PROTECTED]> wrote:

Mhhh - what does PHP have to do with Postgresql? Lots of pages just end
in .php, which is why the google results are so high - guess what, the
tool "html" hits 3.2 billion :-)

show me a database that doesn't respect html and i'll show you one i
don't want :).


Respect html?

WTH does that mean?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-06-16 Thread Ron Johnson

On 06/16/07 17:05, Alexander Staubo wrote:

On 6/16/07, Ron Johnson <[EMAIL PROTECTED]> wrote:

> Hardware acceleration for quickly counting the number of
> set/unset/matching bits?

x86 doesn't already do that?


I don't think so. The fastest way, I believe, is to use precomputed
lookup tables. Same for finding the least/most significant set/unset
bit, and other operations useful for dealing with bit vectors.


A couple of new AMD Barcelona opcodes might help do that:

http://www.anandtech.com/showdoc.aspx?i=2939&p=6

While on the topic of instructions, AMD also introduced
a few new extensions to its ISA with Barcelona. There are
two new bit manipulation instructions: LZCNT and POPCNT.
Leading Zero Count (LZCNT) counts the number of leading
zeros in an op, while Pop Count counts the leading 1s in
an op. Both of these instructions are targeted at crypto-
    graphy applications.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-06-17 Thread Ron Johnson

On 06/17/07 00:19, Greg Smith wrote:

On Sat, 16 Jun 2007, Ron Johnson wrote:

Anyway... databases are always(?) IO bound.  I'd try to figure out how 
to make a bigger hose (or more hoses) between the spindles and the mobo.


What I keep waiting for is the drives with flash memory built-in to 
mature.  I would love to get reliable writes that use the drive's cache 
for instant fsyncs, instead of right now where you have to push all that 
to the controller level.


But drive-based flash memory will always be a fixed size, and only 
for that drive.


Controller-based cache is expandable and caches the whole RAID set 
(besides being battery-backed).  And if you *still* need more cache, 
rip out that controller and put in a more expensive one, or 
transition to "plain" SCSI cards and a storage controller.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-07-02 Thread Ron Johnson

On 06/18/07 08:05, Merlin Moncure wrote:
[snip]


That being said, it's pretty clear to me we are in the last days of
the disk drive.


Oh, puhleeze.  Seagate, Hitachi, Fuji and WD aren't sitting around 
with their thumbs up their arses.In 3-4 years, large companies 
and spooky TLAs will be stuffing SANs with hundreds of 2TB drives.


My (young) kids will be out of college before the density/dollar of 
RAM gets anywhere near that of disks.  If it ever does.


What we are in, though, is the last decade of tape.


When solid state drives become prevalent in server
environments, database development will enter a new era...physical
considerations will play less and less a role in how systems are
engineered.


"Oh, puhleeze" redux.

There will always be physical considerations.  Why?

Even if static RAM drives *do* overtake spindles, you'll still need 
to engineer them properly.  Why?



1) There's always a bottleneck.

2) There's always more data to "find" the bottleneck.


So, to answer the OP, my answer would be to 'get rid of
the spinning disk!' :-)


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Experiences of PostgreSQL on-disk bitmap index patch

2007-07-02 Thread Ron Johnson

On 06/25/07 09:58, Tom Lane wrote:
[snip]


The fly in the ointment is that if the column value is so high
cardinality as all that, it's questionable whether you want an index
search at all rather than just seqscanning; and it's definite that
the index access cost will be only a fraction of the heap access cost.
So the prospects for actual net performance gain are a lot less than
the index-size argument makes them look.


Well they definitely are for data warehouses, in which many 
high-cardinality columns each have an index.


Because of their small disk size, ANDing them is fast and winnows 
down the result set.  That's the theory, of course.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-07-03 Thread Ron Johnson

On 07/03/07 13:03, Merlin Moncure wrote:

On 7/2/07, Ron Johnson <[EMAIL PROTECTED]> wrote:

On 06/18/07 08:05, Merlin Moncure wrote:
[snip]
>
> That being said, it's pretty clear to me we are in the last days of
> the disk drive.

Oh, puhleeze.  Seagate, Hitachi, Fuji and WD aren't sitting around
with their thumbs up their arses.In 3-4 years, large companies
and spooky TLAs will be stuffing SANs with hundreds of 2TB drives.


haven't we had this debate before?

I don't know if you've been paying attention to what's going on in the
storage industry...Apple, Dell, Fuji, Sandisk, Intel, and others are
all making strategic plays in the flash market.  At the outset of
2007, flash was predicted to decline 50% for the year...so far, prices
have dropped 65% in the first two quarters.  Right now it's all about
the high end notebooks and media players but the high margin, high
rotation speed drives are next.


Technological nay-sayers have been wrong before, but I just can't 
see a *database* server full of static RAM in the next 10 years.



 I admit the high density low speed
cold storage d2d backup systems will be the last to fall and will be
quite some ways off.

note by, 'next', and 'last days', i mean that pretty loosely...within
the next 5 years or so.  'dead' as well...there are many stages of
death to an enterprise legacy product.  I consider tape backups to be
nearly dead already, although there are many still in use.  d2d is
where it's at though.


Mainframers (and various other oldsters like me) think about
1) shock resistance,
2) media costs,
3) Iron Mountain,
4) media longevity.

You can drop a SuperDLT tape from "man height" and recover the data 
(even if it has to be restrung into a new housing).  I wouldn't drop 
a disk full of data and have any expectation of survival.


A 160GB ("320"GB compressed) SATA drive is about $60 plus a $10 
carrier.  That comparable very well to tapes, I think.


An Iron Mountain delivery truck will drive over some nasty bumps. 
How shock resistant is a disk drive in an external carrier?  Not as 
resistant as a drive in a padded shipping box.  But is it resistant 
"enough"?


"Enterprise-level" tapes can sit in storage for 7-15 years and then 
still be readable.  Can a disk drive sit un-used for 7 years?  Would 
the motor freeze up?  Will we still be able to connect SATA drives 
in 7 years?


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-07-04 Thread Ron Johnson

On 07/04/07 16:00, Andrej Ricnik-Bay wrote:

On 7/4/07, Ron Johnson <[EMAIL PROTECTED]> wrote:


"Enterprise-level" tapes can sit in storage for 7-15 years and then
still be readable.  Can a disk drive sit un-used for 7 years?  Would
the motor freeze up?  Will we still be able to connect SATA drives
in 7 years?


I was a bit harsh about connecting to SATA drives.  IDE has been 
around for 21 years and ATA-133 is backwards compatible with 20MB 
drives of that era, so I predict that you'll be able to plug SATA-1 
drives into machines with SATA-9 interfaces.


But then, the motor might still not spin up... :(


Same with a tape-drive, no?  I've seen so many standard changes
in drives and SCSI connectors ... if you don't keep spares of all the
equipment involved you'll face the same issue with tapes that you'd
face with SATA disks.


No.

Enterprise tape drives are not "flavor of the month", and can always 
read the previous one or two generations of tape.


And if you've switched from, for example, SuperDLT to LTO, then 
you'll still be able to buy some drives on the used market (either 
eBay or from a dealer).


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Direct I/O

2007-07-08 Thread Ron Johnson

On 07/08/07 22:07, lai yoke hman wrote:

Hello,
Is there any direct I/O instead of buffered I/O for postgresql?
Thank you.


Why do you want raw IO?  Performance would tank.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] how does a temp table work?

2007-07-11 Thread Ron Johnson

On 07/11/07 12:06, Andrew Sullivan wrote:

On Wed, Jul 11, 2007 at 10:28:25PM +0530, Joshua N Pritikin wrote:
and foo is executed from different transactions in parallel then each 
transaction will see its own private temp table, right?


They will see their own private temp tables _no matter what_. 
There's no "shared temp table" support in Postgres.


Will GLOBAL TEMP tables be added any time soon?

I know the docs say that the standards implies needing modules, but 
"shared definition" is really handy.  We use regularly such GLOBAL 
TEMP tables without modules.


--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] One Large Table or Multiple DBs?

2007-07-12 Thread Ron Johnson

On 07/09/07 16:18, Mike wrote:

Hi,

What is efficient approach toward storing a web application's user
data. How do applications such as basecamp, SalesForce or QuickBooks
online store their data? Is it in one DB with huge tables each record
having a user account's foreign key or do they create a separate
database for each of their accounts? Which one is more efficient? My
guess was in one large DB with large tables.


How big is "big"?

What is "efficient"?  Speed, management, upgrades, backups, scalability?

If each customer has 80GB of data, then separate databases are the 
way to go, since it eases scalability and allows for parallel 
backups.  But then upgrades must be applied to each of thousands of 
databases.  If each customer has 100MB of data, then unified tables 
keyed off of account number would be simpler.


In between is schema-per-account.  But upgrades are still a chore.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] createing indexes on large tables and int8

2007-07-18 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/17/07 17:12, [EMAIL PROTECTED] wrote:
> On Tuesday 17 July 2007 17:47:01 Tom Lane wrote:
>> [EMAIL PROTECTED] writes:
>>> i think i got it fixed as i saw that i pushed my maintenance_work_mem too
>>> high. It was higher than physical ram :-(
>> Ooops, that will definitely cause problems.
> 
> yes it did! I ran it again. And now it takes 10 minutes per index instead of 
> 10 hours (still 8.1). maybe something postgres should complain about if 
> setting maintance_work_mem too high. 

Unless it does some really OS-specific calls, *can* PostgreSQL know
how much *physical* RAM is in a box?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGnk+zS9HxQb37XmcRAsDtAKCCadB0CF8ATeHCtO79wcTD3lER7wCgttoF
E9Rndryd/IhZEP2FY7yIr/A=
=bDSf
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Delete/update with limit

2007-07-23 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/23/07 10:56, Csaba Nagy wrote:
> Hi all,
> 
> This subject was touched a few times in the past, I looked into the
> archives... the result is invariably key developers saying such a
> feature is unsafe because the result is unpredictable, while the people
> requesting is saying it is OK that way, it is expected... but no
> compelling use case for it.
> 
[snip]
> 
> Now I don't put too much hope I can convince anybody that the limit on
> the delete/update commands has valid usage scenarios, but then can
> anybody help me find a good solution to chunk-wise process such a buffer
> table where insert speed is the highest priority (thus no indexes, the
> minimum of fields), and batch processing should still work fine with big
> table size, while not impacting at all the inserts, and finish in short
> time to avoid long running transactions ? Cause I can't really think of
> one... other than our scheme with the delete with limit + trigger +
> private temp table thing.

Maybe add OIDs to the table, and delete based on the OID number?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGpQ7zS9HxQb37XmcRArXQAJ9qcrWphVgtINdGlcwGubg/SEsjMgCeKyLt
I8xPs0NEGqg22Cvgf4awNVQ=
=l/yz
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Will partial index creation use existing index?

2007-07-24 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/24/07 14:48, Gregory Stark wrote:
[snip]
> 
> The main use case for it is actually REINDEX. Since you already have an index
> which contains precisely the records you want to index and already in order
> too.
> 
> The main disadvantage is that it's not clear when it would actually be faster.
> Generally index scans are slower than reading the whole table and sorting.
> Probably it would have to run an SPI query to use the planner to find the best
> way to get the rows it wants.

I believe you, but it's totally counter to prima-fascia logic.

Scanning the whole table means that you have to read in a whole
bunch of columns that you don't really give a rat's arse about, and
thus is a waste, whereas directly reading an existing index means
that you've got perfect locality of data, since you're only reading
what you care about.

> Another problem is that presumably you're reindexing because the existing
> index *isn't* in such good shape. You may even be doing it because the
> existing index is corrupt.

That, of course, is an excellent point.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGpqHkS9HxQb37XmcRAu93AKC04YXUWvMI6YiLfBNoy2BYtQw28ACdHqE/
kVqHiPwBONv0Tudy5OnA/SE=
=Fbuw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Linux distro

2007-08-01 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/01/07 10:37, Owen Hartnett wrote:
> At 4:52 PM +0200 8/1/07, Leif B. Kristensen wrote:
>> On Wednesday 1. August 2007 16:15, Madison Kelly wrote:
>>
>>> /Personally/, I love Debian on servers.
>>>
>>> It's not quite as 'hardcore' as Gentoo (a great distro, but not one to
>>> start with!). It's the foundation of many of the popular distros
>>> (Ubuntu, Mepis, Knoppix, etc) and the Debian crew is very careful
>>>  about what they put into the 'stable' repositories.
>>
>> I agree totally. Debian in a server configuration is quite easy to get
>> started with, and is rock solid. My first Linux "test server" (my old
>> Pentium 133 MHz desktop) way back in 2002 ran Debian Woody. I kept it
>> running until it died from old age a couple of years ago. Later I fell
>> in love with Gentoo. But if I'd have to run a server with maximum
>> stability and uptime, I think that I'd still prefer Debian.
> 
> As an alternative viewpoint, I've been running the latest postgres on
> Mac OS X Server 10.4, and it's been great for me.  It was my first time
> using a server, and my first serious use of postgres (although I have
> had a lot of previous unix experience.)  All the power of unix, all the
> ease of the Macintosh (and it's server installation gives you lots of

Pardon me for being the contrarian, but why does a server need a
GUI?  Isn't that just extra RAM & CPU overhead that could be more
profitably put to use powering the application?

> great things for free and already installed - granted most is publicly
> available, but it's already installed and ready for use that's the big
> advantage).  Not only that, but I can run windoze in Parallels (or even
> Boot Camp if I desired).

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGsUJ4S9HxQb37XmcRAggtAKCfNxfv4/+Qmwt/89jiun4jWsM/FACeMMld
zyDp6ec4t12RF6XGnEE3uUg=
=E95M
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Linux distro

2007-08-01 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/01/07 21:44, Andrej Ricnik-Bay wrote:
> On 8/2/07, Ron Johnson <[EMAIL PROTECTED]> wrote:
> 
>>> As an alternative viewpoint, I've been running the latest postgres on
>>> Mac OS X Server 10.4, and it's been great for me.  It was my first time
>>> using a server, and my first serious use of postgres (although I have
>>> had a lot of previous unix experience.)  All the power of unix, all the
>>> ease of the Macintosh (and it's server installation gives you lots of
>> Pardon me for being the contrarian, but why does a server need a
>> GUI?  Isn't that just extra RAM & CPU overhead that could be more
>> profitably put to use powering the application?
> Amen =)
> 
> Unnecessary waste of resources, plus artificial introduction of stuff
> that can potentially make the machine go belly-up...  a dedicated
> server should have the bare minimum install; as much as necessary
> and as little as possible.

That's (one reason) why I like Debian.  It's packages are so
granular that you can only install what you want to install.

I.e., Python without the GNOME or KDE language bindings, which would
also drag in GNOME/KDE and X.org.

> /me cringes at the idea of flying toasters slowing down a query ... :D

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGsUokS9HxQb37XmcRAvYQAKC/ju2KOmb3ZOt8rCPd/CEc6cJJrgCfXOVM
aS08rA2LtbDNf/h1HUqbSI8=
=WIP6
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Linux distro

2007-08-01 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/01/07 21:58, Merlin Moncure wrote:
[snip]
> 
> 3. binary packaging
> While I like the debian distros generally, I dislike the debian
> packaging of PostgreSQL.  IMO, it's over engineered.  If you plan to

How so?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGsUqHS9HxQb37XmcRAg0+AKCiY79IVE0APk6YcF6HSlIoRmvshACfTH+2
hTCDwDo40q4+VPlHBAjxLIc=
=ysHU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Linux distro

2007-08-01 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/01/07 22:05, Merlin Moncure wrote:
> On 8/2/07, Ron Johnson <[EMAIL PROTECTED]> wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> On 08/01/07 10:37, Owen Hartnett wrote:
>>> At 4:52 PM +0200 8/1/07, Leif B. Kristensen wrote:
>>>> On Wednesday 1. August 2007 16:15, Madison Kelly wrote:
>>>>
>>>>> /Personally/, I love Debian on servers.
>>>>>
>>>>> It's not quite as 'hardcore' as Gentoo (a great distro, but not one to
>>>>> start with!). It's the foundation of many of the popular distros
>>>>> (Ubuntu, Mepis, Knoppix, etc) and the Debian crew is very careful
>>>>>  about what they put into the 'stable' repositories.
>>>> I agree totally. Debian in a server configuration is quite easy to get
>>>> started with, and is rock solid. My first Linux "test server" (my old
>>>> Pentium 133 MHz desktop) way back in 2002 ran Debian Woody. I kept it
>>>> running until it died from old age a couple of years ago. Later I fell
>>>> in love with Gentoo. But if I'd have to run a server with maximum
>>>> stability and uptime, I think that I'd still prefer Debian.
>>> As an alternative viewpoint, I've been running the latest postgres on
>>> Mac OS X Server 10.4, and it's been great for me.  It was my first time
>>> using a server, and my first serious use of postgres (although I have
>>> had a lot of previous unix experience.)  All the power of unix, all the
>>> ease of the Macintosh (and it's server installation gives you lots of
>> Pardon me for being the contrarian, but why does a server need a
>> GUI?  Isn't that just extra RAM & CPU overhead that could be more
>> profitably put to use powering the application?
> 
> A server with a GUI sitting on a login screen is wasting zero
> resources.  Some enterprise management tools are in java which require
> a GUI to use so there is very little downside to installing X, so IMO
> a lightweight window manager is appropriate...a full gnome is maybe
> overkill.  Obviously, you want to turn of the 3d screen saver :-)

X is network-transparent.

Load the few necessary X libraries (Debian's packages are granular
enough to do this), and then use the GUI on your workstation to run
all those foolish GUI-based server (in a room down the hall, across
town or across the country, with compressed X) apps.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGsW1OS9HxQb37XmcRAoUYAKCZqXlRD9LmVIXHLuPe3YhWxJzppQCghP5x
zCLmJdjJLz+NnyMlwNGJG5E=
=f6TY
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] python api to backup/restore data

2007-08-04 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/04/07 10:58, Tom Lane wrote:
> Acm <[EMAIL PROTECTED]> writes:
>> I would like to know whether an API for accessing Postgresql from
>> Python exists and whether or not it supports commands for backing up a
>> database (or schema) and restoring a backup file.
> 
> Run pg_dump or pg_restore as a subprocess.

But that doesn't give you the capability of getting customized output.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGtMMQS9HxQb37XmcRArvxAKDo0VQvq1uBtCEUYYb3+FSWNKmCYACfckSV
f7gaTsz/mAbEzFH0IJIypI0=
=WAls
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Moving to postgresql and some ignorant questions

2007-08-14 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/14/07 10:09, Phoenix Kiula wrote:
> On 14/08/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>> "Phoenix Kiula" <[EMAIL PROTECTED]> writes:
[snip]
>>>
>>> QUESTION1:
>>> Is this it? Or am I missing something in terms of execution?  We use
>>> Perl on our site but may be gradually switching to PHP at some point.
>>> Will the above plan of execution be ok?
>> A transaction is a bunch of queries which you want to all get committed or
>> aborted together. The expensive step in Postgres is the COMMIT. Postgres does
>> an fsync which forces the data to go to disk (if you're using good hardware,
>> and don't have fsync=off). That takes from 5-15ms depending on how much data
>> and how fast your drives are.
> 
> 
> 
> 
> If I am reading this right, does this mean it is probably better to
> leave fsync as "fsync=off" on production machines? Also, is COMMIT
> automatic for my queries? In some minor testing I did (not scientific
> I did at all) some queries through Perl did not update the database at
> all. I had "fsync=off" in my conf file, and there was no COMMIT etc in
> my SQL, just plain SQL. So I am a bit confused. What's a good starting
> point?
> 
[snip]
> 
> So, again, in the conf file, is this what you recommend:
> 
> fsync=off

I seriously doubt that's what he means.

"fsync=off" *is* faster than "fsync=on", but leaves your data at
risk in case of a hardware crash.

Turning it off during initial data load is quite common, though.
Just remember to turn it back on!!!

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGwczGS9HxQb37XmcRAhUvAJ9LsQPvd6tQDp+/Mzh3jl8oPs4mHQCffjev
2uCJa3x0/NzUQBVmaJMcVR4=
=kVMU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] MVCC cons

2007-08-14 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/14/07 14:34, Kenneth Downs wrote:
> Tom Lane wrote:
>> Kenneth Downs <[EMAIL PROTECTED]> writes:
>>  
>>> Speaking as an end-user, I can give only one I've ever seen, which is
>>> performance.  Because of MVCC, Postgres's write performance (insert
>>> and update) appears on my systems to be almost exactly linear to row
>>> size.  Inserting 1000 rows into a table with row size 100 characters
>>> takes twice as long as inserting 1000 rows into a table with row size
>>> 50 characters.
>>> 
>>
>> Not sure why you'd think that's specific to MVCC.  It sounds like purely
>> an issue of disk write bandwidth.
>>
>> regards, tom lane
>>   
> 
> I did not see this in MS SQL Server.

It is only logical that it will take 2x as long to insert 2x as much
data.

Maybe SQL Server is compressing out white space?  Or (shudder)
heavily caching writes?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGwhXTS9HxQb37XmcRAmdTAJ4rpK60hNtcvT82gCD4RG4EPtcC2wCeNR/C
poURsgchjku2UC0y476KOfM=
=KVNY
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] MVCC cons

2007-08-14 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/14/07 20:36, Merlin Moncure wrote:
[snip]
> 
> PostgreSQL wins in terms of better concurrency (especially in long
> transactions or transactions that touch a lot of records), cheap
> rollbacks, and all the advantages of a sophisticated locking engine
> (transactional ddl for example).

Although MVCC is not a /sine qua non/ of transactional ddl.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGwl4GS9HxQb37XmcRAhm5AJ9pSZR08Vj5aOYtMnQdQaPsjtPMOQCghP+v
k81CxBZyH/42wHjPeT0Azvc=
=WRv0
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Transactional DDL

2007-08-14 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/15/07 00:05, Harpreet Dhaliwal wrote:
> And this feature i.e. transactional DDL is not there in other major
> RDBMS like sql server, oracle etc?

Define "major".  Does it mean "popular" or "used on very large systems"?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGwp1GS9HxQb37XmcRAtdIAKC+7kG6K4WVxgGGSVT/AHcWCo6I8gCfZ9y5
bVcXkbWY4E9OzYss8g1i7Q4=
=/dqV
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Strange inconsistency with UPDATE

2007-08-16 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/16/07 21:58, Phoenix Kiula wrote:
> I am trying to force a column to have lowercase because Postgresql is
> case-sensitive in queries. For the time being I've made an expression
> index on lower(KEY). But I would like to have just lower case data and
> then drop this expression index.
> 
> However, I see some inconsisent behavior from Postgresql. When I issue
> an UPDATE command , it shows me a duplicate violation (which could be
> correct) --
> 
> -# update TABLE set ACOLUMN = lower(ACOLUMN);
> ERROR:  duplicate key violates unique constraint "TABLE_ACOLUMN_key"
> 
> So I try to find out the offending values of this ACOLUMN that become
> duplicated when lower(ACOLUMN) is issued:
> 
> -# SELECT lower(ACOLUMN), count(*)  FROM TABLE
>  GROUP BY lower(ACOLUMN) HAVING count(*) > 1 ;
> ---+---
> lower | count
> ---+---
> (0 rows)
> 
> But this doesn't make sense! If there are no columns that get
> repeated, how can it violate the UNIQUE constraint?
> 
> I am not sure if the following helps, but I'm including the EXPLAIN on
> this table. Penny for your thoughts!

Whatever the issue, you can bet your car that it's not a bug in
PostgreSQL, but you who is misunderstanding how PG works.

Write a script that loops thru the records one by one, updating only
one record per loop iteration.  That will find the problem record.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxRSUS9HxQb37XmcRApMyAKCGOmpm5xKkfuWR19OnbXLVZMMbkwCcCHmu
4OOXMnRnaixpp8lSjbrA/5w=
=M3jw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Writing most code in Stored Procedures

2007-08-17 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/17/07 18:00, Steve Manes wrote:
> Guy Rouillier wrote:
>> I have a thread I started ages ago over on the PERFORM list that I'm
>> sadly just now being able to provide some insight on.  I'll be
>> replying on that thread in more detail, but the short of it turns out
>> to be that at least in this one application, using stored procs for
>> inserts is slowing down the app considerably.  The app does high
>> volume inserts and updates, about 16 million rows a day.  By switching
>> from stored procs to inline inserts, elapsed time dropped from 2595
>> seconds to 991 seconds for a test run.
>>
>> So the moral of the story is that, as anyone who has worked
>> professionally for a number of years knows, no magic bullets exist.
>> General guidelines can be helpful, but each scenario must be
>> individually investigated.
> 
> Absolutely.  You can't assume that every application is going to fit
> neatly into the same development jig.  Hope the MVC frameworks zealots
> don't read that. 
> 
> I worked on a fairly high volume site using PG, an ad trafficking
> workflow application, which imported ~2 million placements daily from
> DoubleClick, OAS and Accipiter.  Everything had to be imported and the
> reports run and cached by 8am so the clients stare blankly at 10,000 row
> Excel charts over their morning coffee.
> 
> Moving all the application-bound inserts into stored procedures didn't
> achieve nearly the performance enhancement I'd assumed I'd get, which I
> figured was due to the overhead of the procs themselves.

Would that be because the original app was written in a compiled
language, but the SPs in an interpreted language?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxjK1S9HxQb37XmcRAnYWAKCRV+INrpvl83lhnt4iadIMrBNIRgCgr8J2
UK3F87ji/24mrISLl+WmLnY=
=5csM
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Blobs in Postgresql

2007-08-18 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/17/07 23:16, Merlin Moncure wrote:
> On 8/18/07, Ron Olson <[EMAIL PROTECTED]> wrote:
>> The language is Java. I've made some tests and they work very well for 25meg
>> filesworks exactly the way it should, first time. MySQL had all kinds of
>> nasty surprises for me when I first started working with blobs, but I can
>> say that I took my code, changed the driver, and it all works like a champ
>> (mind you, this was a quick test app).
>>
>> I haven't looked at encryption at the database levelis such a thing
>> available? I know Oracle has some form of data encryption at the database
>> level so the nefarious DBA with the wide mustache and black brimmed hat
>> always going "ah ha ha ha ha" can't make off with the data, but does
>> Postgres have something similar?
>>
>> BTW, to put into context, the database will be designed to hold evidence
>> (well, photos and videos of). Thus the compelling need for some security, as
>> well as the variation in file sizes.
> 
> Well, my assumption was that you would encrypt the data on the client
> side and store it that way.
> 
> PostgreSQL has open architecture.  If you wanted to do the encryption
> on the server, one possible approach that jumps out at me is to write
> a small C function which receives the data, encrypts the image using a
> key sent by the client all (but not stored), and either stores the
> encrypted image back in the database via SPI or writes it out to a
> file.
> 
> There are many strategies to encrypting data...first thing to think
> about is where the encryption happens, where the keys are stored, etc.

Client-side encryption is important, because with server-side
encryption, you are sending the Valuable Data across the wire (or,
even worse!) wireless in cleartext form.

It's more likely that there's a packet sniffer on the network than
an Evil DBA snooping around.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwf7S9HxQb37XmcRAimGAJ98Kykormb63BedYknIij2xZvDgEACgw23C
eWn7JJKSs1KL9dSfVx3p/BY=
=OLl1
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Writing most code in Stored Procedures

2007-08-18 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/17/07 21:45, Steve Manes wrote:
> Ron Johnson wrote:
>>> Moving all the application-bound inserts into stored procedures didn't
>>> achieve nearly the performance enhancement I'd assumed I'd get, which I
>>> figured was due to the overhead of the procs themselves.
>>
>> Would that be because the original app was written in a compiled
>> language, but the SPs in an interpreted language?
> 
> No, because the application language was Perl5 for both.  I think it was
> just the overhead of 2 million inserts via procs versus 2 million inline
> inserts (without the proc overhead).

Interesting.  Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwiAS9HxQb37XmcRAgi2AJ9Yq2drImecZVTbZR0Wo4VKlpaiXwCgpjHo
8KcDWqDpW6BWNWCj+ZUQFU4=
=Qlpg
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Automating logins for mundane chores

2007-08-18 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/18/07 06:02, Phoenix Kiula wrote:
[snip]
> 
> Thanks for this. I am logged in as root. Put it there and it works. I

Well, that's your first problem.

And second.  And third.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwlGS9HxQb37XmcRAvJSAKDP//ElDCzRk2Jcewm1+GVxaeVikACfbk+p
4obghwE8R19ljPRiqRPQQRg=
=NYrP
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Writing most code in Stored Procedures

2007-08-18 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/18/07 11:08, Joshua D. Drake wrote:
> Josh Tolley wrote:
>> On 8/18/07, Ron Johnson <[EMAIL PROTECTED]> wrote:
>>> Interesting.  Does PG have to initiate the Perl interpreter every
>>> time you call a Perl-written SP?
>> IIRC PostgreSQL should only load the perl interpreter once per session.
> 
> Right.

I'd have been stunned if you did it any other way.

So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT?

(I can imagine that the SP code path would be longer, but since IO
is the slowest part of the system, I'm surprised that it's *that*
much slower.)

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxx5jS9HxQb37XmcRAn1vAKCEWTovgcj/w/uFVK0nankGdbOFuACg6HSq
dUnMN0dPsdQ8NKSDW0EahcU=
=P8gN
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Automating logins for mundane chores

2007-08-18 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/18/07 21:10, Phoenix Kiula wrote:
> On 18/08/07, Ron Johnson <[EMAIL PROTECTED]> wrote:
>>
>> On 08/18/07 06:02, Phoenix Kiula wrote:
>> [snip]
>>> Thanks for this. I am logged in as root. Put it there and it works. I
>> Well, that's your first problem.
>>
>> And second.  And third.
> 
> 
> 
> Thanks for the kick in the derierre. Have set it all up to operate as
> user postgres. Or is that insecure too?

Whenever thinking about security, the question to ask yourself is:
am I doing anything which would make it easier for a Bad Guy to gain
access to my data or systems.  Then, do the opposite.

Examples:

Using a powerful account for mundane activities?  Use a mundane
account instead.  (As Joshua pointed out, "postgres" is a powerful
account.)

Sending important data over the wire (or worse, wireless) in clear
text?  Encrypt it.

Vulnerable to SQL injection attacks by sending fully formed SQL
statements across the wire?  Use prepared statements instead.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGx6zsS9HxQb37XmcRAndxAJ0YJ1mGQ1+erBsDuq3/iCN3q6ZcsgCgsVpd
F0/q8sPWoWs4qgFhbP65NyM=
=syP0
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery

2007-08-22 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/22/07 17:45, Dmitry Koterov wrote:
> Also, the controller is configured to use 75% of its memory for write
> caching and 25% - for read caching. So reads cannot flood writes.

That seems to be a very extreme ratio.  Most databases do *many*
times more reads than writes.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGzMNDS9HxQb37XmcRAgMLAJsGvA43MKrfRKoyf0W0Nv5/VWu5gACdG8qh
oJbb6+7FbotnEXnf9PdYF+E=
=Esfi
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Geographic High-Availability/Replication

2007-08-23 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/23/07 17:22, Bill Moran wrote:
> Decibel! <[EMAIL PROTECTED]> wrote:
>> On Aug 22, 2007, at 3:37 PM, Joshua D. Drake wrote:
>>> You can not do multi master cross continent reliably.
>> I'm pretty sure that credit card processors and some other companies  
>> do it... it just costs a LOT to actually do it well.
> 
> Isn't this sort of requirement the entire reason for 2-phase commit?

Entire reason?  Not that I've heard.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGzjKiS9HxQb37XmcRArTlAJ43MAEDdbbi71WDIApW5j0PveeJIwCePJPx
czuG/oescDoF8SAAehw4xdA=
=v+RP
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Geographic High-Availability/Replication

2007-08-24 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/24/07 08:21, Markus Schiltknecht wrote:
> Hi,
> 
> Matthew wrote:
>> Hey all, new postgres user here. We are trying to setup/research an
>> HA/Replicated solution with Postrgresql between a datacenter in LA and a
>> d.c. in NY.
>>
>> We have a private LAN link between the two D.C.'s with a max round-trip
>> of 150ms.
>>
>> We will have a web server at each d.c. (among other servers) that will
>> write/read to/from the local LAN DB. On writes, that data should be
>> xmited to the other data center so that if, for whatever reason, my
>> website request was sent to LA instead of NY, all my session information
>> etc will still exist.
> 
> This is commonly known as synchronous replication. As that involves
> *at-least* one round-trip *before* committing, it's quite expensive. Can
> you live with a delay of ~150ms before COMMIT confirmation?

Which puts an upper limit on transaction rates at 6TPS.  Blech.

> Another issue is the reliability of your failure detectors. How does
> server B know that server A is really down (and not only the link?).
> Normally, that's solved with a quorum device. So that you have to have
> at least three servers - preferably in different locations.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGzu/7S9HxQb37XmcRArnEAJ947nvYFT1eQvRzj6YkpVEDLtLUqQCgsbQy
rgyz2ZCrlGbS+RzzXTD1ybY=
=SRQv
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] SQL Diff ?

2007-08-25 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/25/07 21:51, Kevin Kempter wrote:
> Hi List;
> 
> I have a very large table (52million rows) - I'm creating a copy of it to rid 
> it of 35G worth of dead space, then I'll do a sync, drop the original table 
> and rename table2.

What is your definition of "dead space"?

Bad rows, duplicate rows, old rows?  Something else?

> Once I have the table2 as a copy of table1 what's the best way to select all 
> rows that have been changed, modified in table1  since the initial laod from 
> table1 into table2?
> 
> Also I'll need to delete any rows in table2 that no longer remain in table1.
> 
> There is no change date column
> I could do something like select * from table1 where col1 || col2 || col3 etc 
> not in (select col1 || col2 || col3 etc from table2)
> 
> but this would be ineffecient & slow.
> 
> Anyone have a suggestion to do this in an efficient manner?
> 
> Thanks in advance

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG0O8bS9HxQb37XmcRAsWdAJoDI/WvdaGSO7CUkUa74xHoRgycIwCguLXv
Msw0TdJyYI4keoq+ofu4j3c=
=Gi/f
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] SQL Diff ?

2007-08-25 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/25/07 22:21, Kevin Kempter wrote:
> On Saturday 25 August 2007 21:10:19 Ron Johnson wrote:
>> On 08/25/07 21:51, Kevin Kempter wrote:
>>> Hi List;
>>>
>>> I have a very large table (52million rows) - I'm creating a copy of it to
>>> rid it of 35G worth of dead space, then I'll do a sync, drop the original
>>> table and rename table2.
>> What is your definition of "dead space"?
>>
>> Bad rows, duplicate rows, old rows?  Something else?
> 
> deleted rows that should have been cleaned up with vacuum, problem is the 
> client let it go so long that now I cant get a vacuum to finish cause it 
> impacts the day2day operations too much.  Long story, see my recent questions 
> on the performance list for more info.

OK.

> 
>>> Once I have the table2 as a copy of table1 what's the best way to select
>>> all rows that have been changed, modified in table1  since the initial
>>> laod from table1 into table2?

Is this a 24x7 database?

>>> Also I'll need to delete any rows in table2 that no longer remain in
>>> table1.
>>>
>>> There is no change date column
>>> I could do something like select * from table1 where col1 || col2 || col3
>>> etc not in (select col1 || col2 || col3 etc from table2)
>>>
>>> but this would be ineffecient & slow.
>>>
>>> Anyone have a suggestion to do this in an efficient manner?
>>>
>>> Thanks in advance

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG0RRzS9HxQb37XmcRAtuhAJ0TEBIHJwvcR58iU4MvjV2CYMvvfQCdFnkJ
ThDbGY8dXJ2MoqOaNHInqdU=
=GF05
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Install on 32 or 64 bit Linux?

2007-08-28 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/28/07 16:21, Ralph Smith wrote:
> Hello,
> 
> We're at a crossroads here and it's time to upgrade boxes and versions
> of PG.
> This eMail query is about the first step.
> 
> Are there any recommendations on whether to install onto 32 vs 64 bit
> Linux?
> We're going to be using virtual machines.

Which hypervisor?

> Our application consists mostly of near-sequential inserts into a LARGE
> table, accompanied by updates into much smaller tables.
> I perform the occasional queries on the large table, requiring
> sequential scans.
> 
> Are there any indications whether 32 or 64 bit Linux would be preferable?

You haven't told us the important stuff yet.  Primarily:

   How *big* is the database?

   How many inserts/minute?

   What else will this box be doing besides inserting records?

   How fast are the disks?

   SCSI or SATA/IDE?

   How much RAM?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1KlnS9HxQb37XmcRAkSJAKDIHfK7YzumbMrZ6iPKMUyDnTnafQCbBKDJ
XplucXPbBqXtruC4rlHc+yM=
=Shq4
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Is there a better way to do this?

2007-08-28 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/28/07 20:06, Tom Lane wrote:
[snip]
> 
> As Michael says, the speed argument is really kinda minor compared
> to the other ones, but it's real enough.

Every little bit counts, though.  For example, if it's part of an
otherwise computationally-intensive operation that you're performing
on 20 million rows, you'll definitely see the difference.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1OP4S9HxQb37XmcRAvxUAJ0TFcC06TgAsOv8aou3BC1pJpbADwCgkQyy
Ib8H44C7d8mvvWfsHbWjFVE=
=3/7d
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Reliable and fast money transaction design

2007-08-29 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/29/07 07:27, cluster wrote:
> OK, thanks. But what with the second question in which the UPDATE is
> based on a SELECT max(...) statement on another table? How can I ensure
> that no other process inserts a row between my SELECT max() and UPDATE -
> making my SELECT max() invalid?
> 
> A table lock could be an option but I am only interested in blocking for
> row insertions for this particular account_id. Insertions for other
> account_ids will not make the SELECT max() invalid and should therefore
> be allowed.

Well, concurrency and transactional consistency *allows* other
processes to update the table after you start your transaction.  You
just won't *see* their updates while you're inside of a transaction.

Of course, if you truly want exclusive access, you could LOCK the
table.  It's well explained in the documentation...

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1XaWS9HxQb37XmcRAi5hAKDff5j5KnqWdGKxHjCJuTwXxfPwjACfZuko
1Ic5Bq1tU3IlPP44VYyD74M=
=Sv0p
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Reliable and fast money transaction design

2007-08-29 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/29/07 09:34, Decibel! wrote:
> On Wed, Aug 29, 2007 at 08:37:26AM -0500, Ron Johnson wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> On 08/29/07 07:27, cluster wrote:
>>> OK, thanks. But what with the second question in which the UPDATE is
>>> based on a SELECT max(...) statement on another table? How can I ensure
>>> that no other process inserts a row between my SELECT max() and UPDATE -
>>> making my SELECT max() invalid?
>>>
>>> A table lock could be an option but I am only interested in blocking for
>>> row insertions for this particular account_id. Insertions for other
>>> account_ids will not make the SELECT max() invalid and should therefore
>>> be allowed.
>> Well, concurrency and transactional consistency *allows* other
>> processes to update the table after you start your transaction.  You
>> just won't *see* their updates while you're inside of a transaction.
> 
> Just make sure and read up about transaction isolation... in the default
> of READ COMMITTED mode, you can sometimes see changes made by other
> transactions.

Argh!!!  The RDBMS that I typically use defaults to SERIALIZABLE.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1YxuS9HxQb37XmcRAlJOAKCWL+NtM95YC2bMkFjOkD2NfF/xuQCggfKO
QQC/mW+IYtlV6R9rqaSomMs=
=H3+i
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


ACID (was Re: [GENERAL] Reliable and fast ...)

2007-08-29 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/29/07 10:40, Joshua D. Drake wrote:
> Ron Johnson wrote:
>> On 08/29/07 09:34, Decibel! wrote:
>>> On Wed, Aug 29, 2007 at 08:37:26AM -0500, Ron Johnson wrote:
>>>>
>>>> On 08/29/07 07:27, cluster wrote:
>>>>> OK, thanks. But what with the second question in which the UPDATE is
>>>>> based on a SELECT max(...) statement on another table? How can I ensure
>>>>> that no other process inserts a row between my SELECT max() and UPDATE -
>>>>> making my SELECT max() invalid?
>>>>>
>>>>> A table lock could be an option but I am only interested in blocking for
>>>>> row insertions for this particular account_id. Insertions for other
>>>>> account_ids will not make the SELECT max() invalid and should therefore
>>>>> be allowed.
>>>> Well, concurrency and transactional consistency *allows* other
>>>> processes to update the table after you start your transaction.  You
>>>> just won't *see* their updates while you're inside of a transaction.
>>> Just make sure and read up about transaction isolation... in the default
>>> of READ COMMITTED mode, you can sometimes see changes made by other
>>> transactions.
>> Argh!!!  The RDBMS that I typically use defaults to SERIALIZABLE.
> 
> SERIALIZABLE is really slow :). You should look into SERIALIZABLE only
> for those transactions that need it. There is also SELECT FOR UPDATE.

We use SERIALIZABLE (with all it's locking "issues") to guarantee
the I in ACID.  ISTM that READ COMMITTED can only deliver "ACD".

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1ZVYS9HxQb37XmcRAlopAJ9wvAovDcqvUpsj5dqSrum+/3QUbgCeODwL
a8BJm6gi7VnR6dWgtmTLkcM=
=eg1s
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Reliable and fast money transaction design

2007-08-29 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/29/07 10:47, Tom Lane wrote:
> Ron Johnson <[EMAIL PROTECTED]> writes:
>> On 08/29/07 07:27, cluster wrote:
>>> Just make sure and read up about transaction isolation... in the default
>>> of READ COMMITTED mode, you can sometimes see changes made by other
>>> transactions.
> 
>> Argh!!!  The RDBMS that I typically use defaults to SERIALIZABLE.
> 
> You can change default_transaction_isolation if you like.

You misunderand: we do that on purpose, and I had forgotten that
most RDBMSs don't do that.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1Zf8S9HxQb37XmcRAjt+AKCW7I1H9ON60TEUaZMzLzcXYFb0pwCgh9qC
Q+VwNMFCHTWqq1mTL8kx13w=
=3NIY
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: ACID (was Re: [GENERAL] Reliable and fast ...)

2007-08-29 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/29/07 11:20, Joshua D. Drake wrote:
> Ron Johnson wrote:
>> On 08/29/07 10:40, Joshua D. Drake wrote:
>>> Ron Johnson wrote:
> 
>>>> Argh!!!  The RDBMS that I typically use defaults to SERIALIZABLE.
>>> SERIALIZABLE is really slow :). You should look into SERIALIZABLE only
>>> for those transactions that need it. There is also SELECT FOR UPDATE.
>> We use SERIALIZABLE (with all it's locking "issues") to guarantee
>> the I in ACID.  ISTM that READ COMMITTED can only deliver "ACD".
> 
> You are using serializable for select statements?

READ ONLY, which defaults to SERIALIZABLE.

(It's not PostgreSQL...)

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1Z02S9HxQb37XmcRAnzVAKDFFqHLuMHE1q6sgvO288bzZvZa1gCfcGWM
KUyB8HyjE3s9NfWq5GeLfvQ=
=9jB2
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] counting columns

2007-08-29 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/28/07 10:46, pere roca wrote:
> hi! 
> 
> I want different users to insert data and perform different analysis which
> will be inserted in an interactively created new column. The name of this
> new column should follow a logic name such as: fxanalysis_1 for the first
> user that perform the analysis called fx, fxanalysis_2 for the second... 
> To get it, I think I should construct a function that counts how many
> columns have their name starting with fxanalysis. 
>If count >=1, then alter table adding a new column with name fxanalysis_2
>and so on...
>One of the questions: how to COUNT COLUMNS? I can rename, add, alter
> columns... but count?

You'll need to go into the catalog.

Your idea, though, is poor.  You should make a second table, where
the key is the same as the master table, plus a "sequence number".

If you want to do it your way, though, PostgreSQL supports array types.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1gSbS9HxQb37XmcRAqw+AKC8lDaQZ6IQ2ugVZzfxuT8TlGeHvQCgsLic
esN5w79t4bGAnURm+Nulq20=
=PBHq
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Export data to MS Excel

2007-09-01 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/01/07 02:16, Ashish Karalkar wrote:
> Hello All,
> I want to export data from PostgreSQL tables to MS Excel.
> Is there any way?

Extract the data to a CSV (comma or tab) file.

http://www.postgresql.org/docs/
http://www.postgresql.org/docs/8.2/interactive/index.html
http://www.postgresql.org/docs/8.2/interactive/sql-copy.html

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2TopS9HxQb37XmcRAi2oAJ0VGYEPQm0N/y1kYVW6a+eg8hfyyACg6AUz
B0u5gnYgPz8GyMIMH2hSe3g=
=l1VW
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Bigtime scaling of Postgresql (cluster and stuff I suppose)

2007-09-01 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/01/07 08:12, chris smith wrote:
>> Ever read anything on how myspace is laid out?  The big ones need
>> replication to handle the traffic.
> 
> Actually no.
> 
> http://highscalability.com/livejournal-architecture
> 
> "Using MySQL replication only takes you so far." (Yeh it's mysql but
> the point is valid regardless).
> "You can't keep adding read slaves and scale."
> 
> A lot use sharding now to keep scaling (limiting to "X" users/accounts
> per database system and just keep adding more database servers for the
> next "X" accounts).

Hmmm.  Horizontally partitioning your "database" into multiple
physical databases is 10+ years old.  At least.  This is how DEC
implemented the billing database for DirecTV, and how we implemented
*large* toll systems in the US Northeast.

In addition to the account databases, you need a "reference"
database for tables that can't be partitioned by account, be able to
run queries across databases, and middleware that knows how to
direct transactions to the correct database.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG2c6QS9HxQb37XmcRAkuGAJ4thc+owEX8OJl1qaTrY+krHkMIwgCfWver
aUEiSPkrw4Gnf7dI0ftSVJ0=
=4wFD
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Reporting services for PostgreSQL

2007-09-02 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/01/07 06:12, Andrus wrote:
> I'm looking for a report generator which renders reports in server and sends 
> rendering result to client.

"Render" as in "run the report program on the host"?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG22LqS9HxQb37XmcRAlspAKC2paI69cLtVhMav5S1WFjU6Kqk/QCgmAuE
TOhUBVnjRQtGpeMc01tgDh4=
=efFa
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Need suggestion on how best to update 3 million rows

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 01:13, Ow Mun Heng wrote:
> I have a table in PG, Pulled from SQL Server using Perl DBI (w/o using
> chopblanks) and have ended up with a column where the "space" is being
> interpreted as a value.
> 
> eg: 
> 
> "ABC " when it should be "ABC"
> 
> this is being defined  as varchar(4)
> 
> I've already pull the relevent columns with 
> 
> create foo as select unique_id, rtrim(number) from org_column
> 
> I've tried to do the update using
> 
> update org_column set number = foo.number where foo.unique_id =
> org_column=unique_id.

Number?  Where does "number" come from?  Unless you've got weird
field names, that doesn't sound like a very good name for a
VARCHAR(4) column.

> The update is taking a few hours and still hasn't ended.
> 
> I've killed it already and rolled back the changes.
> 
> what's the easiest way to update these fields?

Is it only *some* tuples that have the "extra space" problem?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG38NAS9HxQb37XmcRAlZhAKCMtXSMzvbZ04M3YAdlAhjN4p7rSQCfZTDp
Goyd+/FIFdwoc7IA87Mr3xM=
=hJfr
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Need suggestion on how best to update 3 million rows

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 04:20, Ow Mun Heng wrote:
> On Thu, 2007-09-06 at 04:07 -0500, Ron Johnson wrote:
> 
>> On 09/06/07 01:13, Ow Mun Heng wrote:
> 
>>> update org_column set number = foo.number where foo.unique_id =
>>> org_column=unique_id.
>> Number?  Where does "number" come from?  Unless you've got weird
>> field names, that doesn't sound like a very good name for a
>> VARCHAR(4) column.
> 
> "number" is just a fictitious column name. I use sane column names of
> course :-)

OK.

Maybe there's an English language "issue", or maybe I'm just
excessively picky, but using "number" in this context is confusing.

In any event, it didn't stop the Earth from spinning or the sun from
shining, so it can't be that bad of a problem...

>>> The update is taking a few hours and still hasn't ended.
>>>
> 
>> Is it only *some* tuples that have the "extra space" problem?
> 
> Actually, it's all of it 

Then I agree with Alban:
  update table set number = trim(number);
or, if you need the leading spaces:
  update table set number = rtrim(number)

Then:
  update table set number = NULL where number = '';

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG38y7S9HxQb37XmcRAgC8AKDue6TRz4oLcmavV5u6dw0yOiQC4gCfVmgt
pCuDuyjOKh7LM/dfACkw3lc=
=KCw6
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] a provocative question?

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 10:43, TJ O'Donnell wrote:
> I am getting in the habit of storing much of my day-to-day
> information in postgres, rather than "flat" files.
> I have not had any problems of data corruption or loss,
> but others have warned me against abandoning files.
> I like the benefits of enforced data types, powerful searching,
> data integrity, etc.
> But I worry a bit about the "safety" of my data, residing
> in a big scary database, instead of a simple friendly
> folder-based files system.
> 
> I ran across this quote on Wikipedia at
> http://en.wikipedia.org/wiki/Eudora_%28e-mail_client%29
> 
> "Text files are also much safer than databases, in that should disk
> corruption occur, most of the mail is likely to be unaffected, and any
> that is damaged can usually be recovered."
> 
> How naive (optimistic?) is it to think that "the database" can
> replace "the filesystem"?

Text file are *simple*.  When fsck repairs the disk and creates a
bunch of recovery files, just fire up $EDITOR (or cat, for that
matter) and piece your text files back together.  You may lose a
block of data, but the rest is there, easy to read.

Database files are *complex*.  Pointers and half-vacuumed freespace
and binary fields and indexes and WALs, yadda yadda yadda.  And, by
design, it's all got to be internally consistent.  Any little
corruption and *poof*, you've lost a table.  A strategically placed
corruption and you've lost your database.

But... that's why database vendors create backup/restore commands.

You *do* back up your database(s), right??

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4D2nS9HxQb37XmcRAg73AKCD321T0u7lux0K2NBhkpQ4kwBjOwCfWh3D
WDuns1HAZboUPlraTzbE0oo=
=NuLE
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Column as arrays.. more efficient than columns?

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 20:53, Merlin Moncure wrote:
[snip]
> 
> arrays are interesting and have some useful problems.  however, we
> must first discuss the problems...first and foremost if you need to
> read any particular item off the array you must read the entire array
> from disk and you must right all items back to disk for writes.

Reads and writes are done at the page level, so I'm not sure this is
valid.

>also,
> they cause some problems with constraints and other issues that come
> up with de-normalization tactics.

ACK.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4N+/S9HxQb37XmcRApl6AJ43p087jXwHs2LHGlr+JoIUVs8s7QCgmRWY
BjV99QNGxKQnel3vQ4RuBMA=
=IeDI
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Column as arrays.. more efficient than columns?

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 21:26, Ow Mun Heng wrote:
> On Thu, 2007-09-06 at 20:57 -0500, Michael Glaesemann wrote:
>> On Sep 6, 2007, at 20:46 , Ow Mun Heng wrote:
> 
>>> I would believe performance would be better it being denormalised. (in
>>> this case)
>> I assume you've arrived at the conclusion because you have 
>> (a) shown  
>> that the performance with a normalized schema does not meet your  
>> needs; 
>> (b) benchmarked the normalized schema under production  
>> conditions; 
>> (c) benchmarked the denormalized schema under production  
>> conditions; and 
>> (d) shown that performance is improved in the  
>> denormalized case to arrive at that conclusion. I'm interested to see  
>> the results of your comparisons.
> 
>> Regardless, it sounds like you've already made up your mind. Why ask  
>> for comments?
> 
> You've assumed wrong. I've not arrived at any conclusion but merely
> exploring my options on which way would be the best to thread. I'm
> asking the list because I'm new in PG and after reading all those
> articles on highscalability etc.. majority of them are all using some
> kind of denormalised tables.

Correlation != causation.

There *might* be a causal relationship between high scalability and
table denormalization, but I seriously doubt it.

> Right now, there's 8 million rows of data in this one table, and growing
> at a rapid rate of ~2 million/week. I can significantly reduce this
> number down to 200K (i think by denormalising it) and shrink the table
> size.

Even presuming you only insert data SIX hours per day, that's only
13.3 inserts per second.  Not very impressive.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4N81S9HxQb37XmcRArnRAJ9T2vOWe+RTWK99zYKCXIVfzisY5ACg3s8H
NAeykgSGT2jeiXUa8P8oRAQ=
=GBcW
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Column as arrays.. more efficient than columns?

2007-09-07 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/07/07 02:49, Ow Mun Heng wrote:
> On Fri, 2007-09-07 at 00:18 -0500, Ron Johnson wrote: 
>> On 09/06/07 21:26, Ow Mun Heng wrote:
>> I've not arrived at any conclusion but merely
>>> exploring my options on which way would be the best to thread. I'm
>>> asking the list because I'm new in PG and after reading all those
>>> articles on highscalability etc.. majority of them are all using some
>>> kind of denormalised tables.
>> Correlation != causation.
>>
>> There *might* be a causal relationship between high scalability and
>> table denormalization, but I seriously doubt it.
> 
> I can't refute you on this since I have no experience in this arena,
> only what I read in highscalbility.com (IIRC)
> 
>>> Right now, there's 8 million rows of data in this one table, and growing
>>> at a rapid rate of ~2 million/week. I can significantly reduce this
>>> number down to 200K (i think by denormalising it) and shrink the table
>>> size.
>> Even presuming you only insert data SIX hours per day, that's only
>> 13.3 inserts per second.  Not very impressive.
> 
> Data is inserted 24 hours a day, but not at the same rate each
> sec/minute. The problem isn't really the data-insertion, it's already
> inserted in a normalised manner. It's the selection of data. (OLTP
> datahouse) which takes a longer time and which is the area of worry.

Datahouse or "data warehouse"?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4T8QS9HxQb37XmcRAmwFAJ0bOFYj4gWg2VGa4l28kiDAkraQYACgl167
sRA33c8h7ZHS2qgAfgFmzkg=
=66Z0
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Column as arrays.. more efficient than columns?

2007-09-07 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/07/07 07:49, Merlin Moncure wrote:
> On 9/7/07, Ron Johnson <[EMAIL PROTECTED]> wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> On 09/06/07 20:53, Merlin Moncure wrote:
>> [snip]
>>> arrays are interesting and have some useful problems.  however, we
>>> must first discuss the problems...first and foremost if you need to
>>> read any particular item off the array you must read the entire array
>>> from disk and you must right all items back to disk for writes.
>> Reads and writes are done at the page level, so I'm not sure this is
>> valid.
> 
> sure it is...since the denormalized record is much larger (especially
> in array scenarios), the tuple is much larger meaning the page will
> fill up much more quickly meaning more dead pages, more vacuuming,
> etc.   Besides that, the server has to do some work presenting the
> array as part of the read which is overhead.  I didn't go into a lot
> of detail but the reasoning is sound.  Here is a quick example showing
> the problem.

We agree.

What I meant was that reads and writes are done at the page level no
matter whether the table is normalized or not.  Thus, to say "if you
need to read any particular item off the array you must read the
entire array from disk and you must right all items back to disk for
writes." is... irrelevant.

That's probably an imprecise word, but it's all I can think of at
the moment.

We also agree regarding big records filling pages faster.



- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4Uv8S9HxQb37XmcRAk2rAKCpxsJjhcMbvHJB5GrZOzNsUXgmWwCg7Cy0
CEU8zlbHGG9JvZgCSm/hajE=
=/Uv/
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] an other provokative question??

2007-09-07 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/07/07 01:37, Greg Smith wrote:
> On Fri, 7 Sep 2007, Ron Johnson wrote:
> 
>> Definitely a niche product.
> 
> Stonebraker's commentary was unfortunately spun by the ComputerWorld
> columnist.

Tech journalist morphing reality to make a provocative story?

Say in ain't so!

>I hope people followed the link to his actual blog entry at
> http://www.databasecolumn.com/2007/09/one-size-fits-all.html where his
> arguement is that the idea of one database approach always being right
> just isn't true anyway.  With that mindset, every technology is a niche
> product of sorts; just the size of the niche varies.

I read something similar by him a few months ago.  Very interesting.

> Given past history of this project and its relation to Stonebraker, I
> was tempted earlier today to suggest that the Postgres vs. PostgreSQL
> renaming argument be dropped in favor of renaming the database
> "Horizontica".



- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4T/TS9HxQb37XmcRAj8gAKC+IeBhW9+0qlrRSiUbVwQH/EiQlACfWxVm
Y4uTvkWRDog+W2vPTrULXmQ=
=bKYs
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] an other provokative question??

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 22:54, Tom Lane wrote:
> "Dann Corbit" <[EMAIL PROTECTED]> writes:
>>> Relational database pioneer says technology is obsolete
>>> http://www.computerworld.com/action/article.do?command=3DviewArticleBasic&articleId=3D9034619
> 
>> This bit is a hint:
>> "Column-oriented databases -- such as the one built by Stonebraker's
>> latest start-up, Andover, Mass.-based Vertica Systems Inc. -- store data
>> vertically in table columns rather than in successive rows."
> 
>> Mr. Stonebraker's company sells column oriented databases.  So of course
>> the other methods must be "obsolete".
> 
> I don't see anything in there where Stonebraker says that relational DBs
> are obsolete.  What he suggests is that column-oriented storage might

Does "column-oriented storage" mean that all of the COLUMN_A values
for all 200 million rows are stored together on adjacent pages?

If so, then doing aggregates (the bread and butter of DW) *would*
seem to be faster.  But b-tree leaf that points to "a record" would
need num_cols pointers instead of one pointer.  Very messy.  And large.

Definitely a niche product.

> beat row-oriented storage for a lot of modern applications.  He might be
> right (I'm sure not going to bet against the guy who started Postgres)
> but this has not got anything to do with the concept of a relational
> database.  It's an implementation detail --- maybe a pretty fundamental
> one, but in principle you could build a DB either way and no user could
> see a semantic difference.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4OF4S9HxQb37XmcRAtQeAKCGqjOcdmT6ccrbMy/JDOURjYItSACfVu7/
AEdP1gbDPK/MNwCVlCb1IAg=
=PD28
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Column as arrays.. more efficient than columns?

2007-09-07 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/07/07 09:00, Ow Mun Heng wrote:
> 
>>Datahouse or "data warehouse"?
> 
> OLTP data warehouse.

But OLTP & DW are diametrically opposed in how you design,
structure, load and use them.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4ZRES9HxQb37XmcRArECAJ0elIpVRxGjB14fzH98opOmOnm4jwCg4Vb+
rn/mb7tB0d6p6GAio4lSxCc=
=SCl9
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] a provocative question?

2007-09-06 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/07 20:45, Chris Browne wrote:
> [EMAIL PROTECTED] ("Trevor Talbot") writes:
>> There's also a point in regard to how modifications are made to your
>> data store.  In general, things working with text files don't go to
>> much effort to maintain durability like a real database would.  The
>> most direct way of editing a text file is to make all the changes in
>> memory, then write the whole thing out.  Some editors make backup
>> files, or use a create-delete-rename cycle, but they won't
>> necessarily force the data to disk -- if it's entirely in cache you
>> could end up losing the contents of the file anyway.
> 
> In the case of Eudora, if its filesystem access protocol involves
> writing a new text file, and completing that before unlinking the old
> version, then the risk of "utter destruction" remains fairly low
> specifically because of the nature of access protocol.

mbox is a monolithic file also, and you need to copy/delete,
copy/delete, yadda yadda yadda.  Just to do anything, you need 2x as
much free disk space as you biggest mbox file.  What a PITA.

mh and Maildir are, as has been partially mentioned, much more
efficient in that regard.

(Yes... mbox is an excellent transport format.)

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG4Nx3S9HxQb37XmcRAg+6AJ42gRm82MTmocxNC2hp3yQ9ZsFhQgCgoXVQ
i51vvPBwN2Qot2TUR9AjMBY=
=8WKX
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Scalability Design Questions

2007-09-08 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/08/07 19:58, novnov wrote:
> You're right, that's not how I understood your partitioning question. I'd not
> be eager to implement that kind of split in the db, though I can see that
> it's an example of how to spread the load around.
> 
> But basically, it seems that the answer to one of my questions is that there
> is currently no way with postgres to spread a single database over multiple
> servers, ala a loadbalanced apache cluster, where requests are forwarded to
> different boxes.
> 
> It's not going to be an issue for me soon, maybe ever, I just wanted to see
> what the score is.

Scott is correct both in (a) that the money otherwise spent on RAC
can buy a big heap of extra oomph, and (b) "middleware partitioning"
method of scaling systems works very well.

Any way you go, though, requires proper design and coding from the
very beginning.

And it never hurts to think outside the box: sometimes you can get
the results you want by doing something that seems crazy.  For
example, we had a couple of tables that were specifically designed
for OLTP.  A specific weekly report that joined the 2 tables just
would *never* finish.  Instead of fighting to create some hairy-
looking SQL-statement-from-Hell, I create WHERE-less views on the
two tables with only the columns that were needed.  Now we have a
job that unloads the views, truncates then loads them into a
*separate* database (so that backups & CREATE INDEX don't clash),
indexes them perfectly for this query, and then runs the query
against this separate database.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG42yaS9HxQb37XmcRAg+VAJ48E7pERfJ2MpRj2j9N4u1Wh01wJQCfUwnN
h1pHoRtnUOLrJkWOtg0Fs4A=
=7meG
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Time Zone design issues

2007-09-09 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/09/07 22:29, novnov wrote:
[snip]
> 
> But it doesn't particularly make sense to set the server's clock to UTC.
> It'd seem to skew a lot of server functionality which I'd think should
> normally be geared around local time. So I'd guess that the route to take is
> to keep the server pegged to local time, and use a function that derives UTC
> from now(), tags the 'last modified' fields with that value.

Your single-user Windows mindset is shining brightly.

Unix servers have had their internal clocks set to UTC for a decade
or more, and there have been no noticeable ill effects, since apps
all know to adjust for TZ.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5MfXS9HxQb37XmcRAuf5AKDKm9h0AxznSTJ0fJx7KzVqFDblYACfeSUV
Lub89IZdWSIfvGhUZde/jG0=
=3+7a
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] arrays of foreign keys

2007-09-10 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/07/07 18:47, Max wrote:
> Hello,
> 
> And pardon me if I posted this question to the wrong list, it seems
> this list is the most appropriate.
> 
> I am trying to create a table with an array containing foreign keys.
> I've searched through the documentation and couldn't find a way to do
> so.
> 
> Is this something that one can do?
> 
> Basically, I have two tables:
> 
> create table user (
>   user_id serial,
>   login varchar(50) primary key,
>   tags integer[]-- this is where the problem lies
> );
> 
> create table tag (
>   tag_id serial,
>   name varchar(50) primary key
> );
> 
> I would like the user.tags field to be a list of foreign keys (tag_ids
> specifically).
> 
> I know I can solve my problem with the following table:
> 
> create table user_tag (
>   user_id integer,
>   tag_id integer,
>   foreign key (user_id) references table user(user_id) on delete
> cascade,
>   foreign key (tag_id) references table tag(tag_id) on delete cascade,
>   primary key (user_id,tag_id)
> );
> 
> But I would really like to avoid doing that.

Why?

The (literally) cardinal rule of database normalization is
"eliminate repeating values".

> Is there a solution to this problem with arrays of foreign keys, and
> if so, how does one do that?
> 
> Thanks for any help.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5UDUS9HxQb37XmcRAo+WAKDSB8DNYBZiZtOJ3utAkkX9QkDYtACgk0ab
HKf6Oe1DbvZP8cmh1e9dZaQ=
=lPTU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Time Zone design issues

2007-09-10 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/10/07 10:55, Scott Marlowe wrote:
> On 9/10/07, novnov <[EMAIL PROTECTED]> wrote:
>> Your explanation of now() and localtime() is good...but localtime() to
>> postgres acting as a web app db, wouldn't it just return the local time as
>> far as the server is concerned?
> 
> No, it would return it as the local time of the POSTGRESQL client.
> 
> For instance, suppose that someone in NY, connects to a web app, which
> asks him what TZ he's in and he picks EST5EDT.  The web app sets
> timezone='EST5EDT' and inserts a time of '2007-07-11 12:30:00'.
> 
> The database now stores that as '2007-07-11 16:30:00+00'
> 
> No matter WHERE the postgresql server is, that's what it has in it.
> 
> Now, I connect from Chicago, and your web app ascertains my timezone
> as CST6CDT.  When I ask for the same timestamp, I get '2007-07-11
> 11:30:00-05'.
> 
> Now, if you don't set a timezone, then the database will use it's own.
>  If your postgresql server is in california, then it might have a
> timezone of PST8PDT set.  If your web app didn't tell it otherwise,
> then the time would be '2007-07-11 09:30:00-07'.
> 
> Note that there are IP lookup tools you can use to determine, or at
> least make a wild guess at, someone's timezone.  But you still need to
> let them pick one if you get it wrong.  Then you can store that
> timezone in a user profile and set it everytime the user uses your web
> app.

In both Windows *and* Unix "you" set your TZ when you install the
system.  There are system functions to inquire how you've set it.

Browsers already report back a mountain of client data to the web
server.  I'd be stunned if FF, IE, Opera, Konq, etc don't already
expose TZ, too.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5W04S9HxQb37XmcRAhsrAKCr3WQZi0oyWSJsMxortjoExeaS1QCg5HbS
G+fd0X7UvX9406A+Td2GYpw=
=wk17
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Time Zone design issues

2007-09-10 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/10/07 15:21, Alvaro Herrera wrote:
[snip]
> 
> I wouldn't trust the browser's TZ, and you would need a way to
> override it.

Why?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5asLS9HxQb37XmcRAoQqAKCoqdRr6cIOnIktIa8l2689isYtAQCbBK9z
/O1wYFZrtlols1lrvo1Rw5Q=
=WcJ6
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Time Zone design issues

2007-09-10 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/10/07 19:50, Tom Lane wrote:
> Ron Johnson <[EMAIL PROTECTED]> writes:
>> On 09/10/07 15:21, Alvaro Herrera wrote:
>>> I wouldn't trust the browser's TZ, and you would need a way to
>>> override it.
> 
>> Why?
> 
> The browser may not know the setting, or may not tell it to you,
> or you might not be able to make any sense of what it says
> (timezone names are hardly standardized).

Well that's true.  Except for numeric offsets.

> This whole sub-thread actually is predicated on an assumption not
> in evidence, which is that there is any browser anywhere that will
> tell the http server timezone information.  I'm quite sure no such
> thing is required by the http standard.

I'm really surprised.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5ff8S9HxQb37XmcRAstvAJ4tnhHhv9SCWnrWGq8td5orPttrfgCg1c+t
IyhqUpzNg6RDS3wkALx5mUc=
=BtHs
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Hardware recommendation: which is best

2007-09-11 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/11/07 07:55, Phoenix Kiula wrote:
> On 11/09/2007, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>> It depends what you want to do with your database.
>>
>> Do you have many reads (select) or a lot of writes (update,insert) ?
> 
> 
> This one will be a hugely INSERT thing, very low on UPDATEs. The
> INSERTS will have many TEXT fields as they are free form data. So the
> database will grow very fast. Size will grow pretty fast too.

15000 rows/day times 365 days = 5475000 rows.

How big are these rows?  *That* is the crucial question.

>> You should use a hardware raid controller with battery backup write cache
>> (write cache should be greater than 256 MB).
> 
> 
> I'll have a raid controller in both scenarios, but which RAID should
> be better: RAID1 or RAID10?

The striping aspects of RAID10 makes sequential reads and writes and
large writes much faster.

The more spindles you have, the faster it is.

If you are *really* concerned about speed, 4 x 147GB 10K SCSI

>> How much memory do you have ?
> 
> 
> 4GB to begin with..
> 
> 
>> How big is your database, tables ... ?
> 
> 
> Huge, as the two main tables will each have about ten TEXT columns
> each. They will have about 15000 new entries every day, which is quite
> a load, so I believe we will have to partition it at least by month
> but even so it will grow at a huge pace.

15000 in an 8 hour window is 31.25 inserts/minute or ~2 seconds/insert.

If the records are 30MB each, then that could cause some stress on
the system in that 8 hour window.

If they are 3MB each, not a chance.

> While we are at it, would postgres be any different in performance
> across a single-CPU Quad Core Xeon with a dual CPU dual-core AMD
> Opteron? Or should the hard disk and RAM be the major considerations
> as usually proposed?

Opteron is the standard answer.

What is your backup/recovery strategy?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5pn1S9HxQb37XmcRAnl1AJ48p5CGBMma15yWt9FtD0bOXN/D7ACeNxxq
9EWbm10L/Zt/tB1xPly/Ex0=
=QPI1
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Hardware recommendation: which is best

2007-09-11 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/11/07 11:26, Phoenix Kiula wrote:
> Thanks Greg.
> 
> 
>> You're not going to get a particularly useful answer here without giving
>> some specifics about the two disk controllers you're comparing, how much
>> cache they have, and whether they include a battery backup.
>>
[snip]
> 
> 
> Would appreciate any tips. From these two, Scenario 1 looks marginally
> better to me. I am requesting further information about cache and
> battery powered backup, but would appreciate first-off thoughts based
> on above info.

How (on average) large are the records you need to insert, and how
evenly spread across the 24 hour day do the inserts occur?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5saES9HxQb37XmcRAjZnAKCrPnIyuhuXm+InFpmn/bDiw/1xKgCg7xoC
pq4xdQ72BT4qFVqvy2g5RrM=
=EbRU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Hardware recommendation: which is best

2007-09-11 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/11/07 12:02, Phoenix Kiula wrote:
> On 12/09/2007, Ron Johnson <[EMAIL PROTECTED]> wrote:
>> How (on average) large are the records you need to insert, and how
>> evenly spread across the 24 hour day do the inserts occur?
> 
> 
> There will be around 15,000 inserts in a day. Each insert will have
> several TEXT columns, so it is difficult to predict, but about 30,000
> to 100,000 characters in each row. And yes, the inserts will be very
> consistently timed every day.

15000*10 = 1.5GB.

1.5GB / (24*60) = ~1 binary MB.

Any computer that that can't write 1 megabyte per minute in a day
should have been retired in 1970.

So.

Unless there's something that you aren't telling us, this should be
handleable by a Wal-Mart Special with an extra-large disk in it.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG5vxLS9HxQb37XmcRAoneAKCsw0W92jmySlGaRFpkZsvIJovTJwCgqPFg
j4Fp2uV5vf+OIVhnuJSaQxs=
=4u83
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Database/Table Design for Global Country Statistics

2007-09-12 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/12/07 03:28, Stefan Schwarzer wrote:
[snip]
> 
> The new design would be like this:
> 
> id   |year|value
> ---
>  1   |   1970| NULL
>  1   |   1971|  36
>  1   
>  1   |   2005|  45
>  2   |   1970|  
>  2   .
> 
> 
> Would that be considered as "good table design" then?

What Richard says, plus:

1. The PK of New_Design should be country_id/year.

2. You also should not store the records where "value is NULL".  A
left outer join would handle that.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG57E/S9HxQb37XmcRAsb/AJ97lFt25sLwIYhkhQgGdJq2m8NaXgCfTxIV
rVuX5FF0XRDkXJgdSlV1qZg=
=jyv3
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Tablespaces on tertiary media

2007-09-14 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/14/07 04:06, Mark Morgan Lloyd wrote:
> Where does PostgreSQL stand with storing /really/ large amounts of data
> offline? Specifically, if a FUSE is used to move a tablespace to
> something like a tape archiver can the planner be warned that access
> might take an extended period?
> 
> I know that at one point (v6?) there were hooks in the code for
> experimental Berkeley code to do this sort of thing but as far as I know
> there has never been anything publicly available.

While tertiary media certainly was relevant 10 years ago, is it
really necessary in 2007? A couple of MSA-1000s stuffed with 1TB
disks would hold an l-o-t *lot* of historical data.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG6rpUS9HxQb37XmcRApN4AJ9ETn8nRlfGn67oRk4KVvd2+S6vtQCeKzlh
pxIham1MIue8+PhxuK0PBFQ=
=nOC4
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


  1   2   3   4   5   6   7   >