[GENERAL] pg_restore load data
Hi, v9.2.7 (Yes, I know, it's old. Nothing I can do about it.) During a "whole database" restore using pg_restore of a custom dump, when is the data actually loaded? I've looked in the list output and don't see any "load" statements. Thanks -- World Peace Through Nuclear Pacification -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_restore load data
On 11/16/2017 03:13 PM, bricklen wrote: On Thu, Nov 16, 2017 at 1:07 PM, Ron Johnson <mailto:ron.l.john...@cox.net>> wrote: v9.2.7 (Yes, I know, it's old. Nothing I can do about it.) During a "whole database" restore using pg_restore of a custom dump, when is the data actually loaded? I've looked in the list output and don't see any "load" statements. Look for COPY lines, that's how the data is restored. $ pg_restore -l CDSHA01.dump > CDSHA01.txt $ grep --color -i copy CDSHA01.txt $ echo $? 1 There are lots of "restoring data", though. I should have thought to grep for that. One thing that puzzles me is how fast the tables (even large ones) loaded compared to how slow the pg_dump -Fc was. Granted, I'm running -j4 but still, these were some really large, poorly compressible tables (the dump file was about as big as du -mc data/base). -- World Peace Through Nuclear Pacification
Re: [GENERAL] mild modification to pg_dump
On 11/17/2017 02:23 PM, John R Pierce wrote: On 11/17/2017 12:19 PM, marcelo wrote: Sorry, I was not exact. I don't need nor like to change pg_dump. Rather, based on pg_dump code, I need to develop a daemon which can receive a TCP message (from a privileged app) containing some elements: the database to dump, the user under which do that, and his password. (My apps are using that same data, of course, encripted to the common users). I would just fork pg_dump to do the actual dump rather than try and incorporate its source code into your app. Specifically, do you mean to write a simple daemon which forks pg_dump at the appropriate time? -- World Peace Through Nuclear Pacification -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] mild modification to pg_dump
What about the pgpass file? https://www.postgresql.org/docs/9.2/static/libpq-pgpass.html On 11/17/2017 03:06 PM, marcelo wrote: I need to "emulate" the pg_dump code because the password prompt. Years ago I write a program (for the QnX environment) that catched some prompt and emulates the standard input. I don't like to do that again. On 17/11/17 17:23, John R Pierce wrote: On 11/17/2017 12:19 PM, marcelo wrote: Sorry, I was not exact. I don't need nor like to change pg_dump. Rather, based on pg_dump code, I need to develop a daemon which can receive a TCP message (from a privileged app) containing some elements: the database to dump, the user under which do that, and his password. (My apps are using that same data, of course, encripted to the common users). I would just fork pg_dump to do the actual dump rather than try and incorporate its source code into your app. -- World Peace Through Nuclear Pacification -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Will a DELETE violate an FK?
On 05/29/07 09:48, Robert James wrote: I'd like to be able to detect if a record has associations. I don't want to actually delete it, just know if it could be deleted. (This is to build an intelligent menu on a GUI) Are you wanting to know this in a generic way or for a specific database? P.S. - Please don't top-post. On 5/29/07, *Albe Laurenz* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: > Is there anyway to know if a DELETE will violate an FK > without actually trying it? I don't know what you mean by 'without trying it', but does the following answer your question? CREATE TABLE a (id integer PRIMARY KEY); CREATE TABLE b (id integer PRIMARY KEY, a_id integer NOT NULL CONSTRAINT b_fkey REFERENCES a(id)); INSERT INTO a (id) VALUES (1); INSERT INTO b (id, a_id) VALUES (42, 1); DELETE FROM a WHERE id=1; ERROR: update or delete on table "a" violates foreign key constraint "b_fkey" on table "b" DETAIL: Key (id)=(1) is still referenced from table "b". Yours, Laurenz Albe -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] hundreds of schema vs hundreds of databases
On 05/29/07 04:02, Albe Laurenz wrote: I have an application with some hundreds users, each one having the same data definitions, and each one storing up to 2 GB of data. A user have just access to his own data. His data will have its own tablespace. Therefore, it seems to me I have a choice between "one database per user" and "one schema per user in the same database". What is the best practice here ? Which solution will be the easiest to manage ? Advantages of many databases: - Each database is smaller. - No danger of one user accessing another user's data (because of misconfigured permissions and similar). - Guaranteed independence of each user's data. - More scalable: If you decide that one machine or one cluster You could always dump a schema then drop it and restore it in a new database. At 2GB, that should be quick. is not enough to handle the load, you can easily transfer some of the databases somewhere else. > Advantages of one database with many schemata: - Fewer databases to administrate. But since they all have to have the same schema, you'd still have the same DDL overhead whether it's one DB or many. Does PG set up buffers at the postmaster level or the database level? If at the database level, then you'll be allocating memory to databases that might not be in use at any one time, thus wasting it. One database buffer pool would make more efficient use of RAM. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Rookie Questions: Storing the results of calculations vs. not?
On 05/24/07 15:20, cjl wrote: PG: Sorry it's taken so long for anyone to answer you, but it appears that some emails were hung up for a while. I am playing around with some historical stock option data, and have decided to use a database to make my life easier. The data is "end-of- day" for all equitable options, so something like 17 columns and approximately 200,000 rows a day. I have several months of data in "csv" format, one file for each trading day. After some simple experiments, I found it was easier to import this data directly into postgresql than into mysql, because of the expiration date format being un-friendly to mysql. I'm using the COPY command to load the data. I realize I have a lot of reading to do, but I would like to ask a few questions to help guide my reading. 1) The data contains the price of the underlying stock, the strike price of the option, and the option premium. From this I can calculate the "cost basis" and the "maximum potential profit", which are elements I would like to be able to SELECT and ORDER. Should I store the results of these calculation with the data, or is this "business logic" which doesn't belong in the database. Is this what views are for? I'd say "business logic", and yes, views are good for that. 2) For each underlying stock there are lots of options, each having unique strike prices and expirations. For example, AAPL (apple computer) have stock options (calls and puts) that expire in June, at various strike prices. Lets say that apple stock is trading at $112. I would like to be able to select the options with strikes just above and below this price, for example $110 and $115. The data contains options with strikes from $60 through $125, every $5. Is this something I need to do programatically, or can I create a complex SQL query to extract this information? I'd have a table with one row per option. Then make this kind of query: SELECT * FROM T_OPTION WHERE TICKER = 'AAPL' AND EXPIRE_DT BETWEEN '2007-06-01' AND 2007-06-30' AND PRICE = 112.0 ORDER BY PRICE DESC LIMIT 1 UNION SELECT * FROM T_OPTION WHERE TICKER = 'AAPL' AND EXPIRE_DT BETWEEN '2007-06-01' AND 2007-06-30' AND PRICE = 112.0 ORDER BY PRICE ASC LIMIT 1 ; I have rudimentary python skills, and I'm getting the hang of psycopg2. After reading the postgresql manual, what should I read next? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] hundreds of schema vs hundreds of databases
On 05/30/07 01:38, Albe Laurenz wrote: Ron Johnson wrote: Does PG set up buffers at the postmaster level or the database level? If at the database level, then you'll be allocating memory to databases that might not be in use at any one time, thus wasting it. One database buffer pool would make more efficient use of RAM. Shared memory is allocated at the cluster level. See http://www.postgresql.org/docs/current/static/runtime-config-resource.ht ml#RUNTIME-CONFIG-RESOURCE-MEMORY I read that page, but don't see any references to "cluster level". Maybe I am misinterpreting "cluster"? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Geographic data sources, queries and questions
On 05/29/07 17:46, Michael Glaesemann wrote: On May 29, 2007, at 15:28 , John D. Burger wrote: Even ISO country codes are not guaranteed to be stable I'm not sure where the idea that primary keys must be stable comes from. There's nothing necessarily wrong with updating a primary key. All a primary key does is uniquely identify a row in a table. If that id changes over time, that's fine, as long as the primary key columns continue to uniquely identify each row in the table. And any archived data (for example, transaction detail that you must keep for 7 years but don't still want in your database, since it doubles your backup/restore times) will still have the old codes. "Static" data needs to be static. SQL even provides ON UPDATE CASCADE to make this convenient. There may be performance arguments against updating a primary key (as the changes need to propagate), but that depends on the needs of a particular (benchmarked and tested) application environment. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] problem calling psql multiple times from a script ?
On 05/30/07 10:23, Tom Lane wrote: Paul Tilles <[EMAIL PROTECTED]> writes: I am wondering if one of the previous executions of psql is doing some "back room" work in the database while allowing the script to continue. It takes finite time for the backend to quit, and psql doesn't wait around for that to happen. I've noticed that on many systems it seems that the kernel scheduler actively discriminates against an exiting backend --- maybe it thinks it's a low-priority background process? The amount of work needed to start a new psql and a new backend vastly exceeds what it takes to quit (unless you've got lots of temp tables to drop, or some such), and yet people report cases like this pretty often. We could fix it by making PQfinish() wait for the connection to drop, but that cure could be worse than the disease; most apps would just see this as useless delay. In the meantime, a sleep(1) or some such before trying to drop a recently-used database ought to be a usable workaround. The proprietary DB that I use at work implemented something similar as a command-line qualifier, and I guarantee you that it's been tremendously useful. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Geographic data sources, queries and questions
On 05/30/07 11:01, John D. Burger wrote: Even ISO country codes are not guaranteed to be stable I'm not sure where the idea that primary keys must be stable comes from. There's nothing necessarily wrong with updating a primary key. All a primary key does is uniquely identify a row in a table. If that id changes over time, that's fine, as long as the primary key columns continue to uniquely identify each row in the table. And any archived data (for example, transaction detail that you must keep for 7 years but don't still want in your database, since it doubles your backup/restore times) will still have the old codes. "Static" data needs to be static. Yes, and then there is the question of what such a recycled code actually =means= as a foreign key. For example, CS used to be the code for Czechoslovakia, then it was for Serbia and Montenegro, now it is in "transition" before being deleted. Czechoslovakia no longer has a code, since it no longer exists, as far as ISO is concerned. What do you want to do with your biography database for 19th century Slavic poets, which indicate that some people were born in Czechoslovakia. Did those people move (briefly) to Serbia and Montenegro? Or did their birthplace change to NULL? If you want to give them a code, you have to find out what part of Czechoslovakia they actually lived in, and what country that region's now in. Do you really want some external agency forcing you to muck with you data like this? In situations like this (a toll road schedule) we add beginning and expiring timestamps, and the expiring timestamp is part of the natural PK. Anyway, regardless of one's feelings along these lines, I thought many might be implicitly assuming that all of these standards guarantee such stability, and I wanted to disabuse folks of that. It's very interesting and useful to know. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Continuous PITR (was Re: multimaster)
On 06/01/07 11:16, Andrew Sullivan wrote: [snip] My real question in all this is, "What is the problem you are trying to solve?" Hot failover using combinations of hardware and software, and a disk array that can be mounted across two machines, is actually probably good enough for most cases, assuming it is implemented correctly (see recent discussion on this topic). So the availability piece is mostly solved. What else do you want? What you need are disk blocks to be mirrored to a machine at the DR site. Or "continuous PITR" to the DR machine. How difficult would it be to modify the process (the postmaster?) that writes the xlogs(?) to tee them to a listening process across the cloud on the DR machine, which then applies them to the DR database? This then begs the question: are CREATE|ALTER TABLESPACE commands stored in the xlogs? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] why postgresql over other RDBMS
On 06/01/07 11:22, Bruce Momjian wrote: PFC wrote: On Thu, 31 May 2007 22:20:09 +0200, Vivek Khera <[EMAIL PROTECTED]> wrote: On May 25, 2007, at 5:28 PM, Tom Lane wrote: That's true at the level of DDL operations, but AFAIK we could parallelize table-loading and index-creation steps pretty effectively --- and that's where all the time goes. I would be happy with parallel builds of the indexes of a given table. That way you have just one scan of the whole table to build all its indexes. Will the synchronized seq scan patch be able to do this by issuing all the CREATE INDEX commands at the same time from several different database connections ? No, but it could someday. Or would a "CREATE MANY INDEXES" (where in one statement you specify all the indexes on a single table) command be easier to implement? This way also the process reads the table once, building separate sortwork files on-the-fly. Too bad child processes can't inherit transaction state. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Continuous PITR (was Re: multimaster)
On 06/01/07 16:25, Andrew Sullivan wrote: On Fri, Jun 01, 2007 at 03:58:01PM -0500, Ron Johnson wrote: What you need are disk blocks to be mirrored to a machine at the DR site. Or "continuous PITR" to the DR machine. I thought you could already do this? (I'm not, but I was pretty sure someone reported doing it already.) From within PG, or "manually"? This then begs the question: are CREATE|ALTER TABLESPACE commands stored in the xlogs? (I'll spare the rant about begging the question.) Since they're transactional, they must be, no? I wasn't sure whether they were a special case or not. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] why postgresql over other RDBMS
On 06/01/07 16:38, PFC wrote: Will the synchronized seq scan patch be able to do this by issuing all the CREATE INDEX commands at the same time from several different database connections ? No, but it could someday. Actually I tested, it does it right now, albeit unconsciously (pg doesn't do anything to synchronize the scans, but if you launch the concurrent connections at the same time and issue all your "create index" at the same time, only 1 table scan is needed). Maybe if the tables were bigger, it would lose sync between the 3 concurrent scans and would end up going slower. That's why I spoke about the "synchronized scan" patch. How much of this, though, is from the OS's disk cache? Or are Seq Scans O_DIRECT and bypass the OS cache? Or would a "CREATE MANY INDEXES" (where in one statement you specify all the indexes on a single table) command be easier to implement? You can get the effect right now by using concurrent connections it seems. Not very practical in a psql script, though... -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
One last Slony question (was Re: [GENERAL] Slightly OT.)
On 06/01/07 17:31, Andrew Sullivan wrote: On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote: Could you not (I ask naively) detect the first DDL statement is submitted in a transaction Maybe. on the master, then start a transaction on each slave, then funnel this and all subsequent statements synchronously to every nodes, then prepare and commit everyone? You could if 2PC was ubiquitous, which is certainly wasn't when the code was designed (remember, it was originally compatible all the way back to 7.3). Some people suggested using 2PC "if it's there", but that just seems to me to be asking for really painful problems. It also entails that all DDL has to happen on every node at the same time, which imposes a bottleneck not actually currently in the system. Since DDL is infrequent, is that bottleneck an acceptable trade-off? It is probably the case, however, that version 2 of the system will break some of these backwards compatibility attempts in order to depend on some new back end features -- putting this entirely in user space turns out to be awful. It's how we got the monstrous catalog corruption hack. This is getting pretty Slony specific, though, so if we're to continue this thread, I suggest we do it on the Slony list. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: One last Slony question (was Re: [GENERAL] Slightly OT.)
On 06/01/07 18:35, Joshua D. Drake wrote: Ron Johnson wrote: On 06/01/07 17:31, Andrew Sullivan wrote: On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote: Could you not (I ask naively) detect the first DDL statement is submitted in a transaction Maybe. on the master, then start a transaction on each slave, then funnel this and all subsequent statements synchronously to every nodes, then prepare and commit everyone? You could if 2PC was ubiquitous, which is certainly wasn't when the code was designed (remember, it was originally compatible all the way back to 7.3). Some people suggested using 2PC "if it's there", but that just seems to me to be asking for really painful problems. It also entails that all DDL has to happen on every node at the same time, which imposes a bottleneck not actually currently in the system. Since DDL is infrequent, is that bottleneck an acceptable trade-off? Define infrequent? I have customers that do it, everyday in prod. They do it willingly and refuse to change that habit. Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per day is a drop in the bucket compared to the number of I/U/D statements, no? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: One last Slony question (was Re: [GENERAL] Slightly OT.)
On 06/01/07 19:17, Joshua D. Drake wrote: Ron Johnson wrote: On 06/01/07 18:35, Joshua D. Drake wrote: Since DDL is infrequent, is that bottleneck an acceptable trade-off? Define infrequent? I have customers that do it, everyday in prod. They do it willingly and refuse to change that habit. Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per day is a drop in the bucket compared to the number of I/U/D statements, no? True. So Alexander Staubo's idea of synchronous DDL replication via 2PC has some merit? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Partitioning (was Re: [GENERAL] Slightly OT.)
On 06/01/07 19:29, Jeff Davis wrote: [snip] You shouldn't use a volatile function in a check constraint. Use a trigger instead, but even that is unlikely to work for enforcing constraints correctly. In general, for partitioning, you have to make some sacrifices. It's very challenging (and/or expensive) to ensure uniqueness across partitions. Are partitioned databases the same as federated databases? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Transactional DDL
You were politely asked not to top-post. On 06/02/07 11:46, Harpreet Dhaliwal wrote: So, while writing any technical document, would it be wrong to mention stored procedures in postgresql? what is the general convention? Did I miss something? What does "stored procedures" have to do with "Transactional DDL"? On 6/2/07, Dawid Kuroczko <[EMAIL PROTECTED]> wrote: On 6/2/07, Jasbinder Singh Bali <[EMAIL PROTECTED]> wrote: > But its said that transactions in any RDBMS follow ACID properties. > So if i put a create table and an Insert statement in the same begin end > block as one single transactioin, won't both create and insert follow acid > property, being in one single trasaction, and either both get committed or > none, talking about oracle lets say Actually, Oracle inserts implicit COMMIT after each DDL. So, if you have: BEGIN; INSERT INTO foo (bar) VALUES (1); CREATE INDEX foo_bar ON foo (bar); -- Here Oracle will insert implicit COMMIT, thus your foo table will have value 1 commited. -- And here Oracle will BEGIN a new trasaction. INSERT INTO foo (bar) VALUES (2); ROLLBACK; -- And you will rollback the insert of value 2. Value 1 remains in the table, -- because it is already committed. Regards, Dawid -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Transactional DDL
On 06/02/07 13:35, Jasbinder Singh Bali wrote: On 6/2/07, Michael Glaesemann <[EMAIL PROTECTED]> wrote: On Jun 2, 2007, at 11:08 , Harpreet Dhaliwal wrote: > Whats so novel about postgresql here? > This would happen in any RDBMS. right? > You induced divide by zero exception that crashed the whole > transaction and it did not create the table bar? [Please don't top-post. It makes the discussion hard to follow.] I used the divide by zero to raise an error to show that both the CREATE TABLE and the INSERT were rolled back when the transaction failed. If there's another definition of transactional DDL, I'd like to know what it is. Michael Glaesemann grzm seespotcode net This is what happens in every RDBMS. Whats so special about postgres then? But it's NOT what happens in every RDBMS. Oracle implicitly executes a COMMIT after every DDL statement. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] general features
On 06/03/07 09:08, Badawy, Mohamed wrote: [snip] Thanks for all of the replies,,, What I mean by clustering is to put the database on many machines. Single database on many machines? Do you mean federation/horizontal scaling, or DR replication or something different. PostgreSQL has master-slave replication using Slony-1. If you mean simultaneous access to the same disks from multiple machines without corrupting the data, you'll have to go to a proprietary system. 10K (15K if you've got the scratch) RPM RAID10, 8GB RAM and a couple of dual-core Opterons is a *really* fast database box. You might not need more than that. But if you *do* need the continuous uptime that shared-disk clustering and rolling in-place upgrades gives you, then Rdb/VMS can't be beat. It'll *cost*, though. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Multiple customers sharing one database?
On 06/03/07 12:45, Lew wrote: [snip] I am always confused by the overloaded term "database" in such discussions. Do we mean the RDBMS engine, i.e., run separate instances of PG? I sure would recommend against that. Or do we mean it as the PG documentation does, e.g., <http://www.postgresql.org/docs/8.1/interactive/ddl-schemas.html> A (relational) database is a database, and an RDBMS is what manipulates that (relational) database. "The" schema "defines" the database. "A" schema defines a specific logical sub-set of "the" schema. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Database design wisdom needed
engine for different kinds of tables help -- engines that were better for INSERT and SELECT, while others that were good for UPDATE? We ran MySQL until a year ago but we have switched to Pg since we had data corruption issues. I am sure I'm missing some tricks in the Pg world, and would truly appreciate any ideas. If you are reading until here, well done! And so many thanks in advance for any insight you can shed into this matter. What index(es) is/are on GAME and GAME_COUNTS? What version of PG are you running? Are you pushing the box too hard? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Encrypted column
On 06/04/07 17:54, Guy Rouillier wrote: Ranieri Mazili wrote: Hello, I need to store users and passwords on a table and I want to store it encrypted, but I don't found documentation about it, how can I create a table with columns "user" and "password" with column "password" encrypted and how can I check if "user" and "password" are correct using a sql query ? Many people consider two-way encryption to be insecure; two-way encryption means you can decrypt a value if you know the key, and it is insecure because you usually have to put the key into the source code. That means at least one person in your company, the programmer maintaining the source code, can learn all of your users' passwords. Two-way encryption is needed for companies that store customer credit cards. But yes, I've always worried about that. One way around that is to hash the value instead. Then to validate, at runtime you hash the user-entered password using the same hash function, and validate that it matches the stored hash. No one in your company ever knows end-user passwords. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Encrypted column
On 06/05/07 08:59, Alvaro Herrera wrote: Ron Johnson wrote: On 06/04/07 17:54, Guy Rouillier wrote: Many people consider two-way encryption to be insecure; two-way encryption means you can decrypt a value if you know the key, and it is insecure because you usually have to put the key into the source code. That means at least one person in your company, the programmer maintaining the source code, can learn all of your users' passwords. Two-way encryption is needed for companies that store customer credit cards. I thought that the advice for companies storing customer CCs was: don't. Sometimes you "must". An example from my industry: transponder "toll tags" and toll roads. The customer pre-pays a certain amount based on expected usage, and every time he drives thru a plaza, his balance decreases. Once it drops to a certain threshold, more money needs to be added to the account. If he is a CASH/CHEK customer, a light at the lane flashes yellow and (depending on the Agency) a message pops up saying, "Balance low", so he drives over to the customer service center, stands in line and pays his cash. If he is a CC customer, the system (which I am DBA of) bills his card directly, saving the customer much time and effort. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] pointer to feature comparisons, please
On 06/13/07 15:02, Kevin Hunter wrote: [snip] 'To enforce this rule without integrity constraints, you can use a trigger to query the department table and test that each new employee's department is valid. But this method is less reliable than the integrity constraint. SELECT in Oracle Database uses "consistent read", so the query might miss uncommitted changes from other transactions.' Isn't it *supposed* to mis UNcommitted changes from other transactions? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] pointer to feature comparisons, please
On 06/13/07 16:59, PFC wrote: Isn't it *supposed* to mis UNcommitted changes from other transactions? Well, if the "uncommited change" is a DELETE of the row that allowed the constraint check to pass, then when this delete is commited, your data is no longer consistent. The DELETE should block, no? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] pointer to feature comparisons, please
On 06/13/07 17:23, PFC wrote: On Thu, 14 Jun 2007 00:09:20 +0200, Ron Johnson <[EMAIL PROTECTED]> wrote: On 06/13/07 16:59, PFC wrote: Isn't it *supposed* to mis UNcommitted changes from other transactions? Well, if the "uncommited change" is a DELETE of the row that allowed the constraint check to pass, then when this delete is commited, your data is no longer consistent. The DELETE should block, no? Why ? Foreign keys put an ON DELETE trigger on the referenced table Foreign keys that silently, automatic DELETE records? Did I read that correctly? besides checking the referencing column on insert/update... If you just implement a constraint, you only get half the functionality. But when I define a FK *constraint*, that's all I *want*! -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] pointer to feature comparisons, please
On 06/14/07 02:24, PFC wrote: The DELETE should block, no? Why ? Foreign keys put an ON DELETE trigger on the referenced table Foreign keys that silently, automatic DELETE records? Did I read that correctly? Isn't that the point of ON DELETE CASCADE ? Where'd that come from? Did I miss something in the thread? besides checking the referencing column on insert/update... If you just implement a constraint, you only get half the functionality. But when I define a FK *constraint*, that's all I *want*! When you add a REFERENCE foo( bar ) foreign key and don't specify ON DELETE clause, I believe you get ON DELETE NO ACTION, which means the referenced row can't be deleted if there are referencing rows... so when you have a FK constraint, it actually constrains two tables, and pg uses share locking to ensure race conditions can't cause an inconsistent database. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP
On 06/16/07 11:24, PFC wrote: [snip] It's a matter of mindset. PHP and Postgres have really opposite mindsets. Python is a lot more similar to Postgres, for instance : - Postgres, Python : strongly typed, throws an error rather than doing funny stuff with your data, your code does what you think it should do. I wouldn't call Python *strongly* typed, but I do know what you mean. I think. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] What O/S or hardware feature would be useful for databases?
On 06/16/07 10:47, Lincoln Yeoh wrote: Hi, I've been wondering, what O/S or hardware feature would be useful for databases? If Postgresql developers could get the CPU and O/S makers to do things that would make certain things easier/faster (and in the long term) what would they be? By long term I mean it's not something that's only useful for a few years. Not something "gimmicky". For example - something like virtual memory definitely made many things easier. Hardware support for virtualization also makes stuff like vmware easier and better. What's the purpose of a multi-processing OS if you're just going to run a bunch of single-task VMs? Seems CPU makers currently have more transistors than they know what to do with, so they're adding cores and doing a lot of boring stuff like SSE2, SSE3, SSE4, etc. So is there anything else useful that they (and the O/S ppl) can do that they aren't doing already? Reducing memory latency always helps. That's AMD's strong point and now Intel is doing it. They've both got more cache. While I can't see the big use in PCs with quad-cores, multi-core can't help but benefit database servers. AMD, Intel & IBM are always profiling code, to find bottlenecks in their microarchitectures. POWER6 can run at 4GHz and is multi-core. Anyway... databases are always(?) IO bound. I'd try to figure out how to make a bigger hose (or more hoses) between the spindles and the mobo. The Alpha 8400 had multiple PCI *buses*, so as not to have a 133MBps chokepoint. A server with multiple PCI-e buses, 10Gb Ethernet, and lots of 4Gb HBAs attached to a big, fat SAN chock full of 15K SCSI disks could suck up a heck of a lot of data. Better support for distributed locking (across cluster nodes etc)? OK that's old stuff, but the last I checked HP was burying VMS and Tandem. AMD's HyperTransport could probably be used similar to Memory Channel. However, nowadays, gigabit Ethernet is the CI of choice, meaning that it's all done in software. Hardware acceleration for quickly counting the number of set/unset/matching bits? x86 doesn't already do that? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP
On 06/16/07 15:04, Andrej Ricnik-Bay wrote: On 6/17/07, John Smith <[EMAIL PROTECTED]> wrote: guys, love both tools but php @ 2.5 *billion* google results is far more popular than postgresql @ 25 million google results. *if* somebody's gotto adapt it's not php. php does what it does best in a way that stuffy academics don't get. > That's like saying that BSD or Linux should be more like Windows because there's more Windows than Linux stuff to be found on the web You've not used KDE lately, have you? :) And while *ix might not get viruses, it's got it's share of worms and rootkits. The following is a bit out of proportion, but you may get the idea" "If the majority of the population had cancer it would still be an illness and not the norm." -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP
On 06/16/07 15:34, John Smith wrote: On 6/16/07, Uwe C. Schroeder <[EMAIL PROTECTED]> wrote: Mhhh - what does PHP have to do with Postgresql? Lots of pages just end in .php, which is why the google results are so high - guess what, the tool "html" hits 3.2 billion :-) show me a database that doesn't respect html and i'll show you one i don't want :). Respect html? WTH does that mean? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] What O/S or hardware feature would be useful for databases?
On 06/16/07 17:05, Alexander Staubo wrote: On 6/16/07, Ron Johnson <[EMAIL PROTECTED]> wrote: > Hardware acceleration for quickly counting the number of > set/unset/matching bits? x86 doesn't already do that? I don't think so. The fastest way, I believe, is to use precomputed lookup tables. Same for finding the least/most significant set/unset bit, and other operations useful for dealing with bit vectors. A couple of new AMD Barcelona opcodes might help do that: http://www.anandtech.com/showdoc.aspx?i=2939&p=6 While on the topic of instructions, AMD also introduced a few new extensions to its ISA with Barcelona. There are two new bit manipulation instructions: LZCNT and POPCNT. Leading Zero Count (LZCNT) counts the number of leading zeros in an op, while Pop Count counts the leading 1s in an op. Both of these instructions are targeted at crypto- graphy applications. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] What O/S or hardware feature would be useful for databases?
On 06/17/07 00:19, Greg Smith wrote: On Sat, 16 Jun 2007, Ron Johnson wrote: Anyway... databases are always(?) IO bound. I'd try to figure out how to make a bigger hose (or more hoses) between the spindles and the mobo. What I keep waiting for is the drives with flash memory built-in to mature. I would love to get reliable writes that use the drive's cache for instant fsyncs, instead of right now where you have to push all that to the controller level. But drive-based flash memory will always be a fixed size, and only for that drive. Controller-based cache is expandable and caches the whole RAID set (besides being battery-backed). And if you *still* need more cache, rip out that controller and put in a more expensive one, or transition to "plain" SCSI cards and a storage controller. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] What O/S or hardware feature would be useful for databases?
On 06/18/07 08:05, Merlin Moncure wrote: [snip] That being said, it's pretty clear to me we are in the last days of the disk drive. Oh, puhleeze. Seagate, Hitachi, Fuji and WD aren't sitting around with their thumbs up their arses.In 3-4 years, large companies and spooky TLAs will be stuffing SANs with hundreds of 2TB drives. My (young) kids will be out of college before the density/dollar of RAM gets anywhere near that of disks. If it ever does. What we are in, though, is the last decade of tape. When solid state drives become prevalent in server environments, database development will enter a new era...physical considerations will play less and less a role in how systems are engineered. "Oh, puhleeze" redux. There will always be physical considerations. Why? Even if static RAM drives *do* overtake spindles, you'll still need to engineer them properly. Why? 1) There's always a bottleneck. 2) There's always more data to "find" the bottleneck. So, to answer the OP, my answer would be to 'get rid of the spinning disk!' :-) -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Experiences of PostgreSQL on-disk bitmap index patch
On 06/25/07 09:58, Tom Lane wrote: [snip] The fly in the ointment is that if the column value is so high cardinality as all that, it's questionable whether you want an index search at all rather than just seqscanning; and it's definite that the index access cost will be only a fraction of the heap access cost. So the prospects for actual net performance gain are a lot less than the index-size argument makes them look. Well they definitely are for data warehouses, in which many high-cardinality columns each have an index. Because of their small disk size, ANDing them is fast and winnows down the result set. That's the theory, of course. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] What O/S or hardware feature would be useful for databases?
On 07/03/07 13:03, Merlin Moncure wrote: On 7/2/07, Ron Johnson <[EMAIL PROTECTED]> wrote: On 06/18/07 08:05, Merlin Moncure wrote: [snip] > > That being said, it's pretty clear to me we are in the last days of > the disk drive. Oh, puhleeze. Seagate, Hitachi, Fuji and WD aren't sitting around with their thumbs up their arses.In 3-4 years, large companies and spooky TLAs will be stuffing SANs with hundreds of 2TB drives. haven't we had this debate before? I don't know if you've been paying attention to what's going on in the storage industry...Apple, Dell, Fuji, Sandisk, Intel, and others are all making strategic plays in the flash market. At the outset of 2007, flash was predicted to decline 50% for the year...so far, prices have dropped 65% in the first two quarters. Right now it's all about the high end notebooks and media players but the high margin, high rotation speed drives are next. Technological nay-sayers have been wrong before, but I just can't see a *database* server full of static RAM in the next 10 years. I admit the high density low speed cold storage d2d backup systems will be the last to fall and will be quite some ways off. note by, 'next', and 'last days', i mean that pretty loosely...within the next 5 years or so. 'dead' as well...there are many stages of death to an enterprise legacy product. I consider tape backups to be nearly dead already, although there are many still in use. d2d is where it's at though. Mainframers (and various other oldsters like me) think about 1) shock resistance, 2) media costs, 3) Iron Mountain, 4) media longevity. You can drop a SuperDLT tape from "man height" and recover the data (even if it has to be restrung into a new housing). I wouldn't drop a disk full of data and have any expectation of survival. A 160GB ("320"GB compressed) SATA drive is about $60 plus a $10 carrier. That comparable very well to tapes, I think. An Iron Mountain delivery truck will drive over some nasty bumps. How shock resistant is a disk drive in an external carrier? Not as resistant as a drive in a padded shipping box. But is it resistant "enough"? "Enterprise-level" tapes can sit in storage for 7-15 years and then still be readable. Can a disk drive sit un-used for 7 years? Would the motor freeze up? Will we still be able to connect SATA drives in 7 years? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] What O/S or hardware feature would be useful for databases?
On 07/04/07 16:00, Andrej Ricnik-Bay wrote: On 7/4/07, Ron Johnson <[EMAIL PROTECTED]> wrote: "Enterprise-level" tapes can sit in storage for 7-15 years and then still be readable. Can a disk drive sit un-used for 7 years? Would the motor freeze up? Will we still be able to connect SATA drives in 7 years? I was a bit harsh about connecting to SATA drives. IDE has been around for 21 years and ATA-133 is backwards compatible with 20MB drives of that era, so I predict that you'll be able to plug SATA-1 drives into machines with SATA-9 interfaces. But then, the motor might still not spin up... :( Same with a tape-drive, no? I've seen so many standard changes in drives and SCSI connectors ... if you don't keep spares of all the equipment involved you'll face the same issue with tapes that you'd face with SATA disks. No. Enterprise tape drives are not "flavor of the month", and can always read the previous one or two generations of tape. And if you've switched from, for example, SuperDLT to LTO, then you'll still be able to buy some drives on the used market (either eBay or from a dealer). -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Direct I/O
On 07/08/07 22:07, lai yoke hman wrote: Hello, Is there any direct I/O instead of buffered I/O for postgresql? Thank you. Why do you want raw IO? Performance would tank. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] how does a temp table work?
On 07/11/07 12:06, Andrew Sullivan wrote: On Wed, Jul 11, 2007 at 10:28:25PM +0530, Joshua N Pritikin wrote: and foo is executed from different transactions in parallel then each transaction will see its own private temp table, right? They will see their own private temp tables _no matter what_. There's no "shared temp table" support in Postgres. Will GLOBAL TEMP tables be added any time soon? I know the docs say that the standards implies needing modules, but "shared definition" is really handy. We use regularly such GLOBAL TEMP tables without modules. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] One Large Table or Multiple DBs?
On 07/09/07 16:18, Mike wrote: Hi, What is efficient approach toward storing a web application's user data. How do applications such as basecamp, SalesForce or QuickBooks online store their data? Is it in one DB with huge tables each record having a user account's foreign key or do they create a separate database for each of their accounts? Which one is more efficient? My guess was in one large DB with large tables. How big is "big"? What is "efficient"? Speed, management, upgrades, backups, scalability? If each customer has 80GB of data, then separate databases are the way to go, since it eases scalability and allows for parallel backups. But then upgrades must be applied to each of thousands of databases. If each customer has 100MB of data, then unified tables keyed off of account number would be simpler. In between is schema-per-account. But upgrades are still a chore. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] createing indexes on large tables and int8
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/17/07 17:12, [EMAIL PROTECTED] wrote: > On Tuesday 17 July 2007 17:47:01 Tom Lane wrote: >> [EMAIL PROTECTED] writes: >>> i think i got it fixed as i saw that i pushed my maintenance_work_mem too >>> high. It was higher than physical ram :-( >> Ooops, that will definitely cause problems. > > yes it did! I ran it again. And now it takes 10 minutes per index instead of > 10 hours (still 8.1). maybe something postgres should complain about if > setting maintance_work_mem too high. Unless it does some really OS-specific calls, *can* PostgreSQL know how much *physical* RAM is in a box? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGnk+zS9HxQb37XmcRAsDtAKCCadB0CF8ATeHCtO79wcTD3lER7wCgttoF E9Rndryd/IhZEP2FY7yIr/A= =bDSf -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Delete/update with limit
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/23/07 10:56, Csaba Nagy wrote: > Hi all, > > This subject was touched a few times in the past, I looked into the > archives... the result is invariably key developers saying such a > feature is unsafe because the result is unpredictable, while the people > requesting is saying it is OK that way, it is expected... but no > compelling use case for it. > [snip] > > Now I don't put too much hope I can convince anybody that the limit on > the delete/update commands has valid usage scenarios, but then can > anybody help me find a good solution to chunk-wise process such a buffer > table where insert speed is the highest priority (thus no indexes, the > minimum of fields), and batch processing should still work fine with big > table size, while not impacting at all the inserts, and finish in short > time to avoid long running transactions ? Cause I can't really think of > one... other than our scheme with the delete with limit + trigger + > private temp table thing. Maybe add OIDs to the table, and delete based on the OID number? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGpQ7zS9HxQb37XmcRArXQAJ9qcrWphVgtINdGlcwGubg/SEsjMgCeKyLt I8xPs0NEGqg22Cvgf4awNVQ= =l/yz -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Will partial index creation use existing index?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/24/07 14:48, Gregory Stark wrote: [snip] > > The main use case for it is actually REINDEX. Since you already have an index > which contains precisely the records you want to index and already in order > too. > > The main disadvantage is that it's not clear when it would actually be faster. > Generally index scans are slower than reading the whole table and sorting. > Probably it would have to run an SPI query to use the planner to find the best > way to get the rows it wants. I believe you, but it's totally counter to prima-fascia logic. Scanning the whole table means that you have to read in a whole bunch of columns that you don't really give a rat's arse about, and thus is a waste, whereas directly reading an existing index means that you've got perfect locality of data, since you're only reading what you care about. > Another problem is that presumably you're reindexing because the existing > index *isn't* in such good shape. You may even be doing it because the > existing index is corrupt. That, of course, is an excellent point. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGpqHkS9HxQb37XmcRAu93AKC04YXUWvMI6YiLfBNoy2BYtQw28ACdHqE/ kVqHiPwBONv0Tudy5OnA/SE= =Fbuw -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Linux distro
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/01/07 10:37, Owen Hartnett wrote: > At 4:52 PM +0200 8/1/07, Leif B. Kristensen wrote: >> On Wednesday 1. August 2007 16:15, Madison Kelly wrote: >> >>> /Personally/, I love Debian on servers. >>> >>> It's not quite as 'hardcore' as Gentoo (a great distro, but not one to >>> start with!). It's the foundation of many of the popular distros >>> (Ubuntu, Mepis, Knoppix, etc) and the Debian crew is very careful >>> about what they put into the 'stable' repositories. >> >> I agree totally. Debian in a server configuration is quite easy to get >> started with, and is rock solid. My first Linux "test server" (my old >> Pentium 133 MHz desktop) way back in 2002 ran Debian Woody. I kept it >> running until it died from old age a couple of years ago. Later I fell >> in love with Gentoo. But if I'd have to run a server with maximum >> stability and uptime, I think that I'd still prefer Debian. > > As an alternative viewpoint, I've been running the latest postgres on > Mac OS X Server 10.4, and it's been great for me. It was my first time > using a server, and my first serious use of postgres (although I have > had a lot of previous unix experience.) All the power of unix, all the > ease of the Macintosh (and it's server installation gives you lots of Pardon me for being the contrarian, but why does a server need a GUI? Isn't that just extra RAM & CPU overhead that could be more profitably put to use powering the application? > great things for free and already installed - granted most is publicly > available, but it's already installed and ready for use that's the big > advantage). Not only that, but I can run windoze in Parallels (or even > Boot Camp if I desired). - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGsUJ4S9HxQb37XmcRAggtAKCfNxfv4/+Qmwt/89jiun4jWsM/FACeMMld zyDp6ec4t12RF6XGnEE3uUg= =E95M -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Linux distro
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/01/07 21:44, Andrej Ricnik-Bay wrote: > On 8/2/07, Ron Johnson <[EMAIL PROTECTED]> wrote: > >>> As an alternative viewpoint, I've been running the latest postgres on >>> Mac OS X Server 10.4, and it's been great for me. It was my first time >>> using a server, and my first serious use of postgres (although I have >>> had a lot of previous unix experience.) All the power of unix, all the >>> ease of the Macintosh (and it's server installation gives you lots of >> Pardon me for being the contrarian, but why does a server need a >> GUI? Isn't that just extra RAM & CPU overhead that could be more >> profitably put to use powering the application? > Amen =) > > Unnecessary waste of resources, plus artificial introduction of stuff > that can potentially make the machine go belly-up... a dedicated > server should have the bare minimum install; as much as necessary > and as little as possible. That's (one reason) why I like Debian. It's packages are so granular that you can only install what you want to install. I.e., Python without the GNOME or KDE language bindings, which would also drag in GNOME/KDE and X.org. > /me cringes at the idea of flying toasters slowing down a query ... :D - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGsUokS9HxQb37XmcRAvYQAKC/ju2KOmb3ZOt8rCPd/CEc6cJJrgCfXOVM aS08rA2LtbDNf/h1HUqbSI8= =WIP6 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Linux distro
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/01/07 21:58, Merlin Moncure wrote: [snip] > > 3. binary packaging > While I like the debian distros generally, I dislike the debian > packaging of PostgreSQL. IMO, it's over engineered. If you plan to How so? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGsUqHS9HxQb37XmcRAg0+AKCiY79IVE0APk6YcF6HSlIoRmvshACfTH+2 hTCDwDo40q4+VPlHBAjxLIc= =ysHU -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Linux distro
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/01/07 22:05, Merlin Moncure wrote: > On 8/2/07, Ron Johnson <[EMAIL PROTECTED]> wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> On 08/01/07 10:37, Owen Hartnett wrote: >>> At 4:52 PM +0200 8/1/07, Leif B. Kristensen wrote: >>>> On Wednesday 1. August 2007 16:15, Madison Kelly wrote: >>>> >>>>> /Personally/, I love Debian on servers. >>>>> >>>>> It's not quite as 'hardcore' as Gentoo (a great distro, but not one to >>>>> start with!). It's the foundation of many of the popular distros >>>>> (Ubuntu, Mepis, Knoppix, etc) and the Debian crew is very careful >>>>> about what they put into the 'stable' repositories. >>>> I agree totally. Debian in a server configuration is quite easy to get >>>> started with, and is rock solid. My first Linux "test server" (my old >>>> Pentium 133 MHz desktop) way back in 2002 ran Debian Woody. I kept it >>>> running until it died from old age a couple of years ago. Later I fell >>>> in love with Gentoo. But if I'd have to run a server with maximum >>>> stability and uptime, I think that I'd still prefer Debian. >>> As an alternative viewpoint, I've been running the latest postgres on >>> Mac OS X Server 10.4, and it's been great for me. It was my first time >>> using a server, and my first serious use of postgres (although I have >>> had a lot of previous unix experience.) All the power of unix, all the >>> ease of the Macintosh (and it's server installation gives you lots of >> Pardon me for being the contrarian, but why does a server need a >> GUI? Isn't that just extra RAM & CPU overhead that could be more >> profitably put to use powering the application? > > A server with a GUI sitting on a login screen is wasting zero > resources. Some enterprise management tools are in java which require > a GUI to use so there is very little downside to installing X, so IMO > a lightweight window manager is appropriate...a full gnome is maybe > overkill. Obviously, you want to turn of the 3d screen saver :-) X is network-transparent. Load the few necessary X libraries (Debian's packages are granular enough to do this), and then use the GUI on your workstation to run all those foolish GUI-based server (in a room down the hall, across town or across the country, with compressed X) apps. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGsW1OS9HxQb37XmcRAoUYAKCZqXlRD9LmVIXHLuPe3YhWxJzppQCghP5x zCLmJdjJLz+NnyMlwNGJG5E= =f6TY -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] python api to backup/restore data
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/04/07 10:58, Tom Lane wrote: > Acm <[EMAIL PROTECTED]> writes: >> I would like to know whether an API for accessing Postgresql from >> Python exists and whether or not it supports commands for backing up a >> database (or schema) and restoring a backup file. > > Run pg_dump or pg_restore as a subprocess. But that doesn't give you the capability of getting customized output. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGtMMQS9HxQb37XmcRArvxAKDo0VQvq1uBtCEUYYb3+FSWNKmCYACfckSV f7gaTsz/mAbEzFH0IJIypI0= =WAls -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Moving to postgresql and some ignorant questions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/07 10:09, Phoenix Kiula wrote: > On 14/08/07, Gregory Stark <[EMAIL PROTECTED]> wrote: >> "Phoenix Kiula" <[EMAIL PROTECTED]> writes: [snip] >>> >>> QUESTION1: >>> Is this it? Or am I missing something in terms of execution? We use >>> Perl on our site but may be gradually switching to PHP at some point. >>> Will the above plan of execution be ok? >> A transaction is a bunch of queries which you want to all get committed or >> aborted together. The expensive step in Postgres is the COMMIT. Postgres does >> an fsync which forces the data to go to disk (if you're using good hardware, >> and don't have fsync=off). That takes from 5-15ms depending on how much data >> and how fast your drives are. > > > > > If I am reading this right, does this mean it is probably better to > leave fsync as "fsync=off" on production machines? Also, is COMMIT > automatic for my queries? In some minor testing I did (not scientific > I did at all) some queries through Perl did not update the database at > all. I had "fsync=off" in my conf file, and there was no COMMIT etc in > my SQL, just plain SQL. So I am a bit confused. What's a good starting > point? > [snip] > > So, again, in the conf file, is this what you recommend: > > fsync=off I seriously doubt that's what he means. "fsync=off" *is* faster than "fsync=on", but leaves your data at risk in case of a hardware crash. Turning it off during initial data load is quite common, though. Just remember to turn it back on!!! - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwczGS9HxQb37XmcRAhUvAJ9LsQPvd6tQDp+/Mzh3jl8oPs4mHQCffjev 2uCJa3x0/NzUQBVmaJMcVR4= =kVMU -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] MVCC cons
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/07 14:34, Kenneth Downs wrote: > Tom Lane wrote: >> Kenneth Downs <[EMAIL PROTECTED]> writes: >> >>> Speaking as an end-user, I can give only one I've ever seen, which is >>> performance. Because of MVCC, Postgres's write performance (insert >>> and update) appears on my systems to be almost exactly linear to row >>> size. Inserting 1000 rows into a table with row size 100 characters >>> takes twice as long as inserting 1000 rows into a table with row size >>> 50 characters. >>> >> >> Not sure why you'd think that's specific to MVCC. It sounds like purely >> an issue of disk write bandwidth. >> >> regards, tom lane >> > > I did not see this in MS SQL Server. It is only logical that it will take 2x as long to insert 2x as much data. Maybe SQL Server is compressing out white space? Or (shudder) heavily caching writes? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwhXTS9HxQb37XmcRAmdTAJ4rpK60hNtcvT82gCD4RG4EPtcC2wCeNR/C poURsgchjku2UC0y476KOfM= =KVNY -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] MVCC cons
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/07 20:36, Merlin Moncure wrote: [snip] > > PostgreSQL wins in terms of better concurrency (especially in long > transactions or transactions that touch a lot of records), cheap > rollbacks, and all the advantages of a sophisticated locking engine > (transactional ddl for example). Although MVCC is not a /sine qua non/ of transactional ddl. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwl4GS9HxQb37XmcRAhm5AJ9pSZR08Vj5aOYtMnQdQaPsjtPMOQCghP+v k81CxBZyH/42wHjPeT0Azvc= =WRv0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Transactional DDL
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/15/07 00:05, Harpreet Dhaliwal wrote: > And this feature i.e. transactional DDL is not there in other major > RDBMS like sql server, oracle etc? Define "major". Does it mean "popular" or "used on very large systems"? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGwp1GS9HxQb37XmcRAtdIAKC+7kG6K4WVxgGGSVT/AHcWCo6I8gCfZ9y5 bVcXkbWY4E9OzYss8g1i7Q4= =/dqV -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Strange inconsistency with UPDATE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/16/07 21:58, Phoenix Kiula wrote: > I am trying to force a column to have lowercase because Postgresql is > case-sensitive in queries. For the time being I've made an expression > index on lower(KEY). But I would like to have just lower case data and > then drop this expression index. > > However, I see some inconsisent behavior from Postgresql. When I issue > an UPDATE command , it shows me a duplicate violation (which could be > correct) -- > > -# update TABLE set ACOLUMN = lower(ACOLUMN); > ERROR: duplicate key violates unique constraint "TABLE_ACOLUMN_key" > > So I try to find out the offending values of this ACOLUMN that become > duplicated when lower(ACOLUMN) is issued: > > -# SELECT lower(ACOLUMN), count(*) FROM TABLE > GROUP BY lower(ACOLUMN) HAVING count(*) > 1 ; > ---+--- > lower | count > ---+--- > (0 rows) > > But this doesn't make sense! If there are no columns that get > repeated, how can it violate the UNIQUE constraint? > > I am not sure if the following helps, but I'm including the EXPLAIN on > this table. Penny for your thoughts! Whatever the issue, you can bet your car that it's not a bug in PostgreSQL, but you who is misunderstanding how PG works. Write a script that loops thru the records one by one, updating only one record per loop iteration. That will find the problem record. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxRSUS9HxQb37XmcRApMyAKCGOmpm5xKkfuWR19OnbXLVZMMbkwCcCHmu 4OOXMnRnaixpp8lSjbrA/5w= =M3jw -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Writing most code in Stored Procedures
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/17/07 18:00, Steve Manes wrote: > Guy Rouillier wrote: >> I have a thread I started ages ago over on the PERFORM list that I'm >> sadly just now being able to provide some insight on. I'll be >> replying on that thread in more detail, but the short of it turns out >> to be that at least in this one application, using stored procs for >> inserts is slowing down the app considerably. The app does high >> volume inserts and updates, about 16 million rows a day. By switching >> from stored procs to inline inserts, elapsed time dropped from 2595 >> seconds to 991 seconds for a test run. >> >> So the moral of the story is that, as anyone who has worked >> professionally for a number of years knows, no magic bullets exist. >> General guidelines can be helpful, but each scenario must be >> individually investigated. > > Absolutely. You can't assume that every application is going to fit > neatly into the same development jig. Hope the MVC frameworks zealots > don't read that. > > I worked on a fairly high volume site using PG, an ad trafficking > workflow application, which imported ~2 million placements daily from > DoubleClick, OAS and Accipiter. Everything had to be imported and the > reports run and cached by 8am so the clients stare blankly at 10,000 row > Excel charts over their morning coffee. > > Moving all the application-bound inserts into stored procedures didn't > achieve nearly the performance enhancement I'd assumed I'd get, which I > figured was due to the overhead of the procs themselves. Would that be because the original app was written in a compiled language, but the SPs in an interpreted language? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxjK1S9HxQb37XmcRAnYWAKCRV+INrpvl83lhnt4iadIMrBNIRgCgr8J2 UK3F87ji/24mrISLl+WmLnY= =5csM -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Blobs in Postgresql
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/17/07 23:16, Merlin Moncure wrote: > On 8/18/07, Ron Olson <[EMAIL PROTECTED]> wrote: >> The language is Java. I've made some tests and they work very well for 25meg >> filesworks exactly the way it should, first time. MySQL had all kinds of >> nasty surprises for me when I first started working with blobs, but I can >> say that I took my code, changed the driver, and it all works like a champ >> (mind you, this was a quick test app). >> >> I haven't looked at encryption at the database levelis such a thing >> available? I know Oracle has some form of data encryption at the database >> level so the nefarious DBA with the wide mustache and black brimmed hat >> always going "ah ha ha ha ha" can't make off with the data, but does >> Postgres have something similar? >> >> BTW, to put into context, the database will be designed to hold evidence >> (well, photos and videos of). Thus the compelling need for some security, as >> well as the variation in file sizes. > > Well, my assumption was that you would encrypt the data on the client > side and store it that way. > > PostgreSQL has open architecture. If you wanted to do the encryption > on the server, one possible approach that jumps out at me is to write > a small C function which receives the data, encrypts the image using a > key sent by the client all (but not stored), and either stores the > encrypted image back in the database via SPI or writes it out to a > file. > > There are many strategies to encrypting data...first thing to think > about is where the encryption happens, where the keys are stored, etc. Client-side encryption is important, because with server-side encryption, you are sending the Valuable Data across the wire (or, even worse!) wireless in cleartext form. It's more likely that there's a packet sniffer on the network than an Evil DBA snooping around. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxwf7S9HxQb37XmcRAimGAJ98Kykormb63BedYknIij2xZvDgEACgw23C eWn7JJKSs1KL9dSfVx3p/BY= =OLl1 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Writing most code in Stored Procedures
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/17/07 21:45, Steve Manes wrote: > Ron Johnson wrote: >>> Moving all the application-bound inserts into stored procedures didn't >>> achieve nearly the performance enhancement I'd assumed I'd get, which I >>> figured was due to the overhead of the procs themselves. >> >> Would that be because the original app was written in a compiled >> language, but the SPs in an interpreted language? > > No, because the application language was Perl5 for both. I think it was > just the overhead of 2 million inserts via procs versus 2 million inline > inserts (without the proc overhead). Interesting. Does PG have to initiate the Perl interpreter every time you call a Perl-written SP? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxwiAS9HxQb37XmcRAgi2AJ9Yq2drImecZVTbZR0Wo4VKlpaiXwCgpjHo 8KcDWqDpW6BWNWCj+ZUQFU4= =Qlpg -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Automating logins for mundane chores
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/18/07 06:02, Phoenix Kiula wrote: [snip] > > Thanks for this. I am logged in as root. Put it there and it works. I Well, that's your first problem. And second. And third. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxwlGS9HxQb37XmcRAvJSAKDP//ElDCzRk2Jcewm1+GVxaeVikACfbk+p 4obghwE8R19ljPRiqRPQQRg= =NYrP -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Writing most code in Stored Procedures
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/18/07 11:08, Joshua D. Drake wrote: > Josh Tolley wrote: >> On 8/18/07, Ron Johnson <[EMAIL PROTECTED]> wrote: >>> Interesting. Does PG have to initiate the Perl interpreter every >>> time you call a Perl-written SP? >> IIRC PostgreSQL should only load the perl interpreter once per session. > > Right. I'd have been stunned if you did it any other way. So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT? (I can imagine that the SP code path would be longer, but since IO is the slowest part of the system, I'm surprised that it's *that* much slower.) - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGxx5jS9HxQb37XmcRAn1vAKCEWTovgcj/w/uFVK0nankGdbOFuACg6HSq dUnMN0dPsdQ8NKSDW0EahcU= =P8gN -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Automating logins for mundane chores
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/18/07 21:10, Phoenix Kiula wrote: > On 18/08/07, Ron Johnson <[EMAIL PROTECTED]> wrote: >> >> On 08/18/07 06:02, Phoenix Kiula wrote: >> [snip] >>> Thanks for this. I am logged in as root. Put it there and it works. I >> Well, that's your first problem. >> >> And second. And third. > > > > Thanks for the kick in the derierre. Have set it all up to operate as > user postgres. Or is that insecure too? Whenever thinking about security, the question to ask yourself is: am I doing anything which would make it easier for a Bad Guy to gain access to my data or systems. Then, do the opposite. Examples: Using a powerful account for mundane activities? Use a mundane account instead. (As Joshua pointed out, "postgres" is a powerful account.) Sending important data over the wire (or worse, wireless) in clear text? Encrypt it. Vulnerable to SQL injection attacks by sending fully formed SQL statements across the wire? Use prepared statements instead. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGx6zsS9HxQb37XmcRAndxAJ0YJ1mGQ1+erBsDuq3/iCN3q6ZcsgCgsVpd F0/q8sPWoWs4qgFhbP65NyM= =syP0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/22/07 17:45, Dmitry Koterov wrote: > Also, the controller is configured to use 75% of its memory for write > caching and 25% - for read caching. So reads cannot flood writes. That seems to be a very extreme ratio. Most databases do *many* times more reads than writes. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGzMNDS9HxQb37XmcRAgMLAJsGvA43MKrfRKoyf0W0Nv5/VWu5gACdG8qh oJbb6+7FbotnEXnf9PdYF+E= =Esfi -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Geographic High-Availability/Replication
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/23/07 17:22, Bill Moran wrote: > Decibel! <[EMAIL PROTECTED]> wrote: >> On Aug 22, 2007, at 3:37 PM, Joshua D. Drake wrote: >>> You can not do multi master cross continent reliably. >> I'm pretty sure that credit card processors and some other companies >> do it... it just costs a LOT to actually do it well. > > Isn't this sort of requirement the entire reason for 2-phase commit? Entire reason? Not that I've heard. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGzjKiS9HxQb37XmcRArTlAJ43MAEDdbbi71WDIApW5j0PveeJIwCePJPx czuG/oescDoF8SAAehw4xdA= =v+RP -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Geographic High-Availability/Replication
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/24/07 08:21, Markus Schiltknecht wrote: > Hi, > > Matthew wrote: >> Hey all, new postgres user here. We are trying to setup/research an >> HA/Replicated solution with Postrgresql between a datacenter in LA and a >> d.c. in NY. >> >> We have a private LAN link between the two D.C.'s with a max round-trip >> of 150ms. >> >> We will have a web server at each d.c. (among other servers) that will >> write/read to/from the local LAN DB. On writes, that data should be >> xmited to the other data center so that if, for whatever reason, my >> website request was sent to LA instead of NY, all my session information >> etc will still exist. > > This is commonly known as synchronous replication. As that involves > *at-least* one round-trip *before* committing, it's quite expensive. Can > you live with a delay of ~150ms before COMMIT confirmation? Which puts an upper limit on transaction rates at 6TPS. Blech. > Another issue is the reliability of your failure detectors. How does > server B know that server A is really down (and not only the link?). > Normally, that's solved with a quorum device. So that you have to have > at least three servers - preferably in different locations. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGzu/7S9HxQb37XmcRArnEAJ947nvYFT1eQvRzj6YkpVEDLtLUqQCgsbQy rgyz2ZCrlGbS+RzzXTD1ybY= =SRQv -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] SQL Diff ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/25/07 21:51, Kevin Kempter wrote: > Hi List; > > I have a very large table (52million rows) - I'm creating a copy of it to rid > it of 35G worth of dead space, then I'll do a sync, drop the original table > and rename table2. What is your definition of "dead space"? Bad rows, duplicate rows, old rows? Something else? > Once I have the table2 as a copy of table1 what's the best way to select all > rows that have been changed, modified in table1 since the initial laod from > table1 into table2? > > Also I'll need to delete any rows in table2 that no longer remain in table1. > > There is no change date column > I could do something like select * from table1 where col1 || col2 || col3 etc > not in (select col1 || col2 || col3 etc from table2) > > but this would be ineffecient & slow. > > Anyone have a suggestion to do this in an efficient manner? > > Thanks in advance - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG0O8bS9HxQb37XmcRAsWdAJoDI/WvdaGSO7CUkUa74xHoRgycIwCguLXv Msw0TdJyYI4keoq+ofu4j3c= =Gi/f -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] SQL Diff ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/25/07 22:21, Kevin Kempter wrote: > On Saturday 25 August 2007 21:10:19 Ron Johnson wrote: >> On 08/25/07 21:51, Kevin Kempter wrote: >>> Hi List; >>> >>> I have a very large table (52million rows) - I'm creating a copy of it to >>> rid it of 35G worth of dead space, then I'll do a sync, drop the original >>> table and rename table2. >> What is your definition of "dead space"? >> >> Bad rows, duplicate rows, old rows? Something else? > > deleted rows that should have been cleaned up with vacuum, problem is the > client let it go so long that now I cant get a vacuum to finish cause it > impacts the day2day operations too much. Long story, see my recent questions > on the performance list for more info. OK. > >>> Once I have the table2 as a copy of table1 what's the best way to select >>> all rows that have been changed, modified in table1 since the initial >>> laod from table1 into table2? Is this a 24x7 database? >>> Also I'll need to delete any rows in table2 that no longer remain in >>> table1. >>> >>> There is no change date column >>> I could do something like select * from table1 where col1 || col2 || col3 >>> etc not in (select col1 || col2 || col3 etc from table2) >>> >>> but this would be ineffecient & slow. >>> >>> Anyone have a suggestion to do this in an efficient manner? >>> >>> Thanks in advance - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG0RRzS9HxQb37XmcRAtuhAJ0TEBIHJwvcR58iU4MvjV2CYMvvfQCdFnkJ ThDbGY8dXJ2MoqOaNHInqdU= =GF05 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Install on 32 or 64 bit Linux?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/28/07 16:21, Ralph Smith wrote: > Hello, > > We're at a crossroads here and it's time to upgrade boxes and versions > of PG. > This eMail query is about the first step. > > Are there any recommendations on whether to install onto 32 vs 64 bit > Linux? > We're going to be using virtual machines. Which hypervisor? > Our application consists mostly of near-sequential inserts into a LARGE > table, accompanied by updates into much smaller tables. > I perform the occasional queries on the large table, requiring > sequential scans. > > Are there any indications whether 32 or 64 bit Linux would be preferable? You haven't told us the important stuff yet. Primarily: How *big* is the database? How many inserts/minute? What else will this box be doing besides inserting records? How fast are the disks? SCSI or SATA/IDE? How much RAM? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1KlnS9HxQb37XmcRAkSJAKDIHfK7YzumbMrZ6iPKMUyDnTnafQCbBKDJ XplucXPbBqXtruC4rlHc+yM= =Shq4 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Is there a better way to do this?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/28/07 20:06, Tom Lane wrote: [snip] > > As Michael says, the speed argument is really kinda minor compared > to the other ones, but it's real enough. Every little bit counts, though. For example, if it's part of an otherwise computationally-intensive operation that you're performing on 20 million rows, you'll definitely see the difference. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1OP4S9HxQb37XmcRAvxUAJ0TFcC06TgAsOv8aou3BC1pJpbADwCgkQyy Ib8H44C7d8mvvWfsHbWjFVE= =3/7d -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Reliable and fast money transaction design
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/29/07 07:27, cluster wrote: > OK, thanks. But what with the second question in which the UPDATE is > based on a SELECT max(...) statement on another table? How can I ensure > that no other process inserts a row between my SELECT max() and UPDATE - > making my SELECT max() invalid? > > A table lock could be an option but I am only interested in blocking for > row insertions for this particular account_id. Insertions for other > account_ids will not make the SELECT max() invalid and should therefore > be allowed. Well, concurrency and transactional consistency *allows* other processes to update the table after you start your transaction. You just won't *see* their updates while you're inside of a transaction. Of course, if you truly want exclusive access, you could LOCK the table. It's well explained in the documentation... - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1XaWS9HxQb37XmcRAi5hAKDff5j5KnqWdGKxHjCJuTwXxfPwjACfZuko 1Ic5Bq1tU3IlPP44VYyD74M= =Sv0p -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Reliable and fast money transaction design
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/29/07 09:34, Decibel! wrote: > On Wed, Aug 29, 2007 at 08:37:26AM -0500, Ron Johnson wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> On 08/29/07 07:27, cluster wrote: >>> OK, thanks. But what with the second question in which the UPDATE is >>> based on a SELECT max(...) statement on another table? How can I ensure >>> that no other process inserts a row between my SELECT max() and UPDATE - >>> making my SELECT max() invalid? >>> >>> A table lock could be an option but I am only interested in blocking for >>> row insertions for this particular account_id. Insertions for other >>> account_ids will not make the SELECT max() invalid and should therefore >>> be allowed. >> Well, concurrency and transactional consistency *allows* other >> processes to update the table after you start your transaction. You >> just won't *see* their updates while you're inside of a transaction. > > Just make sure and read up about transaction isolation... in the default > of READ COMMITTED mode, you can sometimes see changes made by other > transactions. Argh!!! The RDBMS that I typically use defaults to SERIALIZABLE. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1YxuS9HxQb37XmcRAlJOAKCWL+NtM95YC2bMkFjOkD2NfF/xuQCggfKO QQC/mW+IYtlV6R9rqaSomMs= =H3+i -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
ACID (was Re: [GENERAL] Reliable and fast ...)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/29/07 10:40, Joshua D. Drake wrote: > Ron Johnson wrote: >> On 08/29/07 09:34, Decibel! wrote: >>> On Wed, Aug 29, 2007 at 08:37:26AM -0500, Ron Johnson wrote: >>>> >>>> On 08/29/07 07:27, cluster wrote: >>>>> OK, thanks. But what with the second question in which the UPDATE is >>>>> based on a SELECT max(...) statement on another table? How can I ensure >>>>> that no other process inserts a row between my SELECT max() and UPDATE - >>>>> making my SELECT max() invalid? >>>>> >>>>> A table lock could be an option but I am only interested in blocking for >>>>> row insertions for this particular account_id. Insertions for other >>>>> account_ids will not make the SELECT max() invalid and should therefore >>>>> be allowed. >>>> Well, concurrency and transactional consistency *allows* other >>>> processes to update the table after you start your transaction. You >>>> just won't *see* their updates while you're inside of a transaction. >>> Just make sure and read up about transaction isolation... in the default >>> of READ COMMITTED mode, you can sometimes see changes made by other >>> transactions. >> Argh!!! The RDBMS that I typically use defaults to SERIALIZABLE. > > SERIALIZABLE is really slow :). You should look into SERIALIZABLE only > for those transactions that need it. There is also SELECT FOR UPDATE. We use SERIALIZABLE (with all it's locking "issues") to guarantee the I in ACID. ISTM that READ COMMITTED can only deliver "ACD". - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1ZVYS9HxQb37XmcRAlopAJ9wvAovDcqvUpsj5dqSrum+/3QUbgCeODwL a8BJm6gi7VnR6dWgtmTLkcM= =eg1s -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Reliable and fast money transaction design
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/29/07 10:47, Tom Lane wrote: > Ron Johnson <[EMAIL PROTECTED]> writes: >> On 08/29/07 07:27, cluster wrote: >>> Just make sure and read up about transaction isolation... in the default >>> of READ COMMITTED mode, you can sometimes see changes made by other >>> transactions. > >> Argh!!! The RDBMS that I typically use defaults to SERIALIZABLE. > > You can change default_transaction_isolation if you like. You misunderand: we do that on purpose, and I had forgotten that most RDBMSs don't do that. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1Zf8S9HxQb37XmcRAjt+AKCW7I1H9ON60TEUaZMzLzcXYFb0pwCgh9qC Q+VwNMFCHTWqq1mTL8kx13w= =3NIY -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: ACID (was Re: [GENERAL] Reliable and fast ...)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/29/07 11:20, Joshua D. Drake wrote: > Ron Johnson wrote: >> On 08/29/07 10:40, Joshua D. Drake wrote: >>> Ron Johnson wrote: > >>>> Argh!!! The RDBMS that I typically use defaults to SERIALIZABLE. >>> SERIALIZABLE is really slow :). You should look into SERIALIZABLE only >>> for those transactions that need it. There is also SELECT FOR UPDATE. >> We use SERIALIZABLE (with all it's locking "issues") to guarantee >> the I in ACID. ISTM that READ COMMITTED can only deliver "ACD". > > You are using serializable for select statements? READ ONLY, which defaults to SERIALIZABLE. (It's not PostgreSQL...) - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1Z02S9HxQb37XmcRAnzVAKDFFqHLuMHE1q6sgvO288bzZvZa1gCfcGWM KUyB8HyjE3s9NfWq5GeLfvQ= =9jB2 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] counting columns
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/28/07 10:46, pere roca wrote: > hi! > > I want different users to insert data and perform different analysis which > will be inserted in an interactively created new column. The name of this > new column should follow a logic name such as: fxanalysis_1 for the first > user that perform the analysis called fx, fxanalysis_2 for the second... > To get it, I think I should construct a function that counts how many > columns have their name starting with fxanalysis. >If count >=1, then alter table adding a new column with name fxanalysis_2 >and so on... >One of the questions: how to COUNT COLUMNS? I can rename, add, alter > columns... but count? You'll need to go into the catalog. Your idea, though, is poor. You should make a second table, where the key is the same as the master table, plus a "sequence number". If you want to do it your way, though, PostgreSQL supports array types. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG1gSbS9HxQb37XmcRAqw+AKC8lDaQZ6IQ2ugVZzfxuT8TlGeHvQCgsLic esN5w79t4bGAnURm+Nulq20= =PBHq -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Export data to MS Excel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/01/07 02:16, Ashish Karalkar wrote: > Hello All, > I want to export data from PostgreSQL tables to MS Excel. > Is there any way? Extract the data to a CSV (comma or tab) file. http://www.postgresql.org/docs/ http://www.postgresql.org/docs/8.2/interactive/index.html http://www.postgresql.org/docs/8.2/interactive/sql-copy.html - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG2TopS9HxQb37XmcRAi2oAJ0VGYEPQm0N/y1kYVW6a+eg8hfyyACg6AUz B0u5gnYgPz8GyMIMH2hSe3g= =l1VW -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Bigtime scaling of Postgresql (cluster and stuff I suppose)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/01/07 08:12, chris smith wrote: >> Ever read anything on how myspace is laid out? The big ones need >> replication to handle the traffic. > > Actually no. > > http://highscalability.com/livejournal-architecture > > "Using MySQL replication only takes you so far." (Yeh it's mysql but > the point is valid regardless). > "You can't keep adding read slaves and scale." > > A lot use sharding now to keep scaling (limiting to "X" users/accounts > per database system and just keep adding more database servers for the > next "X" accounts). Hmmm. Horizontally partitioning your "database" into multiple physical databases is 10+ years old. At least. This is how DEC implemented the billing database for DirecTV, and how we implemented *large* toll systems in the US Northeast. In addition to the account databases, you need a "reference" database for tables that can't be partitioned by account, be able to run queries across databases, and middleware that knows how to direct transactions to the correct database. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG2c6QS9HxQb37XmcRAkuGAJ4thc+owEX8OJl1qaTrY+krHkMIwgCfWver aUEiSPkrw4Gnf7dI0ftSVJ0= =4wFD -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Reporting services for PostgreSQL
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/01/07 06:12, Andrus wrote: > I'm looking for a report generator which renders reports in server and sends > rendering result to client. "Render" as in "run the report program on the host"? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG22LqS9HxQb37XmcRAlspAKC2paI69cLtVhMav5S1WFjU6Kqk/QCgmAuE TOhUBVnjRQtGpeMc01tgDh4= =efFa -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Need suggestion on how best to update 3 million rows
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 01:13, Ow Mun Heng wrote: > I have a table in PG, Pulled from SQL Server using Perl DBI (w/o using > chopblanks) and have ended up with a column where the "space" is being > interpreted as a value. > > eg: > > "ABC " when it should be "ABC" > > this is being defined as varchar(4) > > I've already pull the relevent columns with > > create foo as select unique_id, rtrim(number) from org_column > > I've tried to do the update using > > update org_column set number = foo.number where foo.unique_id = > org_column=unique_id. Number? Where does "number" come from? Unless you've got weird field names, that doesn't sound like a very good name for a VARCHAR(4) column. > The update is taking a few hours and still hasn't ended. > > I've killed it already and rolled back the changes. > > what's the easiest way to update these fields? Is it only *some* tuples that have the "extra space" problem? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG38NAS9HxQb37XmcRAlZhAKCMtXSMzvbZ04M3YAdlAhjN4p7rSQCfZTDp Goyd+/FIFdwoc7IA87Mr3xM= =hJfr -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Need suggestion on how best to update 3 million rows
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 04:20, Ow Mun Heng wrote: > On Thu, 2007-09-06 at 04:07 -0500, Ron Johnson wrote: > >> On 09/06/07 01:13, Ow Mun Heng wrote: > >>> update org_column set number = foo.number where foo.unique_id = >>> org_column=unique_id. >> Number? Where does "number" come from? Unless you've got weird >> field names, that doesn't sound like a very good name for a >> VARCHAR(4) column. > > "number" is just a fictitious column name. I use sane column names of > course :-) OK. Maybe there's an English language "issue", or maybe I'm just excessively picky, but using "number" in this context is confusing. In any event, it didn't stop the Earth from spinning or the sun from shining, so it can't be that bad of a problem... >>> The update is taking a few hours and still hasn't ended. >>> > >> Is it only *some* tuples that have the "extra space" problem? > > Actually, it's all of it Then I agree with Alban: update table set number = trim(number); or, if you need the leading spaces: update table set number = rtrim(number) Then: update table set number = NULL where number = ''; - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG38y7S9HxQb37XmcRAgC8AKDue6TRz4oLcmavV5u6dw0yOiQC4gCfVmgt pCuDuyjOKh7LM/dfACkw3lc= =KCw6 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] a provocative question?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 10:43, TJ O'Donnell wrote: > I am getting in the habit of storing much of my day-to-day > information in postgres, rather than "flat" files. > I have not had any problems of data corruption or loss, > but others have warned me against abandoning files. > I like the benefits of enforced data types, powerful searching, > data integrity, etc. > But I worry a bit about the "safety" of my data, residing > in a big scary database, instead of a simple friendly > folder-based files system. > > I ran across this quote on Wikipedia at > http://en.wikipedia.org/wiki/Eudora_%28e-mail_client%29 > > "Text files are also much safer than databases, in that should disk > corruption occur, most of the mail is likely to be unaffected, and any > that is damaged can usually be recovered." > > How naive (optimistic?) is it to think that "the database" can > replace "the filesystem"? Text file are *simple*. When fsck repairs the disk and creates a bunch of recovery files, just fire up $EDITOR (or cat, for that matter) and piece your text files back together. You may lose a block of data, but the rest is there, easy to read. Database files are *complex*. Pointers and half-vacuumed freespace and binary fields and indexes and WALs, yadda yadda yadda. And, by design, it's all got to be internally consistent. Any little corruption and *poof*, you've lost a table. A strategically placed corruption and you've lost your database. But... that's why database vendors create backup/restore commands. You *do* back up your database(s), right?? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4D2nS9HxQb37XmcRAg73AKCD321T0u7lux0K2NBhkpQ4kwBjOwCfWh3D WDuns1HAZboUPlraTzbE0oo= =NuLE -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Column as arrays.. more efficient than columns?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 20:53, Merlin Moncure wrote: [snip] > > arrays are interesting and have some useful problems. however, we > must first discuss the problems...first and foremost if you need to > read any particular item off the array you must read the entire array > from disk and you must right all items back to disk for writes. Reads and writes are done at the page level, so I'm not sure this is valid. >also, > they cause some problems with constraints and other issues that come > up with de-normalization tactics. ACK. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4N+/S9HxQb37XmcRApl6AJ43p087jXwHs2LHGlr+JoIUVs8s7QCgmRWY BjV99QNGxKQnel3vQ4RuBMA= =IeDI -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Column as arrays.. more efficient than columns?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 21:26, Ow Mun Heng wrote: > On Thu, 2007-09-06 at 20:57 -0500, Michael Glaesemann wrote: >> On Sep 6, 2007, at 20:46 , Ow Mun Heng wrote: > >>> I would believe performance would be better it being denormalised. (in >>> this case) >> I assume you've arrived at the conclusion because you have >> (a) shown >> that the performance with a normalized schema does not meet your >> needs; >> (b) benchmarked the normalized schema under production >> conditions; >> (c) benchmarked the denormalized schema under production >> conditions; and >> (d) shown that performance is improved in the >> denormalized case to arrive at that conclusion. I'm interested to see >> the results of your comparisons. > >> Regardless, it sounds like you've already made up your mind. Why ask >> for comments? > > You've assumed wrong. I've not arrived at any conclusion but merely > exploring my options on which way would be the best to thread. I'm > asking the list because I'm new in PG and after reading all those > articles on highscalability etc.. majority of them are all using some > kind of denormalised tables. Correlation != causation. There *might* be a causal relationship between high scalability and table denormalization, but I seriously doubt it. > Right now, there's 8 million rows of data in this one table, and growing > at a rapid rate of ~2 million/week. I can significantly reduce this > number down to 200K (i think by denormalising it) and shrink the table > size. Even presuming you only insert data SIX hours per day, that's only 13.3 inserts per second. Not very impressive. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4N81S9HxQb37XmcRArnRAJ9T2vOWe+RTWK99zYKCXIVfzisY5ACg3s8H NAeykgSGT2jeiXUa8P8oRAQ= =GBcW -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Column as arrays.. more efficient than columns?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/07/07 02:49, Ow Mun Heng wrote: > On Fri, 2007-09-07 at 00:18 -0500, Ron Johnson wrote: >> On 09/06/07 21:26, Ow Mun Heng wrote: >> I've not arrived at any conclusion but merely >>> exploring my options on which way would be the best to thread. I'm >>> asking the list because I'm new in PG and after reading all those >>> articles on highscalability etc.. majority of them are all using some >>> kind of denormalised tables. >> Correlation != causation. >> >> There *might* be a causal relationship between high scalability and >> table denormalization, but I seriously doubt it. > > I can't refute you on this since I have no experience in this arena, > only what I read in highscalbility.com (IIRC) > >>> Right now, there's 8 million rows of data in this one table, and growing >>> at a rapid rate of ~2 million/week. I can significantly reduce this >>> number down to 200K (i think by denormalising it) and shrink the table >>> size. >> Even presuming you only insert data SIX hours per day, that's only >> 13.3 inserts per second. Not very impressive. > > Data is inserted 24 hours a day, but not at the same rate each > sec/minute. The problem isn't really the data-insertion, it's already > inserted in a normalised manner. It's the selection of data. (OLTP > datahouse) which takes a longer time and which is the area of worry. Datahouse or "data warehouse"? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4T8QS9HxQb37XmcRAmwFAJ0bOFYj4gWg2VGa4l28kiDAkraQYACgl167 sRA33c8h7ZHS2qgAfgFmzkg= =66Z0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Column as arrays.. more efficient than columns?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/07/07 07:49, Merlin Moncure wrote: > On 9/7/07, Ron Johnson <[EMAIL PROTECTED]> wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> On 09/06/07 20:53, Merlin Moncure wrote: >> [snip] >>> arrays are interesting and have some useful problems. however, we >>> must first discuss the problems...first and foremost if you need to >>> read any particular item off the array you must read the entire array >>> from disk and you must right all items back to disk for writes. >> Reads and writes are done at the page level, so I'm not sure this is >> valid. > > sure it is...since the denormalized record is much larger (especially > in array scenarios), the tuple is much larger meaning the page will > fill up much more quickly meaning more dead pages, more vacuuming, > etc. Besides that, the server has to do some work presenting the > array as part of the read which is overhead. I didn't go into a lot > of detail but the reasoning is sound. Here is a quick example showing > the problem. We agree. What I meant was that reads and writes are done at the page level no matter whether the table is normalized or not. Thus, to say "if you need to read any particular item off the array you must read the entire array from disk and you must right all items back to disk for writes." is... irrelevant. That's probably an imprecise word, but it's all I can think of at the moment. We also agree regarding big records filling pages faster. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4Uv8S9HxQb37XmcRAk2rAKCpxsJjhcMbvHJB5GrZOzNsUXgmWwCg7Cy0 CEU8zlbHGG9JvZgCSm/hajE= =/Uv/ -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] an other provokative question??
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/07/07 01:37, Greg Smith wrote: > On Fri, 7 Sep 2007, Ron Johnson wrote: > >> Definitely a niche product. > > Stonebraker's commentary was unfortunately spun by the ComputerWorld > columnist. Tech journalist morphing reality to make a provocative story? Say in ain't so! >I hope people followed the link to his actual blog entry at > http://www.databasecolumn.com/2007/09/one-size-fits-all.html where his > arguement is that the idea of one database approach always being right > just isn't true anyway. With that mindset, every technology is a niche > product of sorts; just the size of the niche varies. I read something similar by him a few months ago. Very interesting. > Given past history of this project and its relation to Stonebraker, I > was tempted earlier today to suggest that the Postgres vs. PostgreSQL > renaming argument be dropped in favor of renaming the database > "Horizontica". - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4T/TS9HxQb37XmcRAj8gAKC+IeBhW9+0qlrRSiUbVwQH/EiQlACfWxVm Y4uTvkWRDog+W2vPTrULXmQ= =bKYs -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] an other provokative question??
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 22:54, Tom Lane wrote: > "Dann Corbit" <[EMAIL PROTECTED]> writes: >>> Relational database pioneer says technology is obsolete >>> http://www.computerworld.com/action/article.do?command=3DviewArticleBasic&articleId=3D9034619 > >> This bit is a hint: >> "Column-oriented databases -- such as the one built by Stonebraker's >> latest start-up, Andover, Mass.-based Vertica Systems Inc. -- store data >> vertically in table columns rather than in successive rows." > >> Mr. Stonebraker's company sells column oriented databases. So of course >> the other methods must be "obsolete". > > I don't see anything in there where Stonebraker says that relational DBs > are obsolete. What he suggests is that column-oriented storage might Does "column-oriented storage" mean that all of the COLUMN_A values for all 200 million rows are stored together on adjacent pages? If so, then doing aggregates (the bread and butter of DW) *would* seem to be faster. But b-tree leaf that points to "a record" would need num_cols pointers instead of one pointer. Very messy. And large. Definitely a niche product. > beat row-oriented storage for a lot of modern applications. He might be > right (I'm sure not going to bet against the guy who started Postgres) > but this has not got anything to do with the concept of a relational > database. It's an implementation detail --- maybe a pretty fundamental > one, but in principle you could build a DB either way and no user could > see a semantic difference. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4OF4S9HxQb37XmcRAtQeAKCGqjOcdmT6ccrbMy/JDOURjYItSACfVu7/ AEdP1gbDPK/MNwCVlCb1IAg= =PD28 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Column as arrays.. more efficient than columns?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/07/07 09:00, Ow Mun Heng wrote: > >>Datahouse or "data warehouse"? > > OLTP data warehouse. But OLTP & DW are diametrically opposed in how you design, structure, load and use them. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4ZRES9HxQb37XmcRArECAJ0elIpVRxGjB14fzH98opOmOnm4jwCg4Vb+ rn/mb7tB0d6p6GAio4lSxCc= =SCl9 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] a provocative question?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/06/07 20:45, Chris Browne wrote: > [EMAIL PROTECTED] ("Trevor Talbot") writes: >> There's also a point in regard to how modifications are made to your >> data store. In general, things working with text files don't go to >> much effort to maintain durability like a real database would. The >> most direct way of editing a text file is to make all the changes in >> memory, then write the whole thing out. Some editors make backup >> files, or use a create-delete-rename cycle, but they won't >> necessarily force the data to disk -- if it's entirely in cache you >> could end up losing the contents of the file anyway. > > In the case of Eudora, if its filesystem access protocol involves > writing a new text file, and completing that before unlinking the old > version, then the risk of "utter destruction" remains fairly low > specifically because of the nature of access protocol. mbox is a monolithic file also, and you need to copy/delete, copy/delete, yadda yadda yadda. Just to do anything, you need 2x as much free disk space as you biggest mbox file. What a PITA. mh and Maildir are, as has been partially mentioned, much more efficient in that regard. (Yes... mbox is an excellent transport format.) - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG4Nx3S9HxQb37XmcRAg+6AJ42gRm82MTmocxNC2hp3yQ9ZsFhQgCgoXVQ i51vvPBwN2Qot2TUR9AjMBY= =8WKX -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Scalability Design Questions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/08/07 19:58, novnov wrote: > You're right, that's not how I understood your partitioning question. I'd not > be eager to implement that kind of split in the db, though I can see that > it's an example of how to spread the load around. > > But basically, it seems that the answer to one of my questions is that there > is currently no way with postgres to spread a single database over multiple > servers, ala a loadbalanced apache cluster, where requests are forwarded to > different boxes. > > It's not going to be an issue for me soon, maybe ever, I just wanted to see > what the score is. Scott is correct both in (a) that the money otherwise spent on RAC can buy a big heap of extra oomph, and (b) "middleware partitioning" method of scaling systems works very well. Any way you go, though, requires proper design and coding from the very beginning. And it never hurts to think outside the box: sometimes you can get the results you want by doing something that seems crazy. For example, we had a couple of tables that were specifically designed for OLTP. A specific weekly report that joined the 2 tables just would *never* finish. Instead of fighting to create some hairy- looking SQL-statement-from-Hell, I create WHERE-less views on the two tables with only the columns that were needed. Now we have a job that unloads the views, truncates then loads them into a *separate* database (so that backups & CREATE INDEX don't clash), indexes them perfectly for this query, and then runs the query against this separate database. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG42yaS9HxQb37XmcRAg+VAJ48E7pERfJ2MpRj2j9N4u1Wh01wJQCfUwnN h1pHoRtnUOLrJkWOtg0Fs4A= =7meG -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Time Zone design issues
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/09/07 22:29, novnov wrote: [snip] > > But it doesn't particularly make sense to set the server's clock to UTC. > It'd seem to skew a lot of server functionality which I'd think should > normally be geared around local time. So I'd guess that the route to take is > to keep the server pegged to local time, and use a function that derives UTC > from now(), tags the 'last modified' fields with that value. Your single-user Windows mindset is shining brightly. Unix servers have had their internal clocks set to UTC for a decade or more, and there have been no noticeable ill effects, since apps all know to adjust for TZ. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5MfXS9HxQb37XmcRAuf5AKDKm9h0AxznSTJ0fJx7KzVqFDblYACfeSUV Lub89IZdWSIfvGhUZde/jG0= =3+7a -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] arrays of foreign keys
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/07/07 18:47, Max wrote: > Hello, > > And pardon me if I posted this question to the wrong list, it seems > this list is the most appropriate. > > I am trying to create a table with an array containing foreign keys. > I've searched through the documentation and couldn't find a way to do > so. > > Is this something that one can do? > > Basically, I have two tables: > > create table user ( > user_id serial, > login varchar(50) primary key, > tags integer[]-- this is where the problem lies > ); > > create table tag ( > tag_id serial, > name varchar(50) primary key > ); > > I would like the user.tags field to be a list of foreign keys (tag_ids > specifically). > > I know I can solve my problem with the following table: > > create table user_tag ( > user_id integer, > tag_id integer, > foreign key (user_id) references table user(user_id) on delete > cascade, > foreign key (tag_id) references table tag(tag_id) on delete cascade, > primary key (user_id,tag_id) > ); > > But I would really like to avoid doing that. Why? The (literally) cardinal rule of database normalization is "eliminate repeating values". > Is there a solution to this problem with arrays of foreign keys, and > if so, how does one do that? > > Thanks for any help. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5UDUS9HxQb37XmcRAo+WAKDSB8DNYBZiZtOJ3utAkkX9QkDYtACgk0ab HKf6Oe1DbvZP8cmh1e9dZaQ= =lPTU -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Time Zone design issues
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/10/07 10:55, Scott Marlowe wrote: > On 9/10/07, novnov <[EMAIL PROTECTED]> wrote: >> Your explanation of now() and localtime() is good...but localtime() to >> postgres acting as a web app db, wouldn't it just return the local time as >> far as the server is concerned? > > No, it would return it as the local time of the POSTGRESQL client. > > For instance, suppose that someone in NY, connects to a web app, which > asks him what TZ he's in and he picks EST5EDT. The web app sets > timezone='EST5EDT' and inserts a time of '2007-07-11 12:30:00'. > > The database now stores that as '2007-07-11 16:30:00+00' > > No matter WHERE the postgresql server is, that's what it has in it. > > Now, I connect from Chicago, and your web app ascertains my timezone > as CST6CDT. When I ask for the same timestamp, I get '2007-07-11 > 11:30:00-05'. > > Now, if you don't set a timezone, then the database will use it's own. > If your postgresql server is in california, then it might have a > timezone of PST8PDT set. If your web app didn't tell it otherwise, > then the time would be '2007-07-11 09:30:00-07'. > > Note that there are IP lookup tools you can use to determine, or at > least make a wild guess at, someone's timezone. But you still need to > let them pick one if you get it wrong. Then you can store that > timezone in a user profile and set it everytime the user uses your web > app. In both Windows *and* Unix "you" set your TZ when you install the system. There are system functions to inquire how you've set it. Browsers already report back a mountain of client data to the web server. I'd be stunned if FF, IE, Opera, Konq, etc don't already expose TZ, too. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5W04S9HxQb37XmcRAhsrAKCr3WQZi0oyWSJsMxortjoExeaS1QCg5HbS G+fd0X7UvX9406A+Td2GYpw= =wk17 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Time Zone design issues
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/10/07 15:21, Alvaro Herrera wrote: [snip] > > I wouldn't trust the browser's TZ, and you would need a way to > override it. Why? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5asLS9HxQb37XmcRAoQqAKCoqdRr6cIOnIktIa8l2689isYtAQCbBK9z /O1wYFZrtlols1lrvo1Rw5Q= =WcJ6 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Time Zone design issues
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/10/07 19:50, Tom Lane wrote: > Ron Johnson <[EMAIL PROTECTED]> writes: >> On 09/10/07 15:21, Alvaro Herrera wrote: >>> I wouldn't trust the browser's TZ, and you would need a way to >>> override it. > >> Why? > > The browser may not know the setting, or may not tell it to you, > or you might not be able to make any sense of what it says > (timezone names are hardly standardized). Well that's true. Except for numeric offsets. > This whole sub-thread actually is predicated on an assumption not > in evidence, which is that there is any browser anywhere that will > tell the http server timezone information. I'm quite sure no such > thing is required by the http standard. I'm really surprised. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5ff8S9HxQb37XmcRAstvAJ4tnhHhv9SCWnrWGq8td5orPttrfgCg1c+t IyhqUpzNg6RDS3wkALx5mUc= =BtHs -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Hardware recommendation: which is best
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/11/07 07:55, Phoenix Kiula wrote: > On 11/09/2007, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: >> It depends what you want to do with your database. >> >> Do you have many reads (select) or a lot of writes (update,insert) ? > > > This one will be a hugely INSERT thing, very low on UPDATEs. The > INSERTS will have many TEXT fields as they are free form data. So the > database will grow very fast. Size will grow pretty fast too. 15000 rows/day times 365 days = 5475000 rows. How big are these rows? *That* is the crucial question. >> You should use a hardware raid controller with battery backup write cache >> (write cache should be greater than 256 MB). > > > I'll have a raid controller in both scenarios, but which RAID should > be better: RAID1 or RAID10? The striping aspects of RAID10 makes sequential reads and writes and large writes much faster. The more spindles you have, the faster it is. If you are *really* concerned about speed, 4 x 147GB 10K SCSI >> How much memory do you have ? > > > 4GB to begin with.. > > >> How big is your database, tables ... ? > > > Huge, as the two main tables will each have about ten TEXT columns > each. They will have about 15000 new entries every day, which is quite > a load, so I believe we will have to partition it at least by month > but even so it will grow at a huge pace. 15000 in an 8 hour window is 31.25 inserts/minute or ~2 seconds/insert. If the records are 30MB each, then that could cause some stress on the system in that 8 hour window. If they are 3MB each, not a chance. > While we are at it, would postgres be any different in performance > across a single-CPU Quad Core Xeon with a dual CPU dual-core AMD > Opteron? Or should the hard disk and RAM be the major considerations > as usually proposed? Opteron is the standard answer. What is your backup/recovery strategy? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5pn1S9HxQb37XmcRAnl1AJ48p5CGBMma15yWt9FtD0bOXN/D7ACeNxxq 9EWbm10L/Zt/tB1xPly/Ex0= =QPI1 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Hardware recommendation: which is best
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/11/07 11:26, Phoenix Kiula wrote: > Thanks Greg. > > >> You're not going to get a particularly useful answer here without giving >> some specifics about the two disk controllers you're comparing, how much >> cache they have, and whether they include a battery backup. >> [snip] > > > Would appreciate any tips. From these two, Scenario 1 looks marginally > better to me. I am requesting further information about cache and > battery powered backup, but would appreciate first-off thoughts based > on above info. How (on average) large are the records you need to insert, and how evenly spread across the 24 hour day do the inserts occur? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5saES9HxQb37XmcRAjZnAKCrPnIyuhuXm+InFpmn/bDiw/1xKgCg7xoC pq4xdQ72BT4qFVqvy2g5RrM= =EbRU -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Hardware recommendation: which is best
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/11/07 12:02, Phoenix Kiula wrote: > On 12/09/2007, Ron Johnson <[EMAIL PROTECTED]> wrote: >> How (on average) large are the records you need to insert, and how >> evenly spread across the 24 hour day do the inserts occur? > > > There will be around 15,000 inserts in a day. Each insert will have > several TEXT columns, so it is difficult to predict, but about 30,000 > to 100,000 characters in each row. And yes, the inserts will be very > consistently timed every day. 15000*10 = 1.5GB. 1.5GB / (24*60) = ~1 binary MB. Any computer that that can't write 1 megabyte per minute in a day should have been retired in 1970. So. Unless there's something that you aren't telling us, this should be handleable by a Wal-Mart Special with an extra-large disk in it. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG5vxLS9HxQb37XmcRAoneAKCsw0W92jmySlGaRFpkZsvIJovTJwCgqPFg j4Fp2uV5vf+OIVhnuJSaQxs= =4u83 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Database/Table Design for Global Country Statistics
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/12/07 03:28, Stefan Schwarzer wrote: [snip] > > The new design would be like this: > > id |year|value > --- > 1 | 1970| NULL > 1 | 1971| 36 > 1 > 1 | 2005| 45 > 2 | 1970| > 2 . > > > Would that be considered as "good table design" then? What Richard says, plus: 1. The PK of New_Design should be country_id/year. 2. You also should not store the records where "value is NULL". A left outer join would handle that. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG57E/S9HxQb37XmcRAsb/AJ97lFt25sLwIYhkhQgGdJq2m8NaXgCfTxIV rVuX5FF0XRDkXJgdSlV1qZg= =jyv3 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Tablespaces on tertiary media
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/14/07 04:06, Mark Morgan Lloyd wrote: > Where does PostgreSQL stand with storing /really/ large amounts of data > offline? Specifically, if a FUSE is used to move a tablespace to > something like a tape archiver can the planner be warned that access > might take an extended period? > > I know that at one point (v6?) there were hooks in the code for > experimental Berkeley code to do this sort of thing but as far as I know > there has never been anything publicly available. While tertiary media certainly was relevant 10 years ago, is it really necessary in 2007? A couple of MSA-1000s stuffed with 1TB disks would hold an l-o-t *lot* of historical data. - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG6rpUS9HxQb37XmcRApN4AJ9ETn8nRlfGn67oRk4KVvd2+S6vtQCeKzlh pxIham1MIue8+PhxuK0PBFQ= =nOC4 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend