date:20070826

Re: [GENERAL] Geographic High-Availability/Replication

2007-08-26 Thread Bill Moran

Markus Schiltknecht <[EMAIL PROTECTED]> wrote:
>
> Hi,
> 
> Gregory Stark wrote:
> > Only if your application is single-threaded. By single-threaded I don't 
> > refer
> > to operating system threads but to the architecture. If you're processing a
> > large batch file handling records one by one and waiting for each commit
> > before proceeding then it's single threaded. If you have a hundred 
> > independent
> > clients on separate connections doing separate things then each one of them
> > could get 6tps. Which you have will depend on your application and your 
> > needs,
> > it may not be something you can change.
> 
> Correct.
> 
> Plus, as in the implementation of Postgres-R, performance is *not* bound 
> to the slowest node. Instead, every node can process transactions at 
> it's own speed. Slower nodes might then have to queue transactions from 
> those until they catch up again.

I'm curious as to how Postgres-R would handle a situation where the
constant throughput exceeded the processing speed of one of the nodes.

I can see your system working if it's just spike loads and the slow
nodes can catch up during slow periods, but I'm wondering about the
scenarios where an admin has underestimated the hardware requirements
and one or more nodes is unable to keep up.

Just musing, really.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] SQL Diff ?

2007-08-26 Thread Dawid Kuroczko

On 8/26/07, Kevin Kempter <[EMAIL PROTECTED]> wrote:
> On Saturday 25 August 2007 21:10:19 Ron Johnson wrote:
> > On 08/25/07 21:51, Kevin Kempter wrote:
> > > Hi List;
> > >
> > > I have a very large table (52million rows) - I'm creating a copy of it to
> > > rid it of 35G worth of dead space, then I'll do a sync, drop the original
> > > table and rename table2.
> >
> > What is your definition of "dead space"?
> >
> > Bad rows, duplicate rows, old rows?  Something else?
>
> deleted rows that should have been cleaned up with vacuum, problem is the
> client let it go so long that now I cant get a vacuum to finish cause it
> impacts the day2day operations too much.  Long story, see my recent questions
> on the performance list for more info.

In your place I would do something like Slony-I does, when
it replicates the tables.  Create on insert/update/delete triggers
on table1 which will log operations on table1 to some table1_log
table.  Then copy table1 to table2.  Then replay table1_log on
table2, then BEGIN;LOCK tablel1;finish replaying the lock;DROP table1;
alter table rename...;commit;

Or perhaps actually use Slony-I for the above steps?  Should work
quite nicely... Or perhaps use SkyTools for it (I've never used it)?

   Regarda,
   Dawid

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-26 Thread Trevor Talbot

On 8/24/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Trevor Talbot" <[EMAIL PROTECTED]> writes:
> > On 8/23/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> >> Not that wild a guess, really :-) I'd say it's a very good possibility -
> >> but I have no idea why it'd do that, since all backends load the same
> >> DLLs at that stage.
>
> > Not a valid assumption; you can't rely on consistent VM space among
> > multiple [non-cloned] processes without a serious amount of effort.
>
> I'm not sure if you have a specific technical meaning of "clone" in mind
> here, but these processes are all executing the identical executable,
> and taking care to map the shmem early in execution *before* they load
> any DLLs.  So it should work.  Apparently, it *does* work for awhile for
> the OP, and then stops working, which is even odder.

"Clone" in the same sense as fork(): duplicating a process instead of
regenerating it.  Even ignoring things like DLL replacement and
LD_PRELOAD-style options, there's still a lot of opportunity for
dynamic behavior.  All DLLs have an initialization routine called by
the loader (and on thread creation), which tends to be used to set up
things you don't want the caller to have to explicitly initialize.
DLLs that maintain global state they share with copies of themselves
in other processes can set up shared memory etc to do that.  They can
easily change their behavior based on the environment at the time of
process start.

There are also all the hooks for extension points, such as Winsock
LSPs.  Most such things happen only after an explicit initialization
(e.g. WSAStartup() or socket creation in the Winsock case), but
between the C runtime and third-party libraries, it may be happening
when you don't expect it.

All that said, I don't actually have a real-world example of process
VM layout changing like this, especially since you are using it early
to avoid this very problem.  I'd love to find out exactly what's going
on in Terry's case, but I haven't come up with a good way to do it
that doesn't disturb his production environment.

> If you've got a specific suggestion for making it more reliable,
> we're all ears.

To elaborate on what I said earlier, internal_forkexec() creates the
process suspended; while it has an execution environment set up, the
loader hasn't done all the DLL linking and initialization yet, so the
address space is relatively untouched.  At that point you could use
VirtualAllocEx() to reserve VM space for the shared memory at the
right address, and proceed with the rest of the setup.  When the new
backend starts up, it would then VirtualFree() that space immediately
before calling MapViewOfFileEx() on it.

I can probably set up with the 8.3 tree and MSVC to create an
artificial failure, and play with the above as a fix, but I'm not
quite sure when that will be.  There's still the issue of verifying it
is the problem on Terry's machine, and figuring out a fix for him.

On 8/24/07, Terry Yapt <[EMAIL PROTECTED]> wrote:

> Yes, the windows system log (application log section) doesn't show any
> error in several days.  Suddenly errors bring back to life and syslog
> errors repeats every few time.  But again errors disappears and return
> in a few hours.  After few hours the system goes out.
>
> Curiosity:
> ==
> On the log lines I have and I sent to the list:   * FATAL:  could not
> reattach to shared memory (key=5432001, addr=01D8): Invalid argument
> , this one: "addr=01D8" is always the same in spite of  the system
> have been shutting down and restarted or the error was out for a days.

The environment is consistent then.  Whatever is going on, when
postgres first starts things are normal, something just changes later
and the change is temporary.  As vague guides, I would look at some
kind of global resource usage/tracking, and scheduled tasks.  Do you
see any patterns about WHEN this happens?  During high load periods?
Any antivirus or other security type tasks running on the machine?
Any third-party VPN type software?  Fast User Switching or Remote
Desktop use?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] SQL Diff ?

2007-08-26 Thread Kevin Kempter

On Saturday 25 August 2007 23:49:39 Ron Johnson wrote:
> On 08/25/07 22:21, Kevin Kempter wrote:
> > On Saturday 25 August 2007 21:10:19 Ron Johnson wrote:
> >> On 08/25/07 21:51, Kevin Kempter wrote:
> >>> Hi List;
> >>>
> >>> I have a very large table (52million rows) - I'm creating a copy of it
> >>> to rid it of 35G worth of dead space, then I'll do a sync, drop the
> >>> original table and rename table2.
> >>
> >> What is your definition of "dead space"?
> >>
> >> Bad rows, duplicate rows, old rows?  Something else?
> >
> > deleted rows that should have been cleaned up with vacuum, problem is the
> > client let it go so long that now I cant get a vacuum to finish cause it
> > impacts the day2day operations too much.  Long story, see my recent
> > questions on the performance list for more info.
>
> OK.
>
> >>> Once I have the table2 as a copy of table1 what's the best way to
> >>> select all rows that have been changed, modified in table1  since the
> >>> initial laod from table1 into table2?
>
> Is this a 24x7 database?

Yes. with little room for extra overhead


>
> >>> Also I'll need to delete any rows in table2 that no longer remain in
> >>> table1.
> >>>
> >>> There is no change date column
> >>> I could do something like select * from table1 where col1 || col2 ||
> >>> col3 etc not in (select col1 || col2 || col3 etc from table2)
> >>>
> >>> but this would be ineffecient & slow.
> >>>
> >>> Anyone have a suggestion to do this in an efficient manner?
> >>>
> >>> Thanks in advance



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] problem Linking a TTable component to a pgsql view using BCB5

2007-08-26 Thread JLoz

> Does the table have a unique index/primary key?

The view shows fields from two tables.  One of the primary keys of one of 
the tables is shown by the view.

Thanks,

JLoz

>
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster
> 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] simple query runs 26 seconds

2007-08-26 Thread Andrus

I have W2K server, relatively small database containing all required indexes 
and need to sum only few records.

My query takes 26 seconds to run.
How to fix this ?

Andrus.

explain analyze select sum(taitmata) as ukogus
   from rid join dok using (dokumnr)
   where toode='NE TR'
 and doktyyp='U'

returns

"Aggregate  (cost=47581.25..47581.26 rows=1 width=8) (actual 
time=26462.178..26462.178 rows=1 loops=1)"
"  ->  Hash Join  (cost=1175.91..47579.54 rows=684 width=8) (actual 
time=26462.002..26462.108 rows=7 loops=1)"
"Hash Cond: (rid.dokumnr = dok.dokumnr)"
"->  Bitmap Heap Scan on rid  (cost=673.35..46946.99 rows=19703 
width=12) (actual time=19.864..26430.100 rows=21335 loops=1)"
"  Recheck Cond: (toode = 'NE TR'::bpchar)"
"  ->  Bitmap Index Scan on rid_toode_idx  (cost=0.00..668.42 
rows=19703 width=0) (actual time=12.402..12.402 rows=21335 loops=1)"
"Index Cond: (toode = 'NE TR'::bpchar)"
"->  Hash  (cost=458.47..458.47 rows=3528 width=4) (actual 
time=0.141..0.141 rows=6 loops=1)"
"  ->  Index Scan using dok_tasudok_unique_idx on dok 
(cost=0.00..458.47 rows=3528 width=4) (actual time=0.057..0.125 rows=6 
loops=1)"
"Index Cond: (doktyyp = 'U'::bpchar)"
"Total runtime: 26462.551 ms"

using

"PostgreSQL 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 
(mingw-special)" 



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] reporting tools

2007-08-26 Thread Andrus

Use

www.fyireporting.com

Open source, uses excellent PostgreSQL npgsql drivers.
Use standard RDL format

Andrus.



"Geoffrey" <[EMAIL PROTECTED]> kirjutas sõnumis 
news:[EMAIL PROTECTED]
> We are looking for a reporting tool that will enable users to generate 
> their own reports.  Something like Crystal Reports.
>
> Anyone using something like this with Postgresql?
>
> -- 
> Until later, Geoffrey
>
> Those who would give up essential Liberty, to purchase a little
> temporary Safety, deserve neither Liberty nor Safety.
>  - Benjamin Franklin
>
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
> 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-26 Thread Terry Yapt


Trevor Talbot escribió:

The environment is consistent then.  Whatever is going on, when
postgres first starts things are normal, something just changes later
and the change is temporary.  As vague guides, I would look at some
kind of global resource usage/tracking, and scheduled tasks.  Do you
see any patterns about WHEN this happens?  During high load periods?
Any antivirus or other security type tasks running on the machine?
Any third-party VPN type software?  Fast User Switching or Remote
Desktop use?
I have spent a lot of time looking for patterns on system logs, apache 
logs, postgres logs, etc...

I have not found any clue conclusive.

Only I can say I have this kind of errors on postgreSQL-Logs:
'2007-08-21 15:19:21 ERROR:  could not open relation 16692/16694/17295: 
Invalid argument'
And next log line/s are the statement-X.  But Statement-X runs ok and 
give me right results when I copy+paste on any sql editor connected to 
that DB.


That errors are not 'linked on time' with FATAL errors we are speaking 
about on this thread.


I am trying to get the opportunity to migrate that DB to another server 
and use that server to test anything we want, but the customer is 
reluctant to let me that server to try-test-errors process because that 
is their mail and web server too.  :-(


In spite of that server is remote far away from my location I have a 
console (UltraVNC) to it if you need something to looking for.


Greetings.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Geographic High-Availability/Replication

2007-08-26 Thread Markus Schiltknecht


Hi,

Bill Moran wrote:

I'm curious as to how Postgres-R would handle a situation where the
constant throughput exceeded the processing speed of one of the nodes.


Well, what do you expect to happen? This case is easily detectable, but 
I can only see two possible solutions: either stop the node which is to 
slow or stop accepting new transactions for a while.


This technique is not meant to allow nodes to lag behind several 
thousands of transactions - that should better be avoided. Rather it's 
meant to decrease the commit delay necessary for synchronous replication.



I can see your system working if it's just spike loads and the slow
nodes can catch up during slow periods, but I'm wondering about the
scenarios where an admin has underestimated the hardware requirements
and one or more nodes is unable to keep up.


Please keep in mind, that replication per se does not speed your 
database up, it rather adds a layer of reliability, which *costs* some 
performance. To increase the transactional throughput you would need to 
add partitioning to the mix. Or you could try to make use of the gained 
reliability and abandon WAL - you won't need that as long as at least 
one replica is running - that should increase the single node's 
throughput and therefore the cluster's throughput, too.


When replication meets partitioning and load balancing, you'll get into 
a whole new world, where new trade-offs need to be considered. Some look 
similar to those with RAID storage - probably Sequoia's term RAIDb isn't 
bad at all.


Regards

Markus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [GENERAL] simple query runs 26 seconds

2007-08-26 Thread Tom Lane

"Andrus" <[EMAIL PROTECTED]> writes:
> My query takes 26 seconds to run.

The time seems entirely spent in fetching rows from table "rid".
Perhaps that table is bloated by lack of vacuuming --- can you
show the output from "vacuum verbose rid"?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] problem Linking a TTable component to a pgsql view using BCB5

2007-08-26 Thread Merlin Moncure

On 8/23/07, JLoz <[EMAIL PROTECTED]> wrote:
> > Does the table have a unique index/primary key?
>
> The view shows fields from two tables.  One of the primary keys of one of
> the tables is shown by the view.

you will probably have better luck with the TQuery component.  Also,
you should try out another TDataset implementation...for example Zeos.
 The BDE is old, slow, and  buggy.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] Restore v. Running COPY/INDEX seperatly

2007-08-26 Thread Benjamin Arai


Hi,

So, I built my tables which contains a TSearch2 field by

1. Create table without indexes
2. COPY data into table
3. ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
4. UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);
5. Index all the fields including the TSearch2 field

The process takes several days.

In contrast, if I backup the table and restore it to a new table it  
takes a fraction of the time as running the above operation  
manually.  I am building my indexes at the end but I think the step 4  
may be causing uneeded overhead.  Can I somehow just copy data into  
the idxFTI field during the copy process?  Is there anything else I  
can do to get my loading process to perform similar to backup/restore?


Does pg_dump also dump the indexes?  That would explain why it is so  
much faster...


Benjamin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [GENERAL] Bigtime scaling of Postgresql (cluster and stuff I suppose)

2007-08-26 Thread David Fetter

On Sat, Aug 25, 2007 at 10:18:25AM -0400, Bill Moran wrote:
> "Phoenix Kiula" <[EMAIL PROTECTED]> wrote:
> >
> > We're moving from MySQL to PG, a move I am rather enjoying, but
> > we're currently running both databases. As we web-enable our
> > financial services in fifteen countries, I would like to recommend
> > the team that we move entirely to PG.
> > 
> > In doing research on big installations of the two databases, I
> > read this from a MySQL senior exec on Slashdot:
> 
> Senior MySQL exec means this is a marketing blurb, which means it's
> exaggerated, lacking any honest assessment of challenges and
> difficulties, and possibly an outright lie.  I've no doubt that
> MySQL can do clusters if you know what you're doing, but if you want
> the facts, you're going to have to look deeper than that obviously
> biased quote.  I seem to remember a forum thread with someone having
> considerable difficulty with MySQL cluster, and actual MySQL
> employees jumping in to try to help and no solution ever found.
> Anyone have that link lying around?

I think this is the one.

http://forums.mysql.com/read.php?25,93181,93181

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

2007-08-26 Thread Oleg Bartunov


On Sun, 26 Aug 2007, Benjamin Arai wrote:


Hi,

So, I built my tables which contains a TSearch2 field by

1. Create table without indexes
2. COPY data into table
3. ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
4. UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);


vacuum here


5. Index all the fields including the TSearch2 field

The process takes several days.

In contrast, if I backup the table and restore it to a new table it takes a 
fraction of the time as running the above operation manually.  I am building 
my indexes at the end but I think the step 4 may be causing uneeded overhead. 
Can I somehow just copy data into the idxFTI field during the copy process? 
Is there anything else I can do to get my loading process to perform similar 
to backup/restore?


Does pg_dump also dump the indexes?  That would explain why it is so much 
faster...


Benjamin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

2007-08-26 Thread Gregory Stark

"Oleg Bartunov" <[EMAIL PROTECTED]> writes:

> On Sun, 26 Aug 2007, Benjamin Arai wrote:
>
>> Hi,
>>
>> So, I built my tables which contains a TSearch2 field by
>>
>> 1. Create table without indexes
>> 2. COPY data into table
>> 3. ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
>> 4. UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);
>
> vacuum here

Or you could do something tricky and do the update like this which would avoid
the need to vacuum:

ALTER TABLE tblMessages ALTER COLUMN idxFTI TYPE tsvector USING 
to_tsvector('default, strMesage);

This only works because ALTER TABLE rewrites the table from scratch any time
it does an operation like this. Don't try this if there are transactions
working against the table at the same time (such as a pg_dump!).

Or you could set up a trigger to generate the tsvector when you first load the
data instead of adding it later.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] SQL Diff ?

2007-08-26 Thread Erik Jones



On Aug 26, 2007, at 9:02 AM, Dawid Kuroczko wrote:


On 8/26/07, Kevin Kempter <[EMAIL PROTECTED]> wrote:

On Saturday 25 August 2007 21:10:19 Ron Johnson wrote:

On 08/25/07 21:51, Kevin Kempter wrote:

Hi List;

I have a very large table (52million rows) - I'm creating a copy  
of it to
rid it of 35G worth of dead space, then I'll do a sync, drop the  
original

table and rename table2.


What is your definition of "dead space"?

Bad rows, duplicate rows, old rows?  Something else?


deleted rows that should have been cleaned up with vacuum, problem  
is the
client let it go so long that now I cant get a vacuum to finish  
cause it
impacts the day2day operations too much.  Long story, see my  
recent questions

on the performance list for more info.


In your place I would do something like Slony-I does, when
it replicates the tables.  Create on insert/update/delete triggers
on table1 which will log operations on table1 to some table1_log
table.  Then copy table1 to table2.  Then replay table1_log on
table2, then BEGIN;LOCK tablel1;finish replaying the lock;DROP table1;
alter table rename...;commit;

Or perhaps actually use Slony-I for the above steps?  Should work
quite nicely... Or perhaps use SkyTools for it (I've never used it)?



Yeah, for trigger based replication it'd be simpler to just use Slony- 
I or Skytools.  However, if you're on 8.2, with row-wise comparisons,  
you could do something like:


begin;
lock table1;

insert into table2
select *
from table1
where id not in (select id from test2);

drop  table1;
alter table2 rename to table1;
commit;

Here id is your primary key.  Note that if your ids are generated by  
a sequence you'll need to use setval on the sequence to get it  
"caught up" before that commit or you'll get duplicate key errors  
immediately.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

2007-08-26 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
>> On Sun, 26 Aug 2007, Benjamin Arai wrote:
>>> So, I built my tables which contains a TSearch2 field by
>>> 1. Create table without indexes
>>> 2. COPY data into table
>>> 3. ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
>>> 4. UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);

> Or you could set up a trigger to generate the tsvector when you first
> load the data instead of adding it later.

You're going to want such a trigger anyway, so installing it before the
COPY step seems like the Obviously Right Thing.  Any other approach
implies rewriting the entire table after you've loaded it, with no
compensating advantage that I can see.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Insert or Replace or \copy (bulkload)

2007-08-26 Thread Ow Mun Heng

On Tue, 2007-08-14 at 10:16 -0500, Scott Marlowe wrote:
> On 8/14/07, Ow Mun Heng <[EMAIL PROTECTED]> wrote:
> > I'm seeing an obstacle in my aim to migrate from mysql to PG mainly from
> > the manner in which PG handles duplicate entries either from primary
> > keys or unique entries.
> >
> > Data is taken from perl DBI into (right now) CSV based files to be used
> > via psql's \copy command to insert into the table.
> >
> > In MySql, I was using mysqlimport --replace which essentially provided
> > the means to load data into the DB, while at the same time, would
> > provide the necessary logic to replace the entire row if there was a
> > duplicate instead of dying.
> >
> > Under PG, I've yet to found something similar to this functionality and
> > searches via google has uncovered that this is one thing which _still_
> > has not found its way into PG. (anyone knows why? Standards?)
> >
> > Anyway, I found a workaround, but, to me, even though it provides a
> > means to an end, it still looks like it'll end up as a maintenance
> > nightmare each time a table has any additional columns added.
> >
> > Solution is taken from this site:
> >
> > http://www.pointwriter.com/blog/index.php?/archives/6-REPLACE-in-PostgreSQL.html
> 
> Example code snipped for brevity
> 
> > Can anyone tell me if this won't turn out to be a maintenance nightmare?
> > So, the pertinent question is, is there a better mousetrap available?
> 
> I don't see why it would be a maintenance nightmare.  Looks pretty
> much like you just create it and go.  Once it's in place it should
> just work.  There are other ways to skin this particular cat, but that
> one seems as good as any.

I just ran into trouble with this. This rule seems to work when I do
simple inserts, but as what I will be doing will be doing \copy
bulkloads, it will balk and fail.

Is there a suitable/replacement rule which can be used for this purpose
which will work bulkload wise?

Now would be a good idea to teach me how to skin the cat differently.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

2007-08-26 Thread Alvaro Herrera

Tom Lane wrote:
> Gregory Stark <[EMAIL PROTECTED]> writes:
> >> On Sun, 26 Aug 2007, Benjamin Arai wrote:
> >>> So, I built my tables which contains a TSearch2 field by
> >>> 1. Create table without indexes
> >>> 2. COPY data into table
> >>> 3. ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
> >>> 4. UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);
> 
> > Or you could set up a trigger to generate the tsvector when you first
> > load the data instead of adding it later.
> 
> You're going to want such a trigger anyway, so installing it before the
> COPY step seems like the Obviously Right Thing.  Any other approach
> implies rewriting the entire table after you've loaded it, with no
> compensating advantage that I can see.

Isn't the main speed advantage of the dump the fact that the
to_tsvector() results already come in the COPY data?  The dump already
comes with the idxFTI column contents, instead of having to generate it
from scratch.  That would depend on how expensive that function is, of
course.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"PHP is what I call the "Dumb Monkey" language. [A]ny dumb monkey can code
something in PHP. Python takes actual thought to produce something useful."
   (J. Drake)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

\copy ignoring Rules Was [Re: [GENERAL] Insert or Replace or \copy (bulkload)]

2007-08-26 Thread Ow Mun Heng

On Mon, 2007-08-27 at 11:55 +0800, Ow Mun Heng wrote:
> I just ran into trouble with this. This rule seems to work when I do
> simple inserts, but as what I will be doing will be doing \copy
> bulkloads, it will balk and fail.
> Now would be a good idea to teach me how to skin the cat differently.


Base on the arhives, seems like this was raised as BUG 2437 close to a
year ago and it was marked as "correct behaviour" and the suggested
alternative is to either

1. Use pgloader (DL'ed it, but have not gone indepth to see if it suits
my needs)

2. COPY into temp table, and then perform an insert into.

=>create table foo (a int, b int, c int, d varchar(1) e bigint);
=>alter table foo add primary key (a,b,c,e);
=>insert into foo values(1,1,1,'A',1);
=>create rule replace_foo as on insert to foo where (exists(select 1
from foo where foo.a = new.a and foo.b = new.b and foo.c = new.c and
foo.e = new.e)) do instead update foo set d = new.d where foo.a = new.a
and foo.b = new.b and foo.c = new.c and foo.e = new.e
=> \d foo;
 Table "xmms.foo"
 Column | Type | Modifiers
+--+---
 a  | integer  | not null
 b  | integer  | not null
 c  | integer  | not null
 d  | character varying(1) |
 e  | bigint   | not null
Indexes:
"foo_pkey" PRIMARY KEY, btree (a, b, c, e)
Rules:
replace_foo AS
ON INSERT TO foo
   WHERE (EXISTS ( SELECT 1
   FROM foo
  WHERE foo.a = new.a AND foo.b = new.b AND foo.c = new.c AND
foo.e = new.e)) DO INSTEAD  UPDATE foo SET d = new.d
  WHERE foo.a = new.a AND foo.b = new.b AND foo.c = new.c AND foo.e =
new.e
=>create table temp_table_foo as select * from foo where 1=0;

=> select * from foo;
 a | b | c | d | e
---+---+---+---+---
 1 | 1 | 1 | A | 1

=>insert into temp_table_foo values(1,1,1,'Z',1);

=> select * from foo;
 a | b | c | d | e
---+---+---+---+---
 1 | 1 | 1 | Z | 1

=> truncate table foo

Seems like an awful lot of step and may even make the \copy process even
longer than it should.. Is there any other method? Seems to me pgloader
also does more or less the same thing..

What about pgbulkload? Anyone can provide any new insights?

Thanks...

http://archives.postgresql.org/pgsql-bugs/2006-05/msg00073.php

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

2007-08-26 Thread Benjamin Arai

Why is a trigger faster than doing a ALTER after table is created?  I  
thought a trigger would be slower because it would be invoked every  
iteration (a new row is inserted) during the COPY process.


Benjamin

On Aug 26, 2007, at 8:43 PM, Tom Lane wrote:


Gregory Stark <[EMAIL PROTECTED]> writes:

On Sun, 26 Aug 2007, Benjamin Arai wrote:

So, I built my tables which contains a TSearch2 field by
1. Create table without indexes
2. COPY data into table
3. ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
4. UPDATE tblMessages SET idxFTI=to_tsvector('default',  
strMessage);



Or you could set up a trigger to generate the tsvector when you first
load the data instead of adding it later.


You're going to want such a trigger anyway, so installing it before  
the

COPY step seems like the Obviously Right Thing.  Any other approach
implies rewriting the entire table after you've loaded it, with no
compensating advantage that I can see.

regards, tom lane




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] Geographic High-Availability/Replication

Re: [GENERAL] SQL Diff ?

Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

Re: [GENERAL] SQL Diff ?

Re: [GENERAL] problem Linking a TTable component to a pgsql view using BCB5

[GENERAL] simple query runs 26 seconds

Re: [GENERAL] reporting tools

Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

Re: [GENERAL] Geographic High-Availability/Replication

Re: [GENERAL] simple query runs 26 seconds

Re: [GENERAL] problem Linking a TTable component to a pgsql view using BCB5

[GENERAL] Restore v. Running COPY/INDEX seperatly

Re: [GENERAL] Bigtime scaling of Postgresql (cluster and stuff I suppose)

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

Re: [GENERAL] SQL Diff ?

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

Re: [GENERAL] Insert or Replace or \copy (bulkload)

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

\copy ignoring Rules Was [Re: [GENERAL] Insert or Replace or \copy (bulkload)]

Re: [GENERAL] Restore v. Running COPY/INDEX seperatly

21 matches

Site Navigation

Mail list logo

Footer information