from:"Csaba Nagy"

[GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

Hi all,

System: postgres 8.1.3p, not a released version but a CVS checkout
somewhere after 8.1.3 was released.

I have a table (quite big one, 2 columns, 2166381 DB relpages, 364447136
reltuples as reported by pg_class) which is mostly inserted into, very
rarely deleted from.

I also have a primary key on the combination of the 2 columns with the
corresponding PK index.

Initially I left this table out of the pg_autovacuum table, as like I
said it is never updated, rarely deleted from, and I have a weekly
complete DB vacuum scheduled which should theoretically be enough so
autovacuum will never touch it except maybe for analyzing, which is OK.

Recently however I discovered that my weekly DB wide vacuum started to
crash due to an error on vacuuming this table:

ERROR:  failed to re-find parent key in "pk_table"

This turned out to be a problem for autovacuum too, which spent lots of
time vacuuming this table, only to (presumably) crash and start it all
over next time (I can't explain otherwise the table bloating going on on
this DB box, which translates in slow operation).

I think I have seen some bug reports on this list with similar errors
and I guess it is fixed in 8.2, and I have planned a move to 8.2 anyway
next month, so I thought I just take this table out from autovacuum's
list.

So I did:

insert into pg_autovacuum (
vacrelid, enabled, 
vac_base_thresh, vac_scale_factor, 
anl_base_thresh, anl_scale_factor, 
vac_cost_delay, vac_cost_limit
) values (
(select oid from pg_class where relname='table_name'), 
false,
5000, 0.2,
1, 0.2,
-1, -1
);

Which indeed inserted one row in pg_autovacuum.

Trouble: the autovacuum daemon is still taking that table and vacuums
it... how do I know ? So:

select l.pid,c.relname from pg_locks l, pg_class c where l.pid not in
(select distinct procpid from pg_stat_activity) and l.relation=c.oid and
c.relkind='r';

  pid  |  relname
---+
 16317 | table_name
(1 row)

[EMAIL PROTECTED]:~$ ps auxww|grep auto
postgres 16317  0.8  5.8 436008 240656 ? D13:26   0:55 postgres:
autovacuum process   dbname

So, the autovacuum's process is locking the table, which I interpret as
autvacuum vacuuming it.

Question: is the autovacuum daemon somehow caching it's table list ? Can
I reset somehow this ? I tried killing it's process, but it doesn't
work, next time it took the table again.

Thanks,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

> If that table has a TOAST table you might need to mark the toast table
> as disabled too.  Or maybe it's forcing a vacuum because it's worried
> about XID wraparound?

OK, that might be the case, as I guess there are tables which were not
successfully vacuumed in the last few months (DB wide vacuum never
finished, and this system was stable enough before that I only noticed
this problem recently when performance started to degrade considerably).
Is there a way I can check the imminence of XID wraparound ?

Thanks,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

On Fri, 2007-06-01 at 15:58, Alvaro Herrera wrote:
> Try reindexing the table; that should make the problem go away.  Also,
> update to the latest of the 8.1 branch ASAP.

Reindexing won't work, it would mean hours of downtime. I plan to move
the DB to 8.2 via slony in ~2-3 weeks, that should take care of the
bloating too.

Regarding upgrade, I have to use the CVS version because I have some
local patches to apply. Now what is more recommended, use the latest
version on the REL8_1_STABLE branch or stick with REL8_1_9 ? I checked
and there are quite a few changed files after REL8_1_9.

Thanks,
Csaba.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

> How would reindexing a table imply hours of downtime?

Simply, all meaningful activities on this system will sooner or later
insert into this table :-)

So given that we use a connection pool we end up pretty soon with all
connections waiting for an insert on this table, and at that point
nothing else will work either.

Cheers,
Csaba.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

[snip]
> You don't need to reindex, just update.

I was pretty sure I've seen the error reported and the fix too, the
thing is relatively harmless except the vacuum troubles. And considering
that the table itself does not need vacuuming, it would be even more
harmless if it wouldn't block other stuff and if XID wraparound wouldn't
be a problem.
 
> HEAD from the branch is certainly better as it contains some extra
> fixes (particularly one for a bug introduced in 8.1.9 which may bite you
> if you just pick that one ... see about bug #3116)

OK, I'll go with head.

Thanks,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

On Fri, 2007-06-01 at 17:40, Alvaro Herrera wrote:
> Huh, why do you say that it doesn't need any vacuuming?

This table is only inserted and rarely deleted, so other than the XID
wraparound vacuuming it is not critical. Of course it will need vacuum
finally, but it can survive months without it - or till the XID
wraparound occurs, which I can't estimate yet if in danger or not... how
could I check that ? (I can probably find the answer if looking harder
though).

Cheers,
Csaba.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Autovacuum keeps vacuuming a table disabled in pg_autovacuum

2007-06-01 Thread Csaba Nagy

> select age(datfrozenxid) from pg_database where datname = 'your database'
> 
> 2 billions and you are screwed.  Autovacuum starts panicking way before
> that, to have enough slack.

dbname=# select age(datfrozenxid) from pg_database where datname = 'dbname';
age

 1648762992
(1 row)

Sooo... looks like time to quickly upgrade to 8.1 head.

Thanks for the help,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Excell

2007-06-21 Thread Csaba Nagy

On Thu, 2007-06-21 at 16:45, Scott Marlowe wrote:
> Another option is to use your favorite scripting language and throw an 
> excel header then the data in tab delimited format.  Or even in excel 
> xml format.

Why would you need any scripting language ? COPY supports CSV output
pretty well, it can even put you a header on the top. And 8.2 can COPY a
SELECT too, so you can generate the CSV dump directly from joins too...
on older systems I usually create a temporary table as the result of the
join and then COPY that - plain SQL is all you need... I always did it
this way when it comes to occasional data from/to excel.

Now if it is about regularly exchanging data with excel, possibly using
excel as a DB interface, probably ODBC is the only viable choice, but if
the OP really needs a DB for the data, I would say using excel as the
interface to it is likely a bad decision...

Cheers,
Csaba.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] Database corruption: finding the bad block

2007-07-12 Thread Csaba Nagy

Hi all,

I just had the following error on one of our data bases:

ERROR:  could not access status of transaction 1038286848
DETAIL:  could not open file "pg_clog/03DE": No such file or directory

I researched on the mailing list and it looks like the usual suspect is
disk page corruption. There are few previous discussions about how to
dump the suspect disk page, e.g.:

http://archives.postgresql.org/pgsql-general/2006-10/msg01372.php
http://groups.google.com/group/comp.databases.postgresql.hackers/browse_frm/thread/4988752a6939f45a/fd9f12468e86dd3?hl=en&lr=&ie=UTF-8&rnum=8&prev=/groups%3Fq%3Dpg_filedump%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D20030922162322.E12708%2540quartz.newn.cam.ac.uk%26rnum%3D8#fd9f12468e86dd3

You can probably find more searching for "ERROR:  could not access
status of transaction" or "pg_filedump".

What I could not find was a simple and fast way to find the bad block.
The error message itself is not useful in this regard, and the "binary
search" method is anything but fast on a big table.

So I wrote the following plpgsql function:

create or replace function 
  find_bad_block(p_TableName text)
  returns tid
as $find_bad_block$
  declare
result tid;
crtRow record;
count bigint := 0;
  begin
for crtRow 
in execute 'select ctid from ' || p_TableName
loop 
  result = crtRow.ctid;
  count := count + 1;
  if count % 50 = 0 then
raise notice 'rows processed: %', count;
  end if;
end loop;
return result;
  exception
when OTHERS then
raise notice '%: %', SQLSTATE, SQLERRM;
return result;
  end;
$find_bad_block$
language plpgsql;


This will spit out the error + the last valid block visited. If there's
no bad block, you won't get the notice with the error, just the last
block of the table... worked fine for me, resulting in:

NOTICE:  58P01: could not access status of transaction 1038286848
 find_bad_block

   (657780,157)
(1 row)

Now to finding the file I should dump:

select oid from pg_database where datname = 'dbname';
  oid
---
 16389
(1 row)

select relfilenode from pg_class where relname='table_name';
 relfilenode
-
   20418
(1 row)

The file is then 'db_path/base/16389/20418'. Or a collection of
'20418.x' files if the table's data is more than 1 GB, each file being a
1GB chunk of the table... so which one to dump ?

First calculate the block count in one chunk: 1024 * 1024 / block_size,
where block_size is the block size in KB with which postgres was
compiled. That's normally 8, but we have systems where it is set to 32.
If you didn't change that yourself, it is likely 8.

So in my case the block count per chunk was: 1024 * 1024 / 8 = 131072.

So the chunk number will be: 

blockId / blocks_per_chunk = 657780 / 131072 = 5

So I should look in the file 'db_path/base/16389/20418.5'... but wait,
the block id has to be made relative to the chunk file's start:
chunk_block_id = block_id % blocks_per_chunk

So the block id to use with pg_filedump is: 657780 % 131072 = 2420

So my final command line was:

pg_filedump -if -R 2420 db_path/base/16389/20418.5

resulting in something like:

[snip]
 --
 Item   1 -- Length:   44  Offset: 8148 (0x1fd4)  Flags: USED
  XMIN: 1048587  CMIN: 90130188  XMAX: 0  CMAX|XVAC: 1036845056
  Block Id: 655376  linp Index: 18451   Attributes: 1375   Size: 0
  infomask: 0x0200 (XMIN_INVALID)
  Error: Computed header length not equal to header size.
 Computed <28>  Header: <0>
 
  1fd4: 0b001000 0c475f05  cd3d  .G_=
  1fe4: 0a001000 13485f05 0002 2b03  .H_.+...
  1ff4: 2d03  0100   -...
 
[snip]

So I found my bad block, and the previous and following looked OK. 

Now I want to fix just that one block even if the rows on it are lost,
as the table data is not detail-critical (massive amounts of lost data
would be critical, small loss is tolerable).

Now to the actual fixing: from my searches it looks like zeroing out the
complete block + reindex the table seems to be the recommended solution
if it is not possible to downright drop the table and restore from
backup (in my case that is not possible - this error is there from last
year's October, and all our backups failed from then to now - and nobody
checked the logs till now when I tried to upgrade postgres via slony and
failed to do it because of this error - rather telling for the
importance of this DB).

So, how to zero out the page ?

The following message shows a way to do it:

http://www.webservertalk.com/archive142-2004-7-319751.html

Basically set the 'zero_damaged_pages' setting to 'on':

http://www.postgresql.org/docs/8.1/interactive/runtime-config-developer.html

... and select that page again. Unfortunately this did not work for
me... looks like if the page header is not corrupt but only individual
tuples are a problem, the 'zero_damaged_pages' thing doesn't work.

Next try:

http://usr-share-man.org/ma

Re: [GENERAL] Database corruption: finding the bad block

2007-07-12 Thread Csaba Nagy

On Thu, 2007-07-12 at 16:18, Simon Riggs wrote:
> The corruption could only migrate if the WAL records themselves caused
> the damage, which is much less likely than corruption of the data blocks
> at hardware level. ISTM that both Slony and Log shipping replication
> protect fairly well against block corruption on the standby, but only
> log shipping allows you to recover the precise block, as you describe.

Well, I could only speak of what I experienced, and that is that in the
total of 2 former file system level corruptions the replica was
corrupted too. This time it was not...

Because of that I had the impression Slony will be more immune to such
glitches, as it is not shuffling raw file data around... I mean you
still can have data corruption replicated, but the replica will be
functional. Our WAL standby did not start up at all when we had that
former file corruption issue...

Cheers,
Csaba.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] Delete/update with limit

2007-07-23 Thread Csaba Nagy

Hi all,

This subject was touched a few times in the past, I looked into the
archives... the result is invariably key developers saying such a
feature is unsafe because the result is unpredictable, while the people
requesting is saying it is OK that way, it is expected... but no
compelling use case for it.

OK, I have one...

We have here quite a few processes which collect user input, and put
them in "batch" tables, which in turn are processed regularly and update
other tables with summary data, reject invalid records, etc.

The insertions are unpredictable, they can happen any time and any of
them in parallel, they are user input... and they must be very fast,
it's our user experience at stake. 

The batch processing is done by a single periodical process. Now we had
a few attempts of making this parallelism safe enough so we don't loose
some of the concurrent input while we do the processing step, while
still keeping minimal overhead in the table. The end result was a scheme
where the batch processor deletes from the table and a delete trigger
puts the deleted rows into a temporary table, and then the processor can
do with that private data anything it pleases without interfering with
the inserts (the processing is actually quite complex on occasions).

This works fine in terms of correctness, however it turns out to be a
problem with high bursts of incoming data, or when the processor is not
running for a while and a lot of data is accumulating... then we have
lots of data to process at once, which leads to long running
transactions (the whole thing runs in one transaction) and worse,
connection timeouts.

On other databases, it is possible to limit the delete to a maximum
number of rows to be deleted. This way we can limit the size of one
batch with minimal overhead...

In postgres we're currently not chunking, due to the fact that the code
to do it is simply overly contorted and inefficient compared to the
other DBs we use. At least all the solutions we could think of to do the
chunking in a safe way while the inserts are running in parallel,
without disturbing them, have invariably resulted in overly complicated
code compared to the simple delete with limit + delete trigger solution
we have for the other DBs.

Now I don't put too much hope I can convince anybody that the limit on
the delete/update commands has valid usage scenarios, but then can
anybody help me find a good solution to chunk-wise process such a buffer
table where insert speed is the highest priority (thus no indexes, the
minimum of fields), and batch processing should still work fine with big
table size, while not impacting at all the inserts, and finish in short
time to avoid long running transactions ? Cause I can't really think of
one... other than our scheme with the delete with limit + trigger +
private temp table thing.

Cheers,
Csaba.




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Delete/update with limit

2007-07-24 Thread Csaba Nagy

First of all, thanks for all the suggestions.

> put a SERIAL primary key on the table
Or:
> Maybe add OIDs to the table, and delete based on the OID number?

No, this is not acceptable, it adds overhead to the insertions. Normally
the overhead will be small enough, but on occasions it is noticeable. 

> Loop Forever
>   DELETE from incoming_table;
>   VACUUM incoming_table;
> End Loop;

Not workable either, it still won't assure the table never getting too
big. Once the table is too big, it takes too much to process, and it
gets even bigger for the next time. The whole thing is transient (i.e.
the load will smooth out after a while), but then exactly when it should
work it doesn't... and if you didn't guess, the users want the results
immediately, not next day so we could do the processing at night when we
have virtually no load.

> Use partitioning: don't delete, just drop the partition after a while.

OK, this could work.

It will still be completely different than the code for the other DBs,
but it will work.

Thanks again for all the suggestions,
Csaba.




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Delete/update with limit

2007-07-24 Thread Csaba Nagy

> How about using the following?
> 
> delete from 
>   where ctid in (select ctid from  limit );
> 

I actually checked this out before starting this thread, and the plan
looked like:

> explain delete from my_table where ctid in (select ctid from my_table
limit 10);
 QUERY PLAN
-
 Merge IN Join  (cost=101.68..108.03 rows=10 width=6)
   Merge Cond: (public.my_table.ctid = "IN_subquery".ctid)
   ->  Sort  (cost=101.11..104.21 rows=1240 width=6)
 Sort Key: public.my_table.ctid
 ->  Seq Scan on my_table  (cost=0.00..37.40 rows=1240 width=6)
   ->  Sort  (cost=0.57..0.59 rows=10 width=6)
 Sort Key: "IN_subquery".ctid
 ->  Limit  (cost=0.00..0.30 rows=10 width=6)
   ->  Seq Scan on my_table  (cost=0.00..37.40 rows=1240
width=6)
(9 rows)

It looked strange to me, and I thought it won't work too well on a big
table... but it's true that was a toy table, so let's try on a big one:

> explain delete from big_table where ctid in (select ctid from
big_table limit 10);
   QUERY PLAN

 Merge IN Join  (cost=11086906.66..11404636.41 rows=10 width=60)
   Merge Cond: (public.big_table.ctid = "IN_subquery".ctid)
   ->  Sort  (cost=11086906.26..11245771.06 rows=63545920 width=66)
 Sort Key: public.big_table.ctid
 ->  Seq Scan on big_table  (cost=0.00..834103.20 rows=63545920
width=66)
   ->  Sort  (cost=0.40..0.42 rows=10 width=6)
 Sort Key: "IN_subquery".ctid
 ->  Limit  (cost=0.00..0.13 rows=10 width=6)
   ->  Seq Scan on big_table  (cost=0.00..834103.20
rows=63545920 width=6)
(9 rows)

So, while the batch table is not expected to have 60M rows, on occasions
it got to a few 100Ks... and in that case the chunking would slow down
things even more.

I guess if the ctid in (...) thing would do a better job it would be the
best solution.

Regarding all the other questions, the "other DB" does the trick well
too, without any hidden cost. And the whole complicated mechanism is in
place not because of cost savings, but because I didn't find any better
way to do it so that concurrent inserts are neither slowed down nor
lost... the problem is that if you want to reliably delete only
processed rows, you must mark them somehow, and that would mean an
update + delete later - and I figured the delete + trigger + temp table
approach will be still cheaper. And the processing code will have to
scan the processed chunk multiple times, so for that purpose it is also
better to have it in a temp table. And we had to make sure an accidental
second run of the processor won't corrupt the data either (it happened
before)... the trigger approach helps there too...

We had here so many data losses on this processing with different
approaches, that I start to be tired about it... and this delete +
trigger + temp table looks to be the one which finally works correctly,
but gets us performance problems on occasions.

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Delete/update with limit

2007-07-24 Thread Csaba Nagy

> Unfortunately the stuff that makes a ctid= nice doesn't seem to be
> used when you're doing an in. It's possible that a function that does
> something like
>  for rec in select ctid from my_table limit 10 loop
>   delete from my_table where ctid=rec.ctid;
>  end loop
> might do okay, but I haven't tried it.

OK, I think this will work. It would be nice though to have the 'ctid
in' trick work just as well as 'ctid = ' ...

Thanks,
Csaba.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Delete/update with limit

2007-07-24 Thread Csaba Nagy

On Tue, 2007-07-24 at 18:19, Tom Lane wrote:
> [ thinks for a bit ... ] Actually, you can do it as of 8.2 or so,
> by abusing the ScalarArrayOp stuff: turn the subquery into an array.
> An example in the regression database:
> 
> regression=# explain update tenk1 set ten=ten+1 
> regression-#   where ctid = any (array(select ctid from tenk1 limit 10));
>QUERY PLAN
> -
>  Tid Scan on tenk1  (cost=0.46..40.71 rows=10 width=250)
>TID Cond: (ctid = ANY ($0))
>InitPlan
>  ->  Limit  (cost=0.00..0.46 rows=10 width=6)
>->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=6)
> (5 rows)
> 
> It even seems to get the cost estimate right...

Cool, I will use this then (we do have the relevant DB on 8.2). It would
still be nice to have it work directly...

Thanks,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Delete/update with limit

2007-07-25 Thread Csaba Nagy

On Tue, 2007-07-24 at 19:06, Stephan Szabo wrote:
> > Unfortunately I don't think this will work. Multiple backends will happily
> > pick up the same ctid in their selects and then try to delete the same
> > records.
> 
> I'm pretty sure he said that the batch processing (and the delete) would
> only be happening from one backend at a time, no concurrency on that
> portion, merely concurrency with the large volume of inserts.

Yes it's exactly like that... only it also happened accidentally that 2
batch processes started at the same time, and they should not double
process the data, nor loose some of it. The above scheme is OK with that
too... but the array version from Tom is even better :-)

Regarding the proposed mark/process/delete version, we've done it that
way, and we always managed to get some corner case which lost us data...
so even if it's possible to do it well, it's definitely not easy. The
delete/copy/process private data version is much safer, and it actually
can be done in one transaction to assure crash safety.

Cheers,
Csaba.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Delete/update with limit

2007-07-26 Thread Csaba Nagy

Andrew,

Thanks for your input, comments below.

On Thu, 2007-07-26 at 13:30, Andrew Kroeger wrote:
> After reading through this thread, I have an idea that should accomplish
> what I believe are your 3 main goals (avoid any negative performance
> impact on the user's inserts, do not lose any data from those inserts,
> and do not double process any data) and possibly improve performance
> (I'm not sure what the overhead is for triggers, so there may not be any
> benefit).

The essential changes you're proposing are:

1) use a fixed table instead of temp table -> this is a reasonable one,
temp tables are not exactly free if they are created and dropped too
often;

2) use an "insert selected/delete" scheme instead of the "delete/insert
via trigger". I doubt this would be faster... in fact I wonder if
instead of the trigger it is not possible to use a rule to do it - I've
never used the postgres rule system, but it seems to me it is possible
to create a rule which inserts the row in another table when deleting
it. I'm not sure how cheap that would be compared to the trigger
version.

In any case, on top of all these thoughts there is one more goal I have:
the solution used for postgres should be as close to the other DBs as
possible, for obvious maintenance reasons.

Cheers,
Csaba.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Obtaining random rows from a result set

2007-08-31 Thread Csaba Nagy

> Dear Santa,
> 
> I'd like my database to have functionality analogue to how LIMIT works,
> but for other - non-sequential - algorithms.

There was some discussion before to possibly reuse the algorithm ANALYZE
is using for sampling some given percentage of the table data and
provide this for some kind of "SELECT SAMPLE x% " style of
functionality. This would be the fastest you can get for a reasonably
big sample so it can be statistically significant, but not repeatable.
I'm not sure if this is the same what you were asking for though, I
would like something like this for statistical stuff, not for randomly
selecting rows.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Reliability of WAL replication

2007-10-23 Thread Csaba Nagy

Marc,

On Tue, 2007-10-23 at 13:58 +0200, Marc Schablewski wrote:
> We had some corrupted data files in the past (missing clog, see
> http://archives.postgresql.org/pgsql-bugs/2007-07/msg00124.php) and are
> thinking about setting up a warm standby system using WAL replication.
> 
> Would an error like the one we had appear in WAL and would it be
> replicated too? Or is there some kind of consistency check, that
> prevents broken WAL from being restored?

Here we had WAL based replication in place some time ago, and the result
are somewhat mixed: in one case the corruption was replicated, other
times it was not... I guess it has to do with where the corruption
occurred, and I have a feeling the first case (corruption replicated)
was some postgres corner case reacting badly on kill -9 and alike, the
second case (corruption not replicated) was file system corruption. I
didn't run WAL based replication for a while, so I don't know what have
changed in it lately...

Cheers,
Csaba.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Chunk Delete

2007-11-17 Thread Csaba Nagy

[snip]
> With Oracle we do it with: delete ,tname>   where   and rownum <
> Y;
> Can we have the same goody on Postgres?

The attached message is Tom's response to a similar question, in any
case it would work fine in your case too (assuming you have postgres
8.2).

HTH,
Csaba.

--- Begin Message ---
Stephan Szabo <[EMAIL PROTECTED]> writes:
> Unfortunately the stuff that makes a ctid= nice doesn't seem to be
> used when you're doing an in.

Yeah, see the header comments in tidpath.c:

 * There is currently no special support for joins involving CTID; in
 * particular nothing corresponding to best_inner_indexscan().  Since it's
 * not very useful to store TIDs of one table in another table, there
 * doesn't seem to be enough use-case to justify adding a lot of code
 * for that.

Of course, that argument is wrong for a self-join, which is what this
would essentially be.  So maybe it would be worth doing sometime.
Still, the issue doesn't come up very often.

[ thinks for a bit ... ] Actually, you can do it as of 8.2 or so,
by abusing the ScalarArrayOp stuff: turn the subquery into an array.
An example in the regression database:

regression=# explain update tenk1 set ten=ten+1 
regression-#   where ctid = any (array(select ctid from tenk1 limit 10));
   QUERY PLAN
-
 Tid Scan on tenk1  (cost=0.46..40.71 rows=10 width=250)
   TID Cond: (ctid = ANY ($0))
   InitPlan
 ->  Limit  (cost=0.00..0.46 rows=10 width=6)
   ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=6)
(5 rows)

It even seems to get the cost estimate right...

regards, tom lane
--- End Message ---

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Chunk Delete

2007-11-18 Thread Csaba Nagy

On Thu, 2007-11-15 at 17:13 +, Gregory Stark wrote:
> DELETE
>   FROM atable AS x 
>  USING (SELECT ctid FROM atable LIMIT 5) AS y
>  WHERE x.ctid = y.ctid;

Have you tried to EXPLAIN this one ? Last time I tried to do something
similar it was going for a sequential scan on atable with a filter on
ctid. The other form using "where ctid = any (array(select ctid
from ..." (see my previous post forwarding Tom's suggestion) was going
for a ctid scan, which should be orders of magnitudes faster than the
sequential scan for big tables and small chunks.

Cheers,
Csaba.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Primary Key

2007-11-23 Thread Csaba Nagy

On Fri, 2007-11-23 at 11:37 +0100, Martijn van Oosterhout wrote:
> On Fri, Nov 23, 2007 at 09:33:13AM +, Peter Childs wrote:
> > I tend to agree that primary keys should be single fields if they need to be
> > referenced but should also be natural if at all possible. ie use car number
> > plates rather than some serial int.
> 
> Car number plates are unique over time? I didn't think so...

I'm not sure how it is in the US, but here in Germany I just reused a
car plate from the owner it had before me... so now the plate is
uniquely associated at most with the car, not the owner... and I'm
pretty sure that's not unique either.

And what do you do when the things shift meaning in your natural key ?
Cause that's a very common thing to happen to natural keys. And suddenly
what was unique becomes not unique anymore... and the headaches begin...

You're better off using synthetic keys for references between tables,
and you can still keep your natural keys for lookup, just don't use them
as unique join criteria, only search/filter criteria.

Cheers,
Csaba.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Making a query from 2 tables at same time

2007-11-29 Thread Csaba Nagy

This is the problem:

> ... t2.idr(p1, p4, p6, p7, p9, 'HLA-DRB5*0101')>2;

What is that t2.idr ? Based on the syntax postgres must look for a
function (because of the parentheses), and it thinks t2 is the schema
where it must look for it.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] varchar sort ordering ignore blanks

2008-01-16 Thread Csaba Nagy

On Tue, 2008-01-15 at 16:32 +0100, Luca Arzeni wrote:
> In all cases I'm using locale LATIN9 during DB creation, but I tested also 
> with ASCII, UTF8 and LATIN1 encoding.

I guess this has nothing to do with the encoding, but with the collation
rules used, which is governed by "lc_collate" parameter. See what you
get on both DBs for:

SHOW lc_collate ;

Quoting from the docs:

"The nature of some locale categories is that their value has to be
fixed for the lifetime of a database cluster. That is, once initdb has
run, you cannot change them anymore. LC_COLLATE and LC_CTYPE are those
categories. They affect the sort order of indexes, so they must be kept
fixed, or indexes on text columns will become corrupt. PostgreSQL
enforces this by recording the values of LC_COLLATE and LC_CTYPE that
are seen by initdb. The server automatically adopts those two values
when it is started."

See:
http://www.postgresql.org/docs/8.1/static/charset.html

HTH,
Csaba.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] [pgsql-advocacy] PostgreSQL Certification

2008-02-04 Thread Csaba Nagy

On Mon, 2008-02-04 at 08:31 -0800, Joshua D. Drake wrote:
> On Mon, 04 Feb 2008 12:18:55 +
> Gregory Stark <[EMAIL PROTECTED]> wrote:
> > "Joshua D. Drake" <[EMAIL PROTECTED]> writes:

> > I was going to suggest pruning the mailing lists down to just 3-4
> > already. The last thing we need to be doing is creating new ones.
> >  
> 
> I don't agree in the least, I was actually going to suggest we add a
> new one for relational design questions. I like many lists that are
> contextually specific. IMO, general should be removed for example.

Why don't you go ahead and create those special lists and make general
collect all of them ? Some sort of hierarchy of lists... if doable at
all, that could make everybody happy...

Cheers,
Csaba.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] PostgreSQL Certification

2008-02-04 Thread Csaba Nagy

On Mon, 2008-02-04 at 12:18 +, Gregory Stark wrote:
> "Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> 
> > Guys, with respect this thread does nothing for us unless it is on the
> > certification list.
> 
> Do we really need a separate mailing list for every thread? It's already kind
> of crazy with dozens of lists, many of them moribund, which most people aren't
> even aware exist.
> 
> I was going to suggest pruning the mailing lists down to just 3-4 already. The
> last thing we need to be doing is creating new ones.

+1

At least for me it's far easier to ignore threads I'm not interested in
than subscribe to yet another list. This particular subject
(certification) would be interesting for me as a potential end user, so
I'm not really qualified for any comment on the organization side of it,
but ultimately interested in the end result. I suspect many of the
postgres general list subscribers are in the same situation, so why not
let them know about how it evolves ?

Cheers,
Csaba.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Renaming a constraint

2008-02-05 Thread Csaba Nagy

I found an old post regarding the subject, where modifying the
pg_constraint entry was recommended:

http://archives.postgresql.org/pgsql-admin/2003-04/msg00339.php

Is this still safe to do ? The pertinent docs don't say anything pro or
contra:
http://www.postgresql.org/docs/8.2/static/catalog-pg-constraint.html

I need to rename a few constraints to keep our schema logically
consistent. It would be of course nice to have an ALTER TABLE option to
do it, but if updating pg_constraint is safe it would do it for now.

Thanks,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Constraint name for named NOT NULL constraints is ignored

2008-02-05 Thread Csaba Nagy

While upgrading our schema between application versions, we also had a
few constraint changes. Some of those changes were dropping NOT NULL
constraints on some columns. Our schema had a few such NOT NULL
constraints, which were created using the named variant of the column
constraint clause (something like '... col_name col_type CONSTRAINT
nn_col_name NOT NULL ...'). This syntax is happily accepted by postgres.
So our schema change script was expecting that simply dropping those
named constraints will drop the NOT NULL constraint on the relevant
columns, but the constraint is just simply not there at all, so trying
to drop it gives an error.

>From the description of the pg_constraint table
http://www.postgresql.org/docs/8.2/static/catalog-pg-constraint.html
is clear that it does not hold NOT NULL constraints, which go to the
pg_attribute table, but then maybe an error should be raised if somebody
tries to create a named NOT NULL constraint ? Ignoring integral parts of
the SQL syntax feels somewhat mySQL-ish. Or at least mention this
behavior on the CREATE TABLE page (I couldn't find it if it's there):
http://www.postgresql.org/docs/8.2/static/sql-createtable.html

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Renaming a constraint

2008-02-06 Thread Csaba Nagy

On Tue, 2008-02-05 at 12:11 -0500, Tom Lane wrote:
> Csaba Nagy <[EMAIL PROTECTED]> writes:
> > I found an old post regarding the subject, where modifying the
> > pg_constraint entry was recommended:
> 
> > http://archives.postgresql.org/pgsql-admin/2003-04/msg00339.php
> 
> > Is this still safe to do ?
> 
> What kind of constraint?

The only ones left are foreign key constraints. I had a few not nulls
too, but those are not actually there anyway (see my other post).

Thanks,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Renaming a constraint

2008-02-06 Thread Csaba Nagy

On Wed, 2008-02-06 at 11:03 -0500, Tom Lane wrote:
> IIRC, the name of an FK constraint also appears in the arguments for its
> triggers.  If you didn't update those too, the behavior might be a bit
> inconsistent, depending on which PG version you're using.  In particular
> I'm not sure which name pg_dump will believe.

OK, I think I'll just give up on renaming them.

How hard would be to implement renaming constraints as an alter table
option ? It sounds to me like it should be a task requiring mostly
legwork than programming skills, would it be good task for a beginner in
C like me ? If it is at all regarded as a needed feature...

Thanks,
Csaba.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

[GENERAL] Perceived weaknesses of postgres

2008-02-13 Thread Csaba Nagy

http://www.theserverside.com/news/thread.tss?thread_id=48339

The interesting part is where somebody asks why NOT use postgres, and
it's answers could give some additional hints to those interested on
what people find missing from postgres to adopt it.

Just to summarize some of the answers:
* major PITA to upgrade between major versions;
* executing a single query on multiple cpus/cores;
* no direct table cache control;
* accent-insensitive text comparisons;
* fast select count(*);

Wrong assumptions (but people seem to be sure it's like this):
* no hotbackup except pg_dump (wrong: there are in fact a few different
ways to do that);
* pg_dump the only way to cleanly upgrade (wrong: slony is good for
that);
* missing 2 phase commit (wrong: it is now implemented);
* inadequate performance with really large databases (wrong: there are
known examples of really large postgres DBs);

There are other claims like (quoting): "RAC, enterprise backup
capabilities, database on raw partition, compatibility with enterprise
storage (SAN,...)" which I don't know if there are adequate solutions
for postgres or not.

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Perceived weaknesses of postgres

2008-02-13 Thread Csaba Nagy

On Wed, 2008-02-13 at 13:29 +0100, Dawid Kuroczko wrote:
> > * no direct table cache control;
> 
> Could you elaborate more on this one?

Well, I was just summarizing what other people wrote :-)

But I guess they refer to table level control of how much cache memory
to use. I think there are DBMSs where you can tune the cache usage based
on tables/table groups, so you can make sure the important tables (where
you want instant response even if the table was not accessed in the last
hours) are always in cache. Currently that's not possible in postgres -
it is true that the cache memory will be used in an optimal way
regarding mean performance, but sometimes the mean performance is just
good enough even if not optimal, and you would like to eliminate worst
case scenarios for some queries. In that case more control (on the cache
strategy too) is better...

Cheers,
Csaba.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Perceived weaknesses of postgres

2008-02-13 Thread Csaba Nagy

On Wed, 2008-02-13 at 13:39 +0100, Csaba Nagy wrote:
> On Wed, 2008-02-13 at 13:29 +0100, Dawid Kuroczko wrote:
> > > * no direct table cache control;
> > 
> > Could you elaborate more on this one?
> 

OK, re-reading what I just wrote makes me think it was not clear enough:
I think they mean you can _reserve_ some cache memory directly to
specific tables/table groups so that reserved cache is only used by
those tables, and thus will not be sensitive to other activities than
the access to those tables. Particularly a sequential scan on another,
big, table will not touch that reserved cache, or any other big sweep of
data access on other tables.

Not sure this time I got it actually clearer though :-)

Cheers,
Csaba.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Perceived weaknesses of postgres

2008-02-13 Thread Csaba Nagy

On Wed, 2008-02-13 at 13:56 +0100, Magnus Hagander wrote:
> I don't think these people are comparing to other opensource ones... 
> They're comparing to the commercial ones (at least in this case)

Yes, that's definitely the case. And that can actually be taken as a
compliment to the already attained status of postgres...

> I would guess they're referring to the ability to "pin" a table into 
> memory, so that it always stays in the cache regardless of what else the 
> database is doing. There is a narrow use-case where this can be very 
> useful, but it can also be a very dangerous tool (hint: if you pin a 
> table that grows up to say 80-90% of your RAM size, your database will 
> not be fast for anything else)

If you can not only pin the complete table, but just allocate a fixed
amount of memory for caching _only that table_, then you get a
compromise which would be better than the 2 extremes: the table is
cached in a mixed bag together with all the rest, resulting in
effectively being cached 0-100% depending on all the activities on the
DB, or it is always 100% cached independently of the any activities in
the DB. The intermediate solution is: the table is always cached using x
amount of memory independently of what happens on the DB. That makes
sure the table is completely cached if the allocated memory is enough,
and at least the most used records are always cached if the table grows
bigger than the allocated cache, while there's no risk of overrunning
the available memory. This could also be applied to groups of tables.

Cheers,
Csaba.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] postgre vs MySQL

2008-03-14 Thread Csaba Nagy

On Fri, 2008-03-14 at 08:43 -0700, Steve Crawford wrote:
> >   Also, it is MVCC-safe only from 8.3 upwards; on older versions
> > it (incorrectly) deletes dead tuples that are still visible to old
> > transactions.
> >
> >   
> More interesting. I may have a broken mental-model. I *thought* that 
> CLUSTER acquired exclusive locks and that acquisition of the exclusive 
> lock would imply that there couldn't be any transactions accessing that 
> table. Where is my misunderstanding?

Here's a scenario:

 - transaction A starts to read table A;
 - transaction B starts, deletes some records from table B, end ends;
 - transaction C starts and clusters table B;
 - transaction A finished reading table A, and now tries to read the
records just deleted by transaction B;

Question: under MVCC rules should transaction A see the deleted records
or not ?

Unfortunately I don't know for sure the answer, but if it is yes, then
bad luck for transaction A, because cluster just ate them. And the
locking will not help this...

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Copying large object in a stored procedure

2008-04-15 Thread Csaba Nagy

Hi all,

Is there an easy way to copy a large object to a new one, having a new
OID and it's content independent from the original ?

The large object API functions would only allow to stream the original
to the client and stream it back to the server for the copy, which is
not really optimal... I would like to do it all on the server side. The
server side functions would work to export the original to a file and
then import it again, but that needs superuser privileges which rules
that out.

I could simply copy the contents of the pg_largeobject table, but then I
need  a new OID for that. I can use the server side large object API to
create an empty large object, delete it, then use the obtained OID for
the copy... but that also sounds somewhat suboptimal.

Is there any simple way to copy a large object ?

Cheers,
Csaba.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Copying large object in a stored procedure

2008-04-15 Thread Csaba Nagy

> Is there an easy way to copy a large object to a new one, having a new
> OID and it's content independent from the original ?

So my current solution would be:

CREATE OR REPLACE FUNCTION copy_blob(p_blobId OID)
RETURNS OID AS '
DECLARE
v_NewOID BIGINT;
BEGIN
SELECT lo_create(0) INTO v_NewOID;

DELETE FROM pg_largeobject WHERE loid = v_NewOID;

INSERT INTO pg_largeobject (loid, pageno, data)
SELECT v_NewOID, pageno, data 
FROM pg_largeobject
WHERE loid = p_blobId;

RETURN v_NewOID;
END;
' LANGUAGE plpgsql;


I would still be grateful if anybody knows a better solution using plain
SQL/plpgsql...

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Copying large object in a stored procedure

2008-04-15 Thread Csaba Nagy

[snip]
> DECLARE
> v_NewOID BIGINT;
   ^^
small correction: v_NewOID should be of type OID:

DECLARE
v_NewOID OID;

BIGINT would of course work too as long as there is implicit cast from
it to/from OID, which seems to be the case on 8.2.

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] In the belly of the beast (MySQLCon)

2008-04-21 Thread Csaba Nagy

On Sun, 2008-04-20 at 11:32 -0600, Scott Marlowe wrote:
> On Sun, Apr 20, 2008 at 11:12 AM, Scott Ribe <[EMAIL PROTECTED]> wrote:
> > > I am going to play with this and see where it breaks, but it's going to be
> >  > an enormous time investment to babysit it.

One more suggestion: if you happen to use the alt_perl tools, be sure to
uncomment/change the line:

$SYNC_CHECK_INTERVAL = 1000;

In my experience it made a big difference to set that to 6 (meaning
sync events created once per minute instead of once per second) for the
synchronizing after copy phase. The number of events generated while the
copy over is running can be so big that it will never get in sync again
otherwise...

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Multithreaded queue in PgSQL

2008-06-11 Thread Csaba Nagy

We also have such a queue here, and our solution is an algorithm like
this:
 1. get the next processor_count * 2 queue ids which are not marked as
taken;
 2. choose randomly one of these ids;
 3. lock for update with nowait;
 4. if locking succeeds: 
 4.1. check again the item, as it could have been processed in the
meantime - if not available, go to 5.;
 4.2. update the DB row to mark the id as taken, and process the
item;
 5. there are more ids to try: loop to 2.
 6. sleep a small random interval, and loop to 1.
 
This algorithm should have small enough collision rate on a busy queue
due to the random chosen ids and random sleep (it will have high
collision rate on an almost empty queue, but than you don't really
care), while still allowing all processors to access all entries.

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Unreferenced temp tables disables vacuum to update xid

2008-06-27 Thread Csaba Nagy

Hi all,

I just want to report that we had here almost exactly the same problem
as reported here:

http://archives.postgresql.org/pgsql-hackers/2008-01/msg00134.php

The whole scenario repeated the same: production DB refused to work,
restarted in single user mode, run vacuum (few hours), postgres still
complaining and refuse to start, more web searching, found leaked
temporary tables (this time they were a few proper temporary tables
created by our application, no toast table), dropped them, problem
solved. Net effect: ~5 hours downtime affecting a few hundreds of our
customers...

Can this scenario be included on the doc page regarding routine
vacuuming:

http://www.postgresql.org/docs/8.3/static/routine-vacuuming.html

If it would have been there, it would have saved us all this down time,
I could have just simply drop the leaked temp tables in the first
place...

In general, I found very little information in the docs about the ways
temporary tables work in postgres. There are a few gotchas about
temporary tables, a special page discussing temp tables would be nice to
have, and linked from all other places which currently discuss different
aspects of this topic.

One thing which I still don't know what is it exactly doing is vacuuming
a temporary table from a different session: it worked for me in the
sense it did not throw any error, but it did nothing to the temp table
as far as I can tell... is there a way to vacuum/analyze temporary
tables from another session ? The docs definitely don't say anything
about this topic... I would think it can't work if the table lives in
fact in private memory of it's session, but it would be nice if the docs
would state these things clearly...

In fact I could attempt to write that page but need guidance.

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Clients disconnect but query still runs

2009-07-30 Thread Csaba Nagy

Hi all,

On Thu, 2009-07-30 at 11:02 +0200, Greg Stark wrote:
> On Thu, Jul 30, 2009 at 7:43 AM, Craig
> Ringer wrote:
> > On Wed, 2009-07-29 at 14:56 +0100, Greg Stark wrote:
> > What does work well is occasionally poking the socket with recv(...,
> > MSG_DONTWAIT) while doing other work. Program attached. TCP keepalives
> > seem to work very well at least on my Linux test system, and it's easy
> > to test for a dud connection using recv(...) with the MSG_DONTWAIT and
> > (if desired) MSG_PEEK flags. If the connection has exited cleanly it'll
> > return a zero-size read; if the connection has dropped due to keepalive
> > failure it'll return ETIMEDOUT.
> 
> 
> The problem with this is that it introduces spurious failures for
> transient network failures. Also it requires the server to
> periodically take time out from processing the query to do this. I
> think we want a zero-cost method which will interrupt processing if
> the client actively disconnects. If there's a network failure we'll
> find out about it in the normal course of events.

Sorry, I have to disagree here. If there's a spurious network error, you
have usually bigger problems. I prefer to have the connection killed
even if the network recovers than risk an idle in transaction connection
to live forever when the client/network crashes for any reason. In case
of network failure the connection will probably be cleaned eventually,
but it did happen to me that a client machine crashed in the middle of a
transaction while not executing any SQL, and that connection stayed
until I killed it manually. A simple ping to the client would have
cleared the fact that the client is not there anymore. I would also be
happy to pay the cost of pinging the clients let's say once per a minute
(or configurable interval). Considering that the connections are one to
one with a client, it's enough to have a single timer which periodically
signals each backend to ping it's client, but this is implementation
details for which I have no clue how it would be best, the main thing
is: I would love to have this functionality. It's extremely hard to
secure all clients against crash, and a crash of one of the clients in
the middle of a transaction can have very bad consequences (think
indefinitely stucked open transaction).

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Clients disconnect but query still runs

2009-07-30 Thread Csaba Nagy

On Thu, 2009-07-30 at 11:41 +0200, Greg Stark wrote:
> I know this is a popular feeling. But you're throwing away decades of
> work in making TCP reliable. You would change feelings quickly if you
> ever faced this scenario too. All it takes is some bad memory or a bad
> wire and you would be turning a performance drain into random
> connection drops.

But if I get bad memory or bad wire I'll get much worse problems
already, and don't tell me it will work more reliably if you don't kill
the connection. It's a lot better to find out sooner that you have those
problems and fix them than having spurious errors which you'll get even
if you don't kill the connection in case of such problems.

> Well it ought to have eventually died. Your patience may have ran out
> before the keep-alive timeouts fired though.

Well it lived for at least one hour (could be more, I don't remember for
sure) keeping vacuum from doing it's job on a heavily updated DB. It was
not so much about my patience as about starting to have abysmal
performance, AFTER we fixed the initial cause of the crash, and without
any warning, except of course I did find out immediately that bloat
happens and found the idle transactions and killed them, but I imagine
the hair-pulling for a less experienced postgres DBA. I would have also
preferred that postgres solves this issue on it's own - the network
stack is clearly not fast enough in resolving it.

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Clients disconnect but query still runs

2009-07-30 Thread Csaba Nagy

On Thu, 2009-07-30 at 13:22 +0200, Craig Ringer wrote:
> So, barring network breaks (wifi down / out of range, ethernet cable 
> fell out, etc etc) how is the OP managing to leave backends running 
> queries? Hard-resetting the machine?

It happened to us when a client box went out of memory and started
swapping up to the point it was unaccessible even for console login. The
connections of that machine were still live but unusable, as the client
box will never get out of that state until hard resetting... which we
would promptly do, but the connections on postgres side would still live
on. I will probably have to check out now the network connection
parameters in the postgres configuration, never had a look at them
before... in any case >2 hours mentioned in an earlier post seems a bad
default to me.

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Clients disconnect but query still runs

2009-07-30 Thread Csaba Nagy

On Thu, 2009-07-30 at 13:40 +0200, Craig Ringer wrote:
>  > A simple ping to the client would have
> > cleared the fact that the client is not there anymore.
> 
> Yep. It'd also stop PostgreSQL working for clients with software 
> firewalls, since most of them drop ICMP ECHO ("ping").

I wasn't meaning TCP 'ping', but a higher level one...

> TCP keepalives are designed to do the same thing, but do it reliably and 
> properly. Why not configure your tcp keepalive intervals instead?

Will do, normally we have good networking, never had to touch it before
(and have no experience in network problems anyway)...

> > the main thing
> > is: I would love to have this functionality. It's extremely hard to
> > secure all clients against crash, and a crash of one of the clients in
> > the middle of a transaction can have very bad consequences (think
> > indefinitely stucked open transaction).
> 
> Nope. Just tune your keepalives if you have hopelessly flakey clients.

On the contrary, we do have very stable networking here, the problem was
never a networking one...

> Even if the client _program_ crashes, though, you shouldn't have 
> anything left lying around. It's only if the client _OS_ crashes or the 
> machine is hard-reset that you should be left with a backend lying 
> around until tcp keepalives notice.

As explained in earlier email, the client box's OS went down in SWAP
hell.

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Clients disconnect but query still runs

2009-07-30 Thread Csaba Nagy

[just to make things clear, I'm not the one who brought up this
discussion, only that I was also bitten once by zombie connections] 

On Thu, 2009-07-30 at 13:29 +0200, Craig Ringer wrote:
> Idle? I thought your issue was _active_ queries running, servicing 
> requests from clients that'd since ceased to care?

No, the problem was zombie connections idle in transaction.

> How did you manage to kill the client in such a way as that the OS on 
> the client didn't send a FIN to the server anyway? Hard-reset the client 
> machine(s)?

It was a runaway client box in swapping hell. It had plenty of open
connections, mostly of which idle in transaction (because it started to
take veeery long to do it's processing between the queries).

> I thought your issue was the backend not terminating a query when the 
> client died while the backend was in the middle of a long-running query. 
> Keepalives alone won't solve that one.

No, that is not so dangerous for us, usually we chunk our queries coming
from applications that they can't run too long at all. But it seems that
if I lower the network timeout then even the crashed client scenario
could be OK, I will have to test that...

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Clients disconnect but query still runs

2009-07-30 Thread Csaba Nagy

[this is getting off topic]

On Thu, 2009-07-30 at 13:44 +0200, Craig Ringer wrote:
> A host with a runaway process hogging memory shouldn't be dying. It 
> should really be killing off the problem process, or the problem process 
> should be dying its self after failing to allocate requested memory. If 
> this isn't the case, YOU HAVE TOO MUCH SWAP.
> 
> After all, swap is useless if there's so much that using it brings the 
> system to a halt.

In theory you're right, in practice I can't control any of this - it's
the client boxes, I control the DB. The most I can do about it is to
friendly ask the colleagues in charge with that to make sure it won't
happen again, and then still there will be cases like a virtual machine
just crashing.

> > I will probably have to check out now the network connection
> > parameters in the postgres configuration, never had a look at them
> > before... in any case >2 hours mentioned in an earlier post seems a bad
> > default to me.
> 
> It's the OS's default. PostgreSQL just doesn't change it.

Well, then looks like I will have to learn a bit about TCP keep-alive
and how linux handles it... 

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Update Query doesn't affect all records

2009-08-05 Thread Csaba Nagy

Hi Andor,

On Wed, 2009-08-05 at 14:15 +0200, Schindler Andor wrote:
> Can anyone tell me, how this is possible? If we insert 12 on the end, then
> it decreases, but 11 remains the same. The problem only occurs, when the
> where condition contains "sorrend > 9" or less.

I bet the "sorrend" column is of some text type, and the "sorrend > 9"
comparison is a text comparison. Try "sorrend::integer > 9" and it
should work ;-)

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL driver for Joomla review

2009-10-20 Thread Csaba Nagy

Hi Thom,

I would like to review it, but I get "403 - Forbidden" when clicking:

http://downloads.joomlacode.org/trackeritem/4/5/0/45041/postgresql.php

Not sure what that means, probably I need some kind of login to the
joomla tracker system, and I don't have one, and I would prefer not to
create one... is it possible to access that somehow without full access
to the joomla tracker ?

Cheers,
Csaba.

On Tue, 2009-10-20 at 14:02 +0200, Thom Brown wrote:
> If anyone has a moment, could they review the PostgreSQL driver I
> wrote for Joomla's next major release?  The developers at Joomla have
> listened to the persistent noise created about only having MySQL as an
> option and are now accepting submissions for alternative database
> systems. (see 
> http://groups.google.com/group/joomla-dev-cms/browse_thread/thread/1382dc6f4af56278#msg_9b95648941ef6fa7
> for this development)
> 
> My submission can be found at:
> http://joomlacode.org/gf/project/joomla/tracker/?action=TrackerItemEdit&tracker_item_id=18384
> 
> Things to watch out for a version-dependent features and clumsy 
> implementations.
> 
> Thanks
> 
> Thom Brown
> 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL driver for Joomla review

2009-10-21 Thread Csaba Nagy

Hi Thom,

Sorry for the delay, I got sick in the meantime. I see that others
already did some review, I will do a quick one too, later maybe I'll
actually try it out... so after a quick review:

* on line 218, the " ENCODING '$DBname')" part feels wrong, you probably
want hardcoded UTF8 encoding there ?
* as Merlin already commented, transactions are always safe in postgres,
this is no mysql ;-)
* again, as Merlin commented, getTableList is getting the data bases,
which doesn't make sense, but maybe you actually wanted to get the
tables - in this case you don't have a typo but you need to change the
query ;-)

If I'll get some time I'll test it too, but likely not this week...

Cheers,
Csaba.

On Tue, 2009-10-20 at 15:28 +0200, Thom Brown wrote:
> 2009/10/20 Reid Thompson :
> > your attachment contains this...
> >
> > 
> >  > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
> > http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
> >  
> >  403 - Forbidden
> >  
> >  
> >  403 - Forbidden
> >  
> > 
> >
> 
> Erk.. that's weird.  I got that too even after being logged in.  I'm
> not sure how anyone can review it if no-one has access to it.
> 
> I've attached my working version which differs only slightly to
> conform with coding-styles required by Joomla.
> 
> Apologies
> 
> Thom

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [HACKERS] Switching between terminals

2008-07-03 Thread Csaba Nagy

On Thu, 2008-07-03 at 19:56 +0530, cinu wrote:
> Could anyone please tell me where I am going wrong and if there is a
> way I can get the same behaviour that I am getting while I am
> executing the through psql prompt.

You're mistake is that you think a transaction is related to your
terminal, but it is in fact tied to the psql session you are running...

Your first example is running one psql instance per terminal, hence one
transaction per terminal, while in your second example the transaction
is terminated each time psql finishes to run. Basically what you're
asking for is to keep a transaction opened by one session (the first
psql execution) and connect to it with the second session (the second
psql call) and continue the transaction which was opened by the first
one... which I'm pretty sure is wrong to want. It is likely possible to
do (using PREPARE TRANSACTION), but even likelier that it is a wrong
thing to do in normal circumstances. If you'll say what you really want
to do, I bet you'll get a lot more useful advices...

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Switching between terminals

2008-07-03 Thread Csaba Nagy

On Thu, 2008-07-03 at 16:59 +0200, Csaba Nagy wrote:
> If you'll say what you really want
> to do, I bet you'll get a lot more useful advices...

Oh, and you should use the general list only for these kind of
questions, hackers is for discussion about hacking on the postgres code
itself. And cross-posting will also not help too much, the subscribers
on hackers which are likely to answer you are subscribed to the general
list too.

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Switching between terminals

2008-07-03 Thread Csaba Nagy

On Thu, 2008-07-03 at 16:59 +0200, Csaba Nagy wrote:
> [snip] It is likely possible to do (using PREPARE TRANSACTION) [snip]

I was wrong, you can't do it with that either, see:
http://www.postgresql.org/docs/8.2/static/sql-prepare-transaction.html

Maybe there is some feature to attach/deattach to/from a session, but I
might be just confused...

Cheers,
Csaba.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Feature: FOR UPDATE SKIP LOCKED

2008-07-09 Thread Csaba Nagy

On Wed, 2008-07-09 at 00:48 -0400, Tom Lane wrote:
> "Jonathan Bond-Caron" <[EMAIL PROTECTED]> writes:
> > It would be quite useful to implement a database queue. Although FOR UPDATE
> > NOWAIT and trying again can work as well as other techniques, 
> 
> > just skipping over the locks has its advantages (simplicity and zero wait) 
> 
> And disadvantages, such as complete lack of predictability or failure
> detection.

Well, it's not like SQL is completely predictable in general... think
about ordering of results. Such a feature would definitely help queue
like table processing, and the fact that it is predictably unpredictable
should not be a surprise for anybody using such a feature...

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Feature: FOR UPDATE SKIP LOCKED

2008-07-09 Thread Csaba Nagy

On Wed, 2008-07-09 at 16:23 +0800, Craig Ringer wrote:
> Especially if it returned an updated row count or supported the
> RETURNING clause, so you could find out after the fact what was or
> wasn't done.

Well, it is supposed to be used as "SELECT ... FOR UPDATE SKIP LOCKED",
so you can in fact put the locked row ids in the target list. With a
"LIMIT 1" appended would be the perfect way to check out the next queue
item to process...

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] interesting trigger behaviour in 8.3

2008-07-29 Thread Csaba Nagy

On Tue, 2008-07-29 at 19:25 +0400, Ivan Zolotukhin wrote:
> Any clues? Can anybody suggest how to debug this? Is it possible to
> get an explain of the query within the trigger?

I bet it's the difference between prepared/not prepared plans. The
trigger prepares the plan without considering the actual parameter
values, on the psql prompt you give the parameter values explicitly in
the sql. Try to use the PREPARE command to prepare the plan on the psql
prompt, and EXPLAIN EXECUTE it to see how it works in the trigger...

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Fwd: Copying Blobs between two tables using Insert stmt

2008-10-20 Thread Csaba Nagy

On Fri, 2008-10-17 at 13:40 -0700, John Skillings wrote:
> Hello Csaba,
> 
> Back in the month of April, I noticed that you posted a similar
> request on copying blobs between two tables, having separate OID.
> 
> Can you let me know your final solution please.

The final version I'm using is this one:

CREATE OR REPLACE FUNCTION copy_blob(p_blobId oid) 
RETURNS oid 
AS '
DECLARE 
v_NewOID OID;
v_LODesc INTEGER;
BEGIN
SELECT lo_create(0) INTO v_NewOID;

SELECT lo_open(v_NewOID, -1) INTO v_LODesc;
  
PERFORM lowrite(v_LODesc, data) FROM pg_largeobject WHERE loid =
p_blobId ORDER BY pageno;
  
PERFORM lo_close(v_LODesc);

RETURN v_NewOID;
END;
' LANGUAGE 'plpgsql';

The only drawback is that you must enable reading of pg_largeobject for
the application user, as by default only the postgres super user can
access it. You would do that with something like (as postgres super
user):

GRANT SELECT ON pg_largeobject TO my_app_user;

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] return MAX and when it happened

2008-11-19 Thread Csaba Nagy

Hi Scara,

This should work just fine:

select num, min(mydate) from mytab group by num order by num desc limit
1;

If you have an index on 'num' it will also be fast.

Cheers,
Csaba.

On Wed, 2008-11-19 at 08:47 -0600, Scara Maccai wrote:
> Hi all,
> 
> suppose I have a table like:
> 
> CREATE TABLE mytab
> (
>  num integer,
>  mydate timestamp
> );
> 
> and I want to find MAX(num) and the "mydate" where it first happened.
> 
> I guess I could use
> 
> select * from mytab where num = (select MAX(num) from mytab) order by
> mydate limit 1;
> 
> but that would scan the data twice (I guess...)
> 
> Do I have to write my own MAX function, something like:
> 
> select MYMAX(num, timestamp) from mytab
> 
> which would return a custom type?
> Or is there a better way?
> 
> 
> 
> 
> 
> 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using postgres.log file for replication

2008-11-27 Thread Csaba Nagy

On Thu, 2008-11-27 at 09:20 -0800, Ioana Danes wrote:
> I've been wondering if anybody tried to use the postgresql csv log file to 
> replicate sql statements.
> I've been looking into it in the past days and after a brief testing it 
> doesn't look bad at all...

Try to execute something like:

UPDATE some_table SET some_timestamp_field = now();

The replica is now different than the master :-)

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Foreign Keys and Deadlocks

2011-11-05 Thread Csaba Nagy

Hi David,

On Thu, 2011-11-03 at 15:30 -0700, David Kerr wrote:
> I suspect that it has to be a transaction, and that further up in the TX is 
> an update to one of
> the reference tables in each TX.

This is your cause - updating the referenced table in the same
transaction. That will want an exclusive lock on the row, but the shared
lock taken by the foreign key check (in another process) is conflicting,
and will deadlock when the other process will also want to update some
row in the referenced table which is locked by a foreign key check in
this process.

While the lock on the referenced row was changed to be a shared lock
instead of an exclusive lock as in older postgres versions (see
http://archives.postgresql.org/pgsql-general/2002-11/msg00397.php for
the original problem, which is relaxed now), the lock is still too
strong and the deadlock problem remains. A solution is not trivial at
all, and involves only locking the row for changes of the referenced
columns (which postgres can't do currently).

While getting rid of the foreign key will solve your problem, I think
it's not the best solution - you can perhaps design a way to not update
the referenced tables in the same transaction.

Here we adopted a different solution - we run a patched postgres which
skips that lock altogether, which means a partially broken foreign key
code which mostly works but can leave orphans. I will not recommend to
do that though - the reasons we did it that way is that it was the path
of least resistance as the application was also running on other DBs
(which were the primary DB at that time) and there was no way to make
extensive changes to the application code.

If I were to change the code, I would have separated the updated fields
from the parent table to yet another child table, and have the parent
table never updated. That will still have some potential for deadlock
(if you don't order the inserts/updates properly) but much less.

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Foreign Keys and Deadlocks

2011-11-10 Thread Csaba Nagy

Hi David,

On Wed, 2011-11-09 at 09:52 -0800, David Kerr wrote:
> So, aside from removing the PKs do i have any other options?

Sure you have: order the inserts by primary key inside each transaction.
Then you will not get deadlocks, but inserting the same key again will
fail of course (but that's the purpose of the primary key, right ?)

Ordering inserts/updates by the columns which cause locks is the first
thing to do to avoid dead-locks...

Cheers,
Csaba.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] DELETE running at snail-speed

2008-12-18 Thread Csaba Nagy

On Thu, 2008-12-18 at 05:29 -0800, gerhard wrote:
> I suspect the foreign key constraint of downtime_detail to slow down
> the delete process. Is this a bug, probably fixed in latest version
> (8.1.x) or should I drop the constraint and recreate after deletion -
> which I only see as workaround ?

The foreign key is the cause indeed, but you should put an index on
downtime_detail(downtimeid) and it would work fine. What happens is that
for each row you delete from 'downtime' table, the following is done by
the foreign key triggers (cascade the deletion to the child tables):

delete from downtime_detail where downtimeid = $1

You can try to see what kind of plan you get for that by:

prepare test_001(integer) as delete from downtime_detail where
downtimeid = $1;

explain execute test_001(0);

Now multiply whatever you get there by the count of rows deleted from
'downtime' and you'll get the reason why it is slow... then try it again
with the above mentioned index in place.

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] a strange error

2009-05-27 Thread Csaba Nagy

Hi Ray,

On Wed, 2009-05-27 at 14:24 +0200, Iv Ray wrote:
> When the db is refreshed, the first click inside the web application  
> that uses it (php 5.x, Apache 2.x), and some of the next clicks (i. e.  
> the 3rd, 5th, 8th, 12th) result in the following error -
> PGSQL ERROR: server closed the connection unexpectedly
>   This probably means the server terminated abnormally
>   before or while processing the request.
> After two dozen of clicks and several such messages, the application  
> operates normally until the next refresh.

Although  I don't know the exact mechanics of how this happens, I guess
it has to do with "permanent connections" used by PHP (a kind of
connection caching), and the reused connections discovering that the
data base backend they were connected to is gone. Given that an idle
connection will not notice that the backend is gone until the next
query, it can happen well after the "data base refresh" if the cached
connection stays idle for that time. So probably you also want to clear
the connection cache of php, but for that I have no idea how can be
done.

BTW, I think it would be less disruptive if you truncate your tables
(supposing you don't need the content)... that can also be done using a
plsql script which iterates through all tables in the public schema and
truncates them, if you're concerned with changing schema... of course
you would need to lock all tables exclusively for a short time, but that
would be for sure less disruptive than deleting the DB.

Cheers,
Csaba.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general@postgresql.org

2005-02-21 Thread Csaba Nagy

Hi Markus,

Have you considered installing slony and replicate from the old version
to the new one ? It can do that, and when the replica is up to date, the
switchover will take minutes I guess.
Note that I have never done that so please ask others about that too, or
research for yourself ;-)

Cheers,
Csaba.


On Mon, 2005-02-21 at 07:11, Markus Wollny wrote:
> Hello!
> 
> Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 
> 8.0.1 - and though I'm very much looking forward to some of the new features, 
> the dump&reload process is worrying me a bit this time. I've got one cluster 
> in particular which is roughly 9GB in size and features some dreadfully large 
> indices of the GiST-type for Tsearch2.
> 
> I have already scheduled a nightshift for this upgrade, but I'm not all too 
> certain that I'll be up and running again in the morning, so I wondered if 
> there might be some preparations that would allow for some speed-up during 
> the reload process? What kind of tuning could be done in postgresql.conf in 
> respect to just this particular workload (COPY and finally the CREATE INDEX 
> stuff) with no other concurrent access going on?
> 
> The machine in question features a RAID10 disk array which hosts the 
> DB-cluster, some 2GB RAM and four processors and I've tuned postgresql.conf 
> for our ordinary everyday workload with lots and lots of concurrent reads and 
> writes, but I've got the feeling that this configuration might not be the 
> optimum for dump&reload, so it might make some sense to use some sort of 
> "maintenance"-configuration. What kind of tuning would you recommend?
> 
> Thank you very much for your advice and lots of cheers to the developers for 
> 8.0!
> 
> Kind regards
> 
>Markus
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] Deadlock and FK triggers

2005-02-21 Thread Csaba Nagy

[snip]
> We hope to have shared row locks in 8.1 for this usage.

This feature is one I'm really waiting for ! Is there any estimate when
8.1 comes out ? 
I know, I know, the preferred answer is "whenever it's ready", but if
8.1 is not that far away, and shared row locking will be in it, I won't
bother patching 8.0 to skip FK parent row locking altogether (which is
not the most orthodox way to avoid the FK related deadlocks)...

Thanks to all the developers,
Csaba.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Replication from other SQL Server

2005-03-02 Thread Csaba Nagy

You might want to take a look at
http://www.daffodildb.com/replicator.html
It claims to be able to replicate between different databases, including
"Sql server" and postgres. If you try it pleases share your experience
with us, I haven't seen yet a review involving replication between
different DB server to postgres using daffodil DB in a real world
application.

Cheers,
Csaba.

On Wed, 2005-03-02 at 06:15, Keith Tsao wrote:
> Hi,
> 
> I am new to postgresql!
> We have a M$SQL server and would like to do a replication from this
> server to postgresql.
> Would this be possible? 
> 
> If so, what would be the appropiate method.
> Any suggestion?
> 
> Thx!
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] Triggers and User Defined Trigger Functions

2005-03-09 Thread Csaba Nagy


> DELETE FROM Temp1 WHERE Test = 'test3';
> ERROR:  syntax error at or near "$2" at character 44
> QUERY:  INSERT INTO Temp2 (ID, test) VALUES ( $1   $2 )
> CONTEXT:  PL/pgSQL function "temp1_trigger_delete" line 2 at SQL statement
> LINE 1: INSERT INTO Temp2 (ID, test) VALUES ( $1   $2 )
  ^^^
> 
> What did I miss?

A comma in the indicated position I guess...

HTH,
Csaba.



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [GENERAL] [JDBC] java is locked when select for update

2005-03-09 Thread Csaba Nagy

Please post to the list too when replying...

On Wed, 2005-03-09 at 17:29, Mican Bican wrote:
> Thank you for your answer. The reason of so long transactions is that
> I first Insert a row than get the last ID (primary key) and select it
> for update from the database and wait for user to commit it..
> I need the key becouse I will insert rows with references to this new
> row (like adresses to user etc.) ..so the user cann add some adresses
> to a user and when he press ok I commit the statement...
> 
I can't see why you couldn't insert the user in a first transaction and
the addresses in further transactions... if the user gets deleted in the
meantime by somebody else, then inserting the address will fail and you
will tell the user about the error. And if you don't want other people
to see the new user before it is committed with all it's addresses, then
collect first the user data + all the address data, and then open a
transaction and insert the user and the addresses when the user presses
OK. The point is that there should not be any GUI activity between the
start and end of a transaction.
If you really need a lock (I doubt it) between updates, implement it
using some application logic, don't use the DB row locking for long
living locks. I think there is some locking helper in the contrib
modules of postgres, or if not, read up on resource locking on google.

Cheers,
Csaba.



---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] [JDBC] java is locked when select for update

2005-03-09 Thread Csaba Nagy

Whoops, posted to the wrong mailing list... but it might fit anyway.


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] using limit with delete

2005-04-07 Thread Csaba Nagy

Hi Chris,

Just a thought: if you have to clear the table anyway, wouldn't it work
for you to use truncate ? That should be faster than delete.

HTH,
Csaba.

On Thu, 2005-04-07 at 04:11, Chris Smith wrote:
> I don't care about the order in my particular case, just that I have to 
> clear the table.
> 
> I'll try the subquery and see how I go :)
> 
> Thanks!
> 
> Neil Conway wrote:
> > Chris Smith wrote:
> > 
> >> I'm trying to use a limit clause with delete, but it doesn't work at 
> >> the moment
> > 
> > 
> > It isn't in the SQL standard, and it would have undefined behavior: the 
> > sort order of a result set without ORDER BY is unspecified, so you would 
> > have no way to predict which rows DELETE would remove.
> > 
> >> delete from table where x='1' limit 1000;
> > 
> > 
> > You could use a subquery to achieve this:
> > 
> > DELETE FROM table WHERE x IN
> > (SELECT x FROM table ... ORDER BY ... LIMIT ...);
> > 
> > -Neil
> > 
> > ---(end of broadcast)---
> > TIP 2: you can get off all lists at once with the unregister command
> >(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> > 


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] Time and date functions give me headaches!!

2005-04-11 Thread Csaba Nagy

Costin,

You could probably use one of the functions listed here:
file:///usr/share/doc/postgresql-7.4.2/html/functions-formatting.html
If you want the varchar -> date conversion to happen automatically, you
should study this chapter:
file:///usr/share/doc/postgresql-7.4.2/html/typeconv.html

HTH,
Csaba.

On Mon, 2005-04-11 at 11:40, Costin Manda wrote:
>   Hello,
> 
> I have a table with columns defines as varchar that have values in format
> '-MM-DD HH:MM:SS'.
> 
> Now, if I do a
> select timestamp '2005-10-10 10:10:10';
> I get the value just fine. I can use abstime()::integer on the result to
> find the unix timestamp. (This is the simplest way I could find, are there
> any others?)
> 
> However, when I try doing the same with the values in the table I can't
> get it to work.
> 
> select timestamp setuptime from billing; - error
> select timestamp(setuptime) from billing; - error
> select setuptime::timestamp from billing; - error (cannot cast type
> character varying to timestamp without timezone!?)
> 
> So, how can I convert this string into a timestamp, pleeease! :(
> 
> 
> 
> 
> -
> E-Mail powered by MadNet.
> http://www.madnet.ro/
> 
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[GENERAL] Fixing a too long column value in a before insert trigger or rule

2005-05-11 Thread Csaba Nagy

Hi all,

We have a situation where a varchar column was limited to a too small
maximum length in the design phase... shit happens, we will fix our db
schema in our next database downtime or software version deployment,
whichever comes first (we are using 7.4 and on the fly column type
change is only from 8.0 on I guess).
But in the meantime it would be nice to have a trigger or something in
place which could intercept inserts and fix the offending value by
truncating it, so that the insert doesn't fail because of this.

I've tried with a before insert trigger, but the column size seems to be
checked before the trigger ever gets a chance to be called, I suppose
it's a parse time check. 

A rule will probably also not work, as I can't replace an insert with
another one to the same table, at least that's how I understand from the
docs, otherwise it causes recursive rule processing...

Is there any way to fix the length of a row to fit in the table in any
way, be it a rule or trigger ? Or the length check will kick in even for
a rule before it gets a chance to fix it ?

TIA,
Csaba.



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [GENERAL] Fixing a too long column value in a before insert

2005-05-11 Thread Csaba Nagy

Searching the archives lead me to this link:
http://archives.postgresql.org/pgsql-general/2001-10/msg01208.php
(I also searched before asking but for the wrong keywords...)

Is that advice still valid ?

I will try in any case :-)

Thanks,
Csaba.


On Wed, 2005-05-11 at 15:44, Tom Lane wrote:
> Csaba Nagy <[EMAIL PROTECTED]> writes:
> > We have a situation where a varchar column was limited to a too small
> > maximum length in the design phase... shit happens, we will fix our db
> > schema in our next database downtime or software version deployment,
> > whichever comes first (we are using 7.4 and on the fly column type
> > change is only from 8.0 on I guess).
> 
> Why don't you just hack the column's atttypmod?  See the archives
> (and practice on a test database ;-))
> 
>   regards, tom lane


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] Newbie question

2005-05-18 Thread Csaba Nagy

Dear Hugo,

It is possible using cron and psql. Just "man cron" on unix if you're
not already familiar with that. The command line you should use is
something like:

psql [connection options] -f sql_commands_file.sql

Also "man psql" for the options you can use... 
Starting the procedure should be done by something like:

SELECT procedure_name();

in the sql commands file given to psql.
Alternatively you can do:

echo "SELECT procedure_name();" | psql [connection options]

but I don't know how well that works with cron... I'm not actually using
it...

HTH,
Csaba.

On Wed, 2005-05-18 at 17:23, Hugo wrote:
> hi, 
> 
> is it possible to schedule the execution of an sql stored procedure in
> postgress on linux?
> 
> thanks
> 
> Hugo
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] blocking INSERTs

2005-06-08 Thread Csaba Nagy

On Wed, 2005-06-08 at 05:31, Joseph Shraibman wrote:
> I want to do the following:
> 
> BEGIN;
> SELECT ... FROM table WHERE a = 1 FOR UPDATE;
> UPDATE table SET ... WHERE a = 1;
> if that resturns zero then
> INSERT INTO table (...) VALUES (...);
> END;
> 
> The problem is that I need to avoid race conditions.  Sometimes I get 
> primary key exceptions on the INSERT.
> 
> I think I need to lock the table in share mode to keep inserts from 
> happening, but that blocks vacuums, and blocked vacuums block other 
> things behind them.  So how do I get around this?
> 

I think there's no way to avoid the race condition, I got to this
conclusion while following past discussions (which were many of them,
look for "insert or update" for example).
There is though a solution starting with 8.0, namely you should place a
save point before the inserts which are susceptible to fail, and roll
back to that save point on failure. Then you can update while still
keeping the transaction running. The strategy would be then "insert
first, update if failed".

HTH,
Csaba.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] deadlocks in multiple-triggers environment

2005-06-09 Thread Csaba Nagy

[snip]
> Do say, are there foreign keys on those tables?
> 
> If there are, that may explain the deadlocks.  This is a known problem,
> fixed in the development version, for which there is no complete
  

Wow, that's a good news :-)
Time to drop that nasty patch we're using...
I'm not on the developers list, I guess it was discussed there, I'm
curios what kind of solution it is ? A special foreign key thing or row
level shared locks were implemented ?

[snip]

Cheers,
Csaba.



---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] duplicate key violates unique constraint

2005-06-13 Thread Csaba Nagy

Your problem is that the trigger's "found" check will not see the row
inserted by a concurrent transaction. In other words, your insert
actually fails, the record what you see was inserted by another
concurrent transaction, and the "found" check didn't work because the
other transaction started after yours, but was quicker, and your
transaction can't see it's results.

What you try to do is similar to the "insert-or-update" thing, which
cannot be done safely in the way you tried to do it. Don't even bother
to try, there are lots of discussions on the list and the conclusion is
you can't avoid a race condition between the concurrent inserts. There
always will be a way one of them will fail with an error.

You could actually ignore the error if it's not part of a bigger
transaction, which would of course be broken by the error.
Your only way to avoid the error completely is to place a save point
before the insert, catch the error, and roll back to the save point, and
then continue your transaction as you need.

HTH,
Csaba.

On Mon, 2005-06-13 at 18:22, ON.KG wrote:
> Hi All!
> 
> I have table:
> 
> CREATE TABLE table1 (
>ip char(15) NOT NULL,
>hits integer NOT NULL default '1',
>PRIMARY KEY (ip)
> );
> 
> So it's counting hits per each IP for current day and every day
> trancated by cron:
> TRUNCATE TABLE table1;
> 
> before inserting or updating this table there're some checkings,
> logs, etc., so I'm using PL/PgSQL for that
> 
> after all checkings and logs I have:
> 
>   UPDATE table1
>   SET hits = hits + 1
>   WHERE ip = some_ip;
> 
>   IF NOT FOUND THEN
>  INSERT INTO table1
> (ip)
>  VALUES
> (some_ip);
>   END IF;
> 
> when IP is not found in table it inserts new record into table
> but in logs i see error
> ERROR:  duplicate key violates unique constraint "table1"
> CONTEXT:  PL/pgSQL function "insert_table1" line 68 at SQL statement
> 
> But record is inserted into table
> 
> what may be the problem?
> 
> i also tried before:
>   SELECT INTO cnt hits
>   FROM table1
>   WHERE ip = some_ip;
> 
>   IF FOUND THEN
>  UPDATE table1
>  SET hits = hits + 1
>  WHERE ip = some_ip;
>   ELSE
>  INSERT INTO table1
> (ip)
>  VALUES
> (some_ip);
>   END IF;
> 
> But same error still appears
> 
> Thank You
> 
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] duplicate key violates unique constraint

2005-06-13 Thread Csaba Nagy

[snip]
> If you have more than one client, this can always happen. You have two 
> choices:
>   1. Use a lock to stop two clients interacting like this

This won't work unless you make all the clients serialized, or you have
all the ip's already inserted in the data base... you can't lock on an
unknown key, otherwise the locking will also need to insert, and you're
back to the same race condition ;-)

Cheers,
Csaba.




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] duplicate key violates unique constraint

2005-06-13 Thread Csaba Nagy

That would work indeed. Bit I guess the savepoint solution will be the
simplest and fastest if the OP has or can install 8.0 version.

Cheers,
Csaba.

On Mon, 2005-06-13 at 17:49, Richard Huxton wrote:
> Csaba Nagy wrote:
> > [snip]
> > 
> >>If you have more than one client, this can always happen. You have two 
> >>choices:
> >>  1. Use a lock to stop two clients interacting like this
> > 
> > 
> > This won't work unless you make all the clients serialized, or you have
> > all the ip's already inserted in the data base... you can't lock on an
> > unknown key, otherwise the locking will also need to insert, and you're
> > back to the same race condition ;-)
> 
> You can, however, have something more finely-grained than whole-table 
> locking (assuming one IP updated/inserted at a time) by filling a dummy 
> table with e.g. integers 0..255 and locking a row there based on (e.g.) 
> the last octet of your target IP.
> 
> --
>Richard Huxton
>Archonet Ltd


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Deadlock Problem

2005-06-15 Thread Csaba Nagy

Gavin,

Are you ordering the updates by id inside one transaction ? You should
order the execution of the statements by id inside a transaction, and
the deadlocks should go away.

HTH,
Csaba.

On Wed, 2005-06-15 at 13:10, Gavin Love wrote:
> I am getting a number of deadlock errors in my log files and I was 
> wondering if anyone knows how I can stop them.
> 
> Query failed: ERROR:  deadlock detected DETAIL:  Process 11931 waits for 
> ShareLock on transaction 148236867; blocked by process 11932. Process 
> 11932 waits for ShareLock on transaction 148236866; blocked by process 
> 11931.
> 
> This is for a web application. Whenever a search result is displayed I 
> need to update a counter to say it has been viewed which is done with 
> between 1 and 15 updates in one transaction of the form.
> 
> BEGIN;
> UPDATE stats SET click_count = click_count+1 WHERE id = '122'
> UPDATE stats SET click_count = click_count+1 WHERE id = '123'
> UPDATE stats SET click_count = click_count+1 WHERE id = '124'
> etc...
> COMMIT;
> 
> My lock management config is:
> deadlock_timeout = 2000 # in milliseconds
> #max_locks_per_transaction = 64 # min 10, ~200*max_connections bytes
> 
> I am using Postgres 8.0.3
> 
> Does anyone know how I can stop these deadlocks from occurring?
> 
> They are not a big problem as losing a few it only happens a couple of 
> times a day but I prefer to have everything working as it should.
> 
> Thanks
> 
> Gavin
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Deadlock Problem

2005-06-15 Thread Csaba Nagy

Because I guess id is your primary key, and that would make your
transactions always update the same rows in the same order. It really
doesn't matter if you have other criteria in the where clause, the
issuing order of the statements is the problem.
Deadlock will occur if 2 concurrent transactions are updating the same
rows in different order, so ordering will eliminate the deadlocks.

A typical deadlock scenario:

T1=transaction 1;
T2=transaction 2;

T1 starts;
T2 starts;
T1 updates id 1;
T2 updates id 2;
T1 wants to update id 2, but the row is locked by T2;
T2 wants to update id 1, but the row is locked by T1;
deadlock: both transactions wait for the other one to finish.

Now if both will try to update id 1 first and then id 2, it would be:

T1 starts;
T2 starts;
T1 updates id 1;
T2 wants to update id 1, but the row is locked by T1;
T1 updates id 2;
...
T1 finishes;
T2 updates id 1;
T2 updates id 2;
...
T2 finishes;


Generally, ordering in the same way the rows being updated in all
transactions in a system will eliminate most of the deadlocks.

HTH,
Csaba.


On Wed, 2005-06-15 at 13:58, Gavin Love wrote:
> Hi Csaba,
> 
> I am not ordering them by ID as in reality the where condition is more 
> complex than in my example
> 
> UPDATE stats SET click_count = click_count+1 WHERE month = '06' AND year 
> = '2005' AND type = 'a' AND id = '123' AND count_type = 'b'";
> 
> Can you explain why sorting by ID would make a difference as I don't 
> understand why what you are suggesting should make a difference?
> 
> Thanks,
> 
> Gavin
> 
> Csaba Nagy wrote:
> > Gavin,
> > 
> > Are you ordering the updates by id inside one transaction ? You should
> > order the execution of the statements by id inside a transaction, and
> > the deadlocks should go away.
> > 
> > HTH,
> > Csaba.
> > 
> > On Wed, 2005-06-15 at 13:10, Gavin Love wrote:
> > 
> >>I am getting a number of deadlock errors in my log files and I was 
> >>wondering if anyone knows how I can stop them.
> >>
> >>Query failed: ERROR:  deadlock detected DETAIL:  Process 11931 waits for 
> >>ShareLock on transaction 148236867; blocked by process 11932. Process 
> >>11932 waits for ShareLock on transaction 148236866; blocked by process 
> >>11931.
> >>
> >>This is for a web application. Whenever a search result is displayed I 
> >>need to update a counter to say it has been viewed which is done with 
> >>between 1 and 15 updates in one transaction of the form.
> >>
> >>BEGIN;
> >>UPDATE stats SET click_count = click_count+1 WHERE id = '122'
> >>UPDATE stats SET click_count = click_count+1 WHERE id = '123'
> >>UPDATE stats SET click_count = click_count+1 WHERE id = '124'
> >>etc...
> >>COMMIT;
> >>
> >>My lock management config is:
> >>deadlock_timeout = 2000 # in milliseconds
> >>#max_locks_per_transaction = 64 # min 10, ~200*max_connections bytes
> >>
> >>I am using Postgres 8.0.3
> >>
> >>Does anyone know how I can stop these deadlocks from occurring?
> >>
> >>They are not a big problem as losing a few it only happens a couple of 
> >>times a day but I prefer to have everything working as it should.
> >>
> >>Thanks
> >>
> >>Gavin
> >>
> >>
> >>---(end of broadcast)---
> >>TIP 3: if posting/reading through Usenet, please send an appropriate
> >>  subscribe-nomail command to [EMAIL PROTECTED] so that your
> >>  message can get through to the mailing list cleanly
> > 
> > 
> > 
> > ---(end of broadcast)---
> > TIP 4: Don't 'kill -9' the postmaster
> > 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[GENERAL] Bad plan when null is in an "in" list

2005-07-26 Thread Csaba Nagy

Hi all,

Jumping in directly to the subject, this is what I get:

explain SELECT bigint_col_1, bigint_col_2 FROM big_table WHERE
bigint_col_2 in (12132131, null, null, null,
null);

QUERY PLAN
---
 Seq Scan on big_table  (cost=0.00..2447201.85 rows=448 width=16)
   Filter: ((bigint_col_2 = 12132131::bigint) OR NULL::boolean)
(2 rows)
 

Compared to:


explain SELECT bigint_col_1, bigint_col_2 FROM big_table WHERE
bigint_col_2 in (12132131, 123781, 1297839032, 123667123);

QUERY PLAN

 Index Scan using dom_idx_member_bigint_col_2,
dom_idx_member_bigint_col_2, dom_idx_member_bigint_col_2,
dom_idx_member_bigint_col_2 on big_table  (cost=0.00..6427.28 rows=1789
width=16)
   Index Cond: ((bigint_col_2 = 12132131) OR (bigint_col_2 = 123781) OR
(bigint_col_2 = 1297839032) OR (bigint_col_2 = 123667123))
(2 rows)
 

big_table has ~ 100 million rows.


Considering that NULL::boolean is always false, I don't quite understand
why the first query is going for a sequential scan, instead of just
ignoring the nulls from the in clause...
We have a lot of queries like this for chunking activities, and they
work too on oracle. We've just migrated a bigger data base to postgres
from oracle, and this is the first thing to slow down our system to a
crawl... we will fix this, but the parser could be smarter I guess.

Cheers,
Csaba.



---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Bad plan for queries with IN clause

2005-07-29 Thread Csaba Nagy

Hi all,

Sorry for the cross-post, I need the bigger exposure, and I'm not sure
if it's JDBC issue or not.

This is a follow-up from an earlier post, with a new twist.

We have a Java application which is designed initially to work with
Oracle, and then gradually adapted to also work with postgres, sometimes
even connecting at the same time to both oracle and postgres data bases.

Until now we didn't have any big postgres data-base, which has just
changed by moving a big one from oracle to postgres... so now we start
to notice bad query plans.

We have quite a lot of data import/exporting, and we use the following
type of query to chunk these:

... WHERE primarykey IN (?, ?, , ?) ...

We use JDBC prepared statements, and fill the parameters with chunks of
the looked up data. The last chunk is smaller of course, and the
remaining parameters are filled up with nulls.

On oracle this works just fine.

Before my last post (reporting bad plan on this type of query when nulls
are in the IN list) we were using the 7.4 JDBC driver, which AFAICT is
not actually using server side prepared statements, but it sends each
query with the parameters inlined. I guess that's why the nulls were
taken into account in the query plan.

Now we upgraded the JDBC driver (to 8.0), and I think this one is
actually using server side prepared statements for java
PreparedStatements. This eliminates the null problem, and indeed most of
our queries take now index scans.

However, some of the queries still take the sequential scan route. The
most puzzling in all this is that I've tried to "prepare" the same query
in psql, and then "explain execute" the prepared query, and it gave me
an index scan plan... so now I'm clueless, and have no idea why would
the same query prepared by java yield a different plan than prepared
manually... I thought that the query plan is created when you prepare
the statement, and not on each execution, right ? And I would expect
that the same query prepared multiple times would give the same plan,
provided that the tables didn't change significantly...

The sequential scan on those queries is even more surprising as the
execution time is so radically different, taking milliseconds for an
index scan, and even hours for the sequential scan (the involved tables
are from a few million rows to a few hundred million rows size).

BTW, I've tried to set "enable_seqscan = false", but those queries were
still going the sequential scan way. Could it be that the JDBC driver is
preparing with wrong parameter types ? I thought 8.0 is more forgiving
in this respect anyway.

I also tried to tweak the memory settings per
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
, but it didn't help either.

I have enabled plan logging at one point, I can provide an actual plan
for one of those queries if that would help diagnose what's happening...
for me the only relevant thing was:
"DETAIL: {SEQSCAN :startup_cost 0.00 :total_cost 1832574.05 :plan_rows 1
:plan_width 8 ... "
the rest is too low level and I don't understand it.

I would appreciate any help, as I am mostly clueless what the problem
could be.

Thanks,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] [JDBC] Bad plan for queries with IN clause

2005-07-29 Thread Csaba Nagy

OK, found the source of the problem: our code is setting nulls via:
PreparedStatement.setNull(int parameterIndex, int sqlType)
using Types.NUMERIC as the type, but the data base type is actually
BIGINT. Using Types.BIGINT in this method makes the problem go away.
Hopefully there are no more bad surprises for us coming from the
oracle/postgres incompatibility twilight zone... (crossing fingers).

Cheers,
Csaba.


On Fri, 2005-07-29 at 12:29, Oliver Jowett wrote:
> Csaba Nagy wrote:
> 
> > However, some of the queries still take the sequential scan route. The
> > most puzzling in all this is that I've tried to "prepare" the same query
> > in psql, and then "explain execute" the prepared query, and it gave me
> > an index scan plan... so now I'm clueless, and have no idea why would
> > the same query prepared by java yield a different plan than prepared
> > manually... I thought that the query plan is created when you prepare
> > the statement, and not on each execution, right ? And I would expect
> > that the same query prepared multiple times would give the same plan,
> > provided that the tables didn't change significantly...
> 
> One gotcha is that by default the JDBC driver will use an unnamed
> statement for the first few executions of a particular
> PreparedStatement, then switch to using a (reused) named statement
> thereafter. The unnamed statement path can result in different plans to
> what you'd get with PREPARE or a named statement, as it delays planning
> until the first execution, then takes the actual parameter values into
> account when doing selectivity estimates. In contrast PREPARE and named
> statements plan immediately using placeholder estimates.
> 
> You can tweak the threshold for this on a per-connection or
> per-statement basis via PGConnection.setPrepareThreshold() and
> PGStatement.setPrepareThreshold() (statements inherit the connection's
> value on creation by default). The connection default is also settable
> via the prepareThreshold URL parameter. If you set it to 1, *every*
> PreparedStatement execution uses a named statement. If you set it to 0,
> named statements are never used.
> 
> > Could it be that the JDBC driver is
> > preparing with wrong parameter types ? I thought 8.0 is more forgiving
> > in this respect anyway.
> 
> One thing that may be useful for debugging this: if you set logLevel=2
> as a URL parameter the JDBC driver will log the protocol messages it
> sends and receives to the JDBC log writer (stderr by default) -- that
> includes the type OIDs and whether it's using an unnamed or a named
> statement.
> 
> -O


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[GENERAL] Query stucked in pg_stat_activity

2005-08-09 Thread Csaba Nagy

Hi all,

I have a postgres system where we just migrated a fairly big data set.
The application accessing it is a cluster of servers which do burst-like
processing, i.e. when they have some work to do, it will be distributed
in the cluster and the data base will be under fairly high load.
On our first test run everything went fine, the only strange thing is a
row in the pg_stat_activity, which has a row about a query which is long
gone, the process pointed by the procpid field is not existing.
I've executed a "select pg_stat_reset();" as superuser, and all went
away except the offending row... I wonder what can cause this behavior,
and how to get rid of it ? I'm using the pg_stat_activity view to detect
long running queries, and I do expect a few more to pop up in time...

Thanks,
Csaba.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Query stucked in pg_stat_activity

2005-08-09 Thread Csaba Nagy

Oh, I've forgot to mention that the DB is version 8.0.3 running on
linux.

Cheers,
Csaba.


On Tue, 2005-08-09 at 15:19, Csaba Nagy wrote:
> Hi all,
> 
> I have a postgres system where we just migrated a fairly big data set.
> The application accessing it is a cluster of servers which do burst-like
> processing, i.e. when they have some work to do, it will be distributed
> in the cluster and the data base will be under fairly high load.
> On our first test run everything went fine, the only strange thing is a
> row in the pg_stat_activity, which has a row about a query which is long
> gone, the process pointed by the procpid field is not existing.
> I've executed a "select pg_stat_reset();" as superuser, and all went
> away except the offending row... I wonder what can cause this behavior,
> and how to get rid of it ? I'm using the pg_stat_activity view to detect
> long running queries, and I do expect a few more to pop up in time...
> 
> Thanks,
> Csaba.
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] Query stucked in pg_stat_activity

2005-08-09 Thread Csaba Nagy

Michael, 

I've read the message you referred, and it's probably what happens.
In fact the original row I've complained about is gone, and I have now
10 other dead processes listed in pg_stat_activity... one of the queries
is a "", still running after 25 minutes, and the associated
process is gone, so it's clearly an inconsistent state of the stats
collector. I wonder if there's a way to fix that without too much
affecting performance ?
The logs don't show the "statistics buffer is full" message as suggested
by Tom, but ITOH "log_min_messages = info", and that message might be a
debug level one.

In any case it seems my system can readily reproduce the issue whenever
I place a bigger load on it...

Cheers,
Csaba.

On Tue, 2005-08-09 at 15:51, Michael Fuhr wrote:
> On Tue, Aug 09, 2005 at 03:19:46PM +0200, Csaba Nagy wrote:
> > I have a postgres system where we just migrated a fairly big data set.
> > The application accessing it is a cluster of servers which do burst-like
> > processing, i.e. when they have some work to do, it will be distributed
> > in the cluster and the data base will be under fairly high load.
> > On our first test run everything went fine, the only strange thing is a
> > row in the pg_stat_activity, which has a row about a query which is long
> > gone, the process pointed by the procpid field is not existing.
> 
> I ran across this situation a while ago, where high load caused
> pg_stat_activity to have stale entries.  Tom Lane wondered if the
> stats subsystem was under a high enough load that it was dropping
> messages, as it's designed to do.
> 
> http://archives.postgresql.org/pgsql-bugs/2004-10/msg00163.php

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Query stucked in pg_stat_activity

2005-08-09 Thread Csaba Nagy

[snip]
> > I've executed a "select pg_stat_reset();" as superuser, and all went
> > away except the offending row...
> 
> That only resets the I/O counts (and only for one database), not the
> backend activity info.
> 
>   regards, tom lane

This reminds me I've forgot to ask, is there any other way of getting
rid of those ghost entries than via big load ? The next big load will
leave another set of ghosts behind it...

Cheers,
Csaba.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] World-wide Stop Order on PERL,Python,Java::->> Use

2005-08-11 Thread Csaba Nagy

[snip]
> He's not spam.

But anybody could easily forge his address while sending to the group.
So that makes the mail itself no less spam.

The best way to handle those mails currently is dump and ignore them.
Otherwise threads like this one pop up...

Cheers,
Csaba.


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Postgresql server "crash" after some day

2005-08-17 Thread Csaba Nagy

Is it possible that you're application is not closing connections, and
the server has a limit on connection count, and that is reached in a few
days ? Actually I'm not sure if the "max_connections" parameter is
applicable to local connections too, cause in that case you shouldn't be
able to connect locally either...

Just a thought.

Cheers,
Csaba.

On Wed, 2005-08-17 at 14:55, Stefano B. wrote:
> Hi all,
>  
> I have some strange problems with postgres.
> After some days it works in local but it doesn't listen on port 5432
> for remote request.
>  
> I can make a connection from local host but I can't from remote host.
> If I try to make a connection from remote host the server returns me
> the classic remote error:
>  
> could not connect to server ... is the server running on host ...
> and accepting TCP/IP connections on port 5432?
>  
> This problem happens after some day that I have start the server.
> There is no particular error in the log file (the server seems to
> work)
>  
> PostgreSql version: 8.0.3
> on FreeBSD 5.3
>  
> Thanks in advance.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] Finding nearest numeric value

2005-08-17 Thread Csaba Nagy

[snip]
> To find the nearest value in number_column to some CONSTANT (where you
> replace constant with a number), try:
> 
> select *,(number_column - CONSTANT)^2 as d from tablename order by d limit
> 1;
> 
This will scan the whole table and sort the results... and then pick
just one of it. Watch this:

db=> prepare test_01(bigint) as select *, (pk_col - $1) ^ 2 as d from
big_table order by d limit 1;
PREPARE
eb=> explain execute test_01(27163619);
 QUERY PLAN
-
 Limit  (cost=31239164.71..31239164.72 rows=1 width=59)
   ->  Sort  (cost=31239164.71..31505657.00 rows=106596914 width=59)
 Sort Key: (((pk_col - $1))::double precision ^ 2::double
precision)
 ->  Seq Scan on big_table  (cost=0.00..3149688.00
rows=106596914 width=59)
(4 rows)
 

The names were changed, this is a production DB, but the idea is:
big_table has around 100 million rows, and pk_col is the primary key on
it. Running the above query would take forever. 

If you don't have an index on the numeric column, or if the table is
small, this might be your best choice... but if your table is big, and
you have an index on the numeric column, you should use something along:

select * number_col from big_table where number_col < CONSTANT order by
number_col desc limit 1

select * number_col from big_table where number_col > CONSTANT order by
number_col limit 1

You execute the 2 queries, which are very fast even for big tables if
you have an index on number_col, and then choose the row with the
smallest difference (you do this in your client program).

HTH,
Csaba.


> Does that do it for you?
> 
> Sean
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Finding nearest numeric value

2005-08-17 Thread Csaba Nagy

The only problem is that you can't use the order by/limit syntax inside
the union queries I guess, cause the query you proposed is giving a
syntax error. I also thought first to do it like this, but it won't
work. If it would, then you could wrap the thing in another query which
orders by the difference and limits to the first one ;-)

Cheers,
Csaba.

On Wed, 2005-08-17 at 17:10, Richard Huxton wrote:
> Poul Møller Hansen wrote:
> > Does anyone know how to find the row with the nearest numeric value, not 
> > necessarily an exact match ?
> 
> While the other answers all do their job, and in one go too, I'd be 
> surprised if you found anything faster than:
> 
> SELECT myval FROM mytable WHERE myval > 1234 ORDER BY myval LIMIT 1
> UNION ALL
> SELECT myval FROM mytable WHERE myval < 1234 ORDER BY myval DESC LIMIT 1
> 
> That gives you (up to) two values to look at, but should use any index 
> you have on myval.
> 
> You can always sort the results by abs(myval) then if you don't want to 
> handle two values in the application layer.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] Finding nearest numeric value

2005-08-17 Thread Csaba Nagy

Yep, you're right. The following works and uses the index on pk_col:

prepare test_01 (bigint) as
select * from
  (
(SELECT * FROM big_table WHERE pk_col > $1 ORDER BY pk_col LIMIT 1)
UNION ALL
(SELECT * FROM big_table WHERE pk_col < $1 ORDER BY pk_col DESC
LIMIT 1)
  ) as nearest
order by abs(pk_col - $1)
limit 1;


db=> explain execute test_01(12321);

QUERY
PLAN
   

 Limit  (cost=2.12..2.12 rows=1 width=112)
   ->  Sort  (cost=2.12..2.13 rows=2 width=112)
 Sort Key: abs((pk_col - $1))
 ->  Subquery Scan nearest  (cost=0.00..2.11 rows=2 width=112)
   ->  Append  (cost=0.00..2.08 rows=2 width=59)
 ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1.04
rows=1 width=59)
   ->  Limit  (cost=0.00..1.03 rows=1 width=59)
 ->  Index Scan using idx_pk_col on
big_table  (cost=0.00..36639172.72 rows=35532914 width=59)
   Index Cond: (pk_col > $1)
 ->  Subquery Scan "*SELECT* 2"  (cost=0.00..1.04
rows=1 width=59)
   ->  Limit  (cost=0.00..1.03 rows=1 width=59)
 ->  Index Scan Backward using
idx_pk_col on big_table  (cost=0.00..36639172.72 rows=35532914 width=59)
   Index Cond: (pk_col < $1)
(13 rows)
 

Cheers,
Csaba.

On Wed, 2005-08-17 at 17:57, Bruno Wolff III wrote:
> On Wed, Aug 17, 2005 at 17:35:37 +0200,
>   Csaba Nagy <[EMAIL PROTECTED]> wrote:
> > The only problem is that you can't use the order by/limit syntax inside
> > the union queries I guess, cause the query you proposed is giving a
> > syntax error. I also thought first to do it like this, but it won't
> > work. If it would, then you could wrap the thing in another query which
> > orders by the difference and limits to the first one ;-)
> 
> You probably can just add parenthesis. I think that the second ORDER BY
> and LIMIT may be being applied to the UNION results which would be a
> problem. Putting the second subquery in parens will take care of this if
> that is the problem.
> 
> > 
> > On Wed, 2005-08-17 at 17:10, Richard Huxton wrote:
> > > Poul Møller Hansen wrote:
> > > > Does anyone know how to find the row with the nearest numeric value, 
> > > > not 
> > > > necessarily an exact match ?
> > > 
> > > While the other answers all do their job, and in one go too, I'd be 
> > > surprised if you found anything faster than:
> > > 
> > > SELECT myval FROM mytable WHERE myval > 1234 ORDER BY myval LIMIT 1
> > > UNION ALL
> > > SELECT myval FROM mytable WHERE myval < 1234 ORDER BY myval DESC LIMIT 1
> > > 
> > > That gives you (up to) two values to look at, but should use any index 
> > > you have on myval.
> > > 
> > > You can always sort the results by abs(myval) then if you don't want to 
> > > handle two values in the application layer.


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Support for Limit in Update, Insert...

2005-09-09 Thread Csaba Nagy

Well, I do have a use case for it. 

Context:

We have data coming in from web requests, which must be fast, so we just
insert them in temporary tables without any verification. Then they are
periodically processed by a background task, but even that one will
process just a chunk at a time to avoid long running queries and the
possible socket timeouts bundled with them. Now for identifying a chunk
we use a "chunkid" field in those temporary tables, which is initially
null. When a chunk is selected for processing, we update the chunkid
field with the next value of a sequence, and then all further processing
has a where clause which selects only records with that chunkid.

Use case:

To set the chunkid only for 1000 rows, we actually don't care which
ones. The idea is to uniquely mark a chunk of data, we really don't care
which rows are selected in each chunk, they will be processed all
eventually.

Of course right now we do it by something similar with what you
proposed, using a subselect with a limit clause, I wonder if a simple
update with limit could be faster ?

Cheers,
Csaba.

On Fri, 2005-09-09 at 04:49, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > This has been discussed before, and rejected.  Please see the archives.
> 
> For SELECT, both LIMIT and OFFSET are only well-defined in the presence
> of an ORDER BY clause.  (One could argue that we should reject them when
> no ORDER BY, but given that the database isn't getting changed as a side
> effect, that's probably too anal-retentive.  When the database *is*
> going to be changed, however, I for one like well-defined results.)
> 
> If this proposal included adding an ORDER BY to UPDATE/DELETE, then it
> would at least be logically consistent.  I have not seen the use-case
> for it though.  In any case you can usually get the equivalent result
> with something like
> 
>   UPDATE foo SET ...
>   WHERE pkey IN (SELECT pkey FROM foo ORDER BY ... LIMIT ...);
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] Problem using NULLIF in a CASE expression

2005-09-09 Thread Csaba Nagy

You're misusing the case construct, and try to compare a text result
(from the 'nullif') with a boolean result (from the 'not null'). You
probably should use the other form of case:

SELECT CASE WHEN NULLIF(btrim(' A string', ' '), ') IS NULL THEN NULL
ELSE 6 END AS type_id;

I guess you were looking for something like:

db=> SELECT CASE WHEN NULLIF(btrim(' A string', ' '), '') IS NULL THEN
NULL ELSE 6 END AS type_id;
 type_id
-
   6
(1 row)
 
db=> SELECT CASE WHEN NULLIF(btrim(' ', ' '), '') IS NULL THEN NULL ELSE
6 END AS type_id;
 type_id
-
 
(1 row)
 

Cheers,
Csaba.


On Fri, 2005-09-09 at 13:45, Bruno BAGUETTE wrote:
> Hello,
> 
> I have a problem using NULLIF inside a CASE expression.
> 
> I explain the goal of my query : I have a table where there is two 
> fields : one varchar field and one INT8 field.
> When the varchar field is empty I have to display the INT8 field.
> If the varchar field is not empty I have to display a NULL value (even 
> if the INT8 field contains something).
> 
> BUT : that table is used by several applications and theses applications 
> does not always store NULL in the varchar field. Some applications store 
> an empty string '', or stores  char(s).
> SO, I have to do a btrim and to get a NULL value when I just have '' 
> after the btrim.
> 
> I wrote another (and quite shorter!) SQL query to resume the problem :
> 
> SELECT CASE NULLIF(btrim(' A string', ' '), '')
> WHEN NOT NULL
>   THEN NULL
> ELSE 6
> END AS type_id;
> 
> ERROR:  operator does not exist: text = boolean
> HINT:  No operator matches the given name and argument type(s). You may 
> need to add explicit type casts.
> 
> Why this query does not accept the NULLIF ?
> What can I do to solve that problem ?
> 
> Thanks in advance !
> 
> Bruno BAGUETTE.
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Replication

2005-09-13 Thread Csaba Nagy

Well, AFAICT this kind of replication in postgres is not be named "out
of the box". Setting up the whole thing to work, and test it to really
work reliably is not exactly trivial, and you could have a number of
surprises (like when there's little activity, the last WAL stays at the
server and so the stand-by can be off by a considerable amount of time,
even if in terms of data quantity the difference is not that big). You
must master well scripting to write reliable scripts to handle the
process, though I think there are some examples around, but you still
need to customize them for sure.

What is missing is an easy way to set up the hot stand-by, without
requiring the user to do too much side-work. 

I wonder if there is a way to set up a direct link between the standby
and the server and instead of copying files just send directly the WAL
data over by some background process ? The overhead of sending the data
directly should be similar with copying the files. Then the easiest way
to set up a standby would be to start up the stand-by server with some
standby options and giving the server's connection params, and then the
standby should make a special connection to the server requesting a data
dump + WAL from the data dump start on. Would this be reasonably simple
to implement ? I'm not at all familiar with postgres code or the C
language, so this is just wishful thinking.

Cheers,
Csaba.

On Tue, 2005-09-13 at 17:45, Russ Brown wrote:
> Simon Riggs wrote:
> > Barry,
> > 
> > You can use PITR to archive transaction logs to a second server that is
> > kept in standby mode.
> > 
> > This will cope with any number of tables and cope with dynamic changes
> > to tables.
> > 
> > This is fairly straightforward and very low overhead.
> > Set archive_command to a program that transfers xlog files to second
> > server.
> > Then set restore_command on the second server to a program that loops
> > until the next file is available.
> > Switchover time is low.
> > 
> 
> Apologies for going slighly off topic, but isn't this basically how 
> MySQL does replication? I ask because one of the arguments against 
> moving to PostgreSQL in my organisation is 'no replication out of the 
> box'. But if the above is true it seems that all that is required are a 
> couple of scripts to handle log transfer and you have a form of 
> replication out of the box right there.
> 
> Or am I missing something?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[GENERAL] Inserting slows down

2005-09-19 Thread Csaba Nagy

Hi all,

I am running a data import in a Postgres 8.0.1 data base. The target
table is one with ~ 100 million rows. The imported data is around 40
million rows. The import is inserting 1 rows per transaction. The
table has a few indexes on it, a few foreign constraints, and one insert
trigger which normally inserts a row in another table for each insert on
this one, but for the duration of the data transfer it was modified to
not insert anything for the data which is transfered, but it does an
extra lookup on the index of a ~ 1 big table to get the exclusion
condition.

Given this context, at transfer start I've had a transfer rate at ~ 1,3
million rows per hour, which dropped progressively to ~ 30 now after
transferring ~ 24 million rows. 

My question is what could cause this progressive slow-down ? I would
have expected a more or less constant transfer rate, as the table was
not empty in the first place to say that the indexes slow down as they
grow... so after a growth of ~ 20% I have a slow-down for inserts of
more than 4 times. What should I suspect here ? I can't just drop the
indexes/foreign keys/triggers, the db is in production use.

TIA for any ideas,
Csaba.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Inserting slows down

2005-09-19 Thread Csaba Nagy

OK, I've investigated further a bit, and made some charts out of the
insert speed. 

It looks like it is fluctuating in a very wide range even when it works
fast in average, and there's no "progressive" slow-down visible, it's
more like a sudden drop in average speed at one point. Note that even
after the average speed drops there are still pikes of high speed for
short periods. 

That makes me think if this is not really dependent somehow on the data
itself ? Is it possible that updating the b-tree indexes is heavily
dependent on the ordering/randomness of the keys ? In this case is there
any way I could order the incoming data so that index update performance
is optimal ? Or may the server not optimally be tuned and I should look
in some tuning ?

TIA,
Csaba.

On Mon, 2005-09-19 at 11:18, Csaba Nagy wrote:
> Hi all,
> 
> I am running a data import in a Postgres 8.0.1 data base. The target
> table is one with ~ 100 million rows. The imported data is around 40
> million rows. The import is inserting 1 rows per transaction. The
> table has a few indexes on it, a few foreign constraints, and one insert
> trigger which normally inserts a row in another table for each insert on
> this one, but for the duration of the data transfer it was modified to
> not insert anything for the data which is transfered, but it does an
> extra lookup on the index of a ~ 1 big table to get the exclusion
> condition.
> 
> Given this context, at transfer start I've had a transfer rate at ~ 1,3
> million rows per hour, which dropped progressively to ~ 30 now after
> transferring ~ 24 million rows. 
> 
> My question is what could cause this progressive slow-down ? I would
> have expected a more or less constant transfer rate, as the table was
> not empty in the first place to say that the indexes slow down as they
> grow... so after a growth of ~ 20% I have a slow-down for inserts of
> more than 4 times. What should I suspect here ? I can't just drop the
> indexes/foreign keys/triggers, the db is in production use.
> 
> TIA for any ideas,
> Csaba.
> 
> 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread Csaba Nagy

You could use the system wide crontab, which is configured via
/etc/crontab, and there you can specify what user should execute the
command, without needing passwords. The system wide crontab is executed
as root and it will su to the user you specify.

For vacuuming, you could try to use the autovacuum daemon, it is
included in the contrib part of postgres. If you installed from source,
you will likely need to separately install autovacuum, if you installed
a prepackaged postgres, chances are that you already have the
pg_autovacuum executable installed. It is fairly easy to set up.

HTH,
Csaba.

On Mon, 2005-09-19 at 20:27, Belinda M. Giardine wrote:
> This seems like it should be a frequently asked question, but I am having
> trouble finding the answer.  I am in the process of switching to using
> Postgres, and realize that I need to run vacuum analyze regularly on the
> tables.  This is on a Unix system so cron is the obvious choice.  The
> problem is I don't want to put the user name and password in the script.
> As far as I can tell vacuum must be run by the table or database owner.
> It wouldn't be as bad to have the password in the script if it was a
> limited permissions user.  Any suggestions on the best methods?
> 
> Thanks,
> Belinda
> 
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] deactivating/activating constraint

2005-09-20 Thread Csaba Nagy

You might be looking for a DEFERRED constraint.
You can declare your constraint to be checked only at transaction end,
and then make all your data changes, in one transaction.

You will find details here:
http://www.postgresql.org/docs/8.0/static/sql-createtable.html
Search for DEFERRED.

HTH,
Csaba.


On Tue, 2005-09-20 at 16:26, Oleg wrote:
> Dear All,
> is it possible to temporary deactivate a constraint in PostgreSQL?
> There is a constraint that does not allow me to write some data (see 
> e-mail below). But after all datasets are written the constraint is 
> valid. So I was wondering wether it is possible to deactivate a 
> constraint write all records in all tables then activate constraint 
> again. Somebody told me that it is possible in Oracle.
> Thanks a lot in advance
> Oleg
> 
> > Dear All
> > I am PostgreSQL beginner. I am trying to write some values from 
> > external source table “Table1” (using PostGIS plugin) to 2 destination 
> > PostgreSQL tables “Table2” and “Table3” that describes relations of data:
> >
> > Table1:
> > | A | B |
> > -
> > | 1 | 1 |
> > | 2 | 3 |
> > | 3 | 1 |
> >
> > Table2
> > | C |
> > -
> > | 1 |
> > | 2 |
> >
> > Table3
> > | D | E |
> > -
> > | 1 | 1 | case 1
> > | 2 | 3 | case 2
> >
> > Table3 has constraint:
> > FOREIGN KEY E REFERENCES Table2 (C) ON UPDATE NO ACTION ON DELETE 
> > RESTRICT;
> >
> > I use updateable view “View1” (due to PostGIS interface) with columns 
> > “C” and “E” and rule:
> > AS ON INSERT TO View1 DO INSTEAD (INSERT INTO Table2 VALUES (NEW.C); 
> > INSERT INTO Table3 VALUES (NEW.D, NEW.E));
> >
> > I faced following problem: As it is shown in Table3 while trying to 
> > write data in case 1 everything works fine, but case 2 is not possible 
> > due to constraint in Table3. However my aim is to write a column A to 
> > column C, column A to column D and column B to column E not removing 
> > constraint for table3. May be there is a way to adjust constraint? Or 
> > may be to adjust rule somehow to make it to write all data from column 
> > A to column C first and after that fill in Table3?
> > Thanks a lot in advance
> > Oleg
> 
> 
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

1 2 3 4 >

1 - 100 of 331 matches

Mail list logo