Re: [GENERAL] High resolution PostgreSQL Logo

2008-05-03 Thread Sanjaya Kumar Patel

Thanks!
Sanjay


> Date: Fri, 2 May 2008 16:00:32 +0200
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] High resolution PostgreSQL Logo
> 
> On Fri, May 02, 2008 at 03:32:29PM +0530, Sanjaya Kumar Patel wrote:
>> I think to print PostgreSQL logo on our company brochure. I think it is 
>> permissible. Is not so? Is there a high resolution logo somewhere?
> 
> try here:
> http://pgfoundry.org/docman/?group_id=189
> 
> best regards,
> 
> depesz
> 
> -- 
> quicksil1er: "postgres is excellent, but like any DB it requires a
> highly paid DBA.  here's my CV!" :)
> http://www.depesz.com/ - blog dla ciebie (i moje CV)

_
Timely update on all current affairs, sports, events and all thats in News here 
on MSN videos.
http://video.msn.com/?mkt=en-in
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Unloading a table consistently

2008-05-03 Thread Christophe

Hi,

I will have a log table which, once a day or so, is copied to a file  
(for movement to a data warehouse), and the log table emptied.  For  
performance, the log table on the production system has no indexes,  
and is write-only.  (The unload process is the only reader.)


To unload it, I will be doing:

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
COPY log TO 'filename-path';
TRUNCATE log;
COMMIT;

My understanding is that I need the SERIALIZABLE isolation level so  
that the COPY and TRUNCATE see exactly the same view of the table.   
Obviously, I don't want to lose data by having the TRUNCATE delete  
records that appeared while the COPY was executing.  Is that  
correct?  Is there a better way to handle this kind of thing that I'm  
missing?


Thanks!
-- Xof

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unloading a table consistently

2008-05-03 Thread Tom Lane
Christophe <[EMAIL PROTECTED]> writes:
> I will have a log table which, once a day or so, is copied to a file  
> (for movement to a data warehouse), and the log table emptied.  For  
> performance, the log table on the production system has no indexes,  
> and is write-only.  (The unload process is the only reader.)

> To unload it, I will be doing:

> BEGIN;
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> COPY log TO 'filename-path';
> TRUNCATE log;
> COMMIT;

> My understanding is that I need the SERIALIZABLE isolation level so  
> that the COPY and TRUNCATE see exactly the same view of the table.   

No, the SERIALIZABLE bit is useless if not actually counterproductive
here.  What you'll actually need is an exclusive lock:

BEGIN;
LOCK TABLE log;
COPY log TO 'filename-path';
TRUNCATE log;
COMMIT;

to prevent anyone from inserting new rows during the COPY.  Otherwise
any such rows would be lost at the TRUNCATE.

You might be able to get away with a lock that only locks out writers
and not readers (offhand I think IN SHARE MODE is the right thing),
but from your description that would be a useless refinement anyway.

If you can't tolerate locking out writers for that long, you won't
be able to use TRUNCATE.  The operation I think you were imagining is

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
COPY log TO 'filename-path';
DELETE FROM log;
COMMIT;
VACUUM log; -- highly recommended at this point

Here, the serializable mode is exactly what you want to guarantee
that the DELETE deletes the same rows the COPY copied.  This is
a great deal less efficient than TRUNCATE, but it's secure for
concurrent insertions, which TRUNCATE is definitely not.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unloading a table consistently

2008-05-03 Thread Christophe

On May 3, 2008, at 9:56 AM, Tom Lane wrote:

This is
a great deal less efficient than TRUNCATE, but it's secure for
concurrent insertions, which TRUNCATE is definitely not.


Exactly my question; thank you!

-- Xof

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clustering without locking

2008-05-03 Thread Tom Lane
Craig Ringer <[EMAIL PROTECTED]> writes:
> Later on, though, less new space would have to be allocated because more 
> and more of the space allocated earlier to hold moved tuples would be 
> being freed up in useful chunks that could be reused.

I don't see how that works.  If the minimum size of the table is X
pages, ISTM that the first pass has to push everything up to pages above
X.  You can't put any temporary copies in pages <= X because you might
need that space when it comes time to make the clustering happen.  So
the table is going to bloat to (at least) 2X pages.  The extra pages
will be *mostly* empty when you're done, but probably not *entirely*
empty if there have been concurrent insertions --- and you'll never be
able to clean them out without taking exclusive lock.

If you could accurately predict a tuple's final position, you could
maybe get away with putting it temporarily in a page above that one
but still less than X.  I don't see how you do that though, especially
not in the face of concurrent insertions.  (In fact, given concurrent
insertions I don't even see how to know what X is.)

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] custom C function problem

2008-05-03 Thread Dan "Heron" Myers
I'm creating some custom C functions to load dynamically from a dll 
(this is Postgres 8.3.1 on Windows XP SP2).  I have two that work fine, 
but any time I try to write one that uses a text*, postgres crashes. 
This is true even for the example functions like "copytext" given in the 
documentation here:

http://developer.postgresql.org/pgdocs/postgres/xfunc-c.html

I'm compiling my DLL with MinGW.  I do CREATE FUNCTION as described in 
the documentation, and then I do a "SELECT copytext(colname) FROM 
tablename" (colname is a text column), at which point Visual Studio's 
"Something in postgres.exe crashed.  Do you want to debug?" dialog pops 
up.  (Of course, I click no and postgres appears to restart itself.)


Again, I have two other functions (neither of which uses a text*), both 
of which I can successfully load into postgres (from the same dll file 
as the function that crashes) and use in queries without problems.


Does anyone have any idea why this is happening?

- Dan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom C function problem

2008-05-03 Thread Tom Lane
"Dan \"Heron\" Myers" <[EMAIL PROTECTED]> writes:
> I'm creating some custom C functions to load dynamically from a dll 
> (this is Postgres 8.3.1 on Windows XP SP2).  I have two that work fine, 
> but any time I try to write one that uses a text*, postgres crashes. 

What cases have you gotten to work correctly?

My guess is that you're either messed up about V0 vs V1 calling
convention (ie you forgot PG_FUNCTION_INFO_V1, or added it when you
shouldn't have), or you've got some kind of problem with not detoasting
toasted input values.  There's not enough info here to venture more.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom C function problem

2008-05-03 Thread Dan "Heron" Myers

Tom Lane wrote:

What cases have you gotten to work correctly?

My guess is that you're either messed up about V0 vs V1 calling
convention (ie you forgot PG_FUNCTION_INFO_V1, or added it when you
shouldn't have), or you've got some kind of problem with not detoasting
toasted input values.  There's not enough info here to venture more.

regards, tom lane


This one works correctly:

PG_FUNCTION_INFO_V1(event_duration);

Datum
event_duration(PG_FUNCTION_ARGS)
{
int32 state = PG_GETARG_INT32(0);
int32 target = PG_GETARG_INT32(1);
int32 event = PG_GETARG_INT32(2);
Timestamp start = PG_GETARG_TIMESTAMP(3);
Timestamp end = PG_GETARG_TIMESTAMP(4);

//If this event is the correct type we need to add the event time 
to the total event time (state)

if(target == event){
state += (end - start);
}

PG_RETURN_INT32(state);
}

I can use event_duration in this query without problems:

SELECT call_id, event_duration(4,event_type,start_time,end_time) AS 
talking_duration FROM event GROUP BY call_id;


One case that fails is essentially copied from the V1 section in the 
documentation:


PG_FUNCTION_INFO_V1(copytext);

Datum copytext(PG_FUNCTION_ARGS)
{
text* t = PG_GETARG_TEXT_P(0);
text* new_t = (text *) palloc(VARSIZE(t));
SET_VARSIZE(new_t, VARSIZE(t));

memcpy((void *) VARDATA(new_t), (void *) VARDATA(t),
   VARSIZE(t) - VARHDRSZ);
PG_RETURN_TEXT_P(new_t);
}

Attempting to use copytext in a query results in Postgres crashing.
For example:

SELECT copytext(calling_party) FROM event;

crashes.

- Dan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Problem revoking a user's 'create' privilege on schema public

2008-05-03 Thread jdietrch
I am having trouble revoking a user's create privilege on
schema public.

Here is the sequence of commands that demonstrates the problem:

[EMAIL PROTECTED]:~$ su
Password: 
saturn:/home/jdietrch# su postgres
[EMAIL PROTECTED]:/home/jdietrch$ psql 
Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

postgres=# revoke all on schema public from public;
REVOKE
postgres=# create role user1 password 'user1' login createdb;
CREATE ROLE
postgres=# create role user2 password 'user2' login;
CREATE ROLE
postgres=# revoke all on schema public from user2;
REVOKE
postgres=# grant usage on schema public to user2;
GRANT
postgres=# \q
[EMAIL PROTECTED]:/home/jdietrch$ psql -U user1 template1
Password for user user1: 
Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

template1=> create database db1;
CREATE DATABASE
template1=> \q
[EMAIL PROTECTED]:/home/jdietrch$ psql -U user1 db1
Password for user user1: 
Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

db1=> create table table1(id integer);
CREATE TABLE
db1=> select has_schema_privilege('public', 'create');
 has_schema_privilege 
--
 t
(1 row)

db1=> \q
[EMAIL PROTECTED]:/home/jdietrch$ psql -U user2 db1
Password for user user2: 
Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

db1=> create table table2(id integer);
CREATE TABLE
db1=> select has_schema_privilege('public', 'create');
 has_schema_privilege 
--
 t
(1 row)

db1=> \q
[EMAIL PROTECTED]:/home/jdietrch$ 

Notice that both user1 and user2 were allowed to create
a table in the database.

Why does user2 still have create privilege on schema public?
I am expecting that user2 should not be permitted to
create a table in the database that user1 created.

If someone could point out to me what I'm doing wrong,
I'd be very grateful.

Thank you,
James Dietrich

P.S. I'm running Debian GNU/Linux:
[EMAIL PROTECTED]:~$ uname -a
Linux saturn 2.6.22-3-vserver-k7 #1 SMP Mon Nov 12 11:47:04 UTC 2007
i686 GNU/Linux
[EMAIL PROTECTED]:~$ psql -U user1 template1
Password for user user1: 
Welcome to psql 8.3.1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

template1=> select version();
version  

 PostgreSQL 8.3.1 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.3
 (Debian 4.2.3-2)
(1 row)

template1=> \q
[EMAIL PROTECTED]:~$ 
-- 
  
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - IMAP accessible web-mail


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Speed up repetitive queries

2008-05-03 Thread Javier Olazaguirre
> I have an application developped  by a third party which takes very
> long to process all the queries.
> 
> I use Red Hat 4 and Postgre 8.2.7 on a 64 bit machine.
> 
> Checking the log files created by postgre I see that the program is
> running always the same query:
> 
> execute : select connection0_.id as id35_5_,
> connection0_.pc_name as pc2_35_5_, connection0_.gw_name as gw3_35_5_,
> connection0_.active as active35_5_, connection0_.pc as pc35_5_,
> connection0_.gw as gw35_5_, connection0_.carrier as carrier35_5_,
> connection0_.cic as cic35_5_, pointcode1_.id as id45_0_,
> pointcode1_.value as value45_0_, pointcode1_.ni as ni45_0_,
> pointcode1_.active as active45_0_, gateway2_.id as id41_1_,
> gateway2_.value as value41_1_, gateway2_.ni as ni41_1_,
> gateway2_.active as active41_1_, carrier3_.id as id33_2_,
> carrier3_.name as name33_2_, carrier3_.active as active33_2_, cic4_.id
> as id34_3_, cic4_.low as low34_3_, cic4_.high as high34_3_,
> cic4_.active as active34_3_, cic4_.producttype as productt5_34_3_,
> producttyp5_.id as id46_4_, producttyp5_.name as name46_4_,
> producttyp5_.active as active46_4_ from connection connection0_ left
> outer join pointcode pointcode1_ on connection0_.pc=pointcode1_.id
> left outer join gateway gateway2_ on connection0_.gw=gateway2_.id left
> outer join carrier carrier3_ on connection0_.carrier=carrier3_.id left
> outer join cic cic4_ on connection0_.cic=cic4.id left outer join
> producttype producttyp5_ on cic4_.producttype=producttyp5_.id where
> connection0_.id=$1
> 
> 
> parameters: $1 = '141508'
> 
> The only thing that changes is the parameter at the end ($1).
> This query is executed at least a million times (sometimes several
> million times) just changing the value of the $1 parameter. Each query
> takes between 1 and 2 milliseconds to execute in my system. So running
> a million queries takes quite a lot of minutes.
> 
> Is there any way to speed up the execution of this query?
> 
> I cannot change the code of the application, I already got it
> compiled, so as far as I can think of, all I can do is tune the
> system, change parameters in postgre, etc.
> I already tried changing shared buffers and other parameters in
> postgresql.conf, but performance is still the same.
> When I run a Explain statement with the select I see indices are being
> used by all subqueries.
> I see my cpu is at 100%, so I believe my bottleneck is not IO or
> memory (8GB, and in "top" I see almost all of it is free). My problem
> is that of all the cores of my processors, postgre is just using one,
> but I guess this can only be fixed changing the code of the
> application running the queries on postgre, so this is a different
> story.
> 
> Thanx!!
> Javier


[GENERAL] User Defined C Function with minGW

2008-05-03 Thread Nathan Thatcher
I have been creating some user defined C functions using minGW and
postgreSQL 8.3. Everything works great when I use integers,
timestamps, points, etc. I have compiled, linked, created, and tested
multiple function and aggregates.

The problem occurs when I have a text parameter and need to use
PG_GETARG_TEXT_P(n). The code compiles just fine but linking fails.
Here is an example:

#include "postgres.h"
#include "fmgr.h"

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(function);

Datum
function(PG_FUNCTION_ARGS)
{
text *t = PG_GETARG_TEXT_P(1);
int32 i = PG_GETARG_INT32(0);
PG_RETURN_INT32(i+1);
}

The error:
X)function_name.o:function_name.c: undefined reference to `pg_detoast_datum'
If I comment out the text *t line then it compiles, links, and
executes just fine.

The problem (I am assuming it is the same problem) also manifests
itself when I make a call to palloc. I get 2 errors

X) finalagent.o:finalagent.c: undefined reference to
`_imp__CurrentMemoryContext'
X) finalagent.o:finalagent.c: undefined reference to `MemoryContextAlloc'

Some people have said that this error should just be ignored and it
will sort itself out at run-time. That sounds swell but how do I get
my linker to ignore these errors?

Other people have said that pg_detoast_datum is a backend function and
shouldn't be referenced at all, but it is right there in fmgr.h which
is obviously required for creating functions under the V1 calling
convention.

For reference, I am on Windows XP SP2. I installed postgreSQL 8.3 from
the msi and selected to include the Development components (Include
Files, Library Files, Tools and utilities). The compiler is including
the following directories:

C:\PostgreSQL\8.3\include
C:\PostgreSQL\8.3\include\server
C:\PostgreSQL\8.3\include\server\port\win32"

I added both C:\PostgreSQL\8.3\lib and C:\PostgreSQL\8.3\bin to my
linker hoping that maybe it would work. It didn't.

Any information, insight, tips, or criticism would be welcomed and appreciated.

-Subordin8

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] large query by offset and limt

2008-05-03 Thread finecur
Hi, I am ruuning a database behind a webserver and there is a table
which is huge. I need to pull data from this table and send to user
through http. If I use

select * from huge_table where userid = 100

It will return millions of records which exhuasts my server's memory.
So I do this:

select * from huge_table where userid = 100 limit 1000 offset 0
and then send the results to user, then

select * from huge_table where userid = 100 limit 1000 offset 1000
and then send the results to user, then

select * from huge_table where userid = 100 limit 1000 offset 2000
and then send the results to user,

Continue this until there is no records available

It runs great but it is kind of slow. I think it is because even I
need only 1000 records, the query search the whole table every time.

Is there a better way to do this?

Thank you.

ff




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Feature request

2008-05-03 Thread Scott Miller
One problem I've had in development recently is the inability to get the
aliased name of a table from a query.  We're using a PHP framework for
querying, which internally uses pg_field_name to retrieve the select list
field name, which is great.  There is alwo pg_table_name, to retrieve the
table the field originated from.  The problem is that this returns the name
of the table, unaliased.  If a query does a self join, you can't retrieve a
distinguishing table alias name for that field.  For example:

SELECT a.mycolumn, b.mycolumn
   FROM mytable a, mytable b
  WHERE ...

If I controlled query generation at all times, I could simply alias
a.mycolumn and b.mycolumn differently in the select list.  But if for
example this is SELECT * FROM, those columns are indistinguishable.

Cheers,
  Scott


[GENERAL] Executing dynamic procedure call

2008-05-03 Thread tekwiz
I am using a PL/pgSQL procedure. I am trying to hold the procedure
name in a table and then based on certain selection criteria get the
procedure name out of the table and execute it. I would like to pass a
row record, currently NEW, and retrieve the same rowtype. The
following code will compile but yields the result below it for either
of the two statements listed. Note that the procedure is triggered
from an insert running from a Perl script.

EXECUTE '"RuleRec.rule_procedure_name"(NEW)' INTO NEW;
 or
NEW := "RuleRec.rule_procedure_name"(NEW);

Result:

DBD::Pg:st execute failed: ERROR: operator does not exist: money <>
integer
HINT: No operator matches the given name and argument type(s). You may
need to add explicit type casts.
CONTEXT: SQL statement "SELECT (( $1 - $2 - $3 - $4 - $%) <> 0)"

Can you help me code this correctly and, if possible, explain what's
happening here?

PG Ver: 8.1.3
DBD Ver: 1.48

Thanks.

Robert Young

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] large query by offset and limt

2008-05-03 Thread Steve Atkins


On May 2, 2008, at 2:01 PM, finecur wrote:


Hi, I am ruuning a database behind a webserver and there is a table
which is huge. I need to pull data from this table and send to user
through http. If I use

select * from huge_table where userid = 100

It will return millions of records which exhuasts my server's memory.
So I do this:

select * from huge_table where userid = 100 limit 1000 offset 0
and then send the results to user, then

select * from huge_table where userid = 100 limit 1000 offset 1000
and then send the results to user, then

select * from huge_table where userid = 100 limit 1000 offset 2000
and then send the results to user,

Continue this until there is no records available

It runs great but it is kind of slow. I think it is because even I
need only 1000 records, the query search the whole table every time.


Not quite - if you do a "limit 1000 offset 5000" it'll stop after  
retrieving

the first 6000 from the table. A bigger problem with doing it this
way is that the results aren't particularly well defined unless there's
an order by statement in the query.




Is there a better way to do this?


You want a cursor. See 
http://www.postgresql.org/docs/8.3/static/sql-declare.html

Cheers,
  Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clustering without locking

2008-05-03 Thread Craig Ringer

Tom Lane wrote:

Craig Ringer <[EMAIL PROTECTED]> writes:
Later on, though, less new space would have to be allocated because more 
and more of the space allocated earlier to hold moved tuples would be 
being freed up in useful chunks that could be reused.


I don't see how that works.  If the minimum size of the table is X
pages, ISTM that the first pass has to push everything up to pages above
X.


What I was suggesting was to essentially cluster the table in small 
parts. Rather than two huge passes (move all tuples to free / newly 
allocated space at end of table ; move back into old locations in order) 
 it'd be done in a series of smaller moves.


After moving a chunk out of the way and into free/new space at the end 
of the table data would be copied from later in the table into the freed 
space. That space could then be re-used to hold data from the next chunk 
that needs to be moved out of the way.


I'm just trying to understand if it can actually work. Sorry if my 
attempted explanations are unclear; I'm probably doing a terrible job, 
and it's probably actually a stupid idea anyway (if nothing else it 
might just be too slow). Nonetheless I'm curious. Maybe I can explain 
another way.


Visually:

`0' to `9' : tuples. Desired eventual cluster order is face value.
`.' : Dead/obsoleted tuple not yet marked reusable by VACUUM
` ' : free space

Initial state:

-
584736120
-

Begin a transaction and free the first chunk (2 tuples in this case, but 
obviously many more in a real case):


---
..473612058
---

Use that freed space to store the first ordered tuples:

---
014736.2.58
---

Commit, and when the last transaction to which the "." tuples above are 
still visible completes mark them as free with VACUUM or similar.


---
014736 2 58
---

Repeat, but now use the freed space to store the next set of tuples that 
must be moved rather than extending the table:


---
01..3642758
---
---
0123.64.758
---
---
0123 64 758
---

During the next pass someone inserts `9' after tuples have been moved to 
 make a hole and others have been moved into the hole, but before the 
old locations of the moved tuples are marked as free:


---
0123 .46758
---
---
012345.67.8
---

012345.67.89  <- INSERT 9


012345 67 89


You'd never land up with this sort of convenient ordering half way 
through in a real case with realistic numbers of tuples, so it'd keep 
going, doing small chunks each time, until the whole table had been 
processed.


So, the table does grow, and its final state does contain dead space at 
the end, but not *too* much of it:



0123456789




If "low" values are inserted late in the progressive cluster they can 
just stay at the end of the table. They'll get moved if the user runs a 
progressive cluster operation again later. However, since you're ideally 
doing this with a non-100% fillfactor, they should land up in roughly 
the right place on initial insert rather than at the end of the table, 
avoiding the issue (and helping avoid having newly inserted tuples 
prevent table truncation by vacuum when the progressive clustering 
finishes).


Say a table containing the range 1-9 was being clustered with a 50% 
fillfactor and was about half way through the process:


-
 1 2 3 47986
-

and someone inserts `0' it should ideally land up in the right place 
anyway (as a bit of reading suggests that insert tries to respect 
cluster order):


-
01 2 3 47986
-


If you're re-clustering a table with a fillfactor set you may not need 
to extend it at all, because you can use already allocated free space to 
store tuples temporarily while moving them around.



You can't put any temporary copies in pages <= X because you might
need that space when it comes time to make the clustering happen.  So
the table is going to bloat to (at least) 2X pages.


Yes, if you perform the whole process in a single huge chunk. What I was 
hoping was that it was possible to do it in a series of smaller 
operations instead, avoiding the need for such a huge amount of bloat at 
the end of the table.


Say you're clustering in 10 steps. You need X*0.1 worth of scratch space 
created by extending the table if there's not already appropriate free 
space late in the table. Assuming you can reclaim all space you've used 
for temporary copies of tuples after each pass your ideal final table 
size is X*1.1+(space used by new inserts), of which X*0.1 is free space 
at the end of the table. That free space probably has scattered rows 
from concurrent inserts, but if you're using a non-100% fillfactor it 
might not.


It seems like it should work, *if* space used for temporary storage can 
be efficiently reclaimed for reuse (or new inserts) after each pass. 
Even if it can

Re: [GENERAL] Executing dynamic procedure call

2008-05-03 Thread Craig Ringer

tekwiz wrote:


Result:

DBD::Pg:st execute failed: ERROR: operator does not exist: money <>
integer
HINT: No operator matches the given name and argument type(s). You may
need to add explicit type casts.
CONTEXT: SQL statement "SELECT (( $1 - $2 - $3 - $4 - $%) <> 0)"


craig=# SELECT '0'::money <> 0;
ERROR:  operator does not exist: money <> integer
LINE 1: SELECT '0'::money <> 0;
  ^
HINT:  No operator matches the given name and argument type(s). You 
might need to add explicit type casts.



You are trying to compare a value of `money' data type with a value of 
`integer' data type. This is intentionally not permitted. Since, in your 
case, it looks like the 0 is an integer literal you should be doing:


( moneyvar1 - moneyvar2 - moneyvar3etc ) <> '0'::money;

In the longer run you might want to look into using the NUMERIC data 
type instead of the MONEY type.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clustering without locking

2008-05-03 Thread Tom Lane
Craig Ringer <[EMAIL PROTECTED]> writes:
> Begin a transaction and free the first chunk (2 tuples in this case, but 
> obviously many more in a real case):

> ---
> ..473612058
> ---

> Use that freed space to store the first ordered tuples:

> ---
> 014736.2.58
> ---

> Commit, and when the last transaction to which the "." tuples above are 
> still visible completes mark them as free with VACUUM or similar.

> ---
> 014736 2 58
> ---

Oh, the problem is that you're misexplaining this.  You can't do it like
that: you can't overwrite the moved-up "." tuples until after they
aren't visible to any other transaction.  So you need two transactions
to do the above.  I'm not sure if you need two "wait for all others" or
just one --- it's not clear to me what's the urgency of clearing out the
moved-down tuples after they're moved down.  (You *would* want to vacuum
every so often, but this is to reclaim index space, not so much heap
space because you'll reuse that anyway.)

Anyway I think the main practical problem would be with deadlocks
against other transactions trying to update/delete tuples at the same
times you need to move them.  Dealing with uncommitted insertions would
be tricky too --- I think you'd need to wait out the inserting
transaction, which would add more possibilities of deadlock.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] large query by offset and limt

2008-05-03 Thread Craig Ringer

finecur wrote:

Hi, I am ruuning a database behind a webserver and there is a table
which is huge. I need to pull data from this table and send to user
through http. If I use

select * from huge_table where userid = 100

It will return millions of records which exhuasts my server's memory.



Is there a better way to do this?


Consider using database cursors. Set up by beginning a transaction and 
issuing the statement:


DECLARE huge_table_curs CURSOR FOR
SELECT * FROM huge_table WHERE userid = 100;

... then to get results, just execute:

FETCH 1000 FROM huge_table_curs;

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clustering without locking

2008-05-03 Thread Craig Ringer

Tom Lane wrote:


Anyway I think the main practical problem would be with deadlocks
against other transactions trying to update/delete tuples at the same
times you need to move them.  Dealing with uncommitted insertions would
be tricky too --- I think you'd need to wait out the inserting
transaction, which would add more possibilities of deadlock.


I really appreciate your taking the time to think about and explain 
this. It's very helpful, as I'm trying to understand some of the basics 
of PostgreSQL's underlying operation.


I'd completely missed thinking about uncomitted inserts - I never 
normally need to think about them so they just didn't cross my mind. I 
guess it'd either have to do the equivalent of a SELECT FOR UPDATE 
NOWAIT on all tuples in the pages to be freed before doing anything 
else, or would have to take out an EXCLUSIVE table lock while freeing a 
chunk of pages.


I can also vaguely see how problems would arise with concurrent 
multi-tuple updates grabbing locks in a different order to the 
progressive cluster and deadlocking, and again hadn't even thought about 
that.


I guess it might be OK if the progressive cluster attempted to get row 
exclusive locks on all tuples in the contiguous range of pages to be 
freed, and if it failed to get even one it released them all and retried 
that whole step. It sounds like it could be slow and inefficient, 
though, possibly so much so as to defeat the point of the clustering 
operation in the first place.


Thanks again for taking the time to go over that - it's extremely 
helpful and much appreciated.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom C function problem

2008-05-03 Thread Tom Lane
"Dan \"Heron\" Myers" <[EMAIL PROTECTED]> writes:
> One case that fails is essentially copied from the V1 section in the 
> documentation:

Well, there's nothing wrong with that C code, so the problem is
someplace else.

Did you remember to declare the function STRICT?  If not, and if
there are any nulls in your test table, a crash would be expected;
there's nothing in this function that's guarding against a null
pointer dereference.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature request

2008-05-03 Thread Tom Lane
"Scott Miller" <[EMAIL PROTECTED]> writes:
> One problem I've had in development recently is the inability to get the
> aliased name of a table from a query.  We're using a PHP framework for
> querying, which internally uses pg_field_name to retrieve the select list
> field name, which is great.  There is alwo pg_table_name, to retrieve the
> table the field originated from.  The problem is that this returns the name
> of the table, unaliased.  If a query does a self join, you can't retrieve a
> distinguishing table alias name for that field.

Supporting this would require a protocol change (to allow an additional
field in the result description message).  So don't hold your breath :-(

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] large query by offset and limt

2008-05-03 Thread Ge Cong
Thank you very much. Could you show me how to do it in JDBC?


Craig Ringer wrote:
> finecur wrote:
> > Hi, I am ruuning a database behind a webserver and there is a table
> > which is huge. I need to pull data from this table and send to user
> > through http. If I use
> >
> > select * from huge_table where userid = 100
> >
> > It will return millions of records which exhuasts my server's memory.
>
> > Is there a better way to do this?
>
> Consider using database cursors. Set up by beginning a transaction and
> issuing the statement:
>
> DECLARE huge_table_curs CURSOR FOR
> SELECT * FROM huge_table WHERE userid = 100;
>
> ... then to get results, just execute:
>
> FETCH 1000 FROM huge_table_curs;
>
> --
> Craig Ringer
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Interesting comments about fsync on Linux

2008-05-03 Thread Bruno Wolff III
I was looking for some information on how write barriers interact with
software raid and ran across the following kernel thread referenced on LWN.
The suggestion is that fsync isn't really safe on Linux as it is currently
implented. (The thread was from February 2008, so it probably still
applies.)
http://lwn.net/Articles/270891/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] output array elements in rows

2008-05-03 Thread Rainer . Zaiss

Dear list,

I'm looking for a function to output the elements of my array column  in 
rows ? Let's say I have following table:

id = integer
nam= array char varying

idnam 
1 {"nam1","nam2","nam3"}
2{"n1","n2"}

I think there should be available a function to expand the elements of an 
array of caractères into a set of rows ?


idnam
1 nam1
1 nam2
1 nam3
2 n1
2 n2


THANK YOU VERY MUCH FOR YOUR HELP


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Problems in queries

2008-05-03 Thread David Anderson
Hi all
I have this function:
 def checkNameDob(self, name, dob):
cur = self.conn.cursor();

sql = "SELECT * from patient WHERE fn_pat = %s"
cur.execute(sql,(name))
rows = cur.fetchall()

It seems to work fine, But I'm getting this exception:
psycopg2.ProgrammingError: current transaction is aborted, commands ignored
until end of transaction block
at: cur.execute(sql)

What's the problem?
thx

ps: fn_pat is the column of the db, name is the string passed in the
function parameter.
if I type:
"SELECT * from patient WHERE fn_pat = 'david' "
in the psql on console it works well...


Re: [GENERAL] Problems in queries

2008-05-03 Thread Adrian Klaver
On Saturday 03 May 2008 4:05 pm, David Anderson wrote:
> Hi all
> I have this function:
>  def checkNameDob(self, name, dob):
> cur = self.conn.cursor();
>
> sql = "SELECT * from patient WHERE fn_pat = %s"
> cur.execute(sql,(name))
> rows = cur.fetchall()
>
> It seems to work fine, But I'm getting this exception:
> psycopg2.ProgrammingError: current transaction is aborted, commands ignored
> until end of transaction block
> at: cur.execute(sql)
If I remember right
cur.execute(sql,(name))
should be
cur.execute(sql,(name,))
note the comma after name.

>
> What's the problem?
> thx
>
> ps: fn_pat is the column of the db, name is the string passed in the
> function parameter.
> if I type:
> "SELECT * from patient WHERE fn_pat = 'david' "
> in the psql on console it works well...

-- 
Adrian Klaver
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom C function problem

2008-05-03 Thread Dan "Heron" Myers



Tom Lane wrote:

Well, there's nothing wrong with that C code, so the problem is
someplace else.

Did you remember to declare the function STRICT?  If not, and if
there are any nulls in your test table, a crash would be expected;
there's nothing in this function that's guarding against a null
pointer dereference.


I did declare the function STRICT, yes.

I'm wondering if maybe there is a dependency somewhere I'm missing.  I 
link with postgres.lib to create the dll; I basically made an educated 
guess as to what to link with to get rid of linker errors, since the 
documentation didn't really say I needed to link with anything, but I 
suppose it's possible that there is another .lib somewhere that I should 
link with instead.  Is there anyone who has experience writing custom C 
functions for Postgres that could weigh in here?


Thanks,
Dan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] large query by offset and limt

2008-05-03 Thread Craig Ringer

Ge Cong wrote:

Thank you very much. Could you show me how to do it in JDBC?


Here's one example. As I haven't been using JDBC directly it's probably 
horrible, but it'll do the job. Any exception will terminate this 
example, but in practice you'd want to catch and handle exceptions 
appropriately.


Sorry about the ugly formatting - mail client line wrapping and all.

The example uses a dummy "customer" table, scrolling through it in 
chunks of 1000 records and printing the primary key `id' for each record.



import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {

private static final int BATCH_SIZE = 1000;

public static void main(String[] args)
throws ClassNotFoundException, SQLException {

// Load the JDBC driver
Class.forName("org.postgresql.Driver");

// Initialize a read only connection
Connection c = DriverManager.getConnection(
"jdbc:postgresql:DBNAME", "USERNAME", "PASSWORD");
c.setReadOnly(true);
c.setAutoCommit(false);

// Declare an open cursor attached to a query for the
// desired information
Statement s = c.createStatement();
s.execute("DECLARE customer_curs CURSOR FOR"
  + " SELECT id FROM customer");

// and fetch BATCH_SIZE records from the cursor until fewer
// than the requested number of records are returned (ie
// until we've run out of results).
int nresults = 0;
do {
s.execute("FETCH " + BATCH_SIZE + " FROM customer_curs");
ResultSet rs = s.getResultSet();
while (rs.next()) {
nresults++;
// Do something with the current record at `rs'
System.out.println("CustomerID: " + rs.getString(1));
}
} while (nresults == BATCH_SIZE);   

// Clean up.
c.close();
}

}


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom C function problem

2008-05-03 Thread Tom Lane
"Dan \"Heron\" Myers" <[EMAIL PROTECTED]> writes:
> I'm wondering if maybe there is a dependency somewhere I'm missing.  I 
> link with postgres.lib to create the dll;

Oh, you're using Windows :-(.  I make it my business to not know
anything about that platform, but perhaps you could get a clue by
looking at how the contrib modules are built (using whichever tool
chain you're using).

Or you could put in a few debugging printf's, or poke at it with a
debugger.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Speed up repetitive queries

2008-05-03 Thread Scott Marlowe
On Fri, May 2, 2008 at 9:13 AM, Javier Olazaguirre
<[EMAIL PROTECTED]> wrote:
>
> I have an application developped  by a third party which takes very long to
> process all the queries.
>
> I use Red Hat 4 and Postgre 8.2.7 on a 64 bit machine.
>
> Checking the log files created by postgre I see that the program is running
> always the same query:

> The only thing that changes is the parameter at the end ($1).
> This query is executed at least a million times (sometimes several million
> times) just changing the value of the $1 parameter. Each query takes between
> 1 and 2 milliseconds to execute in my system. So running a million queries
> takes quite a lot of minutes.

Is the application preparing and re-executing the same query, or
repreparing each time it executes it?  Preparation might be a pretty
significant amount of overhead here.

> Is there any way to speed up the execution of this query?

1-2 milliseconds is pretty fast for an individual query.

> I cannot change the code of the application, I already got it compiled, so
> as far as I can think of, all I can do is tune the system, change parameters
> in postgre, etc.

Yeah, we've all been there.  Sometimes you can come up with a workaround.

> I already tried changing shared buffers and other parameters in
> postgresql.conf, but performance is still the same.

Yeah, I can't imagine there's a lot of low hanging fruit for tuning
the db for such a simple query.

> When I run a Explain statement with the select I see indices are being used
> by all subqueries.
> I see my cpu is at 100%, so I believe my bottleneck is not IO or memory
> (8GB, and in "top" I see almost all of it is free).

Yeah, I can't see using more memory helping with this operation.  It's
a single small bit at a time.  In fact, using more memory would just
mean more to keep track of, likely slowing things down.

> My problem is that of
> all the cores of my processors, postgre is just using one, but I guess this
> can only be fixed changing the code of the application running the queries
> on postgre, so this is a different story.

Of course pgsql is using just one.  You're only giving it one thing to
do at a time.  (btw, it's PostgreSQL, postgres, pgsql, or pg.  Postgre
is generally not preferred.  No big.  and no matter how you spell it,
it's pronounced "Post-Gres-Q-L" :)  )

What you might look at doing is having the application run in multiple
instances each instance across a specific data range.  This will
likely move your heads all over the place.  OTOH, if the app could be
rewritten to send >1 query at a time through multiple connections, it
could likely get faster.

However, running multiple update queries will very quickly saturate
your I/O and you'll suddenly be I/O bound.  That can be worked on with
more discs, RAID-10, battery backed RAID controllers, etc...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature request

2008-05-03 Thread Scott Marlowe
On Fri, May 2, 2008 at 9:34 AM, Scott Miller <[EMAIL PROTECTED]> wrote:
> One problem I've had in development recently is the inability to get the
> aliased name of a table from a query.  We're using a PHP framework for
> querying, which internally uses pg_field_name to retrieve the select list
> field name, which is great.  There is alwo pg_table_name, to retrieve the
> table the field originated from.  The problem is that this returns the name
> of the table, unaliased.  If a query does a self join, you can't retrieve a
> distinguishing table alias name for that field.  For example:
>
> SELECT a.mycolumn, b.mycolumn
>FROM mytable a, mytable b
>   WHERE ...
>
> If I controlled query generation at all times, I could simply alias
> a.mycolumn and b.mycolumn differently in the select list.  But if for
> example this is SELECT * FROM, those columns are indistinguishable.

You have the same type of problem if you have this query:

select count(id), count(int2) from table.

They both are named count.  The simple answer is to always alias your
select fields.

select count(id) as idcount, count(int2) as intcount from table.

or

SELECT a.mycolumn as a_mycol, b.mycolumn as b_mycol
   FROM mytable a, mytable b
  WHERE ...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general