Re: [GENERAL] dataset lock

2013-04-17 Thread Albe Laurenz
Philipp Kraus wrote:
> My PG database is connected to differend cluster nodes (MPI). Each
> programm / process on each node are independed and run the SQL
> select * from table where status = waiting
> after that I update the row with the update statement (set status = working)
> 
> so in this case one process can run the select, than comes another
> process and runs also the select, but both processes get an equal row.
> But this does not allowed. The second process need not see the row,
> which is taken by the first process. So can I suppress, that a select
> call
> sees a row, which is locked by a transaction? So I would like to do
> this with a store procedure, that runs the select and the update and
> after that
> it returns the PK of the selected dataset. If two (or more) processes
> run the SP at the same time, but the update can create an error, so the
> stored
> procedure is stopped and must called again.
> I need a solution, that a row, which is taken by one process not shown
> by all other processes

Do you want to implement something like a queue?

I can think of two techniques:

1) Locking
--
In a transaction, you get a few rows for processing by
SELECT * FROM table WHERE status = waiting
   ORDER BY id LIMIT 5 FOR UPDATE;
("id" is the primary key here).
Then you process and update the rows and commit.
   
This will cause concurrent SELECT FOR UPDATE operations
to block until the transaction is committed, effectively
serializing the processing.

2) Set a marker
---
You get a few rows by
UPDATE table SET status = processing WHERE id IN
   (SELECT id FROM table WHERE status = waiting
   ORDER BY id LIMIT 5) RETURNING *;
Then process and update the rows.

This won't block concurrent processes for the whole
time it takes to process the rows, so it's probably
closer to what you want.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread François Beausoleil

Le 2013-04-16 à 22:51, François Beausoleil a écrit :

> Hi all!
> 
> I track Twitter followers in my database. I have the following table:
> 
> # \d persona_followers
>   Table "public.persona_followers"
>   Column|Type | Modifiers
> -+-+---
> service_id  | bigint  | not null
> follower_id | bigint  | not null
> valid_at| timestamp without time zone |
> Indexes:
>"persona_followers_pkey" PRIMARY KEY, btree (service_id, follower_id)
> 
> The table IS NOT partitioned.

A run with the following query:

INSERT INTO "persona_followers"
  SELECT "service_id", "follower_id", now()
  FROM (
SELECT *
FROM (
SELECT DISTINCT "service_id", "follower_id"
FROM "persona_followers_import"
WHERE "service_id" IS NOT NULL OR "follower_id" IS NOT NULL
  EXCEPT
 SELECT "service_id", "follower_id" FROM "persona_followers") AS "t1") 
AS "t1"

results in http://explain.depesz.com/s/Y1c

 Insert on public.persona_followers  (cost=139261.12..20483497.65 rows=6256498 
width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1)
   Buffers: shared hit=33135295 read=4776921
   ->  Subquery Scan on t1  (cost=139261.12..20483497.65 rows=6256498 width=16) 
(actual time=562265.156..578844.999 rows=6819520 loops=1)
 Output: t1.service_id, t1.follower_id, now()
 Buffers: shared hit=36891 read=3572263
 ->  HashSetOp Except  (cost=139261.12..20389650.18 rows=6256498 
width=16) (actual time=562265.127..566513.759 rows=6819520 loops=1)
   Output: "*SELECT* 1".service_id, "*SELECT* 1".follower_id, (0)
   Buffers: shared hit=36891 read=3572263
   ->  Append  (cost=139261.12..17054024.97 rows=667125042 
width=16) (actual time=4090.462..320879.545 rows=667689321 loops=1)
 Buffers: shared hit=36891 read=3572263
 ->  Subquery Scan on "*SELECT* 1"  
(cost=139261.12..264391.09 rows=6256498 width=16) (actual 
time=4090.461..7798.334 rows=6820784 loops=1)
   Output: "*SELECT* 1".service_id, "*SELECT* 
1".follower_id, 0
   Buffers: shared hit=36891
   ->  HashAggregate  (cost=139261.12..201826.11 
rows=6256498 width=16) (actual time=4090.459..6795.009 rows=6820784 loops=1)
 Output: persona_followers_import.service_id, 
persona_followers_import.follower_id
 Buffers: shared hit=36891
 ->  Seq Scan on 
francois.persona_followers_import  (cost=0.00..105137.75 rows=6824675 width=16) 
(actual time=0.017..1344.916 rows=6824700 loops=1)
   Output: 
persona_followers_import.service_id, persona_followers_import.follower_id
   Filter: 
((persona_followers_import.service_id IS NOT NULL) OR 
(persona_followers_import.follower_id IS NOT NULL))
   Buffers: shared hit=36891
 ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..16789633.88 
rows=660868544 width=16) (actual time=6.694..238761.499 rows=660868537 loops=1)
   Output: "*SELECT* 2".service_id, "*SELECT* 
2".follower_id, 1
   Buffers: shared read=3572263
   ->  Seq Scan on public.persona_followers  
(cost=0.00..10180948.44 rows=660868544 width=16) (actual time=6.693..137929.808 
rows=660868537 loops=1)
 Output: public.persona_followers.service_id, 
public.persona_followers.follower_id
 Buffers: shared read=3572263
 Total runtime: 4729338.157 ms

1h20m for 6.8 million rows inserted. The estimates are spot on, since I had 
just run VACUUM ANALYZE on both tables prior to doing this test. Running the 
original query now, SELECT * FROM ... WHERE NOT EXISTS(...).

Bye,
François

smime.p7s
Description: S/MIME cryptographic signature


[GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Aleksey Tsalolikhin
Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they are
looking to switch to open source to cut their licensing costs, and was
asked how large a database does PostgreSQL support?  Is there an upper
bound on database size and if so, what it is?

Aleksey Tsalolikhin


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Adrian Klaver

On 04/17/2013 06:23 AM, Aleksey Tsalolikhin wrote:

Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they
are looking to switch to open source to cut their licensing costs, and
was asked how large a database does PostgreSQL support?  Is there an
upper bound on database size and if so, what it is?


For a guide to capacities see:

http://www.postgresql.org/about/

Of course the above depends on resources allocated.



Aleksey Tsalolikhin




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Mark Felder
On Wed, 17 Apr 2013 08:23:41 -0500, Aleksey Tsalolikhin  
 wrote:



Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they are
looking to switch to open source to cut their licensing costs, and was
asked how large a database does PostgreSQL support?  Is there an upper
bound on database size and if so, what it is?


According to yahoo...:

http://glinden.blogspot.com/2008/05/yahoo-builds-two-petabyte-postgresql.html

...pretty big. But yahoo threw some programmers at it, I believe.

Straight out of the box? Not sure, but I'd expect many on this list have  
databases larger than "enterprise oracle" shops.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Bruce Momjian
On Wed, Apr 17, 2013 at 06:23:41AM -0700, Aleksey Tsalolikhin wrote:
> Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they are
> looking to switch to open source to cut their licensing costs, and was asked
> how large a database does PostgreSQL support?  Is there an upper bound on
> database size and if so, what it is?

Well, anything over hundreds of gigabytes is going to require tuning,
and above a terrabyte is going to require partitioning.  There isn't a
hard limit, but we don't see many serves over tens of terrabytes.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Albe Laurenz
Aleksey Tsalolikhin wrote:
> Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they are 
> looking to switch to open
> source to cut their licensing costs, and was asked how large a database does 
> PostgreSQL support?  Is
> there an upper bound on database size and if so, what it is?

There is no real upper bound.

I think that backup will be a limiting factor, and you'll want backups.
Of course, sequential scans of really large tables will be very
painful, but that's the same for all database systems.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Chris Curvey
> INSERT INTO persona_followers(service_id, follower_id, valid_at)
>   SELECT service_id, follower_id, NOW()
>   FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import
>   WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id
> = persona_followers.service_id AND import.follower_id =
> persona_followers.follower_id);
>
>
I'm wondering if you have an unneeded level of nesting. (I don't know if it
would make any difference, but it might).

INSERT INTO persona_followers(service_id, follower_id, valid_at)
SELECT DISTINCT service_id, follower_id, now()
FROM import
WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id =
persona_followers.service_id AND import.follower_id =
persona_followers.follower_id);


Re: [GENERAL] dataset lock

2013-04-17 Thread Philipp Kraus

On 2013-04-17 09:18:13 +0200, Albe Laurenz said:


Philipp Kraus wrote:

My PG database is connected to differend cluster nodes (MPI). Each
programm / process on each node are independed and run the SQL
select * from table where status = waiting
after that I update the row with the update statement (set status = working)

so in this case one process can run the select, than comes another
process and runs also the select, but both processes get an equal row.
But this does not allowed. The second process need not see the row,
which is taken by the first process. So can I suppress, that a select
call
sees a row, which is locked by a transaction? So I would like to do
this with a store procedure, that runs the select and the update and
after that
it returns the PK of the selected dataset. If two (or more) processes
run the SP at the same time, but the update can create an error, so the
stored
procedure is stopped and must called again.
I need a solution, that a row, which is taken by one process not shown
by all other processes


Do you want to implement something like a queue?


Yes



I can think of two techniques:

1) Locking
--
In a transaction, you get a few rows for processing by
SELECT * FROM table WHERE status = waiting
   ORDER BY id LIMIT 5 FOR UPDATE;
("id" is the primary key here).
Then you process and update the rows and commit.
  This will cause concurrent SELECT FOR UPDATE operations
to block until the transaction is committed, effectively
serializing the processing.

2) Set a marker
---
You get a few rows by
UPDATE table SET status = processing WHERE id IN
   (SELECT id FROM table WHERE status = waiting
   ORDER BY id LIMIT 5) RETURNING *;
Then process and update the rows.

This won't block concurrent processes for the whole
time it takes to process the rows, so it's probably
closer to what you want.


The marker solution seems to be the correct idea, I must think about it.
Would you create the call within a stored procedure (and call the SP 
from the client) or would you use the

statement from a client direct?

Thx

Phil




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can you spot the difference?

2013-04-17 Thread Moshe Jacobson
On Tue, Apr 16, 2013 at 7:29 PM, Adrian Klaver wrote:

> "
> The autovacuum daemon, if enabled, will automatically issue ANALYZE
> commands whenever the content of a table has changed sufficiently. However,
> administrators might prefer to rely on manually-scheduled ANALYZE
> operations, particularly if it is known that update activity on a table
> will not affect the statistics of "interesting" columns. The daemon
> schedules ANALYZE strictly as a function of the number of rows inserted or
> updated; it has no knowledge of whether that will lead to meaningful
> statistical changes.
> "
>
> So at a guess there has not been enough churn on the table.
>

So pg_restore's COPY would not trigger the ANALYZE? That seems wrong.


-- 
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
mo...@neadwerx.com | www.neadwerx.com

"Quality is not an act, it is a habit." -- Aristotle


Re: [GENERAL] Mysterious table that exists but doesn't exist

2013-04-17 Thread Dale Fukami
On Tue, Apr 16, 2013 at 3:04 PM, Tom Lane  wrote:

> Dale Fukami  writes:
> > I'm having a problem on a standby server (streaming replication) where a
> > table seems to exist but is not queryable. Essentially a select statement
> > (and drop/insert/etc) fails but \d and pg_tables show it exists. The
> table
> > exists on the master (and is queryable) and replication is still working
> in
> > that changes to the master table don't cause errors on the standby and
> > changes to other tables are replicated and verified to be on the standby.
>
> That's peculiar.  The most likely theories seem to be
> (1) corruption in the standby's pg_class indexes, such that an index
> search for relname = 'tracked_deductibles' fails to find the row,
> although it is found by seqscans; or
> (2) some type of visibility issue causing SnapshotNow operations to
> think the row is invalid, though it is valid to MVCC queries.
>
> Either way, if it's working on the master, then you've had a replication
> failure since the standby's files evidently don't match the master's.
>
> What PG version is this (and which versions have been installed since
> the replication was set up)?  Have you had any system-level crashes on
> the standby?
>

My apologies: PostgreSQL 9.0.5 on x86_64-pc-linux-gnu, compiled by GCC
gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
This is the only version we've had since we've set up streaming replication.

We believe we've found the moment of corruption. It appears that the db was
shutdown then restarted and didn't quite catch up to the master and then
was shut down again. We use this standby to take snapshots on a regular
basis but it looks like this one didn't quite get into a consistent state
before the next one started. Logs pasted at the end of this email.

At this point I'm most concerned about:

1) We were quite lucky I think to have discovered this issue. I think the
only other way we'd have noticed would have been if we'd failed over to it
and our app stopped working. I'm worried now that we'll end up in a similar
situation and won't have known it for many weeks. At some point we lose the
ability to PITR based on how many backups are kept. Is there a way to be
more confident in our standby machines?

2) Is there a way to recover from this? In the extreme case where we might
have failed over to this standby, could we somehow figure out where exactly
the corruption is and recover or are we talking about going back to an
older snapshot and using our archived WALs? Note: we've already rebuilt
this standby to be consistent with master last night. This is more a
hypothetical at this point. We've kept the bad slave for experimentation in
case anyone has thoughts on this.

Just to clear up the confusion that I had caused John. I had obscured the
previous schema name by restoring from snapshots and altering the schema
name itself. So, the queries and output are an exact copy/paste from my
terminal. I get the same results on the Live standby when using the actual
Live schema name but, obviously, it shows the Live schema name rather than
'someschema'.

Thanks,
Dale

Log snippet from what appears to be the time of corruption. We've confirmed
this by restoring snapshots just prior and just following this time range.
Note the lovely HINT that there may be corruption :)

2013-04-15 15:38:22 UTC @ DETAIL:  last completed transaction was at log
time 2013-04-15 15:37:07.621579+00
2013-04-15 15:38:22 UTC @ LOG:  shutting down
2013-04-15 15:38:22 UTC @ LOG:  database system is shut down
2013-04-15 15:39:21 UTC @ LOG:  database system was shut down in recovery
at 2013-04-15 15:38:22 UTC
2013-04-15 15:39:21 UTC [unknown]@[unknown] LOG:  incomplete startup packet
cp: cannot stat `/var/lib/postgresql-archives/0005.history': No such
file or directory
2013-04-15 15:39:21 UTC @ LOG:  entering standby mode
cp: cannot stat `/var/lib/postgresql-archives/000516480088': No
such file or directory
2013-04-15 15:39:21 UTC postgres@postgres FATAL:  the database system is
starting up
2013-04-15 15:39:22 UTC postgres@postgres FATAL:  the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000516480086': No
such file or directory
2013-04-15 15:39:22 UTC @ LOG:  redo starts at 1648/86ED4360
cp: cannot stat `/var/lib/postgresql-archives/000516480087': No
such file or directory
2013-04-15 15:39:23 UTC postgres@postgres FATAL:  the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000516480088': No
such file or directory
cp: cannot stat `/var/lib/postgresql-archives/000516480089': No
such file or directory
cp: cannot stat `/var/lib/postgresql-archives/00051648008A': No
such file or directory
2013-04-15 15:39:23 UTC postgres@postgres FATAL:  the database system is
starting up
2013-04-15 15:39:23 UTC @ LOG:  consistent recovery state reached at
1648/8A6460E0
2013-04-15 15:39:23 UTC @ LOG:  invalid record 

Re: [GENERAL] Can you spot the difference?

2013-04-17 Thread Adrian Klaver

On 04/17/2013 07:49 AM, Moshe Jacobson wrote:


On Tue, Apr 16, 2013 at 7:29 PM, Adrian Klaver mailto:adrian.kla...@gmail.com>> wrote:

"
The autovacuum daemon, if enabled, will automatically issue ANALYZE
commands whenever the content of a table has changed sufficiently.
However, administrators might prefer to rely on manually-scheduled
ANALYZE operations, particularly if it is known that update activity
on a table will not affect the statistics of "interesting" columns.
The daemon schedules ANALYZE strictly as a function of the number of
rows inserted or updated; it has no knowledge of whether that will
lead to meaningful statistical changes.
"

So at a guess there has not been enough churn on the table.


So pg_restore's COPY would not trigger the ANALYZE? That seems wrong.


Well the argument most often heard is that the command has a cost and it 
left to the discretion of the user as to when to incur that cost. For 
instance I, and others, often use COPY to transfer data from some 
external data source to a holding table, from which the data is then 
manipulated/transferred to one or more tables. I have not interest in 
having the holding table ANALYZEd as I am going to hit all the rows 
anyway. Generally what people do in your situation is include a manual 
ANALYZE in a script that is part of or follows the COPY.





--
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
mo...@neadwerx.com  | www.neadwerx.com


"Quality is not an act, it is a habit." -- Aristotle



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object

2013-04-17 Thread itishree sukla
Dear All,

Can any one please help me to fix this issue, i am getting this error from
our application, currently Database is running on 9.2.

2013-04-17 11:37:25:151 - {ERROR} database.ConnectionManager Thread
[http-8080-1];  --- getConnection() Exception:
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool
error Timeout waiting for idle object
at
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
at
org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at
com.tenkinfo.b2g.database.ConnectionManager.getConnection(ConnectionManager.java:39)
at
com.tenkinfo.b2g.usermanagement.dao.UserManagementDAOImpl.getSessionData(UserManagementDAOImpl.java:228)
at
com.tenkinfo.mapnsav.common.action.BaseAction.execute(BaseAction.java:156)
at
org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:431)
at
org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:236)
at
org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196)
at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.tuckey.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:176)
at
org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java:145)

Do i have to set tcp _keepalive * paramter to less sec, or need to kill the
idle connection ?

Regards,
Itishree


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Scott Marlowe
My experience, doing production and dev dba work on both postgresql
and oracle, is that either works well, as long as you partition
properly or even break things into silos. Oracle isn't magic pixie
dust that suddenly gets hardware with 250MB/s seq read arrays to read
at 1GB/s, etc.

With oracle partitioning is easier, and everything else on the
freaking planet is harder.

On Wed, Apr 17, 2013 at 8:15 AM, Albe Laurenz  wrote:
> Aleksey Tsalolikhin wrote:
>> Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they are 
>> looking to switch to open
>> source to cut their licensing costs, and was asked how large a database does 
>> PostgreSQL support?  Is
>> there an upper bound on database size and if so, what it is?
>
> There is no real upper bound.
>
> I think that backup will be a limiting factor, and you'll want backups.
> Of course, sequential scans of really large tables will be very
> painful, but that's the same for all database systems.
>
> Yours,
> Laurenz Albe
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Mysterious table that exists but doesn't exist

2013-04-17 Thread Tom Lane
Dale Fukami  writes:
> On Tue, Apr 16, 2013 at 3:04 PM, Tom Lane  wrote:
>> Either way, if it's working on the master, then you've had a replication
>> failure since the standby's files evidently don't match the master's.
>> 
>> What PG version is this (and which versions have been installed since
>> the replication was set up)?  Have you had any system-level crashes on
>> the standby?

> My apologies: PostgreSQL 9.0.5 on x86_64-pc-linux-gnu, compiled by GCC
> gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
> This is the only version we've had since we've set up streaming replication.

> We believe we've found the moment of corruption. It appears that the db was
> shutdown then restarted and didn't quite catch up to the master and then
> was shut down again. We use this standby to take snapshots on a regular
> basis but it looks like this one didn't quite get into a consistent state
> before the next one started. Logs pasted at the end of this email.

Hm ... there was a fix in 9.0.12 that might be relevant to this:
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=5840e3181b7e6c784fdb3aff708c4dcc2dfe551d
Whether that explains it or not, 9.0.5 is getting long in the tooth;
you really need to think about an update.  Especially in view of
CVE-2013-1899.

> 1) We were quite lucky I think to have discovered this issue. I think the
> only other way we'd have noticed would have been if we'd failed over to it
> and our app stopped working. I'm worried now that we'll end up in a similar
> situation and won't have known it for many weeks. At some point we lose the
> ability to PITR based on how many backups are kept. Is there a way to be
> more confident in our standby machines?

I think you should update your PG version and then resync your standbys
(ie, apply a fresh base backup) at the next convenient maintenance
window.

> Just to clear up the confusion that I had caused John. I had obscured the
> previous schema name by restoring from snapshots and altering the schema
> name itself. So, the queries and output are an exact copy/paste from my
> terminal. I get the same results on the Live standby when using the actual
> Live schema name but, obviously, it shows the Live schema name rather than
> 'someschema'.

Well, in that case there's the question of whether you'd duplicated the
standby's state accurately ...

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Roadmap for Postgres on AIX

2013-04-17 Thread Thomas Munro
On 19 March 2013 01:00, Tom Lane  wrote:

> Wasim Arif  writes:
> > What is the road map for Postgres on the AIX platform? I understand that
> > the pg build farm contains an AIX 5.3 server; are there any plans to
> > upgrade to 6.1 and 7.1?
>
> The reason there's an AIX 5.3 buildfarm member is that someone cares
> enough about preserving portability to that platform to provide a
> buildfarm member.  If you're worried about other AIX releases, I suggest
> you do likewise.  It's not a huge burden.  Basic info about it is here:
> http://buildfarm.postgresql.org/cgi-bin/register-form.pl
>
> By and large, our approach to AIX is the same as to any other platform:
> we'll support it as long as users of the platform provide testing and
> any necessary portability fixes.  That burden might fall more directly
> on you, as one of a relatively small number of users, than it would on
> somebody using say Linux or BSD.  But as long as you are willing to do
> some of the work you need not fear that we're planning to abandon you.
>

For those organisations running PostgreSQL on AIX, I wonder if it would be
possible for a member of 'IBM PartnerWorld' to use the free 'Virtual
Loaner' programme[1] to set up build farm members on a range of AIX
versions and POWER chips.

My employer is in this category, but I am not in a position to contribute
company time patches or resources currently (although that might change).
 From memory we've had to make a couple of local tweaks to makefiles for
our particular versions of things, although we are mostly focussed on
clients on AIX, not the server.  But perhaps some of the other (rare!) AIX
users from the list might be able to look into VPL farm animals?

[1] http://www-304.ibm.com/partnerworld/wps/pub/systems/vlp/index


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Jeff Janes
On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil
wrote:

>
>
>  Insert on public.persona_followers  (cost=139261.12..20483497.65
> rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1)
>Buffers: shared hit=33135295 read=4776921
>->  Subquery Scan on t1  (cost=139261.12..20483497.65 rows=6256498
> width=16) (actual time=562265.156..578844.999 rows=6819520 loops=1)
>


It looks like 12% of the time is being spent figuring out what rows to
insert, and 88% actually doing the insertions.

So I think that index maintenance is killing you.  You could try adding a
sort to your select so that rows are inserted in index order, or inserting
in batches in which the batches are partitioned by service_id (which is
almost the same thing as sorting, since service_id is the lead column)

Cheers,

Jeff


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Aleksey Tsalolikhin
Thanks for all the great answers, folks, I'll pass this along.

Cheers!
Aleksey


On Wed, Apr 17, 2013 at 9:45 AM, Scott Marlowe wrote:

> My experience, doing production and dev dba work on both postgresql
> and oracle, is that either works well, as long as you partition
> properly or even break things into silos. Oracle isn't magic pixie
> dust that suddenly gets hardware with 250MB/s seq read arrays to read
> at 1GB/s, etc.
>
> With oracle partitioning is easier, and everything else on the
> freaking planet is harder.
>
> On Wed, Apr 17, 2013 at 8:15 AM, Albe Laurenz 
> wrote:
> > Aleksey Tsalolikhin wrote:
> >> Hi.  I was promoting PostgreSQL to an AIX/Oracle shop yesterday, they
> are looking to switch to open
> >> source to cut their licensing costs, and was asked how large a database
> does PostgreSQL support?  Is
> >> there an upper bound on database size and if so, what it is?
> >
> > There is no real upper bound.
> >
> > I think that backup will be a limiting factor, and you'll want backups.
> > Of course, sequential scans of really large tables will be very
> > painful, but that's the same for all database systems.
> >
> > Yours,
> > Laurenz Albe
> >
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
>
>
>
> --
> To understand recursion, one must first understand recursion.
>



-- 
CFEngine Training:
New Jersey, Apr 29 - May 2.  http://cf3.eventbrite.com/
New Jersey, May 3.  http://lopsa-east.org/2013/lopsa-east-training/


Re: [GENERAL] GSL onto postgresql server 9.2

2013-04-17 Thread Yuriy Rusinov
Thanks a lot.

I have compiled gsl with prefix=/usr and this is quite acceptable for us.



On Mon, Apr 15, 2013 at 2:30 PM, Albe Laurenz wrote:

> Yuriy Rusinov wrote:
> > I have to put some C-language functions onto postgresql server 9.2.
> These functions are used GSL
> > software library http://www.gnu.org/software/gsl/. In Makefile for
> these functions I wrote LD_FLAGS =
> > ... -lgsl, On some source-based Linux distributions such as gentoo linux
> these functions successfully
> > installed onto server, but on rpm-based distributions such as fedora and
> others does not. gsl library
> > has to be compiled from sources, because some specific distributions
> does not have this one and rpm-
> > installation of this library does not allowed. Error message is "error
> .so cannot load
> > libgsl.so.0 no such file" despite of gsl library was installed onto
> /usr/local, library files are
> > installed into /usr/local/lib. Where is the problem in postgresql or
> distribution ?
>
> Read "man ld", the description of "search paths to locate
> required shared libraries".
>
> /usr/local/lib is not automatically searched.
> I would either install the library in /usr/lib,
> link the shared object with -Wl,-rpath,/usr/local/lib
> or put /usr/local/lib into /etc/ld.so.conf and run
> ldconfig.
>
> Yours,
> Laurenz Albe
>



-- 
Best regards,
Sincerely yours,
Yuriy Rusinov.


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Moshe Jacobson
On Tue, Apr 16, 2013 at 10:51 PM, François Beausoleil
wrote:

> INSERT INTO persona_followers(service_id, follower_id, valid_at)
>   SELECT service_id, follower_id, NOW()
>   FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import
>   WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id
> = persona_followers.service_id AND import.follower_id =
> persona_followers.follower_id);
>

Try this for your insert query instead:

insert into persona_followers( service_id, follower_id, valid_at )
select i.service_id, i.follower_id, now()
from import i
left join persona_followers pf on i.service_id = pf.service_id and
i.follower_id = pf.follower_id
where pf.service_id is null

order by 1,2;


This will insert only those rows that are not already present, and involves
no subqueries and only one join.

-- 
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
mo...@neadwerx.com | www.neadwerx.com

"Quality is not an act, it is a habit." -- Aristotle


Re: [GENERAL] SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object

2013-04-17 Thread Alfonso Afonso
Hi Itsrhree

>From the machine where is running the tomcat, do you check that you can 
>connect to postgresql server (remember check parameters of connection, user, 
>password, ip)?

Having this first step tested, then:

Do you have the correct connection pool configured on Catalina (Tomcat) and let 
this software to configure the pool of database connections?

Good luck :)

El 17/04/2013, a las 17:01, itishree sukla  escribió:

> Dear All,
>  
> Can any one please help me to fix this issue, i am getting this error from 
> our application, currently Database is running on 9.2.
>  
> 2013-04-17 11:37:25:151 - {ERROR} database.ConnectionManager Thread 
> [http-8080-1];  --- getConnection() Exception:
> org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool 
> error Timeout waiting for idle object
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
> at 
> org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
> at 
> com.tenkinfo.b2g.database.ConnectionManager.getConnection(ConnectionManager.java:39)
> at 
> com.tenkinfo.b2g.usermanagement.dao.UserManagementDAOImpl.getSessionData(UserManagementDAOImpl.java:228)
> at 
> com.tenkinfo.mapnsav.common.action.BaseAction.execute(BaseAction.java:156)
> at 
> org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:431)
> at 
> org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:236)
> at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196)
> at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at 
> org.tuckey.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:176)
> at org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java:145)
>  
> Do i have to set tcp _keepalive * paramter to less sec, or need to kill the 
> idle connection ?
>  
> Regards,
> Itishree

Alfonso Afonso
(personal)







-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Michael Nolan
On 4/17/13, Scott Marlowe  wrote:
> My experience, doing production and dev dba work on both postgresql
> and oracle, is that either works well, as long as you partition
> properly or even break things into silos. Oracle isn't magic pixie
> dust that suddenly gets hardware with 250MB/s seq read arrays to read
> at 1GB/s, etc.
>
> With oracle partitioning is easier, and everything else on the
> freaking planet is harder.


Scott, thank you for the best laugh I've had all day!

I started out on Oracle (some 20 years ago) and have been running both
MySQL and PostgreSQL databases for the last 10 years or so.  I'd take
PostgreSQL over the other two in a heartbeat!

Data integrity/data preservation issues (backup is just one aspect of
that) are going to be your biggest problems with VERY large databases,
no matter how much money you throw at it.
--
Mike Nolan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Mysterious table that exists but doesn't exist

2013-04-17 Thread Dale Fukami
>
> Hm ... there was a fix in 9.0.12 that might be relevant to this:
>
> http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=5840e3181b7e6c784fdb3aff708c4dcc2dfe551d
> Whether that explains it or not, 9.0.5 is getting long in the tooth;
> you really need to think about an update.  Especially in view of
> CVE-2013-1899.
>
>
Yeah, we've been in the process of planning the update in light of the
security fix. We're likely just making the jump to 9.2 in the process so
that'll be handled soon.


> I think you should update your PG version and then resync your standbys
> (ie, apply a fresh base backup) at the next convenient maintenance
> window.


We did apply a fresh base backup last night to ensure our current standbys
are in a good state. I was mostly wondering if there were some better way
to continually monitor the state of our standbys on an ongoing basis aside
from checking pg_last_xlog_receive_location, etc. I realize it's no small
task but thought maybe there was something I had missed.

In the end, I think this just helps nudge our upgrade along :)

Thanks for all the help!
Dale


[GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Mike Roest
Hi There,
   I'm having a bit of an issue finding a C function to fetch the
configured server port from a C module.

We have written a C module to allow for remote clients to call a function
to run pg_dump/pg_restore remotely but create files locally on the db
server.

Currently it works fine if the client connects through a network socket as
we're using inet_server_port to get the port to pass onto pg_dump/restore.
 But if the client is connected through a unix socket (actually a remote
client connecting to pgbouncer which is connecting to postgres though the
unix socket) inet_server_port is null.  I've looked for a function that we
can use to get the configured server port but haven't had any luck.

I could hard code the port in the module when we build it but it would be
nice to be able to change the configured postgres port and not have to
rebuild the module.

Anyone have any suggestions?

I've posted the code for our backup module here:

http://pastebin.com/wQ6VidWn


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Bruce Momjian
On Wed, Apr 17, 2013 at 01:08:18PM -0600, Mike Roest wrote:
> Hi There,
>I'm having a bit of an issue finding a C function to fetch the configured
> server port from a C module.
> 
> We have written a C module to allow for remote clients to call a function to
> run pg_dump/pg_restore remotely but create files locally on the db server.  
> 
> Currently it works fine if the client connects through a network socket as
> we're using inet_server_port to get the port to pass onto pg_dump/restore.  
> But
> if the client is connected through a unix socket (actually a remote client
> connecting to pgbouncer which is connecting to postgres though the unix 
> socket)
> inet_server_port is null.  I've looked for a function that we can use to get
> the configured server port but haven't had any luck.  
> 
> I could hard code the port in the module when we build it but it would be nice
> to be able to change the configured postgres port and not have to rebuild the
> module.

Well, there are technically no _ports_ in unix-domain sockets.  However,
the TCP port number is used to construct the socket file;  I think you
can use the simple "port" server-side variable for this;  does this help
you?

test=> SELECT setting FROM pg_settings WHERE name = 'port';
 setting
-
 5432
(1 row)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread John R Pierce

On 4/17/2013 12:08 PM, Mike Roest wrote:
I could hard code the port in the module when we build it but it would 
be nice to be able to change the configured postgres port and not have 
to rebuild the module.


Anyone have any suggestions?



SHOW PORT;

?

works in 9.2, anyways.



--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Mike Roest
> SHOW PORT;
>
 test=> SELECT setting FROM pg_settings WHERE name = 'port';
 setting
-
 5432

Both of these are from a query context.  This is in a C module, I suppose I
could run a query but there has to be a direct C function to get this data.


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Bruce Momjian
On Wed, Apr 17, 2013 at 01:32:00PM -0600, Mike Roest wrote:
> 
> 
> 
> SHOW PORT;
> 
>  test=> SELECT setting FROM pg_settings WHERE name = 'port';
>  setting
> -
>  5432
>   
> Both of these are from a query context.  This is in a C module, I suppose I
> could run a query but there has to be a direct C function to get this data. 

Well, you could look at how to access the PostPortNumber global
variable.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How large can a PostgreSQL database get?

2013-04-17 Thread Scott Marlowe
On Wed, Apr 17, 2013 at 12:53 PM, Michael Nolan  wrote:
> On 4/17/13, Scott Marlowe  wrote:
>> My experience, doing production and dev dba work on both postgresql
>> and oracle, is that either works well, as long as you partition
>> properly or even break things into silos. Oracle isn't magic pixie
>> dust that suddenly gets hardware with 250MB/s seq read arrays to read
>> at 1GB/s, etc.
>>
>> With oracle partitioning is easier, and everything else on the
>> freaking planet is harder.
>
>
> Scott, thank you for the best laugh I've had all day!
>
> I started out on Oracle (some 20 years ago) and have been running both
> MySQL and PostgreSQL databases for the last 10 years or so.  I'd take
> PostgreSQL over the other two in a heartbeat!
>
> Data integrity/data preservation issues (backup is just one aspect of
> that) are going to be your biggest problems with VERY large databases,
> no matter how much money you throw at it.

Good points. No matter what db engine you're using, whether it's super
fast transactional systems, monstrous data mining systems, or 24/7
can't ever go down dbs, data integrity, hardware acceptance,
monitoring, and disaster recovery are the most important subjects to
be proficient in. Now some db engines do nothing but get in your way,
but both postgresql and oracle seem to be reasonably good platforms
for largish deployments, say a few terabytes, without a lot of
futzing. After that planning becomes king. You're never gonna just
deploy a single server with 100 petabytes storage and expect it to
scale.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread François Beausoleil

Le 2013-04-17 à 14:15, Jeff Janes a écrit :

> On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil  
> wrote:
> 
> 
>  Insert on public.persona_followers  (cost=139261.12..20483497.65 
> rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1)
>Buffers: shared hit=33135295 read=4776921
>->  Subquery Scan on t1  (cost=139261.12..20483497.65 rows=6256498 
> width=16) (actual time=562265.156..578844.999 rows=6819520 loops=1)
> 
> 
> It looks like 12% of the time is being spent figuring out what rows to 
> insert, and 88% actually doing the insertions.
> 
> So I think that index maintenance is killing you.  You could try adding a 
> sort to your select so that rows are inserted in index order, or inserting in 
> batches in which the batches are partitioned by service_id (which is almost 
> the same thing as sorting, since service_id is the lead column)

In that case, partitioning the original table by service_id % N would help, 
since the index would be much smaller, right?

N would have to be reasonable - 10, 100, 256, or something similar.

Thanks,
François



smime.p7s
Description: S/MIME cryptographic signature


[GENERAL] Single Row Mode in psql

2013-04-17 Thread Christopher Manning
I'm using psql to extract data from a redshift (based on postgres)
instance, but psql/libpq collects the result in memory before writing it to
a file and causes out of memory problems for large results. Using COPY TO
STDOUT or FETCH_COUNT isn't an option since redshift doesn't support those.

[Single Row Mode is available in postgres 9.2](
http://www.postgresql.org/docs/9.2/static/libpq-single-row-mode.html) but
[it doesn't look like](
http://www.postgresql.org/docs/9.2/static/app-psql.html) you can tell psql
to use it when writing to a file (using --output).

Is it currently possible to have psql --output to a file using single row
mode?

Thank you,
Christopher

P.S. The ruby-pg gem supports single row mode:
https://bitbucket.org/ged/ruby-pg/src/de1cdb0f7ba625ad7cec02bb871ae8fdf7de68c8/ext/pg_connection.c?at=default#cl-1514but
it has the overhead of being a ruby script instead of psql supporting
it.


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Mike Roest
Perfect thanks Bruce that worked.

I just extern'd PostPortNumber in my module and everything seems to be
working.

--Mike


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-17 Thread Jeff Janes
On Wed, Apr 17, 2013 at 1:19 PM, François Beausoleil
wrote:

>
> Le 2013-04-17 à 14:15, Jeff Janes a écrit :
>
>
> It looks like 12% of the time is being spent figuring out what rows to
> insert, and 88% actually doing the insertions.
>
> So I think that index maintenance is killing you.  You could try adding a
> sort to your select so that rows are inserted in index order, or inserting
> in batches in which the batches are partitioned by service_id (which is
> almost the same thing as sorting, since service_id is the lead column)
>
>
This analysis is based on your example, which inserted 7 million rows.  But
I just noticed you also said you only have a few thousands rows to insert
per day.  So if you make your example better match your use case, perhaps
that analysis would no longer hold.


>
> In that case, partitioning the original table by service_id % N would
> help, since the index would be much smaller, right?
>

Probably not.  If you partition the table but do not change your loading
method, then the relevant thing would be the sum of the index sizes over
all partitions, which would be about the same as now.

On the other hand, if you change the method to load the data in batches,
you don't need to partition the table, you just need to align the batches
with the index order.  You could use partitioning as a way to do that, but
it is just as easy (or easier) to do so without partitioning.

Once you solve the index maintenance problem, partitioning might help solve
the select part of the deduplication problem, though.  You would only need
to check against existing data for the partition into which you already
know the current batch is going to be loaded.

Also, the constraint_exclusion code is usually not smart enough to deal
with constraints that use the modulus (unless the modulus itself appears in
the where clause).  You would have to use range partitioning instead.

Cheers,

Jeff


[GENERAL] Inherit Superuser Role Help

2013-04-17 Thread Carlos Mennens
What am I missing here?

postgres=> SELECT current_user;
 current_user
--
 carlos
(1 row)

postgres=> CREATE DATABASE carlosdb;
ERROR:  permission denied to create database

postgres=> \du
List of roles
 Role name |  Attributes
   | Member of
---+--+---
 carlos|
   | {dba}
 chris |
   | {web}
 dba   | Superuser, Create role, Create DB, Cannot login, Replication | {}
 postgres  | Superuser, Create role, Create DB, Replication   | {}
 web   | Cannot login | {}

Shouldn't 'carlos' be a superuser based on him being a member of a
role which has createdb and superuser rights granted to it?

--
Carlos Mennens


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inherit Superuser Role Help

2013-04-17 Thread Tom Lane
Carlos Mennens  writes:
> Shouldn't 'carlos' be a superuser based on him being a member of a
> role which has createdb and superuser rights granted to it?

No.  Superuserness is quite intentionally not inheritable.

It's perhaps a bit more debatable whether other role privilege bits such
as CREATEDB ought to be inheritable.  The position we've taken though
is that only grant-able rights inherit via GRANT ROLE.

This is documented under CREATE ROLE:

The INHERIT attribute governs inheritance of grantable privileges (that
is, access privileges for database objects and role memberships). It
does not apply to the special role attributes set by CREATE ROLE and
ALTER ROLE. For example, being a member of a role with CREATEDB
privilege does not immediately grant the ability to create databases,
even if INHERIT is set; it would be necessary to become that role via
SET ROLE before creating a database.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general