Re: [GENERAL] shared_buffers formula

2015-03-04 Thread Scott Marlowe
On Wed, Mar 4, 2015 at 4:50 AM, Bill Moran  wrote:
> On Wed, 4 Mar 2015 14:05:09 +0400
> Alexander Shutyaev  wrote:
>
>> Thanks for the answer. Now, given this info I've calculated that our
>> postgresql should occupy approx. 30,53 GB while the server has 125 GB of
>> RAM. However we often see in top that there is very little free memory and
>> even swap is used. What could be the reason of postgres using so much
>> memory?
>
> Memory usage is much more dependent on the OS than Postgres than you
> might realize. I don't see where you state the OS, but I'll assume
> it's Linux for now.
>
> Linux default NUMA policy seems to be tuned toward applications
> that don't use a lot of RAM. If your 128G server has 8 CPUs, then
> Linux will allow a single process to use 16G of RAM before deciding
> that it has to use swap for that process. This is one of the
> advantantages I find with FreeBSD.
>
> Read up on how NUMA works a bit, and do some research into how to
> tune the NUMA policies ... assuming, of course, that you _are_ using
> Linux. Or switch to FreeBSD where the default NUMA policy is more
> friendly to programs that use a lot of RAM.

The single most important step on a NUMA pg machine is to make sure
that zone_reclaim_mode = 0. If the kernel detects a largish internode
communication cost, it will turn it on at boot time and after you've
been running a while it will suddenly slow to a crawl as it tries to
move things around in memory. Other important steps are to lower the
background dirty bytes so you don't get massive background writes.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Index corruption

2015-03-30 Thread Scott Marlowe
On Tue, Mar 24, 2015 at 12:49 PM, Bankim Bhavsar
 wrote:
> Hello postgres experts,
>
> We are running a test that periodically abruptly kills postgres
> process(equivalent to kill -9) and restarts it.
> After running this test for 24 hrs or so, we see duplicate primary key
> entries in postgres table.
>

Are you killing the POSTMASTER process, deleting the pid file, and
then restarting it without killing its children? That's a recipe for
corruption.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unexpected (to me) sorting order

2015-04-09 Thread Scott Marlowe
On Wed, Apr 8, 2015 at 3:33 AM, Glyn Astill  wrote:
>
>> From: Björn Lundin 
>>To: pgsql-general@postgresql.org
>>Sent: Wednesday, 8 April 2015, 10:09
>>Subject: [GENERAL] unexpected (to me) sorting order
>>
>>select * from T_SORT order by NAME ;
>>
>>rollback;
>> id |name
>>+
>>  1 | FINISH_110_150_1
>>  2 | FINISH_110_200_1
>>  3 | FINISH_1.10_20.0_3
>>  4 | FINISH_1.10_20.0_4
>>  5 | FINISH_1.10_30.0_3
>>  6 | FINISH_1.10_30.0_4
>>  7 | FINISH_120_150_1
>>  8 | FINISH_120_200_1
>>(8 rows)
>>
>>why is FINISH_1.10_20.0_3 between
>> FINISH_110_200_1 and
>> FINISH_120_150_1
>>?
>>
>>That is why is '.' between 1 and 2 as in 110/120 ?
>>
>>
>>pg_admin III reports the database is created like
>>CREATE DATABASE bnl
>>  WITH OWNER = bnl
>>   ENCODING = 'UTF8'
>>   TABLESPACE = pg_default
>>   LC_COLLATE = 'en_US.UTF-8'
>>   LC_CTYPE = 'en_US.UTF-8'
>>   CONNECTION LIMIT = -1;
>>
>>
>
>
>
> The collation of your "bnl" database is utf8, so the "." punctuation 
> character is seen as a "variable element" and given a lower weighting in the 
> sort to the rest of the characters.  That's just how the collate algorithm 
> works in UTF8.

utf8 is an encoding method, not a collation. The collation is en_US,
encoded in utf8. You can use C collation with utf8 encoding just fine.
So just replace UTF8 with en_US in your sentence and you've got it
right.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] moving to PostgreSQL from MS-SQL and from Oracle, looking for feature comparison information

2015-05-10 Thread Scott Marlowe
On Sat, May 9, 2015 at 11:20 PM, Albe Laurenz  wrote:
> Maxim Boguk wrote:
>> It's depend where a corruption happen, if pages become corrupted due to some
>> problems with physical storage (filesystem) in that case a replica data 
>> should be ok.
>
> I would not count on that.
> I have had a case where a table file got corrupted due to hardware problems.
> Pages that contained data were suddenly zeroed.
> PostgreSQL recognizes such a block as empty, so the user got no error, but
> data were suddenly missing. What does a user do in such a case? He/she 
> grumbles
> and enters the data again. This insert will be replicated to the standby 
> (which was
> fine up to then) and will cause data corruption there (duplicate primary 
> keys).

You had zero corrupted pages turned on. PostgreSQL by default does NOT
DO THIS. That setting is for recovering a corrupted database not for
everyday use!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] moving to PostgreSQL from MS-SQL and from Oracle, looking for feature comparison information

2015-05-10 Thread Scott Marlowe
On Sun, May 10, 2015 at 7:50 AM, Albe Laurenz  wrote:
> Scott Marlowe wrote:
>> On Sat, May 9, 2015 at 11:20 PM, Albe Laurenz  
>> wrote:
>>> Maxim Boguk wrote:
>>>> It's depend where a corruption happen, if pages become corrupted due to 
>>>> some
>>>> problems with physical storage (filesystem) in that case a replica data 
>>>> should be ok.
>
>>> I would not count on that.
>>> I have had a case where a table file got corrupted due to hardware problems.
>>> Pages that contained data were suddenly zeroed.
>>> PostgreSQL recognizes such a block as empty, so the user got no error, but
>>> data were suddenly missing. What does a user do in such a case? He/she 
>>> grumbles
>>> and enters the data again. This insert will be replicated to the standby 
>>> (which was
>>> fine up to then) and will cause data corruption there (duplicate primary 
>>> keys).
>
>> You had zero corrupted pages turned on. PostgreSQL by default does NOT
>> DO THIS. That setting is for recovering a corrupted database not for
>> everyday use!
>
> No, I didn't.
>
> It was not PostgreSQL that zeroed the page, but the hardware or operating 
> system.
> The problem was a flaky fibre channel cable that intermittently was connected 
> and disconnected.
> That corrupted the file system, and I guess it must have been file system 
> recovery
> that zeroed the pages.  I'm not 100% certain, at any rate the symptoms were 
> silently missing data.

Ahh OK. So broken hardware. I've seen some RAID controlelrs do that.
Sorry but your post didn't make it clear where the zeroing came from.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Index on integer or on string field

2015-05-15 Thread Scott Marlowe
On Fri, May 15, 2015 at 9:18 AM, Job  wrote:
> Hello,
>
> i have a table of about 10 millions of records, with the index on a string 
> field.
> Actually is alphabetical; since queries are about 100/200 per seconds, i was 
> looking for a better way to improve performance and reduce workload.
>
> The unique values, of that fields, are about the 50 (category name), and we 
> could create a second table to codify, with numerical integer values, the 50 
> recurring names.
>
> Is index are integer and not characteral, performance are better and workload 
> reduces?
>
> Is there any comparisons?

Have you considered using a multi-column index here? if there's a more
selective field you could index along with your rather non-selective
one that might work better. But it's hard to tell without looking at
you database usage etc.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG and undo logging

2015-05-23 Thread Scott Marlowe
On Sat, May 23, 2015 at 2:34 PM, Ravi Krishna  wrote:
> Is it true that PG does not log undo information, only redo. If true,
> then how does it bring a database back to consistent state during
> crash recovery. Just curious.

PostgreSQL is NOT oracle. And it doesn't work like oracle. This is normal.

Ever run an insert with 1M rows, and roll it back in postgresql and
compare that to oracle. Time the rollback in both. That should give
you an idea of how differently the two dbs operate.

A rollback in postgres is immediate because it's already "rolled back"
so to speak. NONE of it's in the main data store yet, it's all in the
transaction log waiting.

Oracle spends it's time and effort creating an "undo" log because it
commits every so often, whether or not you've committed your
transaction.

PostgreSQL doesn't. The whole transaction exists in the transaction
log (called xlog dir in pg lingo.)

When you roll back a pg transaction it literally requires almost no
work. Mark the transaction as aborted etc and get on with life.
Transaction logs get cleaned up as usual in the background and we go
on our way.

This means that Oracle uses space for rollback, while postgres uses
space for "roll forward" (aka the transaction logs) so to speak.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"

2015-05-24 Thread Scott Marlowe
On Sun, May 24, 2015 at 4:26 AM, Arup Rakshit
 wrote:
> Hi,
>
> I am copying the data from a CSV file to a Table using "COPY" command. But 
> one thing that I got stuck, is how to skip duplicate records while copying 
> from CSV to tables. By looking at the documentation, it seems, Postgresql 
> don't have any inbuilt too to handle this with "copy" command. By doing 
> Google I got below 1 idea to use temp table.
>
> http://stackoverflow.com/questions/13947327/to-ignore-duplicate-keys-during-copy-from-in-postgresql
>
> I am also thinking what if I let the records get inserted, and then delete 
> the duplicate records from table as this post suggested - 
> http://www.postgresql.org/message-id/37013500.dff0a...@manhattanproject.com.
>
> Both of the solution looks like doing double work. But I am not sure which is 
> the best solution here. Can anybody suggest which approach should I adopt ? 
> Or if any better ideas you guys have on this task, please share.

Have you looked at pg_loader?
http://pgloader.io/index.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to craft a query that uses memory?

2015-06-19 Thread Scott Marlowe
Note that after cranking up work mem you then need a query that can
use it. Quickest way to use LOTS of memory is to do a lot of
unconstrained joins.

select * from table a join table b on (true) join table c on (true)
join table d on (true); Result set is size of a*b*c*d

On Fri, Jun 19, 2015 at 7:21 AM,
 wrote:
> Albe Laurenz wrote on Friday, June 19, 2015 12:56 PM:
>> Holger Friedrich wrote:
>> > So how do I craft a query that actually does use lots of memory?
>
>> You increase the parameter "work_mem".
>
>> You can do that globally in postgresql.conf or with SET for one session or 
>> with SET LOCAL for one transaction.
>
> Thank you to both Albe Laurenz and Andreas Kretschmer, who both gave the same 
> advice.  It was spot-on:  my query actually seems to be good for a 
> performance study, setting work_mem to various values causes the query 
> planner to either sort in memory or on disk, and of course this has some 
> impact on performance.
>
> (Sorry for erroneously asking for suggesting a "better" query, rather than 
> for tuning my configuration, which turned out to be the sensible thing to 
> do...)
>
> Best regards,
> Holger Friedrich
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Ubuntu 14.04 LTS install problem

2015-07-01 Thread Scott Marlowe
On Wed, Jul 1, 2015 at 3:05 AM, Urs Berner  wrote:
> Am 01.07.2015 um 09:08 schrieb Charles Clavadetscher:
>>
>> Hello Urs
>>
>> I could install PostgreSQL 9.4 on Ubuntu 14.04 without adding repositories
>> (actually it was an upgrade and I had to move the data manually from 9.3 to
>> 9.4). You may want to try it that way?
>>
>> Bye
>> Charles
>
>
> the Ubuntu-Software Center offers postgresql-9.3, but when trying to install
> I get an error:
>
> dependencies missing, "postgresql:" but not more hints
>
> Because it is a totally new computer I have not yet had an older instance of
> postgresql.
>
> When I try via terminal
> apt-get install postgresql-9.3 there are the same messages about
> postgresql-common (>= 142~) ... defect packages ...

This isn't really a postgresql problem this is a problem with an
ubuntu machine with a broken packaging system. I'd google "ubuntu
defect pacakges" and see what you find there. There's usually some
single line magical invocation that fixes this kind of stuff.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to get total count of queries hitting DB per day or per hour?

2015-07-09 Thread Scott Marlowe
On Thu, Jul 9, 2015 at 12:04 PM, Sheena, Prabhjot
 wrote:
> Guys
>
>   Is there a way to get total count of queries hitting
> Postgresql DB(verison 9.3) per day or per hour ? I cannot turn on
> log_statement=all coz that’s too much logging for our log file to handle. Is
> there another way to do that.

If you just need totals, turn on just log_duration and then you can
grep the logs etc for that.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Scott Marlowe
On Thu, Jul 23, 2015 at 12:09 PM, Adrian Klaver
 wrote:
> On 07/23/2015 04:57 AM, Tim Smith wrote:
>>
>> Andrew,
>>
>>  From the manual:
>>
>> It is important to realize that a rule is really a command
>> transformation mechanism, or command macro. The transformation happens
>> before the execution of the command starts. If you actually want an
>> operation that fires independently for each physical row, you probably
>> want to use a trigger, not a rule
>>
>>
>> Thus, I should not have to use a trigger for TRUNCATE because the "each
>> row" concept does not apply. Plus it makes perfect sense to want to
>> transform the truncate command and transform into ignore
>>
>
> Just in case it has not been made obvious yet, rules are silently
> deprecated. They still exist because views depend on them, but it is
> generally considered best practices to not use them outside that realm. So
> if you want the rule behavior to change for TRUNCATE(if that is even
> possible) you are fighting an uphill battle. You may pursue that fight of
> course, but I would think you will get a quicker return on your time if you
> just forget about using a RULE and stick to a TRIGGER instead.
>


Also OP needs to know that COPY commands are ignored by rules as well.
I agree, stick to triggers, they're faster and less error prone.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-24 Thread Scott Marlowe
On Fri, Jul 24, 2015 at 3:27 AM, Tim Smith  wrote:
> On 23 July 2015 at 19:25, Scott Marlowe  wrote:
>> stick to triggers, they're faster
>
>
> Erm, not according to your beloved manual !!!
>
> 38.7. Rules Versus Triggers
>
> " a rule issuing one extra command is likely to be faster than a trigger"
> "The summary is, rules will only be significantly slower than triggers
> if their actions result in large and badly qualified joins, a
> situation where the planner fails."

So, that's the sum total of what you took away from my post?
Nevermind, I'll leave you alone.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-30 Thread Scott Marlowe
You might want to try pg replay: http://laurenz.github.io/pgreplay/

On Thu, Jul 30, 2015 at 7:23 AM, Spiros Ioannou  wrote:

> I'm very sorry but we don't have a synthetic load generator for our
> testing setup, only production and that is on SLA. I would be happy to test
> the next release though.
>
>
>
>
>
>
>
>
> *Spiros Ioannou IT Manager, inAccesswww.inaccess.com
> M: +30 6973-903808T: +30 210-6802-358*
>
> On 29 July 2015 at 13:42, Heikki Linnakangas  wrote:
>
>> On 07/28/2015 11:36 PM, Heikki Linnakangas wrote:
>>
>>> A-ha, I succeeded to reproduce this now on my laptop, with pgbench! It
>>> seems to be important to have a very large number of connections:
>>>
>>> pgbench -n -c400 -j4 -T600 -P5
>>>
>>> That got stuck after a few minutes. I'm using commit_delay=100.
>>>
>>> Now that I have something to work with, I'll investigate this more
>>> tomorrow.
>>>
>>
>> Ok, it seems that this is caused by the same issue that I found with my
>> synthetic test case, after all. It is possible to get a lockup because of
>> it.
>>
>> For the archives, here's a hopefully easier-to-understand explanation of
>> how the lockup happens. It involves three backends. A and C are insertion
>> WAL records, while B is flushing the WAL with commit_delay. The byte
>> positions 2000, 2100, 2200, and 2300 are offsets within a WAL page. 2000
>> points to the beginning of the page, while the others are later positions
>> on the same page. WaitToFinish() is an abbreviation for
>> WaitXLogInsertionsToFinish(). "Update pos X" means a call to
>> WALInsertLockUpdateInsertingAt(X). "Reserve A-B" means a call to
>> ReserveXLogInsertLocation, which returned StartPos A and EndPos B.
>>
>> Backend A   Backend B   Backend C
>> -   -   -
>> Acquire InsertLock 2
>> Reserve 2100-2200
>> Calls WaitToFinish()
>>   reservedUpto is 2200
>>   sees that Lock 1 is
>>   free
>> Acquire InsertLock 1
>> Reserve 2200-2300
>> GetXLogBuffer(2200)
>>  page not in cache
>>  Update pos 2000
>>  AdvanceXLInsertBuffer()
>>   run until about to
>>   acquire WALWriteLock
>> GetXLogBuffer(2100)
>>  page not in cache
>>  Update pos 2000
>>  AdvanceXLInsertBuffer()
>>   Acquire WALWriteLock
>>   write out old page
>>   initialize new page
>>   Release WALWriteLock
>> finishes insertion
>> release InsertLock 2
>> WaitToFinish() continues
>>   sees that lock 2 is
>>   free. Returns 2200.
>>
>> Acquire WALWriteLock
>> Call WaitToFinish(2200)
>>   blocks on Lock 1,
>>   whose initializedUpto
>>   is 2000.
>>
>> At this point, there is a deadlock between B and C. B is waiting for C to
>> release the lock or update its insertingAt value past 2200, while C is
>> waiting for WALInsertLock, held by B.
>>
>> To fix that, let's fix GetXLogBuffer() to always advertise the exact
>> position, not the beginning of the page (except when inserting the first
>> record on the page, just after the page header, see comments).
>>
>> This fixes the problem for me. I've been running pgbench for about 30
>> minutes without lockups now, while without the patch it locked up within a
>> couple of minutes. Spiros, can you easily test this patch in your
>> environment? Would be nice to get a confirmation that this fixes the
>> problem for you too.
>>
>> - Heikki
>>
>>
>


-- 
To understand recursion, one must first understand recursion.


Re: [GENERAL] PostgreSQL customer list

2015-08-19 Thread Scott Marlowe
On Tue, Aug 18, 2015 at 9:33 PM, Sridhar N Bamandlapally
 wrote:
> Hi
>
> I need some top 10 customers list in financial or healthcare domain (
> irrespective of open-source or proprietary tool )
>
> We need to showcase to our customer for building analytical database
>
> Please do share, it will be really helpful

Please note that since PostgreSQL is completely free, and there are no
registration requirements, such reporting is completely voluntary, and
many places that use it do not advertise that they do. So it's pretty
hard to say "here are our top 10 users" when we don't really know who
the biggest users are. For all we know the Fortune 50 could be all
heavily using it and we'd never know unless someone there spoke up and
told us.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Need Database Backup

2015-08-29 Thread Scott Marlowe
On Fri, Aug 28, 2015 at 11:27 PM, Murali N Rao
 wrote:

> Sir,
>
> We were using newgenlib software which is having postgresql as backend.
>
> Suddenly Our Systems OS got corrupted and unable to start and we have taken 
> installed folder as backup from the drive but we are unable to restore and 
> kindly help in this.
>
> and also we have also taken backup 30 days in .sql format but even i am 
> unable to upload.
>
> So, i kindly request u to help in this as the database contains more than 
> 2000 Students data
>

OK let's see what we can do to help you. What OS are you running? How
idd you move the database directory to the new machine? What errors
are you getting when you to start the db? What errors are you getting
when trying to restore from your old .sql backup? What version of
pgsql are you running?

Just kinda imagine one of us is looking over your shoulder and tell us
everything. Details that may seem small and unimportant can be very
important in helping to troubleshoot this kind of thing.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Need Database Backup

2015-08-30 Thread Scott Marlowe
OK we're gonna need to keep this on the list, as other folks may be a much
better choice to help you than me (I'm not a Windows guy.)

OK so we need more info to help. I don't know how to fix the NTLDR problem,
etc, because it's windows. However, the folks who might be able to help
need more info.

Specifically, what are you doing to try and load the .sql extension?
Are you using a second machine for this?
Have you made a backup of the files in d:\Program Files\Postgresql and put
it in a safe place? It's important to NOT mess around with the only copy in
case things go wrong with it.
What exactly HAVE you done to try and move the PostgreSQL directory
elsewhere? What EXACT errors are you seeing.

The more information you provide the more we can help. Right now you're
asking for help, but you're not helping yourself, because you're not
showing us what's going wrong. More details gets you more answers.

On Sun, Aug 30, 2015 at 3:39 AM, Murali N Rao 
wrote:

> Sir,
>
> Suddenly Our system got restarted and we got a msg NTLDR Missing. I tried
> to repair but not able.
>
> OS: windows 2003 Server
>
> We installed the PGSQL in D:\program files\Postgresql
>
> I have whole folder having capacity of 1.53 GB
>
> and i have recently taken backup on 01st of August with Extension (.sql)
> even i cant upload that.
>
> Please i request u to do the need ful
>
> Regards
>
> Murali N
> System Admin,
> CIT, Gubbi
>
>
> From: Scott Marlowe 
> Sent: Sun, 30 Aug 2015 09:34:42
> To: muralih...@rediffmail.com
> Cc: "pgsql-general@postgresql.org" 
> Subject: Re: [GENERAL] Need Database Backup
> On Fri, Aug 28, 2015 at 11:27 PM, Murali N Rao
>  wrote:
>
> > Sir,
> >
> > We were using newgenlib software which is having postgresql as backend.
> >
> > Suddenly Our Systems OS got corrupted and unable to start and we have
> taken installed folder as backup from the drive but we are unable to
> restore and kindly help in this.
> >
> > and also we have also taken backup 30 days in .sql format but even i am
> unable to upload.
> >
> > So, i kindly request u to help in this as the database contains more
> than 2000 Students data
> >
>
> OK let's see what we can do to help you. What OS are you running? How
> idd you move the database directory to the new machine? What errors
> are you getting when you to start the db? What errors are you getting
> when trying to restore from your old .sql backup? What version of
> pgsql are you running?
>
> Just kinda imagine one of us is looking over your shoulder and tell us
> everything. Details that may seem small and unimportant can be very
> important in helping to troubleshoot this kind of thing.
>
>
> <https://sigads.rediff.com/RealMedia/ads/click_nx.ads/www.rediffmail.com/signatureline.htm@Middle?>
> Get your own *FREE* website, *FREE* domain & *FREE* mobile app with
> Company email.
> *Know More >*
> <http://track.rediff.com/click?url=___http://businessemail.rediff.com?sc_cid=sign-1-10-13___&cmp=host&lnk=sign-1-10-13&nsrv1=host>
>



-- 
To understand recursion, one must first understand recursion.


Re: [GENERAL] WAL Shipping and streaming replication

2015-09-28 Thread Scott Marlowe
On Mon, Sep 28, 2015 at 8:48 AM, CS DBA  wrote:
> All;
>
> We have a 3 node replication setup:
>
> Master (node1) --> Cascading Replication Node (node2)  --> Downstream
> Standby node (node3)
>
> We will be deploying WAL archiving from the master for PITR backups and
> we'll use the staged WAL files in the recovery.conf files in case the
> standbys need to revert to log shipping.
>
> Question:  whats the best way to ensure consistency of WAL archiving in the
> case of changes  (failover, etc)? can we setup the cascade node to archive
> wals only if it's the master? is this a case where we should deploy repmgr?

Look up WAL-E. It's works really well. We tried using OmniPITR and
it's buggy and doesn't seem to get fixed very quickly (if at all).


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] WAL Shipping and streaming replication

2015-09-28 Thread Scott Marlowe
On Mon, Sep 28, 2015 at 9:12 AM, Keith Fiske  wrote:
>
>
> On Mon, Sep 28, 2015 at 10:54 AM, Scott Marlowe 
> wrote:
>>
>> On Mon, Sep 28, 2015 at 8:48 AM, CS DBA 
>> wrote:
>> > All;
>> >
>> > We have a 3 node replication setup:
>> >
>> > Master (node1) --> Cascading Replication Node (node2)  --> Downstream
>> > Standby node (node3)
>> >
>> > We will be deploying WAL archiving from the master for PITR backups and
>> > we'll use the staged WAL files in the recovery.conf files in case the
>> > standbys need to revert to log shipping.
>> >
>> > Question:  whats the best way to ensure consistency of WAL archiving in
>> > the
>> > case of changes  (failover, etc)? can we setup the cascade node to
>> > archive
>> > wals only if it's the master? is this a case where we should deploy
>> > repmgr?
>>
>> Look up WAL-E. It's works really well. We tried using OmniPITR and
>> it's buggy and doesn't seem to get fixed very quickly (if at all).
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
> If you've encountered bugs with OmniPITR, please feel free to open an issue
> on Github. If you look at the issue and commit history you can see that we
> do indeed fix reported issues or respond to help people with problems they
> are having.
>
> https://github.com/omniti-labs/omnipitr

The issue was reported as omnipitr-cleanup is SLOOOW, so we run
purgewal by hand, because the cleanup is so slow it can't keep up. But
running it by hand is not supported.

We fixed the problem though, we wrote out own script and are now
moving to wal-e for all future stuff.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] RAID and SSD configuration question

2015-10-20 Thread Scott Marlowe
On Tue, Oct 20, 2015 at 7:30 AM, Merlin Moncure  wrote:
> On Tue, Oct 20, 2015 at 3:14 AM, Birta Levente  wrote:
>> Hi
>>
>> I have a supermicro SYS-1028R-MCTR, LSI3108 integrated with SuperCap module
>> (BTR-TFM8G-LSICVM02)
>> - 2x300GB 10k spin drive, as raid 1 (OS)
>> - 2x300GB 15k spin drive, as raid 1 (for xlog)
>> - 2x200GB Intel DC S3710 SSD (for DB), as raid 1
>>
>> So how is better for the SSDs: mdraid or controller's raid?
>
> I personally always prefer mdraid if given a choice, especially when
> you have a dedicated boot drive.  It's better in DR scenarios and for
> hardware migrations.  Personally I find dedicated RAID controllers to
> be baroque.  Flash SSDs (at least the good ones) are basically big
> RAID 0s with their own dedicated cache, supercap, and controller
> optimized to the underlying storage peculiarities.
>
>> What's the difference between Write Back and Always Write Back with supercap
>> module?
>
> No clue.  With spinning drives simple performance tests would make the
> caching behavior obvious but with SSD that's not always the case.  I'm
> guessing(!) 'Always Write Back' allows the controller to buffer writes
> beyond what the devices do.

We're running LSI MegaRAIDs at work with 10 SSD RAID-5 arrays, and we
can get ~5k to 7k tps on a -s 1 pgbench with the write cache on.

When we turn the write cache off, we get 15k to 20k tps. This is on a
120GB pgbench db that fits in memory, so it's all writes.

Final answer: test it for yourself, you won't know until you do which is faster.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] RAID and SSD configuration question

2015-10-20 Thread Scott Marlowe
> On Tue, Oct 20, 2015 at 9:33 AM, Scott Marlowe  
> wrote:
>> We're running LSI MegaRAIDs at work with 10 SSD RAID-5 arrays, and we
>> can get ~5k to 7k tps on a -s 1 pgbench with the write cache on.
>>
>> When we turn the write cache off, we get 15k to 20k tps. This is on a
>> 120GB pgbench db that fits in memory, so it's all writes.
>
> This is my findings exactly.  I'll double down on my statement;
> caching raid controllers are essentially obsolete technology.  They
> are designed to solve a problem that simply doesn't exist any more
> because of SSDs.  Unless your database is very, very, busy it's pretty
> hard to saturate a single low-mid tier SSD with zero engineering
> effort.  It's time to let go:  spinning drives are obsolete in the
> database world, at least in any scenario where you're measuring IOPS.

Here's what's REALLY messed up. The older the firmware on the
megaraid, the faster it ran with caching on. We had 3 to 4 year old
firmware and were getting 7 to 8k tps. As we upgraded firmware it got
all the way down to 3k tps, then the very latest got it back up to 4k
or so. No matter what version of the firmware, turning off caching got
us to 15 to 18k easy. So it appears more aggressive and complex
caching algorithms just made things worse and worse.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] can postgres run well on NFS mounted partitions?

2015-11-10 Thread Scott Marlowe
On Tue, Nov 10, 2015 at 4:26 PM, anj patnaik  wrote:
> Thanks for the feedback. I have setup a second Linux VM (running RHEL 5.11)
> and Postgres 9.4. I ran some insertions today from a client running on
> Windows. The client does a loop of 30 updates.
>
> I am seeing about 10-20% increase in latency in the case where DB is on NFS
> (over TCP) compared to directly on disk.
>
> The other machine I am using to compare is running RHEL 6.5 and Postgres
> 9.4.
>
> Are there any specific tests that are recommended to test that postgres over
> NFS works well?
>
> I am planning on doing a few large data inserts and fetches.
>
> With the little testing, the DB over NFS appears fine.

You need to do a power failure test. While running something like
pgbench for a few minutes, run a checkpoint command and then pull the
plug on the NFS server and / or the pg server. Bring it back up. Is
your db corrupted? Then there's something that needs fixing.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] can postgres run well on NFS mounted partitions?

2015-11-12 Thread Scott Marlowe
On Thu, Nov 12, 2015 at 11:19 AM, anj patnaik  wrote:
> The Linux VM where postgres is running over NFS is in a different location
> than where I am. Both the NFS mounted storage and VM are on the same network
> connected via 1GB ethernet switch.
>
> The physical server for the Linux VM has UPS.
>
> Is there any specific test I can run to do power failure?
>
> Can I reboot my VM to test this or that wouldn't be good enough?
>
> Also, why does a checkpoint need to run? I used the graphical installer to
> install postgres so I assume it would start automatically when the server
> starts.
>
> I was also thinking of blackhole testing. If I do a blackhole to the NFS
> server would I be able to test this accurately?
>
> Folks in the other teams believe NFS should work fine for us so I need to
> check it out.
>
> Your ideas are  highly appreciated!

The point of the checkpoint is to make sure as much as possible is
being written to the data directory when you "pull the plug". But
without being able to pull the power plugs on the NAS or db server you
can't really test for reliability in case of power loss. So you can't
know that it'll survive one. Just disconnecting its network connection
etc means it can still write out cached data if it isn't properly
syncing it.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] can postgres run well on NFS mounted partitions?

2015-11-12 Thread Scott Marlowe
On Thu, Nov 12, 2015 at 11:49 AM, Scott Marlowe  wrote:
> On Thu, Nov 12, 2015 at 11:19 AM, anj patnaik  wrote:
>> The Linux VM where postgres is running over NFS is in a different location
>> than where I am. Both the NFS mounted storage and VM are on the same network
>> connected via 1GB ethernet switch.
>>
>> The physical server for the Linux VM has UPS.
>>
>> Is there any specific test I can run to do power failure?
>>
>> Can I reboot my VM to test this or that wouldn't be good enough?
>>
>> Also, why does a checkpoint need to run? I used the graphical installer to
>> install postgres so I assume it would start automatically when the server
>> starts.
>>
>> I was also thinking of blackhole testing. If I do a blackhole to the NFS
>> server would I be able to test this accurately?
>>
>> Folks in the other teams believe NFS should work fine for us so I need to
>> check it out.
>>
>> Your ideas are  highly appreciated!
>
> The point of the checkpoint is to make sure as much as possible is
> being written to the data directory when you "pull the plug". But
> without being able to pull the power plugs on the NAS or db server you
> can't really test for reliability in case of power loss. So you can't
> know that it'll survive one. Just disconnecting its network connection
> etc means it can still write out cached data if it isn't properly
> syncing it.

Also note that a UPS doesn't preclude the machine losing its power
supplies etc, or the ever popular faulty power switch / reset button
etc. Which I have experienced on production machines. UPS does not
mean never having a power failure.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Silent data loss in its pure form

2016-05-30 Thread Scott Marlowe
On Mon, May 30, 2016 at 10:57 AM, Alex Ignatov  wrote:
> Following this bug reports from redhat
> https://bugzilla.redhat.com/show_bug.cgi?id=845233
>
> it rising some dangerous issue:
>
> If on any reasons you data file is zeroed after some power loss(it is the
> most known issue on XFS in the past) when you do
> select count(*) from you_table you got zero if you table was in one
> 1GB(default) file or some other numbers !=count (*) from you_table before
> power loss
> No errors, nothing suspicious in logs. No any checksum errors. Nothing.
>
> Silent data loss is its pure form.
>
> And thanks to all gods that you notice it before backup recycling which
> contains good data.
> Keep in mind it while checking you "backups" in any forms (pg_dump or the
> more dangerous and short-spoken PITR file backup)
>
> You data is always in danger with "zeroed data file is normal file"
> paradigm.

That bug shows as having been fixed in 2012. Are there any modern,
supported distros that would still have it? It sounds really bad btw.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT hardware recommend

2016-06-17 Thread Scott Marlowe
On Fri, Jun 17, 2016 at 2:36 PM, Andy Colson  wrote:

> Hi all.
>
> I have access to quite a few laptop HD's (10 to 15 of them at least), and
> thought that might make a neat test box that might have some good IO speed.
>
> Needs to be cheap though, so linux with software raid, rack mount
> preferred but not required.
>
> Anyone have any experience with anything like that?  $2K might be
> possible, painful, but possible.
>
> Suggestions?


Sell them all and buy a couple of 800G SSDs? :)


Re: [GENERAL] Vacuum full: alternatives?

2016-06-20 Thread Scott Marlowe
On Mon, Jun 20, 2016 at 3:18 AM, Job  wrote:
>
> Hello,
>
> we have a table with an heavy traffic of pg_bulkload and delete of records.
> The size pass, in only one day, for example for 1Gb to 4Gb and then 1Gb back.
>
> We have important problems on size and the only way to gain free space is 
> issueing a vacuum full .
> But the operation is very slow, sometimes 2/4 hours, and table is not 
> available for services as it is locked.
>
> We do not delete everything at one (in this case the truncate woudl resolve 
> the problem).
>
> The autovacuum is not able (same for normal vacuum) to free the spaces.
>
> Are there some suggestions or another way to manage this?

First off, from your comments I'm not sure you really get postgresql's
way of freeing space and reusing it via autovacuum. Basically
postgresql, autovacuum process marks space as free, and the backend
writes new data (inserts or updates) into the free space. You
eventually reach equilibrium of a sort when the vacuum is freeing up
space as quickly as it's being consumed, or faster. The problem occurs
when vacuum can't keep up with your delete / write and update rate
combined. If this is happening you need to:

A: Make sure your IO Subsystem is fast enough to handle BOTH your
update rate AND your vacuuming needed to keep up, You're better off
with a machine that can do 15,000 transactions per second running a
load of 1,000 than trying to handle it with a machine that can do
1,500 tps etc. Sizing the hardware is a whole other conversation.

AND

B: Make your autovacuum aggressive enough to NOT fall behind.

It's important to remember that autovacuum was built and designed in a
time when most databases lived on spinning media. It's designed to not
overload spinning discs with too much random IO. A super fast RAID-10
array from that time period could do 200 to 1,000 transactions per
second and that only with a top notch RAID controller etc. Regular
spinning discs have a maximum random write ops per second that measure
in the 100 per second range.

My laptop with no pgsql optimizations, can do 850 tps on it's SSD. A
server with 10 SSDs in RAID-5 can do 15,000 tps.  If you have a fast
IO subsystem and wish to utilize it with pgsql you're going to have to
examine whether or not autovacuum with default settings is fast enough
to keep up. Remmeber, Auto-vacuum, by design, is slow and tries not to
get in the way. It's fast enough for most mundane uses, but can't keep
up with a fast machine running hard. The default settings for
autovacuum to look at here are first these two.:

autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = 200

They govern how hard autovac works. By default autovac doesn't work
hard. Making it work too hard for a given machine can cause system
performance issues. I.e. it "gets in the way".

Lowering cost_delay is usually enough. As you approach 1ms autovac
starts to use a lot more bandwidth. I find that even on pretty fast
machines that are routinely doing 1,000 writes per second or more, 3ms
is fast enough to keep up with a cost limit of 200. 5ms is a good
compromise without getting too aggressive.

In contrast to autovacuum, REGULAR vacuum, by default, runs at full
throttle. It hits your db server hard, performance-wise. It has  zero
cost delay, so it works very hard. If you run it midday on a hard
working server you will almost certainly see the performance drop. The
difference between regular vacuum with a delay time of 0 and autovac
with a delay of 20ms is huge.

These settings become important if you have a LOT of tables or dbs.
Otherwise they're probably fine.

autovacuum_max_workers =3 # Adjust this last, unless you have
thousands of tables or dbs.
autovacuum_naptime = 1 min # How long to wait before checking the next
db. Default is usually fine unless you have a lot of dbs.

These settings tell autovacuum when to kick in. Keeping these low
enough to keep autovac busy is a good idea too:

autovacuum_vacuum_scale_factor
autovacuum_vacuum_threshold
autovacuum_analyze_scale_factor
autovacuum_analyze_threshold

I tend to go for threshold, which is an absolute number of rows
changed before autovac kicks off. Scale factor can be dangerous
because what seems small at the beginning, gets big fast. If it's 0.1
then that's 10%. 10% of 1,000 is 100 rows, but 10% of 1,000,000 is
100,000, which is a LOT of rows to ignore until you have more than
that that need vacuuming. Setting it to something like 100 or 1,000
will keep your db from growing hundreds of thousands ofr dead tuples
in a big table.

Either way you need to make sure your autovacuum is aggressive enough
to keep up with your db's throughput.

Checking for bloat. You can see what parts of your db are getting too
big.  First, go here:
https://www.keithf4.com/checking-for-postgresql-bloat/

The script there will let you check all your tables AND indexes for
bloat. This will let you know if you've got a simple space problem or
a vacuuming problem.

Assuming you DO have

Re: [GENERAL] Stored procedure version control

2016-06-29 Thread Scott Marlowe
On Wed, Jun 29, 2016 at 12:00 PM, Adrian Klaver
 wrote:
> On 06/29/2016 09:37 AM, Mark Morgan Lloyd wrote:
>>
>> Elsewhere, somebody was asking how people implemented version control
>> for stored procedures on (MS) SQL Server.
>>
>> The consensus was that this is probably best managed by using scripts or
>> command files to generate stored procedures etc., but does anybody have
>> any comment on that from the POV of PostgreSQL?
>
>
> There is no mechanism internal to Postgres that will version control the
> procedures, so the answer will be the same as above. In other words some
> external mechanism to version control. A more complete answer will depend on
> the workflow you are currently using.

I like this: https://github.com/depesz/Versioning very simple and easy to use.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication with non-read-only standby.

2016-06-30 Thread Scott Marlowe
On Thu, Jun 30, 2016 at 7:15 AM, Nick Babadzhanian  wrote:
> Setup:
> 2 PostgreSQL servers are geographically spread. The first one is used for an 
> application that gathers data. It is connected to the second database that is 
> used to process the said data. Connection is not very stable nor is it fast, 
> so using Bidirectional replication is not an option. It is OK if data is 
> shipped in batches rather than streamed.
>
> Question:
> Is there a way to make the standby server non-read-only, so that it can keep 
> getting updates (mostly inserts) from the 'master', but users are able to 
> edit the data stored on 'slave'? Is there some alternative solution to this?

I'd probably solve this with slony.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] psql connection option: statement_timeout

2016-07-03 Thread Scott Marlowe
On Sun, Jul 3, 2016 at 5:15 PM, Melvin Davidson 
wrote:

>
>
> On Sun, Jul 3, 2016 at 6:54 PM, Craig Boyd  wrote:
>
>> Hello All,
>>
>> I am something of a newbie and I am trying to understand how to pass
>> connection options using the psql client.  My understanding is that it is
>> possible to do this as part of the psql connection event.
>> I am on Mint and my PostgreSQL Server version = 9.3.13.
>>
>> I am trying to connect to an instance on a different machine (also
>> 9.3.13).
>> The following works:
>> psql -U username -h 192.x.x.x 
>>
>> But when I try to set the statement like this it fails:
>> psql -U username -h 192.x.x.x statement_timeout=1000
>>
>> I get the following "invalid connection option"  I am less concerned with
>> actually setting this parameter than I am learning how to pass or set
>> connection options when I log in.  If it is a case where this particular
>> option cannot be set as part of the connection string that is fine.  But
>> that leads me to ask what options can I set as part of the connection
>> string?
>> I have looked here:
>> https://www.postgresql.org/docs/9.3/static/app-psql.html
>> and here:
>> https://www.postgresql.org/docs/9.3/static/runtime-config-client.html
>>
>> I suspect I am close, but I can't seem to figure out where I am going
>> awry.
>> Any thoughts?
>>
>> Thanks in advance.I suspect what you want is connect_timeout
>>
>
> ie: psql -U username -h 192.x.x.x connect_timeout=1000
> see examples at https://www.postgresql.org/docs/9.3/static/app-psql.html
>
> However, if you truly want to set statement_timeout, that cannot be set at
> the command line. You must execute that AFTER you connect.
> ie: # SET statement_timeout = 1000;
>
> You can also place multiple commands inside a file and then execute that
> after you connect.
> eg: # \i your_filename.sql
>
>
Also you can set such things as statement_timeout in the postgresql.conf,
or set them by database or by user / role.

alter user reporting set statement_timemout=60 is handy for users that
should never take a long time to connect.

Note that the connect_timeout is a connection time setting, for how long to
attempt a connection to be made, not for how long to hold it while idle.


Re: [GENERAL] psql connection option: statement_timeout

2016-07-03 Thread Scott Marlowe
correction:

alter user reporting set statement_timemout=60 is handy for users that
should never take a long time to connect.

should read

alter user reporting set statement_timemout=60 is handy for users that
should never take a long time to run a statement.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MediaWiki + PostgreSQL is not ready for production?

2016-07-20 Thread Scott Marlowe
On Mon, Jul 18, 2016 at 10:14 PM, Tatsuo Ishii  wrote:
> I found following comment for using PostgreSQL with MediaWiki:
>
> https://www.mediawiki.org/wiki/Compatibility#Database
>
> "Anything other than MySQL or MariaDB is not recommended for
> production use at this point."
>
> This is a sad and disappointed statement for us. Should we help
> MediaWiki community to enhance this?

A few years back I worked at a company that put mediawiki into our
school content management system with postgresql. We had zero issues
with postgresql support, it mostly just worked.  Took us about 4 weeks
to get it working and tested and deployed.

The cool bit was that by creating a few triggers and views, we made
mediawiki think it was just sitting on top of the default database
when it fact it was sitting on top of our custom db. Each teacher /
classroom had its own wiki, and we had literally 10s of thousands of
independent wikis running, and they were plenty fast.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Uber migrated from Postgres to MySQL

2016-07-28 Thread Scott Marlowe
On Wed, Jul 27, 2016 at 9:51 AM, Geoff Winkless  wrote:
> On 27 July 2016 at 15:22, Scott Mead  wrote:
>>
>>  "The bug we ran into only affected certain releases of Postgres 9.2 and
>> has been fixed for a long time now. However, we still find it worrisome that
>> this class of bug can happen at all. A new version of Postgres could be
>> released at any time that has a bug of this nature, and because of the way
>> replication works, this issue has the potential to spread into all of the
>> databases in a replication hierarchy."
>>
>>
>> ISTM that they needed a tire swing and were using a dump truck.  Hopefully
>> they vectored somewhere in the middle and got themselves a nice sandbox.
>
>
> At least his bug got fixed. The last 2 bugs I reported to MySQL resulted in
> an initial refusal to accept any problem existed, followed by (once that
> particular strategy had run out of steam) the developer simply ignoring the
> bug until it was closed automatically by their bug system. As far as I'm
> aware those bugs still exist in the most recent version.

Best / worst MySQL bug was one introduced and fixed twice. Someone put
in a short cut that sped up order by by quite a bit. It also meant
that order by desc would actually get order by asc output. It was
inserted into the code due to poor oversite / code review practices,
then fixed about 9 months later, then introduced again, and again,
took about a year to fix.

The fact that it was introduced into a General Release mid stream with
no testing or real reviews speaks volumes about MySQL and its
developers. The fact that it took months to years to fix each time
does as well.

As someone who has gotten more than one bug fix from pgsql in less
than 48 hours, I feel sorry for anyone who finds a bug in a MySQL
version they are running in production.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Uber migrated from Postgres to MySQL

2016-07-28 Thread Scott Marlowe
On Thu, Jul 28, 2016 at 10:32 AM, Alex Ignatov  wrote:
>
>  Oh, so in contrast to "Oracle world"  "Postgres world" DBA in their right
> to do major upgrade without complete and tested backup?
> Ok,  I understand you. In Postgres world there always sky is blue and sun is
> shining.

Of course we have backups. But we also have slony. So we CAN go back
and forth between latest and previous without a restore.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Uber migrated from Postgres to MySQL

2016-07-28 Thread Scott Marlowe
On Thu, Jul 28, 2016 at 11:23 AM, Alex Ignatov  wrote:
>
> On 28.07.2016 19:43, Scott Marlowe wrote:
>>
>> On Thu, Jul 28, 2016 at 10:32 AM, Alex Ignatov 
>> wrote:
>>>
>>>   Oh, so in contrast to "Oracle world"  "Postgres world" DBA in their
>>> right
>>> to do major upgrade without complete and tested backup?
>>> Ok,  I understand you. In Postgres world there always sky is blue and sun
>>> is
>>> shining.
>>
>> Of course we have backups. But we also have slony. So we CAN go back
>> and forth between latest and previous without a restore.
>>
>>
>
> And? Oracle and MySql doesnt have it but can downgrade right out the box.
> Quick and easy.

So you can swap between oracle 11 and 12 back and forth in a live
environment with no downtime? Please pull the other leg.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ON CONFLICT does not support deferrable unique constraints

2016-08-24 Thread Scott Marlowe
On Wed, Aug 24, 2016 at 6:05 AM, Andreas Joseph Krogh
 wrote:
>
> Hope some -hackers read this...
>
> Are there any plans to lift this restriction?

I'm trying to figure out a method for making this work in my head.
These two things seem kind of opposed to each other.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?

2016-09-02 Thread Scott Marlowe
On Fri, Sep 2, 2016 at 4:49 AM, dandl  wrote:
> Re this talk given by Michael Stonebraker:
>
> http://slideshot.epfl.ch/play/suri_stonebraker
>
>
>
> He makes the claim that in a modern ‘big iron’ RDBMS such as Oracle, DB2, MS
> SQL Server, Postgres, given enough memory that the entire database lives in
> cache, the server will spend 96% of its memory cycles on unproductive
> overhead. This includes buffer management, locking, latching (thread/CPU
> conflicts) and recovery (including log file reads and writes).
>
>
>
> [Enough memory in this case assumes that for just about any business, 1TB is
> enough. The intent of his argument is that a server designed correctly for
> it would run 25x faster.]
>
>
>
> I wondered if there are any figures or measurements on Postgres performance
> in this ‘enough memory’ environment to support or contest this point of
> view?

What limits postgresql when everything fits in memory? The fact that
it's designed to survive a power outage and not lose all your data.

Stonebraker's new stuff is cool, but it is NOT designed to survive
total power failure.

Two totally different design concepts. It's apples and oranges to compare them.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-02 Thread Scott Marlowe
On Thu, Sep 1, 2016 at 8:48 PM, Patrick B  wrote:
> Hi guys,
>
> I'll be performing a migration on my production master database server,
> which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
> I've got some questions about it, and it would be nice if u guys could share
> your experiences/thoughts:
>
> SCENARIO:
>
> I currently have one MASTER and two Streaming Replication Slaves servers...
>
>> master01
>> slave01 (Streaming replication + wal_files)
>> slave02 (Streaming replication + wal_files)
>
>
> ...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be
> installed only on my Master server, because my main problem is Writes and
> not reads.
>
> The new SSD volume will be mounted on /var/lib/pgsql2/
>
>
> The slave02 server will loose the streaming replication connection to the
> master, once slave01 becomes the new master a new timeline will be settled?
> Will slave02 be able to connect to the slave01 server for streaming
> replication?
>
>
>
>
> MIGRATION OPTIONS:
>
> Migration Option 1: I know this option will work
>
> Mount the new volume /var/lib/pgsql2/ on the master01 server
> Turn slave01 into a master server
> once I can confirm everything is working fine, I can go to step 4
> Stop postgres on the master01, start copying the DB using pg_basebackup from
> slave02 to master01 (Will have to edit postgres to use /var/lib/pgsql2/
> instead /var/lib/pgsql - Is that possible? Or I'd have to create a symbolic
> link?)
> Start postgres on master01 server and check if all goes well as streaming
> replication server (Will test it for days)
> Turn master01 into a master server and I'll have to re-copy the DB into
> slave01 to make it a streaming replication server again
>
>
> Migration Option 2: I don't know if this is possible - IS THIS POSSIBLE
>
> Mount the new volume /var/lib/pgsql2/ on the master01 server
> Stop postgres on the server (I won't stop postgres on the slave so the users
> will be able to use the server as read-only)
> Copy the data from /var/lib/pgsql/ to /var/lib/pgsql2/
> Configure postgres to start using the new volume(/var/lib/pgsql2/)
>
>
> What do you guys think? Is option possible? if so it would be much easier :)
> Thanks!

Why not just subscribe to another cluster on the master, then sub the
slaves to that, then switchover to the new cluster on the master?

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Database performance

2016-09-06 Thread Scott Marlowe
On Fri, Sep 2, 2016 at 9:38 PM, Pradeep  wrote:
> Dear Team,
>
>
>
> Could you please help me, after changing the below parameters in PostgreSQL
> configuration file it was not reflecting in OS level and also Database
> performance is degrading.
>
>
>
> Example: I am using Windows 2008 R2 server .For PostgreSQL I have allocated
> 24GB RAM out of 32GB.

Actually effective_cache_size allocates nothing. It tells the pgsql
server about how much memory the machine it is running on is using for
OS level caching. On  32G machine with 1G or so of shared_buffers that
number is about right.

> However after changing the below parameters, In task bar it is showing 2.7GB
> Utilization even though my utilization is more.

2.7G is ok. Postgresql expects the OS to help out with caching so it
doesn't need to grab all the memory in the machine etc. In fact that
would be counterproductive in most situations.

> So kindly suggest us, whether it will impact or not in Open source
> PostgreSQL database
>
> max_connections = 100
> shared_buffers = 512MB
> effective_cache_size = 24GB
> work_mem = 110100kB

This is WAY too high for work_mem. Work_mem is how much memory a
single sort can grab at once. Each query may run > 1 sort, and you
could have 100 queries running at once.

This setting is 110GB. That's about 109.9GB too high for safety. When
things go wrong with this too big, they go very wrong, sending the
machine into a swap storm from which it may not return.

> maintenance_work_mem = 2GB
>
> checkpoint_segments = 64
>
> checkpoint_completion_target = 0.9

Too high of a checkpoint completion target may cause buffers to get
written out more often than needed. but it varies based on load etc.

> wal_buffers = 16MB
>
> default_statistics_target = 100

It's far more likely that you've just got poorly written queries. I'd
make a post with explain analyze output etc. Here's a good resource
for reporting slow queries:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Database performance

2016-09-06 Thread Scott Marlowe
On Tue, Sep 6, 2016 at 1:18 PM, Steve Atkins  wrote:
>
>> On Sep 6, 2016, at 12:08 PM, Scott Marlowe  wrote:
>>
>> On Fri, Sep 2, 2016 at 9:38 PM, Pradeep  wrote:
>>>
>>> max_connections = 100
>>> shared_buffers = 512MB
>>> effective_cache_size = 24GB
>>> work_mem = 110100kB
>>
>> This is WAY too high for work_mem. Work_mem is how much memory a
>> single sort can grab at once. Each query may run > 1 sort, and you
>> could have 100 queries running at once.
>>
>> This setting is 110GB. That's about 109.9GB too high for safety. When
>> things go wrong with this too big, they go very wrong, sending the
>> machine into a swap storm from which it may not return.
>
> It's an oddly spelled 110MB, which doesn't seem unreasonable.

oh yeah. still kind biggish but not as big as I had thought.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-08 Thread Scott Marlowe
On Wed, Sep 7, 2016 at 5:00 PM, Jim Nasby  wrote:
> On 9/2/16 11:44 AM, David Gibbons wrote:
>>
>> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
>> service postgres stop
>> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
>>
>> The second rsync will only copy the deltas from the first, it still has
>> to go in and determine what needs to be copied/what changed but the bulk
>> of it can be prepared/migrated before the actual downtime window.
>
>
> That is NOT safe. The problem is it allows rsync to use mtime alone to
> decide that a file is in sync, and that will fail if Postgres writes to a
> file in the same second that the first rsync reads from it (assuming
> Postgres writes after rsync reads). You need to add the --checksum flag to
> rsync (which means it will still have to read everything that's in
> /var/lib/pgsql).
> --

I'm still wondering why my advice to just subscribe a new cluster on
the master machine was just ignored by OP. Postgresql already has a
pretty reliable method for doing what the OP wants using
pg_basebackup. Using rsync etc is like reinventing the wheel imho.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres UPGRADE from 9.2 to 9.4

2016-09-08 Thread Scott Marlowe
On Tue, Sep 6, 2016 at 5:25 PM, John R Pierce  wrote:
> On 9/6/2016 4:20 PM, Melvin Davidson wrote:
>>
>> If you use slony to replicate, you CAN have 9.2 on the master and 9.4 on
>> the slave.
>
>
> does rackspace support slony?  how about amazon dms ?
>
> slony requires configuring replication on each table.  if the database has a
> large complex schema this could take considerable setup effort.
>

Not really. As of slony 2.2 you can use a regular expression to add
tables or sequences.

i.e.:

SET ADD TABLE (
SET ID=1,
TABLES='public\\.*'
);

repeat for sequences.  Two commands. I don't consider that considerable effort.
-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-12 Thread Scott Marlowe
On Sun, Sep 11, 2016 at 3:26 AM, Patrick B  wrote:
>
>
> 2016-09-11 14:09 GMT+12:00 Jim Nasby :
>>
>> On 9/8/16 3:29 PM, David Gibbons wrote:
>>>
>>>
>>> Isn't this heading in the wrong direction?   We need to be more
>>> precise than 0 (since 0 is computed off of rounded/truncated time
>>> stamps), not less precise than 0.
>>>
>>> Cheers,
>>>
>>> Jeff
>>>
>>>
>>>
>>> Hmm, You may be right, reading it 4 more times for comprehension it
>>> looks like it should be set to -1 not 1.
>>
>>
>> Not according to my man page:
>>
>>--modify-window
>>   When comparing two timestamps, rsync treats the timestamps
>> as being equal if they differ by no more than the modify-window value.  This
>> is normally 0 (for an exact match), but you
>>   may find it useful to set this to a larger value in some
>> situations.  In particular, when transferring to or from an MS Windows FAT
>> filesystem  (which  represents  times  with  a
>>   2-second resolution), --modify-window=1 is useful (allowing
>> times to differ by up to 1 second).
>>
>>
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
>> Experts in Analytics, Data Architecture and PostgreSQL
>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>> 855-TREBLE2 (855-873-2532)   mobile: 512-569-9461
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
>
> So... what do u guys recommend? which options should I use?
>
> Patrick

Why not subscribe a new cluster on the same box with pg_basebackup?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multiple multithreaded insert

2016-10-14 Thread Scott Marlowe
On Fri, Oct 14, 2016 at 7:12 AM, Арсен Арутюнян  wrote:
> Hi, everyone!
>
> I have a table:
>
> create table testpr(id serial,priority integer,unique(priority) DEFERRABLE,
> primary key(id));
>

This:

> and a trigger which, when added to this table, automatically sets priority
> as the maximum value +1

Leads to THIS:

> The result (priority):
>
> Thread 1: (1) (2) (3) (4) (5) (6) (7)
>
> Thread 2: (8) (9) (10) (11) (12) (13) (14)

If you have to have monotonically increasing priorities with no gaps,
that's the price you pay, unless you can pre-allocate them or
something.

Basically max(id)+1 is a db anti-pattern.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange? BETWEEN behaviour.

2016-10-20 Thread Scott Marlowe
On Thu, Oct 20, 2016 at 6:27 AM, Bjørn T Johansen  wrote:
> On Thu, 20 Oct 2016 14:04:51 +0200
> vinny  wrote:
>
>> On 2016-10-20 13:51, Bjørn T Johansen wrote:
>> > I have the following SQL:
>> >
>> > SELECT * from table WHERE date BETWEEN to_timestamp('20.10.2016
>> > 00:00:00','DD.MM. HH24:MI:SS') AND to_timestamp('20.10.2016
>> > 23:59:59','DD.MM.
>> > HH24:MI:SS')
>> >
>> > date is of type timestamp.
>> >
>> > I was expecting to get all the records that had datepart = 20.10.2016
>> > but I am not getting that..
>> >
>> > What am I missing?
>> >
>> >
>> > Regards,
>> >
>> > BTJ
>> >
>>
>> What are you getting?
>>
>>
>
> The sql returns 5 of the expected 72 rows...

Could you be running into timezone issues?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to hint 2 coulms IS NOT DISTINCT FROM each other

2016-10-28 Thread Scott Marlowe
On Fri, Oct 28, 2016 at 10:29 AM, Kim Rose Carlsen  wrote:
> Hi
>
> I was wondering if there is a way to hint that two columns in two different
> tables IS NOT DISTINCT FROM each other. So that the optimizer may assume if
> table_a.key = 'test' THEN table_b.key = 'test' .
>
> The equals operator already does this but it does not handle NULLS very well
> (or not at all). And it seems IS NOT DISTINCT FROM is not indexable and
> doesn't establish the same inference rules as equals.

The whole idea behing Postgres' query planner is that you don't have
to use any hints. Late model versions of postgres handle nulls fine,
but nulls are never "equal" to anything else. I.e. where xxx is null
works with indexes. Where x=y does not, since null <> null.

Suggestion for getting help, put a large-ish aka production sized
amount of data into your db, run your queries with explain analyze and
feed them to https://explain.depesz.com/ and post the links here along
with the slow queries. A lot of times the fix is non-obvious if you're
coming from another db with a different set of troubleshooting skills
for slow queries.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to hint 2 coulms IS NOT DISTINCT FROM each other

2016-10-29 Thread Scott Marlowe
On Sat, Oct 29, 2016 at 6:55 AM, Kim Rose Carlsen  wrote:
>> try this :-D
>
>> create or replace function indf(anyelement, anyelement) returns anyelement
>> as
>> $$
>>   select $1 = $2 or ($1 is null and $2 is null);
>> $$ language sql;
>>
>> CREATE VIEW view_circuit_with_status AS (
>>SELECT r.*,
>>   s.circuit_status,
>>   s.customer_id AS s_customer_id,
>>   p.line_speed,
>>   p.customer_id AS p_customer_id
>>  FROM view_circuit r
>>  JOIN view_circuit_product_main s
>>ON r.circuit_id = s.circuit_id
>>   AND indf(r.customer_id, s.customer_id)
>>  JOIN view_circuit_product p
>>ON r.circuit_id = p.circuit_id
>>   AND indf(r.customer_id, s.customer_id)
>>
>> merlin
>
> This doesn't do much good. This doesn't tell the planner that the 3
> customer_ids are actually of same value, and it therefore can't filter them
> as it sees fit.

You do know you can index on a function, and the planner then keeps
stats on it when you run analyze right?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Turning slave into a master - PostgreSQL 9.2

2016-11-01 Thread Scott Marlowe
On Mon, Oct 31, 2016 at 8:01 PM, Patrick B  wrote:
> If I change recovery.conf:
>
> recovery_target_time = '2016-10-30 02:24:40'
>
>
> I get error:
>
> FATAL:  requested recovery stop point is before consistent recovery point

You can try using pg_basebackup to get the replica setup. In 9.2 you
gotta make your own recovery.conf, but you already know how to do
that. Way easier than trying to rsync by hand etc.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Statistics on index usage

2016-11-01 Thread Scott Marlowe
On Tue, Nov 1, 2016 at 8:43 AM, François Battail
 wrote:
> Dear List,
>
> does pgsql maintains statistics on index usage? I mean just a counter
> for each index in the database, incremented each time time it is used.
> It would be useful to help cleaning almost unused index and to avoid
> poisoning the global cache.
>
> I've found nothing so far but may be I've been missing something.

Yup it does keep track of index usage. To see all the various stuff
postgres keeps track of etc, try typing

select * from pg_ then hit tab. The two common ones I look at are:

pg_statio_all_indexes
pg_stat_user_indexes


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Turning slave into a master - PostgreSQL 9.2

2016-11-01 Thread Scott Marlowe
On Tue, Nov 1, 2016 at 1:43 PM, Patrick B  wrote:
>
>
> 2016-11-02 2:55 GMT+13:00 Scott Marlowe :
>>
>> On Mon, Oct 31, 2016 at 8:01 PM, Patrick B 
>> wrote:
>> > If I change recovery.conf:
>> >
>> > recovery_target_time = '2016-10-30 02:24:40'
>> >
>> >
>> > I get error:
>> >
>> > FATAL:  requested recovery stop point is before consistent recovery
>> > point
>>
>> You can try using pg_basebackup to get the replica setup. In 9.2 you
>> gotta make your own recovery.conf, but you already know how to do
>> that. Way easier than trying to rsync by hand etc.
>
>
>
> I did a pg_basebackup!

Huh, after a pg_basebackup all you should need is a recovery.conf in
place and a trigger file.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hardware recommendations?

2016-11-02 Thread Scott Marlowe
On Wed, Nov 2, 2016 at 11:40 AM, Joshua D. Drake  wrote:
> On 11/02/2016 10:03 AM, Steve Atkins wrote:
>>
>> I'm looking for generic advice on hardware to use for "mid-sized"
>> postgresql servers, $5k or a bit more.
>>
>> There are several good documents from the 9.0 era, but hardware has moved
>> on since then, particularly with changes in SSD pricing.
>>
>> Has anyone seen a more recent discussion of what someone might want for
>> PostreSQL in 2017?
>
>
> The rules haven't changed much, more cores (even if a bit slower) is better
> than less, as much ram as the budget will allow and:
>
> SSD
>
> But make sure you get datacenter/enterprise SSDs. Consider that even a slow
> datacenter/enterprise SSD can do 500MB/s random write and read just as fast
> if not faster. That means for most installations, a RAID1 is more than
> enough.

Just to add that many setups utilizing SSDs are as fast or faster
using kernel level RAID as they are with a hardware RAID controller,
esp if the RAID controller has caching enabled. We went from 3k to 5k
tps to 15 to 18k tps by turnong off caching on modern LSI MegaRAID
controllers running RAID5.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hardware recommendations?

2016-11-02 Thread Scott Marlowe
On Wed, Nov 2, 2016 at 4:19 PM, John R Pierce  wrote:
> On 11/2/2016 3:01 PM, Steve Crawford wrote:
>>
>> After much cogitation I eventually went RAID-less. Why? The only option
>> for hardware RAID was SAS SSDs and given that they are not built on
>> electro-mechanical spinning-rust technology it seemed like the RAID card was
>> just another point of solid-state failure. I combined that with the fact
>> that the RAID card limited me to the relatively slow SAS data-transfer rates
>> that are blown away by what you get with something like an Intel NVME SSD
>> plugged into the PCI bus. Raiding those could be done in software plus $$$
>> for the NVME SSDs but I already have data-redundancy through a combination
>> of regular backups and streaming replication to identically equipped
>> machines which rarely lag the master by more than a second.
>
>
> just track the write wear life remaining on those NVMe cards, and maintain a
> realistic estimate of lifetime remaining in months, so you can budget for
> replacements.   the complication with PCI NVMe is how to manage a
> replacement when the card is nearing EOL.   The best solution is probably
> failing over to a replication slave database, then replacing the worn out
> card on the original server, and bringing it up from scratch as a new slave,
> this can be done with minimal service interruptions.   Note your slaves will
> be getting nearly as many writes as the masters so likely will need
> replacing in the same time frame.

Yeah the last thing you want is to start having all your ssds fail at
once due to write cycle end of life etc. Where I used to work we had
pretty hard working machines with something like 500 to 1000 writes/s
and after a year were at ~90% writes left. ymmv depending on the ssd
etc.

A common trick is to overprovision if possible. Need 100G of storage
for a fast transactional db? Use 10% of a bunch of 800GB drives to
make an array and you now have a BUNCH of spare write cycles per
device for extra long life.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql errors on Windows with F-Secure AntiVirus

2016-11-09 Thread Scott Marlowe
On Wed, Nov 9, 2016 at 11:29 AM, Moreno Andreo  wrote:
> Hi again,
> our support team is reporting cases where postgres connections are
> suddenly dropped, but reconnecting again soon everyting is OK.
> Environment is PostgreSQL 9.1 on Windows (various versions)
> Asfer some research, I found on postgresql log many of these messages
> could not reserve shared memory region (addr=02E8) for child 094C:
> 487
> Searching the archives I found a bug fixed with 9.3.x, since I had 9.1, I
> migrated to 9.5.6 (standard for new installations). After restarting, I got
> the exact same error and response behaviour.
> After another search, I stopped F-Secure Antivirus services, and everything
> is now going smoothly. So, with the help of a mate that knows how this
> antivirus work, we put exceptions to all PostgreSQL directories.
> After restarting, the error was back. Nothing in the AV log. Tomorrow we'll
> try with adding exclusions to port 5433
>
> Someone else facing this problem?
>
> Thanks
> Moreno.-


Shared memory doesn't come from a file on teh hard drive.  You need an
exception for whatever postgres uses for shared memory in windows. I
don't know much about pgsql on windows, but that should get you
started.


Post back if you find anything that makes it work.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql errors on Windows with F-Secure AntiVirus

2016-11-09 Thread Scott Marlowe
On Wed, Nov 9, 2016 at 11:56 AM, Scott Marlowe  wrote:
> On Wed, Nov 9, 2016 at 11:29 AM, Moreno Andreo  
> wrote:
>> Hi again,
>> our support team is reporting cases where postgres connections are
>> suddenly dropped, but reconnecting again soon everyting is OK.
>> Environment is PostgreSQL 9.1 on Windows (various versions)
>> Asfer some research, I found on postgresql log many of these messages
>> could not reserve shared memory region (addr=02E8) for child 094C:
>> 487
>> Searching the archives I found a bug fixed with 9.3.x, since I had 9.1, I
>> migrated to 9.5.6 (standard for new installations). After restarting, I got
>> the exact same error and response behaviour.
>> After another search, I stopped F-Secure Antivirus services, and everything
>> is now going smoothly. So, with the help of a mate that knows how this
>> antivirus work, we put exceptions to all PostgreSQL directories.
>> After restarting, the error was back. Nothing in the AV log. Tomorrow we'll
>> try with adding exclusions to port 5433
>>
>> Someone else facing this problem?
>>
>> Thanks
>> Moreno.-
>
>
> Shared memory doesn't come from a file on teh hard drive.  You need an
> exception for whatever postgres uses for shared memory in windows. I
> don't know much about pgsql on windows, but that should get you
> started.
>
>
> Post back if you find anything that makes it work.
>

Oh and if you can report it to the antivirus vendor so they can fix it
permanently on their end. pgsql is not a virus.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-12 Thread Scott Marlowe
On Sat, Nov 12, 2016 at 2:31 PM, Adrian Klaver
 wrote:
> On 11/12/2016 01:20 PM, aws backup wrote:
>>
>> Hi,
>>
>> I try to make pg_dumpall backups from a PostgreSQL 9.5 server which is
>> part of the DaVinci Resolve 12.5.3 App on a Mac OS X 10.11.6 system.
>>
>> Unfortunately I get following failure message:
>>
>> pg_dumpall: could not connect to database "template1": FATAL: password
>> authentication failed for user "postgres"
>>
>> Maybe you can help me to solve this problem.
>
>
> Two choices:
>
> 1) Determine what the password is for the postgres user and provide it when
> you connect.
>
> 2) If you have access to the pg_hba.conf file create a access line that uses
> trust as the auth method for user postgres connect that way.
>
> https://www.postgresql.org/docs/9.5/static/auth-methods.html#AUTH-TRUST

OR you could use the -l switch and specify another db.

pg_dumpall -l mydbnamehere


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Disabling inheritance with query.

2016-12-21 Thread Scott Marlowe
On Wed, Dec 21, 2016 at 3:36 PM, Edmundo Robles  wrote:
> Hi!
>
> i need  disable  inheritance  from many tables in a query like
>
> "delete from pg_inherits where inhparent=20473"  instead alter table ...
>
> but  is safe?   which is the risk for  database if  i  delete it?

You could change the source query to use the only keyword:

delete from only parenttable where ...

OR you could write a rule or trigger that rewrote the query to have
the only keyword in it under certain circumstances.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unique index problem

2015-12-20 Thread Scott Marlowe
On Sun, Dec 20, 2015 at 8:50 AM, Sterpu Victor  wrote:
> Hello
>
> I created a unique index that doesn't seem to work when one column is NULL.
> Index is created like this: CREATE UNIQUE INDEX lab_tests_groups_siui_uni ON
> lab_tests_groups_siui(id_lab_tests_siui, id_lab_tests_groups, valid_from,
> id_lab_sample_types);
> Now I can run this insert twice and I will have 2 records in the database
> that seem to violate this index:
> INSERT INTO lab_tests_groups_siui(id_lab_tests_siui, id_lab_tests_groups,
> valid_from) VALUES(463, 9183, '2014-06-01');
>
> When I create the index like this "CREATE UNIQUE INDEX
> lab_tests_groups_siui_uni ON lab_tests_groups_siui(id_lab_tests_siui,
> id_lab_tests_groups, valid_from);" index works fine.
>
> I tested this on postgres 9.1.4 and 9.1.9.

This is normal operation, as one NULL is unique from other NULLS, as
far as the db is concerned. If you want it to work some other way, you
need to use a value other than null, or make an index that's something
like un


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unique index problem

2015-12-20 Thread Scott Marlowe
On Sun, Dec 20, 2015 at 9:00 AM, Scott Marlowe  wrote:
> On Sun, Dec 20, 2015 at 8:50 AM, Sterpu Victor  wrote:
>> Hello
>>
>> I created a unique index that doesn't seem to work when one column is NULL.
>> Index is created like this: CREATE UNIQUE INDEX lab_tests_groups_siui_uni ON
>> lab_tests_groups_siui(id_lab_tests_siui, id_lab_tests_groups, valid_from,
>> id_lab_sample_types);
>> Now I can run this insert twice and I will have 2 records in the database
>> that seem to violate this index:
>> INSERT INTO lab_tests_groups_siui(id_lab_tests_siui, id_lab_tests_groups,
>> valid_from) VALUES(463, 9183, '2014-06-01');
>>
>> When I create the index like this "CREATE UNIQUE INDEX
>> lab_tests_groups_siui_uni ON lab_tests_groups_siui(id_lab_tests_siui,
>> id_lab_tests_groups, valid_from);" index works fine.
>>
>> I tested this on postgres 9.1.4 and 9.1.9.
>
> This is normal operation, as one NULL is unique from other NULLS, as
> far as the db is concerned. If you want it to work some other way, you
> need to use a value other than null, or make an index that's something
> like un

dangit, stupid gmail sent early.  anyway.

you'd have to make an index like unique index on (x,y,z) where field
is [not] null or something like that.

Basically NULL <> NULL <> a particular value.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unique index problem

2015-12-21 Thread Scott Marlowe
On Sun, Dec 20, 2015 at 11:39 PM, Sterpu Victor  wrote:
> Thank you.
>
> I used the syntax with 2 indexes, it works for me.
> But why does NULL != NULL?

Because NULL literally means "an unknown, possibly unknowable value."

You need to stop thinking of NULL as A value. It is not.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-05 Thread Scott Marlowe
On Tue, Jan 5, 2016 at 2:09 PM, Joshua D. Drake  wrote:
> On 01/05/2016 11:08 AM, Roland van Laar wrote:
>>
>>
>>
>> On January 5, 2016 5:47:16 PM GMT+01:00, "Joshua D. Drake"
>>  wrote:
>>>
>>> Hello,
>>>
>>> I had a hard time writing this email. I think Code of Conducts are
>>> non-essential, a waste of respectful people's time and frankly if you
>>> are going to be a jerk, our community will call you out on it.
>>> Unfortunately a lot of people don't agree with that. I have over the
>>> course of the last year seen more and more potential users very
>>> explicitly say, "I will not contribute to a project or attend a
>>> conference that does not have a CoC".
>>
>>
>> Do they give a rational for that?
>
>
> I don't think I am a good person to rationalize their reasoning because I
> don't like the idea of a CoC. That said, I think a lot of boils down to
> perception, responsibility, accountability and the fact that a lot of people
> are flat out jerks. I am not talking the ball busting type of jerk but
> honest, just not nice people or people who vastly lack the ability to
> integrate with larger society. Those people tend to need guidelines for
> their jerkiness because they will say, "I didn't know I couldn't do/say
> XYZ". Whether that is true or not, I have no idea.

CoC:
1: Use our code how you want
2: Don't sue us
3: Don't be a jerk

done.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql 9.3 not coming up after restart in centos

2016-01-06 Thread Scott Marlowe
On Wed, Jan 6, 2016 at 12:36 PM, balajishanmu...@live.in
 wrote:
> By application I mean centOS.
>
> We are starting and stopping Postgres using systemd service. We have a
> service file called postgresql9.3.service which is used to start or stop
> Postgres.
>
> Excerpts of postgresql9.3.service
>
> ExecStartPre=/usr/pgsql-9.3/bin/postgresql93-check-db-dir ${PGDATA}
> ExecStart=/usr/pgsql-9.3/bin/pg_ctl start -D ${PGDATA} -s -w -t 300
> ExecStop=/usr/pgsql-9.3/bin/pg_ctl stop -D ${PGDATA} -s -m fast
> ExecReload=/usr/pgsql-9.3/bin/pg_ctl reload -D ${PGDATA} -s
>
>

So how are you restarting centos? Orderly shutdown, pulling the power plugs etc?

I'm wondering if you've got untrustworthy data storage underneath it
(i.e. storage that lies about fsync) and maybe centos or the method of
shutdown isn't allowing the drives to flush the data that they've
already said they flushed but actually haven't.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql 9.3 not coming up after restart in centos

2016-01-07 Thread Scott Marlowe
On Wed, Jan 6, 2016 at 2:08 PM, balajishanmu...@live.in
 wrote:
> Most of the time I will be restarting centOS by issuing reboot command. Which
> will do the orderly shutdown of all the service and sometimes just pull the
> plug.
>
> But the issue appears to be random. Is there a way that before Postgres
> starts we can check whether data is flushed, if not flush it manually or any
> other better way to avoid this issue.

As Adrian mentioned, by the time you go for a startup of pgsql, the
damage is already done during the previous shut down.

The real issue here is that a properly operating server should be able
to have the power plug pulled, and on boot up postgres should be able
to come back up. When pgsql can't come back up, it's usually due to an
unreliable storage subsystem. So what are you using for storage?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql 9.3 not coming up after restart in centos

2016-01-07 Thread Scott Marlowe
On Thu, Jan 7, 2016 at 1:41 PM, balajishanmu...@live.in
 wrote:
> Hi,
>
> For storage I am using a 2.5inch SATA 3 SSD hard disk. It is about 60 GB. I
> am yet to get the log. I will post the Postgres log once I have it.
>
> Thanks!

Yeah a lot of cheaper consumer grade SSDs don't fsync safely. There
are drives that do, and they're usually bit more expensive. We use the
Intel DC S3500 series at work and they do pass the "pull the power
cables" test for us.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cannot create role, no default superuser role exists

2016-03-10 Thread Scott Marlowe
For future reference you can start the postmaster in single user mode
and create a new postgres account or grant the current one superuser
access if this happens again. Just google "Postgresql single user
mode".


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding connection drops for every few minutes

2016-03-11 Thread Scott Marlowe
On Fri, Mar 11, 2016 at 2:22 AM, Durgamahesh Manne
 wrote:
> hi
>
> the following info generating in pg_log
>
> LOG:  could not receive data from client: Connection reset by peer
> LOG:  could not receive data from client: Connection reset by peer
> LOG:  could not receive data from client: Connection reset by peer
> LOG:  could not receive data from client: Connection reset by peer
>
>
> Can you please provide info to stop connection drops for every few minutes
>
>
> do i need to change the value of tcp related parameter or else
>
> ssl_renegotiation parameter
>  please let me know sir

The latest version of PostgreSQL turn off ssl renegotiation by
default, so it might be easier to just update to the latest release.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding connection drops for every few minutes

2016-03-11 Thread Scott Marlowe
On Fri, Mar 11, 2016 at 3:22 AM, Durgamahesh Manne
 wrote:
> Hi Sir
>
> As per above discussion.GOOD response from PostgreSQL
>
> i am very happy to work on PostgreSQL.Super fast response only from postgres
> team regarding i asked any question related to postgres
>
> regards
> mahesh
>
> On Fri, Mar 11, 2016 at 3:28 PM, John R Pierce  wrote:
>>
>> On 3/11/2016 1:50 AM, Durgamahesh Manne wrote:
>>
>>  Thanks for quick response .as per above conversation. for which parameter
>> i can comment to resolve the  issue & please specify the value of parameter
>> sir
>>
>> LOG:  could not receive data from client: Connection reset by peer
>>
>>
>> your client is disconnecting without closing, thats all that error says,
>> it doesn't say why this is happening, and without knowing why, there's no
>> way to suggest a fix.
>>
>> I suppose you could try setting tcp_keepalives_idle, if your connections
>> are staying idle for a long time and your OS doesn't default to using
>> tcp_keepalive, this could help.
>>
>> I have no idea what a suitable value is, you didn't specify an operating
>> system, a postgres version, what API your client is using, or if this is a
>> localhost vs a LAN vs an internet connection, or really much of anything
>> else..
>>
>>
>>
>> btw, please reply to the list,  not to me privately, thank you.

You could also be getting bitten by a network timeout here. If a
connection sits idle for a while a firewall could be dropping the tcp
connection. You can often work around this with lower tcp_keepalive
timeout values.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL advocacy

2016-03-21 Thread Scott Marlowe
On Mon, Mar 21, 2016 at 7:44 AM, Mark Morgan Lloyd
 wrote:
> If anybody puts together a "just the facts" document after Oracle's attack
> on PostgreSQL in Russia, please make sure it's drawn to the attention of
> this mailing list for the benefit of those who aren't in -advocacy.
>
> I was discussing this sort of thing elsewhere in the context of MS's
> apparent challenge to Oracle and IBM, and the dominant feeling appeared to
> be that actual use of things like Oracle RAC was vanishingly uncommon. Which
> surprised me, and which I'm treating with caution since the fact that
> facilities aren't used (in a certain population of developers etc.) can in
> no way be interpreted as meaning that the technology is not unavailable or
> unreliable.


I've submitted three different bug reports and had a patch within 48
hours each time. the responsiveness of this list, and the folks who
code PostgreSQL is far above any level of support I've ever gotten
from Oracle.

I once asked Oracle to please package the newest connection libs into
an RPM for RHEL5 and their response was "do it yourself."

Yeah, I know which database has REAL, USEFUL support for a DBA and it
isn't Oracle.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL advocacy

2016-03-22 Thread Scott Marlowe
On Tue, Mar 22, 2016 at 9:15 AM, Thomas Kellerer  wrote:
> Bruce Momjian schrieb am 22.03.2016 um 16:07:
>>
>> However, I do think database upgrades are easier with Oracle RAC
>
> I think you can do a rolling upgrade with a standby, but I'm not entirely 
> sure.

I find Slony good for upgrading versions with minimal downtime,
including major version changes.  It's very nature allows you to
migrate pieces and parts for testing etc, in ways that any kind of
byte streaming just can't do.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] I/O - Increase RAM

2016-04-13 Thread Scott Marlowe
On Wed, Apr 13, 2016 at 2:43 PM, drum.lu...@gmail.com 
wrote:

>
> Hi all,
>
> At the moment I'm having 100% I/O during the day. My server has SATA HDs,
> and it can't be changed now.
> So, to solve the problem (or at least try) I was thinking about double the
> RAM, and by doing that, increasing the cache.
>
> [image: Inline images 1]
>
> The server has 128GB today:
>
> shared_buffers = 51605MB (I'll change it to 32GB)
>> effective_cache_size = 96760MB
>
>
>
> Question:
>
> I know that might not be the best option, but by increasing the RAM and
> the CACHE would help, right?
>
>
We're gonna need better stats. iostat, iotop, vmstat etc will all break
down your io between reads and writes, random vs sequential etc.

If you're at 100% IO Util, and iostat says you're writing is taking up 20
or 30% of the time, then no, adding cache probably won't help.

Start looking into adding SSDs. They are literally 20 to 1000 times faster
at a lot of io stuff than spinning drives. And they're relatively cheap for
what they do.

Note that a software RAID-5 array of SSDs can stomp a hardware controller
running RAID-10 with spinning disks easily, and RAID-5 is pretty much as
slow as RAID gets.

Here's a few minutes of "iostat -xd 10 /dev/sdb" on one of my big servers
at work. These machines have a RAID-5 of 10x750GB SSDs under LSI MegaRAIDs
with caching turned off. (much faster that way). The array created thus is
6.5TB and it's 83% full. Note that archiving and pg_xlog are on separate
volumes as well.

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdb   0.00   236.30 1769.10 5907.30 20366.80 69360.80
23.3836.384.740.346.06   0.09  71.00

So we're seeing 1769 reads/s, 5907 writes/s and we're reading ~20MB/s and
writing ~70MB/s. In the past this kind of performance from spinning disks
required massive caching and cabinets full of hard drives. When first
testing these boxes we got literally a fraction of this performance with 20
spinning disks in RAID-10, and they had 512GB of RAM. Management at first
wanted to throw more memory at it, these machines go to 1TB RAM, but we
tested with 1TB RAM and the difference was literally a few % points going
from 512GB to 1TB RAM.

If your iostat output looks anything like mine, with lots of wkB/s and w/s
then adding memory isn't going to do much.


Re: [GENERAL] Scaling Database for heavy load

2016-05-11 Thread Scott Marlowe
On Wed, May 11, 2016 at 4:09 AM, Digit Penguin  wrote:
> Hello,
>
>
> we use PostgreSql 9.x in conjunction with BIND/DNS for some Companies with
> about 1.000 queries per second.
> Now we have to scale the system up to 100.000 queries per second (about).
>
> Bind/DNS is very light and i think can not give us bottleneck.
> The question is how to dimension the backend database.
>
> The queries are select (only few insert or update), but the 100.000 queries
> per second are only select.
>
> How can i calculate/dimensionate?
> We think to put mor ethan one Bind Server (with backend database) behinbd a
> router with balancing capabilities.
>
> The problem is to know which requirements and limits does a Postgresql 9.x
> installation - 64 bit - can have.
> Furthermore, we tried Rubyrep (it is quite old!); can you suggest me other
> replication modules that can work also if connction link, from Database
> Server, went down?

Definitely looks like multiple read slaves is the answer. How man
depends on a few things.

How big is your data set? How many clients need to have an open
connection at a time? How man updates / inserts / second are we
talking equals "a few"? One per second? Ten, a hundred, a thousand?

How often and for how long will your connection link be going down?
Slony is quite robust. Postgresql's built in streaming replication
works well enough if you use something liek WALE or OmniPITR to
archive xlogs and make them available in case of loss of connection.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Corruption of files in PostgreSQL

2007-06-04 Thread Scott Marlowe

Paolo Bizzarri wrote:

On 6/2/07, Tom Lane <[EMAIL PROTECTED]> wrote:

"Paolo Bizzarri" <[EMAIL PROTECTED]> writes:
> On 6/2/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>> Please provide a reproducible test case ...

> as explained above, the problem seems quite random. So I need to
> understand what we have to check.

In this context "reproducible" means that the failure happens
eventually.  I don't care if the test program only fails once in
thousands of tries --- I just want a complete self-contained example
that produces a failure.


As said above, our application is rather complex and involves several
different pieces of software, including Zope, OpenOffice both as
server and client, and PostgreSQL. We are absolutely NOT sure that the
problem is inside PostgreSQL.

What we are trying to understand is, first and foremost, if there are
known cases under which PostgreSQL can truncate a file.


I would suspect either your hardware (RAID controller, hard drive, cache 
etc) or your OS (kernel bug, file system bug, etc)


For instance:

http://lwn.net/Articles/215868/

documents a bug in the 2.6 linux kernel that can result in corrupted 
files if there are a lot of processes accessing it at once.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Foreign keys and indexes

2007-06-05 Thread Scott Marlowe

Marc Compte wrote:

Thanks to everyone for the prompt reply :)

Good thing about answers is when they raise up new questiosn, so you 
can keep on learning all the time.


This one answer, for instance, brings me another question. Does having 
a composite primary mean the system will create an individual index on 
each of the fields? or is the index created on the composition only?


For instance, in the implementation of a N:M relationship, declaring 
the primary as (foreign1, foreign2) will create two indexes? or just one? 


Just one (and please don't top post.  :) )

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Corruption of files in PostgreSQL

2007-06-05 Thread Scott Marlowe

Greg Smith wrote:

On Tue, 5 Jun 2007, Paolo Bizzarri wrote:


On 6/4/07, Scott Marlowe <[EMAIL PROTECTED]> wrote:

http://lwn.net/Articles/215868/
documents a bug in the 2.6 linux kernel that can result in corrupted
files if there are a lot of processes accessing it at once.


in fact, we were using a 2.6.12 kernel. Can this be a problem?


That particular problem appears to be specific to newer kernels so I 
wouldn't think it's related to your issue.


That is not entirely correct.  The problem was present all the way back 
to the 2.5 kernels, before the 2.6 kernels were released.  However, 
there was an update to the 2.6.18/19 kernels that made this problem much 
more likely to bite.  There were reports of data loss for many people 
running on older 2.6 kernels that mysteriously went away after updating 
to post 2.6.19 kernels (or in the case of redhat, the updated 2.6.9-44 
or so kernels, which backported the fix.)


So, it IS possible that it's the kernel, but not likely.  I'm still 
betting on a bad RAID controller or something like that.  But updating 
the kernel probably wouldn't be a bad idea.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Running v8.1 amd v8.2 at the same time for a transition

2007-06-05 Thread Scott Marlowe

Vincenzo Romano wrote:

Hi all.
I need to run both 8.1 and 8.2 at the same time in order to check
everything in 8.2 *before* shutting 8.1 down.
I need to run both as I only have one machine available.
I'm using a debian derivateive (Kubuntu) that provides a nice pg_wrapper
mechanism to direct connections for tools to either version you want.
Infact I see both versions running, one on port 5432 and one on port 5433.
The point is thay I have no clue on ow to choose which instance attach to.
I've already posted this question to the KUbuntu team with no answer in 14+
hours.


You choose one or the other by changing the port.  If you're not sure 
which is running on which port, you can try connecting.


something along the lines of:

psql -p 5433 -U postgres template1
select version();

should tell you.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] should the postgres user have a password?

2007-06-07 Thread Scott Marlowe

Anton Melser wrote:

Hi,
I have never given postgres a password before, but one of my clients
put one on his postgres user - I suppose so he could use the postgres
user from phppgadmin (not my choice !).
But now I can't see how to get my backup scripts to work... can I put
the password for tools like pg_dumpall (or psql, or whatever) on the
CL. There is no option in the docs...
Would it be better to remove the password (if so, any pointers, I
couldn't find that either!) and make postgres only able to connect via
pipes?
Thanks again,
Anton
ps. I did google but obviously couldn't find the right combo of 
keywords...


ALso, you can create a user for backups, give them superuser powers, and 
make an entry in your pg_hba.conf file for that user from a specific 
machine / IP can connect without a password.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Limitations on 7.0.3?

2007-06-08 Thread Scott Marlowe

ARTEAGA Jose wrote:

Also worth mentioning is that I just this week found out about a very,
very important parameter "shared buffers". Ever since the original
person setup our PG (individual no longer with us) this DB had been
running without any major glitches, albeit slow. All this time the
shared buffers were running at default of "64" (8192 block size). Once I
have got this back up and running I have since set this to 1600 shared
buffers (~125MB). I've since noticed a dramatic performance improvement,
I hope that I've striked gold. But cannot claim victory yet it's only
been up for 2 days. 
If you think increasing shared buffers in 7.0 improves things, you 
should see what upgrading to 8.2.4 will do.


Seriously, you'll freak at the increase in speed.

7.0 = handcar: http://www.handcar.net/
8.2 = TGV:  http://en.wikipedia.org/wiki/TGV

(p.s. please upgrade before 7.0 eats your data...)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] pg_xlog - files are guaranteed to be sequentialy named?

2007-06-13 Thread Scott Marlowe

Alvaro Herrera wrote:

Johannes Konert wrote:
  

Hi pgsql-list-members,
I currently write a small script that deletes outdated xlog-files from 
my backup-location.
Because I do not want to rely on creation-date, I found it usable to use 
the result of

ln | sort -g -r
Thus the newest WAL xlog-file is on top and I can delete all not needed 
files at the bottom of the list.



Warning, this is NOT SAFE to do.  You should NEVER delete "outdated"
xlog files, unless you appreciate RANDOM CORRUPTION of your data.
  
I think he's talking about deleting pg_xlog files that are being used 
for PITR from the backup machine after they've been applied.


But I'm not sure that's really what he meant or not.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] DeadLocks...

2007-06-14 Thread Scott Marlowe

[EMAIL PROTECTED] wrote:

I found a problem with my application which only occurs under high loads
(isn't that always the case?).

snippets of perl...

insert into tokens (token)
select values.token
from (values TOKEN_LIST_STRING) as values(token)
left outer join tokens t using (token)
where t.token_idx is null

$sql =~ s/TOKEN_LIST_STRING/$string/
where $string is of the form (('one'),('two'))

This works 99% of the time.

But everyone once in a long while it seems that I hit simultaneaous
execute() statements that deadlock on the insertion.

Right now I know of no other way to handle this than to eval{ } the
execution and if it fails, sleep random milliseconds and retry... "wash
rinse repeat" for some number of times.

Is there any better way of doing this or some other means to manage
DEADLOCK?
Is this a deadlock that postgresql detects and causes one thread to roll 
back and you can recover from, or are you talking about a deadlock that 
isn't detected by postgresql and locks a thread?


What error messages are you seeing?

Generally speaking, if your operations have a potential for a deadlock, 
the best you can do is to do what you're doing now, detect failure and 
retry x times, then give up if it won't go through.


Or, redesign the way you're doing things.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] COPY Command and a non superuser user?

2007-06-14 Thread Scott Marlowe

Warren wrote:

How do I get a non superuser user to be able to run the COPY command?


You can copy to / from the stdin.

non-superusers cannot run copy to / from a file, since the copy to / 
from a file does so with the access authority of the postgres user and 
could be used to do "bad things" TM


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] allocate chunk of sequence

2007-06-15 Thread Scott Marlowe

Gary Fu wrote:

hello,

I try to allocate a chunk of ids from a sequence with the following 
proc.  However, if I don't use the 'lock lock_table', the proc may not 
work when it runs at the same time by different psql sessions.  Is 
there a better way without using the 'lock lock_table' ?



aNewFileId = setval('aa_seq', aFileId + nNumberOfFiles - 1);

This is NOT the best way to use sequences.

Much better would be to use a loop to allocate the them one after the 
other, and put them into a record type or something.


Do they HAVE to be contiguous?

If they're always the same size, then set the increment value of the 
sequence on creation to reflect that.


i.e.:

create sequence abc increment by 20

then just select nextval, and you have that plus the 20 after it all to 
yourself.  Lots of ways to handle this, but setval is generally the 
worst way to handle anything in a highly parallel env.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] allocate chunk of sequence

2007-06-20 Thread Scott Marlowe

PFC wrote:


The chunk to be allocated is not the same size, so to set the 
increment value will not help.


I'm sometimes not that subtle, so I'd just use a BIGINT sequence. 
Think about the largest chunk you'll ever get (probably less than 2^30 
rows, yes ?), set this sequence increment to this very high value 
(this will still let you have 2^32 chunks at least), and make each 
chunk be (big_sequence_value + N).
And that's only if they need to be contiguous.  If they don't, then just 
grab however many you need one after the other.


I'm not sure how the nextval function to handle this internally, if 
it has to read and update the sequence object. Does it use some kind 
of lock ? Otherwise the problem mentioned here should happen to 
nextval function also.


Yes it takes a lock, but the lock is held for a very short time 
(just increment the value and release the lock).
And that lock will be WAY shorter and bothersome than the lock you'll 
need if you do select setval('seqname',select max(id)+1 from sometable, 
FALSE); in the middle of your UDF.


In InnoDB, the AUTO_INCREMENT lock is held for the entire duration 
of the SQL statement, including any triggers that might fire. This 
means if you have ON INSERT triggers which take some time to process, 
INSERT concurrency is horrendous. Not so with Postgres sequences.
Note that that hasn't been true for some time now, according to Heikki 
Tuuri (sp?).  He told me they changed the way that worked about halfway 
through the 5.0 dev cycle so that they use a kind of internal sequence 
much like postgresql.  Except in postgresql the sequence is exposed.


Again, needing contiguous ids is gonna cause a problem.  It is going to 
serialize all inserts to your database, and slow it down in any kind of 
large parallel environment.


If you can get away without them then do so.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] persistent db connections in PHP

2007-06-20 Thread Scott Marlowe

[EMAIL PROTECTED] wrote:

This seems to be a problem with PHP, or at least my set up.

I'm writing pages in basically the same way. Each page has an include
at the top that gets you a database session. The function, either
pg_connect() or mysql_connect(), is supposed to either create a new
connection, or return your existing one.

So after I have a connection, I can navigate to other pages, reload or
post to the current one, trigger the x_connect(), and get the session
I created earlier.

In my Mysql site, if I create temporary tables, I still have access to
them after I have traversed a mysql_connect. So it looks like PHP is
giving me the connection I had when I created the temp tables.

However, with this new Postgres site, I don't have access to my temp
tables after I've traversed another pg_connect. So PHP is either
creating a new connection, or giving me another session, not the one
which I created my tables in.


MySQL reuses old connections within the same script.  PostgreSQL's php 
extension does not, it starts a new connection each time.


MySQL's behaviour is surprising, PostgreSQL's behaviour is what you'd 
expect.  Which is typical of both dbs.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] standard LOB support

2007-06-20 Thread Scott Marlowe

EBIHARA, Yuichiro wrote:

Hi,

I'm developing a software that supports several RDBMSs including PostgreSQL.

The software needs an ability to handle large objects and now it uses 'bytea' 
datatype for binary
data and 'text' for text data. 
But for portability, I'd rather use BLOB and CLOB defined by the SQL standards indeed.


Is there any plan to support BLOB and CLOB in future releases?
  
Looking at the spec, and postgresql's implementation, I can't see much 
reason you couldn't just use a domain to declare that a bytea is a blob 
and varchar is a clob.


Unless there's some spefici thing you need I'm not seeing.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] VACUUM ANALYZE extremely slow

2007-06-20 Thread Scott Marlowe

Sergei Shelukhin wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

  
I noticed you didn't mention your disk subsystem.  PostgreSQL tends to 
use a fair bit of disk I/O when running vacuum and / or analyze.  If you 
check with top / iostat while vacuum analyze is running, I'm betting 
you'll see a lot of waiting on I/O going on.


You do know those two commands (vacuum and analyze) aren't married 
anymore, right?  You can run analyze all by itself if you want?


And I hope you're not running vacuum analyze full all the time, cause 
there's usually no need for that.


Look up pg_autovacuum.  Saves a lot of har pulling.

Is there any way to speed up ANALYZE?
Analyze is usually REALY fast.  Even on my rather pokey workstation, 
with a single SATA hard drive and other things to do, I can run analyze 
on a 31 Gig database in

 Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.
  
You should only need to run analyze every so often.  You should only 
need vacuum after lots of updates / deletes.  You should not need to 
take the system down to vacuum, as vacuum doesn't block.  Vacuum full 
does block, but if you need that you either aren't vacuuming often 
enough or you don't have the autovacuum daemon configured.

The same database running on mysql on basically the same server used
to run optimize table on every table every half an hour without any
problem, I am actually pondering scraping half the work on the
conversion and stuff and going back to mysql but I wonder if there's
some way to improve it.
  
And when you ran optimize on those tables, were they not locked for 
regular users the whole time?


There may be a way to improve it.  Tell us, what OS are you running, 
what are your non-default postgresql.conf settings, what ACTUAL commands 
are you running here?  Vacuum, vacuum analyze, vacuum full analyze?  Are 
you inserting / deleting / updating tons of rows between vacuums and / 
or analyzes?





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
  



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] insane index scan times

2007-06-20 Thread Scott Marlowe

Sergei Shelukhin wrote:

This is just an example isolating the problem.
Actual queries contain more tables and more joins and return
reasonable amount of data.
Performance of big indices however is appalling, with planner always
reverting to seqscan with default settings.

I tried to pre-filter the data as much as possible in preceding joins
to put less strain on the offending join (less rows to join by?) but
it doesn't help.

I wonder what exactly makes index perform 100+ times slower than
seqscan - I mean even if it's perfromed on the HD which it should not
be given the index size, index and table are on the same HD and index
is smaller and also sorted, isn't it?

set enable_seqscan = on;
  
explain select * from entries inner join stuff on entries.id =

stuff.id;

 ->  Seq Scan on stuff  (cost=0.00..193344.00 rows=12550400 width=12)


set enable_seqscan = off;

explain select * from entries inner join stuff on entries.id =
stuff.id;

 ->  Index Scan using blah on stuff  (cost=0.00..25406232.30
rows=12550400 width=12)

I don't think you really understand postgresql very well.

There's no evidence that the index scan is 100 times slower.  The 
planner is guesstimating that it will take much longer for the index to 
do the same work.


Do some research on postgresql's MVCC model and "visibility".  The 
indexes don't have it, so every access has to hit the tables anyway, so 
if the query planner figures you're going to hit 10% or so of the table, 
just seq scan it and go.


Run your queries with "explain analyze" and see which is faster.  If the 
seq scan is faster, then pgsql made the right choice.  What you can do 
to speed it up depends on your system.


Post the output of explain analyze select ... here and let us look at it.

More importantly, post your REAL Queries with explain analyze output 
(not just explain) and let's see what we see.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-20 Thread Scott Marlowe

Andrew Kelly wrote:

On Mon, 2007-06-18 at 13:02 -0400, Jonah H. Harris wrote:
  

On 6/18/07, Andreas Kostyrka <[EMAIL PROTECTED]> wrote:


As a cynic, I might ask, what Oracle is fearing?
  

As a realist, I might ask, how many times do we have to answer this
type of anti-commercial-database flamewar-starting question?




As a nudist, I think I have to answer, "About every 9 weeks, it would
seem".


As a surrealist, I'd have to say purple.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Excell

2007-06-21 Thread Scott Marlowe

David Gardner wrote:
Agreed ODBC is the way to go, depending on what you are doing, Access 
may be helpfull as an intermediate step.


Joshua D. Drake wrote:

Bob Pawley wrote:


Hi All
 
Is there a fast and easy method of transferring information between 
MS Excel and PostgreSQL??


odbc?



Another option is to use your favorite scripting language and throw an 
excel header then the data in tab delimited format.  Or even in excel 
xml format.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] persistent db connections in PHP

2007-06-21 Thread Scott Marlowe

Raymond O'Donnell wrote:


[EMAIL PROTECTED] wrote:

However, with this new Postgres site, I don't have access to my temp
tables after I've traversed another pg_connect. So PHP is either
creating a new connection, or giving me another session, not the one
which I created my tables in.


You wouldn't expect to be given back the same connection (and hence 
the same temp tables) from a pool of connections - they're returned 
randomly.


Scott Marlowe wrote:
MySQL reuses old connections within the same script.  PostgreSQL's 
php extension does not, it starts a new connection each time.


Isn't pg_pconnect supposed to recycle a pooled connection?


I wasn't speaking of pg_pconnect.  I was speaking of mysql_connect and 
pg_connect.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Excell

2007-06-21 Thread Scott Marlowe

Csaba Nagy wrote:

On Thu, 2007-06-21 at 16:45, Scott Marlowe wrote:
  
Another option is to use your favorite scripting language and throw an 
excel header then the data in tab delimited format.  Or even in excel 
xml format.



Why would you need any scripting language ? COPY supports CSV output
pretty well, it can even put you a header on the top. 


Because I'm delivering reports to dozens of people who have windows, no 
psql client, and just want to go to a web page, click a button, and get 
their report (or was that a banana?)


I guess I could give them an account on the reporting server and a copy 
of pgadmin or something, but most of them are happier with a web page 
and a set of buttons.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Throwing exceptions

2007-06-21 Thread Scott Marlowe

Germán Hüttemann Arza wrote:

Hi,

I need a way to throw a message in a function, when an exception occurs, but I 
don't want to write again and again the same message in every place I need to 
throw it. So, is there a way to handle this situation in a more general 
manner?


Why not create a table of error messages / numbers, then you can just 
pull the message from the table?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Establishing a primary key

2007-06-22 Thread Scott Marlowe

Bob Pawley wrote:
I have numerous entries in a column of table 1, some of which are 
duplicated.
 
I need to transfer this information to table 2 so that I have column 
that can be used as a primery key.
 
Any help is appreciated.


So, I take it you're wanting to have this so that table 1 stays as it 
is, and table 2 gets the entries from table 1 made unique, and becomes 
the parent of table 1?


If that's the case, you want something like this:

create table2 as select distinct idcolumn from table1;
alter table2 add primary key (idcolumn);
alter table1 add foreign key (idcolumn) references table2(idcolumn);


I think that's about right.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Crash in PostgreSQL-8.2.4 while executing query

2007-07-07 Thread Scott Marlowe

On 7/7/07, rupesh bajaj <[EMAIL PROTECTED]> wrote:

Hi,
When I run the following query Postmaster crashes. For your reference I have
attached information regarding the query (query plan, table , index, server
log).
My observation is that when Postmaster starts taking lot of memory because
of which system runs out of memory and OS kills the process. System specs
are as: SUSE 10.1, 2GB RAM , P- IV.


Formatting your queries makes it much easier for people reading it to
try and help you.

This query looks like an unconstrained join.  Judging by the query
plan that explain returns, I'd sau that is the case.

What, exactly, are you trying to do here?  I'd think that a union of
two queries with those where exists clauses as where clauses and a
proper join on the semtable might work beyyer.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries

2007-07-18 Thread Scott Marlowe

On 7/18/07, Benjamin Arai <[EMAIL PROTECTED]> wrote:

Hi,

If I have a query such as:

SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE
blah='food';

Assuming the table A and B both have the same attributes and the data
between the table is not partitioned in any special way, does Postgresql
execute WHERE blah="food" on both table simultaiously or what?  If not, is
there a way to execute the query on both in parrallel then aggregate the
results?

To give some context, I have a very large amount of new data being loaded
each week.  Currently I am partitioning the data into a new table every
month which is working great from a indexing standpoint.  But I want to
parrallelize searches if possible to reduce the perofrmance loss of having
multiple tables.


Most of the time, the real issue would be the I/O throughput for such
queries, not the CPU capability.

If you have only one disk for your data storage, you're likely to get
WORSE performance if you have two queries running at once, since the
heads would not be going back and forth from one data set to the
other.

EnterpriseDB, a commercially enhanced version of PostgreSQL can do
query parallelization, but it comes at a cost, and that cost is making
sure you have enough spindles / I/O bandwidth that you won't be
actually slowing your system down.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] 8.2.4 signal 11 with large transaction

2007-07-20 Thread Scott Marlowe

On 7/20/07, Bill Moran <[EMAIL PROTECTED]> wrote:

In response to Tom Lane <[EMAIL PROTECTED]>:

> Bill Moran <[EMAIL PROTECTED]> writes:
> > I'm now full of mystery and wonder.  It would appear as if the
> > underlying problem has something to do with PHP, but why should this
> > cause a backend process to crash?
>
> I'd bet on PHP submitting the query via extended query protocol
> (PQexecParams or equivalent) instead of plain ol PQexec which is what
> psql uses.

Doesn't appear that way.  The PHP source is somewhat cryptic, but I
don't seem much ambiguity here:

pgsql_result = PQexec(pgsql, Z_STRVAL_PP(query));

There're no conditional blocks around that, so it's the only possible
choice when pg_query() gets called in a PHP script.  PHP exposes a
seperate pg_query_params() that wraps PQexecParams().

> I don't speak PHP or have it installed here, so this example
> is hard for me to investigate.  Can someone make a reproducer that uses
> PQexecParams?

Is there any way that this (or something similar) could still apply?


I just ran your script, and only changed the connect string to reflect
my environment.

It ran smoothly against my workstations 8.1.8 pgsql install and
against my reporting server's 8.2.4 installation, inserting 30001 rows
in each.

I'm not familiar with the host=/tmp bit in the connect string, is that
an explicit declaration of using unix local sockets and the directory
to find it?  Does it work if you go to tcp/ip sockets and use a
hostname etc... in your pg_connect?

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Need help with bash script and postgresql

2007-07-23 Thread Scott Marlowe

On 7/23/07, Chuck Payne <[EMAIL PROTECTED]> wrote:


Hey,

I have spend the last several days looking for a website or how to that
would show me how to call postgresql in bash script. I know that in mysql I
can do like this

for i in `cat myfile.txt` ; do mysql -uxxx -p -Asse mydatabase  "insert
into mytable  (aaa,bbb) values ("xxx", "yyy");"

I have tried to do what with pgsql and it not working. I have looked at my
two books I have and they are very limited.

Can some what tell if postgre has flag like -Asse? Or show me a simple
script they have do that is kinda like above.


Sometimes it's handy to process multiple lines at a time and process
each piece of data.  Here's a bit of a script, simplified, that we use
to monitor our application where I work.

echo $newquery | psql -h pg -U report -Atp5432 -F" " productiondb >
/tmp/$$stat.tmp;
t=/tmp/$$stat.tmp;
# If there's no response, exit
if [[ -z $t ]]; then
   rm /tmp/$$stat.tmp
   exit;
fi;
while read line
do
   arr=($line)
   tc=${arr[0]}
   frate=${arr[1]}
   fails=${arr[2]}
   if [[ frate -gt 25 && tc -gt 5 ]]; then
   do something here
  fi;
   if [[ fails -gt 10 && tc -gt 5 ]]; then
   do something here
  fi;
   fi;
done < /tmp/$$stat.tmp

This script reads one line at a time from the $$stat.tmp file and
explodes each space separated element and assigns them to variables
you can perform tests on.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Linux distro

2007-08-01 Thread Scott Marlowe
On 8/1/07, Madison Kelly <[EMAIL PROTECTED]> wrote:
> Joseph S wrote:
> > I just moved one of my desktops and my laptop from Fedora 6 to Unbuntu
> > 7.04 because Fedora lacked hardware support that Unbuntu and my Fedora
> > machines had all sorts of problems like sound dropping out and machines
> > locking up.  (Also the Fedora installers are terrible).
> >
> > My small gripes about Ubuntu are:
> >  1) rpm, for all its faults, is still better than using apt
>
> Heh, see, this is what I meant by "you won't get the same answer twice".
> :) Personally, one of the big selling features of Debian (and Ubuntu)
> was how much better /I/ found 'apt-get'/'aptitude'/'synaptic' over
> 'up2date'/'yum'.
>
> You may want to download all the popularly recommended distributions and
> play around with them to see which suits your fancy.
>
> The major distributions I would suggest (in no particular order) you
> play with:
> - RHEL (if you can afford it)
> - CentOS
> - Debian

Seconded.

I would tend to choose a distro based on who I know that I trust to
help me out.  If you've got a good friend who is an RHCE, it might be
a good idea to go with RHEL/Centos.  And so on.

Let me add.  If you're going to be using this server in production,
it's just as important to stress / load test it before sending it out
to do the job to make sure it can, in fact, do the job.

memtest86 is a must, as is running some kind of heavy load test for a
few days or weeks if you can afford the time.

Get a good reliable RAID card, pref with battery backed cache.

And the point of this little side line is that whatever you choose for
hardware may well constrain what distro to use, as you'll need to make
sure the drivers that come with the distros works well with your
hardware.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] why is the LIMIT clause slowing down this SELECT?

2007-08-01 Thread Scott Marlowe
On 8/1/07, Mason Hale <[EMAIL PROTECTED]> wrote:
> On a 8.1.9 version database that has been recently vacuumed and
> analyzed, I'm seeing some dramatic performance degradation if a limit
> clause is included in the query. This seems counter-intuitive to me.
>
> Here's the query and explain plan WITH the LIMIT clause:
>
> SELECT *
>FROM topic_feed
>  WHERE topic_id = 106947234
>   ORDER BY score DESC
>  LIMIT 25
>
> Limit  (cost=0.00..651.69 rows=25 width=29) (actual
> time=72644.652..72655.029 rows=25 loops=1)
>   ->  Index Scan Backward using topic_feed_score_index on topic_feed
> (cost=0.00..21219.08 rows=814 width=29) (actual
> time=72644.644..72654.855 rows=25 loops=1)
> Filter: (topic_id = 106947234)
> Total runtime: 72655.733 ms
>
> ==
>
> and now WITHOUT the LIMIT clause:
>
> SELECT *
>FROM topic_feed
>  WHERE topic_id = 106947234
>   ORDER BY score DESC
>
> Sort  (cost=1683.75..1685.78 rows=814 width=29) (actual
> time=900.553..902.267 rows=492 loops=1)
>   Sort Key: score
>   ->  Bitmap Heap Scan on topic_feed  (cost=7.85..1644.40 rows=814
> width=29) (actual time=307.900..897.993 rows=492 loops=1)
> Recheck Cond: (topic_id = 106947234)
> ->  Bitmap Index Scan on
> index_topic_feed_on_topic_id_and_feed_id  (cost=0.00..7.85 rows=814
> width=0) (actual time=213.205..213.205 rows=2460 loops=1)
>   Index Cond: (topic_id = 106947234)
> Total runtime: 904.049 ms

Something seems wrong here.  The cost of the second plan adds up to
1685, the cost of the first plan adds up to 651.69 with an
intermediate step that adds up to 21219.08.  ??? I thought the outer
parts of the plan always contained the inner parts?  This doesn't make
sense.

If the inner cost really is 21219 then the planner should have
switched to the cheaper plan with a limit.  If it thinks it's going to
be 651 total, then how come it's ignoring the cost of 21219?

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Linux distro

2007-08-01 Thread Scott Marlowe
On 8/2/07, Andrej Ricnik-Bay <[EMAIL PROTECTED]> wrote:
> On 8/2/07, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> > >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> > >   4735 root  18   0 52524 7204 4304 S  0.0  0.4   0:00.01 httpd
> > >  4820 root  15   0  141m 6648 3140 S  0.0  0.4   0:00.64 X
> > I think most of the virtual memory used by X is actually the map of the
> > graphics card's memory AFAIK, so it's not as significant as you think.
> That machine has an on-board chipset (i845) and has only 8MB
> shared memory allotted to the card 

You don't seem familiar with the meaning of VIRT in the memory
allocation listing there.

VIRT includes all the sizes of all the libraries that the process has
opened, whether they've been loaded or not.  i.e. apache shows 52 Meg
there, but only has 7.2Meg resident.  If it manages to do something
that needs the dynamic libs they'll get loaded into real memory and
take up real space.  until then, it's only using 7.2 meg or so.

The same is true of X here.  It has 141M of total memory taken between
resident, shared and all the libs it's linked to, but it's only
actually using 6.6 meg of phyiscal memory.  If those ever do get used,
then they could take up real physical memory.  but on a server, it's
quite likely that they never will.  And if they do, then sit idle for
some length of time, the OS will swap them out to make space for the
OS to do something else in.  If the programs resident in the 6.6 meg
of physical memory don't see much use, they too will be swapped out to
make space for caching etc as well.

I can't imagine that 6.6 meg making a big difference on most servers
nowadays.  I/O bandwidth, network  bandwidth, memory bandwidth, number
of CPUs, all are probably more important than a 6.6 meg chunk of
memory.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] What do people like to monitor (or in other words, what might be nice in pgsnmpd)?

2007-08-02 Thread Scott Marlowe
I'd like to know what the age of the oldest running transaction is.
i.e. hunt look out for old idle in transaction transactions that are
holding up vacuuming.

Info on the shared buffers like % used, % that hasn't been updated or
seen in x minutes / hours / days.

% used on various tablespaces

connection stats: how many clients connected, by what accounts, %
failed auths, stale connections harvested by tcp_keepalive timeouts...

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] how to detect the backup database every day

2007-08-07 Thread Scott Marlowe
On 8/7/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Hello all
>
> I use a crontab job to backup and restore my database every midnight every
> day (using pg_dump and restore...)

Backup from one machine, restore to another, right?

> My cron job run well for a while. However, today I discovered that my
> database was not restored for one month.

> I also fix the problem as there is corruption in the script.
>
> My question is how I can detect whether the backup or restore processes is
> corrupted. I donot want that my backup database is one month old  -:(

You can detect whether backup failed from a bash script like so:

#!/bin/bash
if ( pg_dump dbname ); then
echo "good";
else
echo "bad";
fi;

Same thing for pg_restore or psql

> If you have any experience about this please help...

Some.  There are a lot of angles you can approach this from.  You can
have a simple cronjob that runs every day that checks the size / age
of the latest backup and sends an alarm if it's smaller than the last
one, or isn't there, etc...  You can use the find command to look for
files that are less than x seconds / minutes / hours / days old.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


  1   2   3   4   5   6   7   8   9   10   >