Re: [PERFORM] : Performance Improvement Strategy

2011-10-05 Thread Venkat Balaji
Hello,

I was attempting to calculate the actual occupied space by a Table.

Below is what i did -

I summed up the avg_width of each column of a table from pg_stats, which
gives me the average size of a row (277 bytes).

select* sum(avg_width) as average_row_size from pg_stats *where
tablename='tablename'

 average_row_size
---
   277

(1 row)

Calculated the actual occupied space by rows in the table as below -

*Took the average_row_size * number_of_rows from pg_class*

select 277*reltuples/1024 as occupied_space from pg_class where
relname='tablename'; == 552 KB

 occupied_space
-
 552.6474609375

Calculated the actual Table size (600 kb)

select pg_size_pretty(pg_relation_size('tablename'));


pg_size_pretty

 600 KB

(1 row)

Calculated the free space with in the table (by scanning the pages - as
suggested by Shaun Thomas) -- 14 KB

SELECT pg_size_pretty(free_space) AS mb_free FROM pgstattuple('tablename');

 mb_free
-
 14 KB

(1 row)

600 KB is the size of the table (taken through pg_size_pretty)
14 KB is the free space (taken through contrib modules)
600+14 = 586 KB -- is the occupied space by normal calculation through
contrib modules. This is based on number of pages allocated to the table.
552 KB is the actual occupied size by the rows (taken by calculating avg row
size ). This is based on number of rows with in the pages.
586-552 = 34 KB -- is still free some where with in the occupied pages (
calculated through pg_stats and pg_class )
34 KB is still free within the pages ( each 8K ) which is basically taken as
occupied space.

This is similar concept which i successfully applied in an other RDBMS
Technology to calculate space usage metrics on production.
This is all calculated after considering Vacuum and Analyze jobs are
executed.

Please comment !

Sorry if this is too confusing and too long.

Thanks
VB

On Wed, Sep 21, 2011 at 6:33 PM, Shaun Thomas  wrote:

> On 09/20/2011 11:22 AM, Venkat Balaji wrote:
>
>  Please help me understand how to calculate free space in Tables and
>> Indexes even after vacuuming and analyzing is performed.
>>
>
> Besides the query Mark gave you using freespacemap, there's also the
> pgstattuple contrib module. You'd use it like this:
>
> SELECT pg_size_pretty(free_space) AS mb_free
>  FROM pgstattuple('some_table');
>
> Query must be run as a super-user, and I wouldn't recommend running it on
> huge tables, since it scans the actual data files to get its information.
> There's a lot of other useful information in that function, such as the
> number of dead rows.
>
>
>  What i understand is that, even if we perform VACUUM ANALYZE
>> regularly, the free space generated is not filled up.
>>
>
> VACUUM does not actually generate free space. It locates and marks reusable
> tuples. Any future updates or inserts on that table will be put in those
> newly reclaimed spots, instead of being bolted onto the end of the table.
>
>
>  I see lot of free spaces or free pages in Tables and Indexes. But, I
>> need to give an exact calculation on how much space will be reclaimed
>> after VACUUM FULL and RE-INDEXING.
>>
>
> Why? If your database is so desperate for space, VACUUM and REINDEX won't
> really help you. A properly maintained database will still have a certain
> amount of "bloat" equal to the number of rows that change between
> maintenance intervals. One way or another, that space is going to be used by
> *something*.
>
> It sounds more like you need to tweak your autovacuum settings to be more
> aggressive if you're seeing significant enough turnover that your tables are
> bloating significantly. One of our tables, for instance, gets vacuumed more
> than once per hour because it experiences 1,000% turnover daily.
>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
> 312-676-8870
> stho...@peak6.com
>
> __**
>
> See 
> http://www.peak6.com/email-**disclaimer/for
>  terms and conditions related to this email
>


Re: [PERFORM] : Performance Improvement Strategy

2011-10-05 Thread Raghavendra
On Wed, Oct 5, 2011 at 2:38 PM, Venkat Balaji wrote:

> Hello,
>
> I was attempting to calculate the actual occupied space by a Table.
>
> Below is what i did -
>
> I summed up the avg_width of each column of a table from pg_stats, which
> gives me the average size of a row (277 bytes).
>
> select* sum(avg_width) as average_row_size from pg_stats *where
> tablename='tablename'
>
>  average_row_size
> ---
>277
>
> (1 row)
>
> Calculated the actual occupied space by rows in the table as below -
>
> *Took the average_row_size * number_of_rows from pg_class*
>
> select 277*reltuples/1024 as occupied_space from pg_class where
> relname='tablename'; == 552 KB
>
>  occupied_space
> -
>  552.6474609375
>
> Calculated the actual Table size (600 kb)
>
> select pg_size_pretty(pg_relation_size('tablename'));
>
>
> pg_size_pretty
> 
>  600 KB
>
> (1 row)
>
> Calculated the free space with in the table (by scanning the pages - as
> suggested by Shaun Thomas) -- 14 KB
>
> SELECT pg_size_pretty(free_space) AS mb_free FROM pgstattuple('tablename');
>
>  mb_free
> -
>  14 KB
>
> (1 row)
>
> 600 KB is the size of the table (taken through pg_size_pretty)
> 14 KB is the free space (taken through contrib modules)
> 600+14 = 586 KB -- is the occupied space by normal calculation through
> contrib modules. This is based on number of pages allocated to the table.
>

Its typo 600 - 14 = 586 KB

552 KB is the actual occupied size by the rows (taken by calculating avg row
> size ). This is based on number of rows with in the pages.
> 586-552 = 34 KB -- is still free some where with in the occupied pages (
> calculated through pg_stats and pg_class )
> 34 KB is still free within the pages ( each 8K ) which is basically taken
> as occupied space.
>
>
One more point to add to this good discussion, each row header will occupy
24 bytes + 4 bytes pointer on page to tuple.

---
Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/


Re: [PERFORM] : Performance Improvement Strategy

2011-10-05 Thread Guillaume Cottenceau
Venkat Balaji  writes:

> Hello,
>
> I was attempting to calculate the actual occupied space by a Table.

SELECT relname, reltuples, pg_size_pretty(relpages*8*1024) as size FROM 
pg_class, pg_namespace WHERE pg_namespace.oid = pg_class.relnamespace AND 
relkind = 'r' AND nspname = 'public' ORDER BY relpages DESC;

relkind = 'i' for indexes.

-- 
Guillaume Cottenceau

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] : Performance Improvement Strategy

2011-10-05 Thread Kevin Grittner
Raghavendra  wrote:
> Venkat Balaji wrote:
 
>> [attempt to calculate file space from row layout and number of
>> rows]
 
> One more point to add to this good discussion, each row header
> will occupy 24 bytes + 4 bytes pointer on page to tuple.
 
Not to mention:
 
http://www.postgresql.org/docs/9.1/interactive/storage-toast.html
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] : Performance Improvement Strategy

2011-10-05 Thread Scott Marlowe
On Wed, Sep 21, 2011 at 11:57 AM, Greg Smith  wrote:
> On 09/21/2011 12:13 PM, Venkat Balaji wrote:
>>
>> I as a DBA, suggested to perform VACUUM FULL and RE-INDEXING + ANALYZE to
>> ensure that IO performance and Indexing performance would be good
>
>
> Read http://wiki.postgresql.org/wiki/VACUUM_FULL before you run VACUUM FULL.
>  You probably don't want to do that.  A multi-gigabyte table can easily be
> unavailable for several hours if you execute VACUUM FULL against it.
>  CLUSTER is almost always faster.

It used to be that cluster on a very randomly ordered table was much
slower than doing something like select * into newtable from oldtable
order by col1, col2;  Is that still the case in 9.0/9.1?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] : Performance Improvement Strategy

2011-10-05 Thread Tom Lane
Scott Marlowe  writes:
> It used to be that cluster on a very randomly ordered table was much
> slower than doing something like select * into newtable from oldtable
> order by col1, col2;  Is that still the case in 9.0/9.1?

Fixed in 9.1, per release notes:

* Allow CLUSTER to sort the table rather than scanning the index when 
it seems likely to be cheaper (Leonardo Francalanci)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Intel 710 Endurance Test Results

2011-10-05 Thread David Boreham


I ran a test using Intel's timed workload wear indication feature on a 
100G 710 series SSD.


The test works like this : you reset the wear indication counters, then 
start running
some workload (in my case pgbench at scale 100 for 4 hours). During the 
test run
a wear indication attribute can be read via smartctl. This contains the 
drive's estimate
for the total drive wear lifetime used during the test run time. 
Provided we trust
the drive's estimate, this in turn provides an answer to the interesting 
question
"how long before my drive wears out in production?". That is, provided 
the test

workload can be reasonably accurately related to production workload.

Executive summary : the drive says that we used 0.025% of its wear life
during this 4h test. The test performed 47 million transactions on a
roughly 17G database.

We think that this pgbench workload performs transactions at roughly
10X the rate we expect from our application in production under heavy load.
So the drive says you could run this pgbench test for about two years
before wearing out the flash devices. Or 20 years under our expected 
production

workload.

Smart attributes read just after test completion:

[root@server1 9.1]# smartctl -A /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-71.29.1.el6.x86_64] 
(local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 5
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  
UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time0x0020   100   100   000Old_age   
Offline  -   0
  4 Start_Stop_Count0x0030   100   100   000Old_age   
Offline  -   0
  5 Reallocated_Sector_Ct   0x0032   100   100   000Old_age   
Always   -   0
  9 Power_On_Hours  0x0032   100   100   000Old_age   
Always   -   90
 12 Power_Cycle_Count   0x0032   100   100   000Old_age   
Always   -   8
170 Reserve_Block_Count 0x0033   100   100   010Pre-fail  
Always   -   0
171 Program_Fail_Count  0x0032   100   100   000Old_age   
Always   -   0
172 Erase_Fail_Count0x0032   100   100   000Old_age   
Always   -   0
174 Unknown_Attribute   0x0032   100   100   000Old_age   
Always   -   4
183 SATA_Downshift_Count0x0030   100   100   000Old_age   
Offline  -   0
184 End-to-End_Error0x0032   100   100   090Old_age   
Always   -   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age   
Always   -   0
190 Airflow_Temperature_Cel 0x0032   073   070   000Old_age   
Always   -   27 (Min/Max 17/30)
192 Unsafe_Shutdown_Count   0x0032   100   100   000Old_age   
Always   -   4
194 Temperature_Celsius 0x0032   100   100   000Old_age   
Always   -   32
199 UDMA_CRC_Error_Count0x0030   100   100   000Old_age   
Offline  -   0
225 Host_Writes_32MiB   0x0032   100   100   000Old_age   
Always   -   51362
226 Workld_Media_Wear_Indic 0x0032   100   100   000Old_age   
Always   -   26
227 Workld_Host_Reads_Perc  0x0032   100   100   000Old_age   
Always   -   1
228 Workload_Minutes0x0032   100   100   000Old_age   
Always   -   242
232 Available_Reservd_Space 0x0033   100   100   010Pre-fail  
Always   -   0
233 Media_Wearout_Indicator 0x0032   100   100   000Old_age   
Always   -   0
241 Host_Writes_32MiB   0x0032   100   100   000Old_age   
Always   -   51362
242 Host_Reads_32MiB0x0032   100   100   000Old_age   
Always   -   822


pgbench output from the test run:

bash-4.1$ /usr/pgsql-9.1/bin/pgbench -T 14400 -j 8 -c 64
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1000
query mode: simple
number of clients: 64
number of threads: 8
duration: 14400 s
number of transactions actually processed: 47205109
tps = 3278.127690 (including connections establishing)
tps = 3278.135396 (excluding connections establishing)



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Allow sorts to use more available memory

2011-10-05 Thread Bruce Momjian
Stephen Frost wrote:
-- Start of PGP signed section.
> * Robert Schnabel (schnab...@missouri.edu) wrote:
> > And getting back to the to-do list entry and reading the related
> > posts, it appears that even if you could set work_mem that high it
> > would only use 2GB anyway.  I guess that was the second part of my
> > question.  Is that true?
> 
> Errr, and to get back to the to-do (which I've been considering doing
> something about...), it's to allow the *actual* memory usage for things
> like sorts to use more than 2GB, but as others have pointed out, you can
> do that by putting pgsql_tmp on a memory filesystem and letting the
> sorts spill to the memory-based FS.

It would be nice if the tempfs would allow us to control total temp
memory usage, except it causes a failure rather than splilling to real
disk.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] pg9 replication over WAN ?

2011-10-05 Thread Ben Ciceron
Hello,

Has any performance or evaluation done for pg9.x streaming replication
over WAN ?
How adequate is the protocol to push WALs over long distance ?
Any best practice tuning wal_* for WAN ?

Cheers,
Ben-

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance