[PERFORM] shared_buffers vs Linux file cache

2015-01-15 Thread Huan Ruan
Hi All

I thought 'shared_buffers' sets how much memory that is dedicated to
PostgreSQL to use for caching data, therefore not available to other
applications.

However, as shown in the following screenshots, The server (CentOS 6.6
64bit) has 64GB of RAM, and 'shared_buffer' is set to 32GB, but the
free+buffer+cache is 60GB.

Shouldn't the maximum value for free+buffer+cache be 32GB ( 64 - 32)?
Is 'shared_buffers' pre allocated to Postgres, and Postgres only?

Thanks
Huan


[image: Inline images 2]

[image: Inline images 1]


Re: [PERFORM] shared_buffers vs Linux file cache

2015-01-15 Thread Glyn Astill
> From: Huan Ruan 
>To: pgsql-performance@postgresql.org 
>Sent: Thursday, 15 January 2015, 11:30
>Subject: [PERFORM] shared_buffers vs Linux file cache
> 
>
>
>Hi All
>
>
>I thought 'shared_buffers' sets how much memory that is dedicated to 
>PostgreSQL to use for caching data, therefore not available to other 
>applications.
>
>
>However, as shown in the following screenshots, The server (CentOS 6.6 64bit) 
>has 64GB of RAM, and 'shared_buffer' is set to 32GB, but the free+buffer+cache 
>is 60GB. 
>
>
>Shouldn't the maximum value for free+buffer+cache be 32GB ( 64 - 32)?
>Is 'shared_buffers' pre allocated to Postgres, and Postgres only?

>

I've not looked at the images, but I think you're getting PostgreSQL 
shared_buffers and the OS buffercache mixed up; they are not the same.

PostgreSQL shared_buffers is specific to postgres, whereas the OS buffercache 
will just use free memory to cache data pages from disk, and this is what 
you're seeing.

Some reading for you: 
http://www.tldp.org/LDP/sag/html/buffer-cache.html

Glyn


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers vs Linux file cache

2015-01-15 Thread Jeff Janes
On Thu, Jan 15, 2015 at 3:30 AM, Huan Ruan  wrote:

> Hi All
>
> I thought 'shared_buffers' sets how much memory that is dedicated to
> PostgreSQL to use for caching data, therefore not available to other
> applications.
>
> However, as shown in the following screenshots, The server (CentOS 6.6
> 64bit) has 64GB of RAM, and 'shared_buffer' is set to 32GB, but the
> free+buffer+cache is 60GB.
>
> Shouldn't the maximum value for free+buffer+cache be 32GB ( 64 - 32)?
> Is 'shared_buffers' pre allocated to Postgres, and Postgres only?
>

While PostgreSQL has reserves the right to use 32GB, as long as PostgreSQL
has not actually dirtied that RAM yet, then the kernel is free to keep
using it to cache files.


Cheers,

Jeff


[PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Josh Berkus
This is an obfuscation and mock up, but:

table files (
id serial pk,
filename text not null,
state varchar(20) not null
... 18 more columns
)

index file_state on (state)
(35GB in size)
index file_in_flight_state (state) where state in (
'waiting','assigning', 'processing' )
(600MB in size)
... 10 more indexes

More important facts:
* state = 'done' 95% of the time.  thereform the partial index
represents only 5% of the table
* all indexes and the table are very bloated
* server has 128GB RAM
* Version 9.2.

Given this setup, I would expect the planner to *always* choose
file_in_flight_state over file_state for this query:

SELECT id, filename FROM files WHERE state = 'waiting';

... and yet it keeps selecting file_state based on extremely small
changes to the stats.   This is important because the same query, using
file_state, is 20X to 50X slower, because that index frequently gets
pushed out of memory.

What am I missing?  Or is this potentially a planner bug for costing?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers vs Linux file cache

2015-01-15 Thread Tom Lane
Jeff Janes  writes:
> On Thu, Jan 15, 2015 at 3:30 AM, Huan Ruan  wrote:
>> I thought 'shared_buffers' sets how much memory that is dedicated to
>> PostgreSQL to use for caching data, therefore not available to other
>> applications.

> While PostgreSQL has reserves the right to use 32GB, as long as PostgreSQL
> has not actually dirtied that RAM yet, then the kernel is free to keep
> using it to cache files.

Another thing to keep in mind is that, even if Postgres *has* used the
RAM, the kernel might decide to swap parts of it out if it's not being
used heavily.  This is pretty disastrous from a performance standpoint,
so it's advisable to not make shared_buffers very much larger than what
your application will keep "hot".

Ideally we'd lock the shared buffer arena into RAM to prevent that,
but such facilities are often unavailable or restricted to root.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Jeff Janes
On Thu, Jan 15, 2015 at 2:30 PM, Josh Berkus  wrote:

> This is an obfuscation and mock up, but:
>
> table files (
> id serial pk,
> filename text not null,
> state varchar(20) not null
> ... 18 more columns
> )
>
> index file_state on (state)
> (35GB in size)
> index file_in_flight_state (state) where state in (
> 'waiting','assigning', 'processing' )
> (600MB in size)
> ... 10 more indexes
>
> More important facts:
> * state = 'done' 95% of the time.  thereform the partial index
> represents only 5% of the table
> * all indexes and the table are very bloated
> * server has 128GB RAM
> * Version 9.2.
>
> Given this setup, I would expect the planner to *always* choose
> file_in_flight_state over file_state for this query:
>
> SELECT id, filename FROM files WHERE state = 'waiting';
>
> ... and yet it keeps selecting file_state based on extremely small
> changes to the stats.   This is important because the same query, using
> file_state, is 20X to 50X slower, because that index frequently gets
> pushed out of memory.
>
> What am I missing?  Or is this potentially a planner bug for costing?
>


I wonder if this could be related to 3e9960e9d935e7e7c12e78441, which first
appeared in 9.2.3.

But I don't know why the small index *should* be better.  If this query is
frequent, it should have no problem keeping just those leaf pages that
contain the 'waiting' rows out of the full index in memory, without having
to keep the 'done' leaf pages around.  And if it is not frequent, then it
would have just as much problem keeping the smaller index in memory as it
would a small portion of the large index.

Of course if it randomly switches back and forth, now you have to keep
twice as much data in memory, the relevant parts of both indexes.

What is the point of having the full index at all, in this case?

Cheers,

Jeff


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Mark Kirkwood

On 16/01/15 11:30, Josh Berkus wrote:

This is an obfuscation and mock up, but:

table files (
id serial pk,
filename text not null,
state varchar(20) not null
... 18 more columns
)

index file_state on (state)
(35GB in size)
index file_in_flight_state (state) where state in (
'waiting','assigning', 'processing' )
(600MB in size)
... 10 more indexes

More important facts:
* state = 'done' 95% of the time.  thereform the partial index
represents only 5% of the table
* all indexes and the table are very bloated
* server has 128GB RAM
* Version 9.2.

Given this setup, I would expect the planner to *always* choose
file_in_flight_state over file_state for this query:

SELECT id, filename FROM files WHERE state = 'waiting';

... and yet it keeps selecting file_state based on extremely small
changes to the stats.   This is important because the same query, using
file_state, is 20X to 50X slower, because that index frequently gets
pushed out of memory.

What am I missing?  Or is this potentially a planner bug for costing?



Are you seeing a bitmapscan access plan? If so see if disabling it gets 
you a plan on the files_in_flight index. I'm seeing this scenario with a 
fake/generated dataset a bit like yours in 9.2 (9.5 uses the 
files_in_flight w/o any coercing).


regards

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Autocompletion with full text search

2015-01-15 Thread Ivan Schneider
Hi,

We implemented an autocompletion feature (case and accent insensitive)
using PostgreSQL full text search.
The query fetches patient ids matching the full text query that belong to a
given patient base (rows contain a pg_array with patient_base_ids).
Our table grew over time (6.2 million rows now) and the query got slower.
We are wondering if we have hit the limit or if there is still room for
performance improvement with better indexing or data partitioning for
instance.
Here is a link to the "explain (analyze, buffers)" results from such a
query run on one of our servers : http://explain.depesz.com/s/a5Q9
Running analyze on the table doesn't change the results and the table is
autovacuumed (last one was 2015-01-08 22:18).

You will find below additional information to bring context to my question.
Thank you in advance for your help.

Here is the schema of the table :

CREATE TABLE patients (
id integer NOT NULL,
first_name character varying(255),
last_name character varying(255),
regular_doctor_name character varying(255),
regular_doctor_city character varying(255),
email character varying(255),
phone_number character varying(255),
secondary_phone_number character varying(255),
gender boolean,
birthdate date,
zipcode character varying(255),
city character varying(255),
created_at timestamp without time zone,
updated_at timestamp without time zone,
imported_at timestamp without time zone,
import_error text,
import_identifier character varying(255),
address character varying(255),
deleted_at timestamp without time zone,
account_id integer,
main boolean DEFAULT false NOT NULL,
insurance_type character varying(255),
patient_base_ids_cache integer[] DEFAULT '{}'::integer[],
crucial_info character varying(255),
referrer character varying(255),
occupation character varying(255),
custom_fields_values hstore DEFAULT ''::hstore NOT NULL,
bounced_at timestamp without time zone,
merged_at timestamp without time zone,
maiden_name character varying(255)
);

Here is the dictionary definition we used for full text search :

CREATE TEXT SEARCH CONFIGURATION custom_name_search (
PARSER = pg_catalog."default" );

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR asciiword WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR word WITH unaccent, simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR numword WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR email WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR url WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR host WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR sfloat WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR version WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR hword_numpart WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR hword_part WITH unaccent, simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR hword_asciipart WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR numhword WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR asciihword WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR hword WITH unaccent, simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR url_path WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR file WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR "float" WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR "int" WITH simple;

ALTER TEXT SEARCH CONFIGURATION custom_name_search
ADD MAPPING FOR uint WITH simple;

Here are the indexes on the patients table :

CREATE INDEX index_patients_on_account_id ON patients USING btree
(account_id);
CREATE INDEX index_patients_on_import_identifier ON patients USING btree
(import_identifier);
CREATE INDEX index_patients_on_patient_base_ids_cache ON patients USING gin
(patient_base_ids_cache);
CREATE INDEX index_patients_on_phone_number ON patients USING btree
(phone_number);
CREATE INDEX patients_clean_secondary_phone_number_index ON patients USING
btree (replace((secondary_phone_number)::text, ' '::text, ''::text));
CREATE INDEX tsvector_on_patients ON patients USING gin
(to_tsvector('custom_name_search'::regconfig, (((COALESCE(last_name,
''::character varying))::text || ' '::text) || (COALESCE(first_name,
''::character varying))::text)));
CREATE INDEX tsvector_on_patients_and_patient_base_ids_cache ON patients
USING gin (to_tsvector('custom_name_search'::regconfig,
(((COALESCE(last_name, ''::char

Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Tom Lane
Josh Berkus  writes:
> index file_state on (state)
>   (35GB in size)
> index file_in_flight_state (state) where state in (
> 'waiting','assigning', 'processing' )
>   (600MB in size)
> ... 10 more indexes

> More important facts:
> * state = 'done' 95% of the time.  thereform the partial index
> represents only 5% of the table
> * all indexes and the table are very bloated
> * server has 128GB RAM
> * Version 9.2.

9.2.what?  And how much of the table is 'waiting' state?

> What am I missing?  Or is this potentially a planner bug for costing?

The only real difference between the two cases is index descent costs:
the number of heap pages visited will be the same whichever index is
used, and the number of index leaf pages visited is probably about the
same too.  9.3 is the first release that makes any real attempt to
model index descent costs realistically.  Before that there were some
dubious fudge factors, which we're unlikely to change in long-stable
branches no matter how badly the results might suck in specific instances.

Having said that, though, I'd have thought that the old fudge factors
would strongly prefer the smaller index given such a large difference in
index size.  Have you neglected to mention some nondefault planner cost
settings?

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Mark Kirkwood

On 16/01/15 13:37, Mark Kirkwood wrote:

On 16/01/15 11:30, Josh Berkus wrote:

This is an obfuscation and mock up, but:

table files (
id serial pk,
filename text not null,
state varchar(20) not null
... 18 more columns
)

index file_state on (state)
(35GB in size)
index file_in_flight_state (state) where state in (
'waiting','assigning', 'processing' )
(600MB in size)
... 10 more indexes

More important facts:
* state = 'done' 95% of the time.  thereform the partial index
represents only 5% of the table
* all indexes and the table are very bloated
* server has 128GB RAM
* Version 9.2.

Given this setup, I would expect the planner to *always* choose
file_in_flight_state over file_state for this query:

SELECT id, filename FROM files WHERE state = 'waiting';

... and yet it keeps selecting file_state based on extremely small
changes to the stats.   This is important because the same query, using
file_state, is 20X to 50X slower, because that index frequently gets
pushed out of memory.

What am I missing?  Or is this potentially a planner bug for costing?



Are you seeing a bitmapscan access plan? If so see if disabling it gets
you a plan on the files_in_flight index. I'm seeing this scenario with a
fake/generated dataset a bit like yours in 9.2 (9.5 uses the
files_in_flight w/o any coercing).



FWIW: For me 9.2 and 9.3 (default config) generate plans like:
state=# EXPLAIN ANALYZE
SELECT id, filename
FROM files
WHERE state = 'processing';
QUERY PLAN 


--
 Bitmap Heap Scan on files  (cost=3102.02..89228.68 rows=164333 
width=15) (actual time=26.629..803.507 rows=166696 loops=1)

   Recheck Cond: ((state)::text = 'processing'::text)
   Rows Removed by Index Recheck: 7714304
   ->  Bitmap Index Scan on file_state  (cost=0.00..3060.93 rows=164333 
width=0) (actual time=25.682..25.682 rows=166696 loops=1)

 Index Cond: ((state)::text = 'processing'::text)
 Total runtime: 808.662 ms
(6 rows)


whereas 9.4 and 9.5 get:

  QUERY 
PLAN


---
 Index Scan using file_in_flight on files  (cost=0.42..62857.39 
rows=158330 width=15) (actual time=0.055..202.732 rows=166696 loops=1)

   Index Cond: ((state)::text = 'processing'::text)
 Planning time: 24.203 ms
 Execution time: 208.926 ms
(4 rows)


This is with each version loading exactly the same dataset (generated by 
the attached scripty). Obviously this is a vast simplification of what 
Josh is looking at - but it is (hopefully) interesting that these later 
versions are doing so much better...


Cheers

Mark



gendata.pl
Description: Perl program

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Tom Lane
Mark Kirkwood  writes:
> This is with each version loading exactly the same dataset (generated by 
> the attached scripty). Obviously this is a vast simplification of what 
> Josh is looking at - but it is (hopefully) interesting that these later 
> versions are doing so much better...

Actually, what I see when using this dataset is that both the estimated
cost and the actual runtime of the query are within a percent or so of
being the same when using either index.  (Try forcing it to use the
non-preferred index by dropping the preferred one, and you'll see what
I mean.)  The absolute magnitude of the cost estimate varies across
versions, but not the fact that we're getting about the same estimate
for both indexes.

I suspect the same may be true for Josh's real-world database, meaning
that the index choice is depending on phase-of-the-moon factors like
which index has the lower OID, which is doubtless contributing to
his frustration :-(

I think that the real key to this problem lies in the index bloat pattern,
which might be quite a bit different between the two indexes.  This might
mean traversing many more index leaf pages in one case than the other,
which would account for the difference in real runtimes that he's seeing
and I'm not.  I don't recall at the moment whether 9.2's cost estimation
rules would do a good job of accounting for such effects.  (And even if
it's trying, it'd be working from an average-case estimate, which might
not have much to do with reality for this specific query.)

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Mark Kirkwood

On 16/01/15 15:32, Mark Kirkwood wrote:

On 16/01/15 13:37, Mark Kirkwood wrote:

On 16/01/15 11:30, Josh Berkus wrote:

This is an obfuscation and mock up, but:

table files (
id serial pk,
filename text not null,
state varchar(20) not null
... 18 more columns
)

index file_state on (state)
(35GB in size)
index file_in_flight_state (state) where state in (
'waiting','assigning', 'processing' )
(600MB in size)
... 10 more indexes

More important facts:
* state = 'done' 95% of the time.  thereform the partial index
represents only 5% of the table
* all indexes and the table are very bloated
* server has 128GB RAM
* Version 9.2.

Given this setup, I would expect the planner to *always* choose
file_in_flight_state over file_state for this query:

SELECT id, filename FROM files WHERE state = 'waiting';

... and yet it keeps selecting file_state based on extremely small
changes to the stats.   This is important because the same query, using
file_state, is 20X to 50X slower, because that index frequently gets
pushed out of memory.

What am I missing?  Or is this potentially a planner bug for costing?



Are you seeing a bitmapscan access plan? If so see if disabling it gets
you a plan on the files_in_flight index. I'm seeing this scenario with a
fake/generated dataset a bit like yours in 9.2 (9.5 uses the
files_in_flight w/o any coercing).



FWIW: For me 9.2 and 9.3 (default config) generate plans like:
state=# EXPLAIN ANALYZE
SELECT id, filename
FROM files
WHERE state = 'processing';
 QUERY PLAN
--

  Bitmap Heap Scan on files  (cost=3102.02..89228.68 rows=164333
width=15) (actual time=26.629..803.507 rows=166696 loops=1)
Recheck Cond: ((state)::text = 'processing'::text)
Rows Removed by Index Recheck: 7714304
->  Bitmap Index Scan on file_state  (cost=0.00..3060.93 rows=164333
width=0) (actual time=25.682..25.682 rows=166696 loops=1)
  Index Cond: ((state)::text = 'processing'::text)
  Total runtime: 808.662 ms
(6 rows)


whereas 9.4 and 9.5 get:

   QUERY PLAN

---

  Index Scan using file_in_flight on files  (cost=0.42..62857.39
rows=158330 width=15) (actual time=0.055..202.732 rows=166696 loops=1)
Index Cond: ((state)::text = 'processing'::text)
  Planning time: 24.203 ms
  Execution time: 208.926 ms
(4 rows)


This is with each version loading exactly the same dataset (generated by
the attached scripty). Obviously this is a vast simplification of what
Josh is looking at - but it is (hopefully) interesting that these later
versions are doing so much better...



A bit more poking about shows that the major factor (which this fake 
dataset anyway) is the default for effective_cache_size (changes from 
128MB to 4GB in 9.4). Increasing this makes 9.2 start using the 
files_in_flight index in a plain index scan too.


Josh - might be worth experimenting with this parameter.

regards

Mark



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Mark Kirkwood

On 16/01/15 16:06, Mark Kirkwood wrote:


A bit more poking about shows that the major factor (which this fake
dataset anyway) is the default for effective_cache_size (changes from
128MB to 4GB in 9.4). Increasing this makes 9.2 start using the
files_in_flight index in a plain index scan too.



Arrg - misread the planner outputin 9.2 what changes is a plan that 
uses an index scan on the *file_state* index (not 
files_in_flight)...which appears much faster than the bitmap scan on 
file_state. Apologies for the confusion.


I'm thinking that I'm seeing the effect Tom has just mentioned.

regards

Mark




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Josh Berkus
On 01/16/2015 04:17 PM, Mark Kirkwood wrote:
> On 16/01/15 16:06, Mark Kirkwood wrote:
> 
>> A bit more poking about shows that the major factor (which this fake
>> dataset anyway) is the default for effective_cache_size (changes from
>> 128MB to 4GB in 9.4). Increasing this makes 9.2 start using the
>> files_in_flight index in a plain index scan too.
>>
> 
> Arrg - misread the planner outputin 9.2 what changes is a plan that
> uses an index scan on the *file_state* index (not
> files_in_flight)...which appears much faster than the bitmap scan on
> file_state. Apologies for the confusion.
> 
> I'm thinking that I'm seeing the effect Tom has just mentioned.

It's not using a bitmapscan in either case; it's a straight indexscan.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers vs Linux file cache

2015-01-15 Thread Huan Ruan
Thanks very much, Glyn, Jeff, and Tom. That was very clearly explained.

A related case, see the following top dump. The Postgres process is using
87g residential memory, which I thought was the physical memory consumed by
a process that can't be shared with others. While, the free+cached is about
155gb. But, (87 + 155) is bigger than the total available 198g RAM. Does
this mean some of the residential memory used by Postgres is actually
shareable to others?


>> Mem:  198311880k total, 183836408k used, 14475472k free, 8388k buffers
>> Swap:  4194300k total,   314284k used,  3880016k free, 141105408k cached
>>
>>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
>> 15338 postgres  20   0 97.9g  87g  87g S  0.3 46.4  21:47.44
>> postgres: checkpointer process
>> 27473 postgres  20   0 98.1g  29g  29g S  0.0 15.8   2:14.93
>> postgres:  idle
>>  4710 postgres  20   0 98.1g  24g  23g S  0.0 12.7   1:17.41
>> postgres:  idle

>> 26587 postgres 20 0 98.0g 15g 15g S 0.0 8.0 1:21.24


>
>


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Mark Kirkwood

On 16/01/15 16:28, Josh Berkus wrote:

On 01/16/2015 04:17 PM, Mark Kirkwood wrote:

On 16/01/15 16:06, Mark Kirkwood wrote:


A bit more poking about shows that the major factor (which this fake
dataset anyway) is the default for effective_cache_size (changes from
128MB to 4GB in 9.4). Increasing this makes 9.2 start using the
files_in_flight index in a plain index scan too.



Arrg - misread the planner outputin 9.2 what changes is a plan that
uses an index scan on the *file_state* index (not
files_in_flight)...which appears much faster than the bitmap scan on
file_state. Apologies for the confusion.

I'm thinking that I'm seeing the effect Tom has just mentioned.


It's not using a bitmapscan in either case; it's a straight indexscan.




Right, I suspect that bloating is possibly the significant factor then - 
can you REINDEX?


Cheers

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange choice of general index over partial index

2015-01-15 Thread Josh Berkus

> Right, I suspect that bloating is possibly the significant factor then -
> can you REINDEX?

Believe me, it's on the agenda.  Of course, this is on a server with 90%
saturated IO, so doing a repack is going to take some finessing.

BTW, effective_cache_size is set to 100GB.  So I suspect that it's the
other issue with Tom mentioned, which is that 9.2 really doesn't take
physical index size into account.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers vs Linux file cache

2015-01-15 Thread Roman Konoval
Huan,

Residential memory is part of the process memory that is now swapped and is
in RAM. This includes also memory shared with other processes so sum of RES
for all processes may be greater that total physical memory.

I recommend this article
http://www.depesz.com/2012/06/09/how-much-ram-is-postgresql-using/ to get
better understanding how linux manages memory.

Regards,
Roman

On Fri, Jan 16, 2015 at 5:32 AM, Huan Ruan  wrote:

> Thanks very much, Glyn, Jeff, and Tom. That was very clearly explained.
>
> A related case, see the following top dump. The Postgres process is using
> 87g residential memory, which I thought was the physical memory consumed by
> a process that can't be shared with others. While, the free+cached is about
> 155gb. But, (87 + 155) is bigger than the total available 198g RAM. Does
> this mean some of the residential memory used by Postgres is actually
> shareable to others?
>
>
> >> Mem:  198311880k total, 183836408k used, 14475472k free, 8388k buffers
> >> Swap:  4194300k total,   314284k used,  3880016k free, 141105408k cached
> >>
> >>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
> >> 15338 postgres  20   0 97.9g  87g  87g S  0.3 46.4  21:47.44
> >> postgres: checkpointer process
> >> 27473 postgres  20   0 98.1g  29g  29g S  0.0 15.8   2:14.93
> >> postgres:  idle
> >>  4710 postgres  20   0 98.1g  24g  23g S  0.0 12.7   1:17.41
> >> postgres:  idle
>
> >> 26587 postgres 20 0 98.0g 15g 15g S 0.0 8.0 1:21.24
>
>
>>
>>
>
>
>