Re: *Regarding brin_index on required column of the table

2018-09-24 Thread Durgamahesh Manne
Thank you all very much for this information

On Sat, Sep 22, 2018 at 12:38 AM Alban Hertroys  wrote:

>
>
> > On 21 Sep 2018, at 17:49, Durgamahesh Manne 
> wrote:
> >
> > 
>
> Considering how hard you try to get rid of duplicates, I'm quite convinced
> that you're at least short a few join conditions. Getting rid of duplicates
> early has the added benefit of having to aggregate fewer rows, which should
> drastically improve the performance of this query.
>
> In addition, your approach of using both distinct and group by raises a
> red flag that you're fighting symptoms (most likely: duplicate results)
> without understanding their causes. I can't think of a single situation
> where both are justified in the same (sub-)query.
>
> Alban Hertroys
> --
> If you can't see the forest for the trees,
> cut the trees and you'll find there is no forest.
>
>


Mysteriously varying index scan costs

2018-09-24 Thread Laurenz Albe
Last week I encountered the following at a customer site on PostgreSQL 9.6,
and I cannot explain it.

The first run gave me this:

 Index Scan using device_outbound_messages_status on device_outbound_messages  
(cost=0.43..20.46 rows=97 width=128) (actual time=34.021..35.545 rows=133 
loops=1)
   Index Cond: ((status)::text = ANY 
('{WAITING_FOR_TX,WAITING_FOR_IMMEDIATE_TX}'::text[]))
   Buffers: shared hit=74917 dirtied=707

Subsequent runs of the same query gave me:

 Index Scan using device_outbound_messages_status on device_outbound_messages  
(cost=0.43..20.46 rows=97 width=128) (actual time=2.129..3.907 rows=133 loops=1)
   Index Cond: ((status)::text = ANY 
('{WAITING_FOR_TX,WAITING_FOR_IMMEDIATE_TX}'::text[]))
   Buffers: shared hit=1185

There were no concurrent changes to the data (test environment).
This was part of a bigger statement.

I understand that some pages can be dirtied because table fetches that
cause hint bits to be set.

But how can it be that the first run has to touch 74917 blocks,
while whe second run only needs to touch 1185?

Thanks for any ideas!

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Mysteriously varying index scan costs

2018-09-24 Thread Pavan Deolasee
On Mon, Sep 24, 2018 at 2:20 PM, Laurenz Albe 
wrote:

>
>
> But how can it be that the first run has to touch 74917 blocks,
> while whe second run only needs to touch 1185?
>
>
The first index scan may have killed lots of index tuples.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgAdmin v4: The application server could not be contacted.

2018-09-24 Thread Richard Nielsen

Hi,

I'm new to PostgresSQL and to this list! I have installed version 10 
64-bit (on Windows 7) I also used stackbuilder to install the ODBC 
drivers. The SQL shell allows me to logon to the server as postgres but 
when starting pgAdmin I get: The application  server could not be 
contacted. This a fatal error of course. Any hint, tips or "Oh man 
you should have" etc will be greatly appreciated.


/Richard




RE: PostgreSQl, PHP and IIS

2018-09-24 Thread Mark Williams
Hi thanks,

Sadly PHP and IIS isn't the issue. It’s when you bring Postgres into the fold 
that I am getting a problem. But thanks.

Mark

__

-Original Message-
From: Adrian Klaver  
Sent: 23 September 2018 15:36
To: Mark Williams ; pgsql-general@lists.postgresql.org; 
dan...@manitou-mail.org
Subject: Re: PostgreSQl, PHP and IIS

On 9/23/18 3:54 AM, Mark Williams wrote:
> Thanks for the suggestions.
> 
> I don't think there is any dependencies in the lib folder, but added it 
> anyway and it made no difference.
> 
> I tried "Process Explorer" (having previously tried "Dependency Walker") PE 
> revealed one or two more dependencies I was previously unaware of. Added 
> these to the root PHP folder, but still got me no further. I have even added 
> Postgres.EXE to the php root folder.
> 
> In desperation I tried adding all the postgres files to the PHP Ext folder. 
> Unsurprisingly that did not yield results either.
> 
> Does anyone know of a step by step guide for installing php and postgres on 
> windows server? So I can doublecheck to see if there is anything I have 
> missed. I can only assume I have made some simple mistake as it surely can't 
> be this difficult to set up.

Maybe this?:

https://docs.microsoft.com/en-us/iis/application-frameworks/install-and-configure-php-on-iis/install-and-configure-php

> 
> Cheers,
> 
> Mark
> 
> __
> 
> -Original Message-
> From: Adrian Klaver 
> Sent: 22 September 2018 16:45
> To: Mark Williams ; pgsql-general@lists.postgresql.org
> Subject: Re: PostgreSQl, PHP and IIS
> 
> On 9/22/18 3:06 AM, Mark Williams wrote:
>> "Fatal error: Call to undefined function pg_connect".
>>
>> Obviously, that is to be expected if pgsql module is not loading.
> 
> Went back over your previous post and saw:
> 
> "I have copied the version of libpq.dll from the lib folder of the postgresql 
> installation to the root folder of the PHP installation.
> 
> I have added the bin folder of the postgresql installation to the Windows 
> system search path.
> "
> 
> I would say add the lib/ of the Postgres install to the search path.
> 
>>



-- 
Adrian Klaver
adrian.kla...@aklaver.com




Re: Mysteriously varying index scan costs

2018-09-24 Thread Laurenz Albe
Pavan Deolasee wrote:
> On Mon, Sep 24, 2018 at 2:20 PM, Laurenz Albe  
> wrote:
> > 
> > But how can it be that the first run has to touch 74917 blocks,
> > while whe second run only needs to touch 1185?
> > 
> 
> The first index scan may have killed lots of index tuples.

So the first index scan visited lots of table tuples, found them
dead, and set the LP_DEAD flag on the corresponding index items
so that subsequent index scans could ignore them, right?

Thanks, that must be the correct explanation.  It correlates nicely
with the dirtied pages, too.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: pgAdmin v4: The application server could not be contacted.

2018-09-24 Thread Adrian Klaver

On 9/24/18 2:24 AM, Richard Nielsen wrote:

Hi,

I'm new to PostgresSQL and to this list! I have installed version 10 
64-bit (on Windows 7) I also used stackbuilder to install the ODBC 
drivers. The SQL shell allows me to logon to the server as postgres but 
when starting pgAdmin I get: The application  server could not be 
contacted. This a fatal error of course. Any hint, tips or "Oh man 
you should have" etc will be greatly appreciated.


From the mention of Stackbuilder I am going to assume you have 
installed using the EDB installer. pgAdmin4 runs as a client/server app 
where the server is a HTTP server that runs the application you then 
connect to with a browser. From your error message it looks like the 
server is either not running or it is running and the client(browser) 
cannot find it. Not sure how this is setup on Windows. My suggestion 
would to ask here:


https://postgresrocks.enterprisedb.com/t5/EDB-Postgres/bd-p/EDBPostgres



/Richard






--
Adrian Klaver
adrian.kla...@aklaver.com



Re: PostgreSQl, PHP and IIS

2018-09-24 Thread rob stone
Hi

On Mon, 2018-09-24 at 10:46 +0100, Mark Williams wrote:
> Hi thanks,
> 
> Sadly PHP and IIS isn't the issue. It’s when you bring Postgres into
> the fold that I am getting a problem. But thanks.
> 
> Mark
> 
> 
I'm not a Windows user and no little about setting up Postgres using
IIS, but you might find something useful from users' comments at this
link:-

http://www.php.net/manual/en/pgsql.setup.php


Cheers,
Rob






Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-24 Thread Arup Rakshit
Hello Thomas,

Thanks for the link. I read the documentation you linked, and part of it I 
understood and rest went above my head. Probably I need to read it multiple 
times to understand what is going on. I am learning how indexing works in DBMS. 
Mostly I understood Btree so far. I am an application developer. Being an 
application developer I think I need to know which column should be indexed and 
what kind of index to apply in which case. Most of the time, when I see slow 
query, I saw people ask to run the explain to see the plan. And explain 
statement shows lot of data about the query. So my questions is that: Which 
part I should read from the plan output to figure out reason of slowness or 
what need to be used to improve it. What are basic things I should know about 
it. I think, I don’t need to internal math for this, am I right? The query in 
this post is not a slow query, it is something I ran to see how index merge 
happens. I am asking generally. Can you give me some directions on this, so 
that I can build up some bases on this subject.


Also what are the best resources to learn GIST, GIN indexes — something which 
teaches it from the ground up? 


Thanks,

Arup Rakshit
a...@zeit.io



> On 24-Sep-2018, at 4:11 AM, Tomas Vondra  wrote:
> 
> 
> 
> On 09/23/2018 10:21 PM, Arup Rakshit wrote:
>> Hello I have some questions related to the query plan output about the
>> planned and actual rows. In the following example:
>> 
>> # explain (analyze true, costs true, format yaml) select * from users
>> where lower(city) = 'melanyfort' and lower(state) = 'ohio';
>>  QUERY PLAN 
>>
>> 
>>  - Plan:
>>   +
>>  Node Type: "Bitmap Heap Scan"  
>>   +
>>  Parallel Aware: false  
>>   +
>>  Relation Name: "users" 
>>   +
>>  Alias: "users" 
>>   +
>>  Startup Cost: 10.78
>>   +
>>  Total Cost: 14.80  
>>   +
>>  Plan Rows: 1   
>>   +
>>  Plan Width: 73 
>>   +
>>  Actual Startup Time: 0.155 
>>   +
>>  Actual Total Time: 0.155   
>>   +
>>  Actual Rows: 0 
>>   +
>>  Actual Loops: 1
>>   +
>>  Recheck Cond: "((lower((city)::text) = 'melanyfort'::text) AND
>> (lower((state)::text) = 'ohio'::text))"+
>>  Rows Removed by Index Recheck: 0   
>>   +
>>  Exact Heap Blocks: 0   
>>   +
>>  Lossy Heap Blocks: 0   
>>   +
>>  Plans: 
>>   +
>>- Node Type: "BitmapAnd" 
>>   +
>>  Parent Relationship: "Outer"   
>>   +
>>  Parallel Aware: false  
>>   +
>>  Startup Cost: 10.78
>>   +
>>  Total Cost: 10.78  
>>   +
>>  Plan Rows: 1   
>>   +
>>  Plan Width: 0  
>>   +
>>  Actual Startup Time: 0.153 
>>   +
>>  Actual Total Time: 0.153   
>>   +
>>  Actual Rows: 0

Weird procedure question

2018-09-24 Thread digimer

Hi all,

  I've got an interesting use case that I am stuck on. It's a bit of a 
complicated environment, but I'll try to keep it simple.


  In short; I have a history schema that has tables that match the 
public schema, plus one 'history_id' column that has a simple sequential 
bigserial value. Then I have a procedure and trigger that, on UPDATE or 
INSERT, copies the data to history. Example use case is that I can 
UPDATE a sensor value in the public table and it's also INSERTs the data 
into history. So public shows just the most recent values, but I can see 
changes over time in the history schema.


  I have built my system to support writing to one or more DBs. I keep 
a list of connected DBs and send INSERT/UPDATE calls to a method that 
then runs the UPDATE/INSERT against all connected databases, as a form 
of redundancy. This all works fine.


  The problem I've hit is that the 'history_id' differs between the 
various databases. So I want to switch this to 'history_uuid' and use 
UUIDs instead of bigserial.


  Now the question;

  Can I tell a produce to use a specific UUID?

  The idea is to generate a UUID for 'history_uuid' so that I have 
consistency across databases. Of course, if an UPDATE will change 
multiple rows, then I'll need to predefine multiple UUIDs. This is where 
things start to get really complicated I think... Maybe I could pass an 
array of UUIDs? I don't care if I find out which UUID was used for which 
record, just that the same UUID was used for the same record when the 
procedure is (re)run on other DBs.


  The databases are not clustered, on purpose. I've been trying to 
handle all the HA stuff in my application for various reasons.


If it helps, here is an example pair of tables, the procedure and the 
trigger I currently use;



CREATE TABLE host_variable (
    host_variable_uuid uuid    not null    primary key,
    host_variable_host_uuid    uuid    not null,
    host_variable_name text    not null,
    host_variable_value text    not null,
    modified_date  timestamp with time zone    not null
);
ALTER TABLE host_variable OWNER TO admin;

CREATE TABLE history.host_variable (
    history_id bigserial,
    host_variable_uuid uuid,
    host_variable_host_uuid    uuid,
    host_variable_name text,
    host_variable_value    text,
    modified_date  timestamp with time zone    not null
);
ALTER TABLE history.host_variable OWNER TO admin;

CREATE FUNCTION history_host_variable() RETURNS trigger
AS $$
DECLARE
    history_host_variable RECORD;
BEGIN
    SELECT INTO history_host_variable * FROM host_variable WHERE 
host_uuid = new.host_uuid;

    INSERT INTO history.host_variable
    (host_variable_uuid,
 host_variable_host_uuid,
 host_variable_name,
 host_variable_value,
 modified_date)
    VALUES
    (history_host_variable.host_variable_uuid,
 history_host_variable.host_variable_host_uuid,
 history_host_variable.host_variable_name,
 history_host_variable.host_variable_value,
 history_host_variable.modified_date);
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;
ALTER FUNCTION history_host_variable() OWNER TO admin;

CREATE TRIGGER trigger_host_variable
    AFTER INSERT OR UPDATE ON host_variable
    FOR EACH ROW EXECUTE PROCEDURE history_host_variable();


  I know this might sound odd, but I didn't want to complicate things 
with how my system works. However, if it would help solve the problem, 
I'm happy to dig into more detail.


  Thanks!




Re: Weird procedure question

2018-09-24 Thread digimer

On 2018-09-25 1:22 a.m., digimer wrote:

  Can I tell a produce to use a specific UUID?


s/produce/procedure/




Re: Weird procedure question

2018-09-24 Thread James Keener
Do you need a single field for the pk or can you just make it the
(original_table_pk, modified_time)? Alternatively, you could generate a
uuid v3 from the (original_table_pk, modified_time) using something like
uuid_generate_v3(uuid_nil(), original_table_pk || ":" || modified_time)?


Re: Weird procedure question

2018-09-24 Thread digimer

On 2018-09-25 1:33 a.m., James Keener wrote:
Do you need a single field for the pk or can you just make it the 
(original_table_pk, modified_time)? Alternatively, you could generate 
a uuid v3 from the (original_table_pk, modified_time) using something 
like uuid_generate_v3(uuid_nil(), original_table_pk || ":" || 
modified_time)?


I need to preset the modified_time, I can't use now() or else the value 
would differ between databases. Also, unless I am missing something, 
uuid_generate_v3() would generate a different UUID per trigger of the 
procedure, so I'd end up with different history_uuids on each database 
that I ran the query against.


If I am missing something (and entirely possible I am), please hit me 
with a clue stick. :)


digimer




Re: Weird procedure question

2018-09-24 Thread James Keener
v3 UUIDs are basically MD5 hashes (v5 is sha1?). So for the same input
you'll always get the same hash.

I had assumed the modified time would be the same; if that's not, then I'm
not sure and my gut tells me this becomes A Really Hard Problem™.

Jim

On Tue, Sep 25, 2018 at 1:38 AM digimer  wrote:

> On 2018-09-25 1:33 a.m., James Keener wrote:
> > Do you need a single field for the pk or can you just make it the
> > (original_table_pk, modified_time)? Alternatively, you could generate
> > a uuid v3 from the (original_table_pk, modified_time) using something
> > like uuid_generate_v3(uuid_nil(), original_table_pk || ":" ||
> > modified_time)?
>
> I need to preset the modified_time, I can't use now() or else the value
> would differ between databases. Also, unless I am missing something,
> uuid_generate_v3() would generate a different UUID per trigger of the
> procedure, so I'd end up with different history_uuids on each database
> that I ran the query against.
>
> If I am missing something (and entirely possible I am), please hit me
> with a clue stick. :)
>
> digimer
>
>


Re: Weird procedure question

2018-09-24 Thread James Keener
Also, modified time doesn't need to be the current time, if it starts as
"null" and is set on the first update, and all subsequent updates, the
pre-update modified time could be used to help key the history pk.

Jim

On Tue, Sep 25, 2018 at 1:45 AM James Keener  wrote:

> v3 UUIDs are basically MD5 hashes (v5 is sha1?). So for the same input
> you'll always get the same hash.
>
> I had assumed the modified time would be the same; if that's not, then I'm
> not sure and my gut tells me this becomes A Really Hard Problem™.
>
> Jim
>
> On Tue, Sep 25, 2018 at 1:38 AM digimer  wrote:
>
>> On 2018-09-25 1:33 a.m., James Keener wrote:
>> > Do you need a single field for the pk or can you just make it the
>> > (original_table_pk, modified_time)? Alternatively, you could generate
>> > a uuid v3 from the (original_table_pk, modified_time) using something
>> > like uuid_generate_v3(uuid_nil(), original_table_pk || ":" ||
>> > modified_time)?
>>
>> I need to preset the modified_time, I can't use now() or else the value
>> would differ between databases. Also, unless I am missing something,
>> uuid_generate_v3() would generate a different UUID per trigger of the
>> procedure, so I'd end up with different history_uuids on each database
>> that I ran the query against.
>>
>> If I am missing something (and entirely possible I am), please hit me
>> with a clue stick. :)
>>
>> digimer
>>
>>


Re: Weird procedure question

2018-09-24 Thread digimer
Oh, this is a very interesting approach! I didn't realize any UUIDs 
could be created in a predictable way. Thank you, this might be what I need.


digimer

On 2018-09-25 1:47 a.m., James Keener wrote:
Also, modified time doesn't need to be the current time, if it starts 
as "null" and is set on the first update, and all subsequent updates, 
the pre-update modified time could be used to help key the history pk.


Jim

On Tue, Sep 25, 2018 at 1:45 AM James Keener > wrote:


v3 UUIDs are basically MD5 hashes (v5 is sha1?). So for the same
input you'll always get the same hash.

I had assumed the modified time would be the same; if that's not,
then I'm not sure and my gut tells me this becomes A Really Hard
Problem™.

Jim

On Tue, Sep 25, 2018 at 1:38 AM digimer mailto:li...@alteeve.ca>> wrote:

On 2018-09-25 1:33 a.m., James Keener wrote:
> Do you need a single field for the pk or can you just make
it the
> (original_table_pk, modified_time)? Alternatively, you could
generate
> a uuid v3 from the (original_table_pk, modified_time) using
something
> like uuid_generate_v3(uuid_nil(), original_table_pk || ":" ||
> modified_time)?

I need to preset the modified_time, I can't use now() or else
the value
would differ between databases. Also, unless I am missing
something,
uuid_generate_v3() would generate a different UUID per trigger
of the
procedure, so I'd end up with different history_uuids on each
database
that I ran the query against.

If I am missing something (and entirely possible I am), please
hit me
with a clue stick. :)

digimer



Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-24 Thread Laurenz Albe
Arup Rakshit wrote:
> Thanks for the link. I read the documentation you linked, and part of it
> I understood and rest went above my head. Probably I need to read it multiple
> times to understand what is going on. I am learning how indexing works in 
> DBMS.
> Mostly I understood Btree so far. I am an application developer.
> Being an application developer I think I need to know which column should be
> indexed and what kind of index to apply in which case.

That is true; a truth that many developers unfortunately ignore.

> Most of the time, when
> I see slow query, I saw people ask to run the explain to see the plan.
> And explain statement shows lot of data about the query. So my questions is 
> that:
> Which part I should read from the plan output to figure out reason of slowness
> or what need to be used to improve it. What are basic things I should know 
> about it.
> I think, I don’t need to internal math for this, am I right? The query in
> this post is not a slow query, it is something I ran to see how index merge 
> happens.
> I am asking generally. Can you give me some directions on this, so that I can
> build up some bases on this subject.

This is a broad topic, and I can only give you some hints.
In order to understand EXPLAIN output and to improve your query, you need
to know some of how the database is implemented.

You have to understand index scans, index only scans and bitmap index scans.
You have to understand nested loop, hash and merge joins.
You have to understand table statistics, dead tuples and table bloat.

The first things to look for in EXPLAIN (ANALYZE, BUFFERS) output is in which
nodes the time is spent, and where the estimated number of rows diverges
significantly from the actual number of rows.
The former are the spots where there is room for improvement, and the latter
is often the root cause of a bad plan choice.
Also, watch out for the nodes that touch a lot of blocks.
They can cause intermittent slow-down if the blocks are not in cache.

> Also what are the best resources to learn GIST, GIN indexes — something which
> teaches it from the ground up? 

The documentation, and for what goes beyond that, the source.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com