from:"Frank"

When to store data that could be derived

2019-03-23 Thread Frank


Hi all

As I understand it, a  general rule of thumb is that you should never 
create a physical column if the data could be derived from existing 
columns. A possible reason for breaking this rule is for performance 
reasons.


I have a situation where I am considering breaking the rule, but I am 
not experienced enough in SQL to know if my reason is valid. I would 
appreciate it if someone could glance at my 'before' and 'after' 
scenarios and see if, from a 'gut-feel' point of view, I should proceed.


I have a VIEW constructed as follows -

CREATE VIEW view_name AS
[select statement 1]
UNION ALL
[select statement 2]
etc.

This is one of the select statements. I will give the 'after' scenario 
first -


SELECT
    'arec' AS tran_type, a.row_id AS tran_row_id,
    a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_date
    WHEN a.tran_type = 'cb_rec' THEN w.tran_date
    END AS tran_date,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.text
    WHEN a.tran_type = 'cb_rec' THEN w.text
    END AS text,
    0 - a.arec_cust AS amount_cust,
    0 - a.arec_local AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
WHERE
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.posted
    WHEN a.tran_type = 'cb_rec' THEN w.posted
    END = '1'

The two columns a.arec_cust and a.arec_local *can* be derived from other 
columns, and in fact that is how it is working at the moment, so here is 
the 'before' scenario -


SELECT
    'arec' AS tran_type, a.row_id AS tran_row_id,
    a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_date
    WHEN a.tran_type = 'cb_rec' THEN w.tran_date
    END AS tran_date,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.text
    WHEN a.tran_type = 'cb_rec' THEN w.text
    END AS text,
    ROUND(0 - (ROUND(a.arec_amount / CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
    WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
    END * a.cust_exch_rate, u.scale)), 2) AS amount_cust,
    ROUND(0 - (ROUND(a.arec_amount / CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
    WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
    END, s.scale)), 2) AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
LEFT JOIN ar_customers v ON v.row_id = a.cust_row_id
LEFT JOIN adm_currencies u ON u.row_id = v.currency_id
LEFT JOIN adm_params t ON t.row_id = 1
LEFT JOIN adm_currencies s ON s.row_id = t.local_curr_id
WHERE
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.posted
    WHEN a.tran_type = 'cb_rec' THEN w.posted
    END = '1'

As you can see, complexity has increased and there are four additional 
JOINs.


I am expecting the VIEW to be used extensively for query purposes, and 
my gut-feel says that the second one is likely to lead to performance 
problems in a system with a lot of data and a lot of users.


I am not looking for an answer - I know that I should create dummy data 
and run some timing tests. I was just wondering if someone more 
experienced would wince when they look at the second SELECT, or if they 
would shrug and think that it looks fine.


Any input will be appreciated.

Frank Millman

Re: When to store data that could be derived

2019-03-24 Thread Frank





On 2019-03-24 9:25 AM, Ron wrote:

On 3/24/19 1:42 AM, Frank wrote:

Hi all

As I understand it, a  general rule of thumb is that you should never 
create a physical column if the data could be derived from existing 
columns. A possible reason for breaking this rule is for performance 
reasons.


I have a situation where I am considering breaking the rule, but I am 
not experienced enough in SQL to know if my reason is valid. I would 
appreciate it if someone could glance at my 'before' and 'after' 
scenarios and see if, from a 'gut-feel' point of view, I should proceed.




[snip]



Sure the second query joins a lot of tables, but is pretty straightforward.

What REALLY worries me is whether or not the query optimiser would look 
at the WHERE CASE, run away screaming and then make it use sequential 
scans. Thus, even query #1 would be slow.




I had not realised that. I hope someone else chimes in on this.



Is this a historical data set that's never updated, or current data 
that's constantly added to?




It is the latter - current data constantly added to.

Frank

Re: When to store data that could be derived

2019-03-24 Thread Frank





On 2019-03-24 11:11 AM, Tony Shelver wrote:

Not the answer you are looking for, but...

I'd suggest trying to create a non-trivial set of dummy data to test your
assumptions before deciding on a route.
It's saved my (professional) life a few times over the years when dealing
with untested designs and new (to us) technology.

Some years ago we were implementing an identity management system for a
large US bank, with SQL Server as the store, with a planned integration to
the id / access / permissions of some 300+ systems, targeting 30k plus
users.

Requirements kept changing as we added new systems to the initial mix,
which the Id management package couldn't handle out the box, so we had to
implement a custom design.  We had to choose between 2 database designs,
one being fully 'normalized' (considering that everything was an object')
and one where we made some assumptions and fixed some table structures in
the interest of performance.

Eventually we spent a few days adding non-trivial amounts of test data to
the proposed designs and it became quickly became very apparent that option
1 was unworkable once we got beyond 10 systems or so.



Good advice - much appreciated.

Frank

Re: When to store data that could be derived

2019-03-25 Thread Frank

On 2019-03-24 2:41 PM, Peter J. Holzer wrote:

On 2019-03-24 10:05:02 +0200, Frank wrote:

Many thanks to Peter et al for their valuable insights. I have learned a 
lot.

> So the important part here is not whether data is added, but whether
> data is changed. Sure, new transactions are added all the time. But is
> it expected that the data used to derive amount_cust and amount_local
> (e.g. the exchange rate) is changed retroactively, and if it is should
> the computed amount change? (I'm a bit worried about the join with the
> customers table here - what happens when a customer moves their
> headquarters to a country with a different currency?)

I think I have got both of those covered. I store the exchange rates in 
physical columns on the transaction, so the compute expressions will 
always return the same values. I have separate tables for 
'organisations' (O) and for 'customers' (C). C has a foreign key 
reference to O, and most static data such as addresses and contact 
details are stored on O. So if a customer moved, I would create a new C 
record with the new currency, and flag the old C record as inactive. 
They can happily co-exist, so receipts can be processed against the old 
C record until it is paid up.

I have started doing some volume tests, and at this stage, for the kind 
of volumes I am concerned about, it looks as if performance is a non-issue.

I generated about 22000 invoices and 22000 receipts, over 12 customers 
and 6 months. Invoices and receipts are stored in separate tables, and a 
VIEW presents them as a single table.

Using the VIEW, I selected all transactions for a given customer for a 
given month. It returned 620 rows and (on my slow desktop computer) it 
took 20ms. I can live with that.

I will generate some higher volumes overnight, and see if it makes a big 
difference. If you do not hear from me, you can consider it 'problem 
solved' :-)

Again, thanks to all.

Frank

Re: When to store data that could be derived

2019-03-25 Thread Frank




On 2019-03-25 4:06 PM, Ron wrote:

On 3/25/19 8:15 AM, Frank wrote:

It would be interesting to see what the query planner tries to do with 
this:


WHERE
     CASE
     WHEN a.tran_type = 'ar_rec' THEN y.posted
     WHEN a.tran_type = 'cb_rec' THEN w.posted
     END = '1'



I have attached the schema showing the full VIEW definition, and the 
result of the following EXPLAIN -


EXPLAIN SELECT * FROM ccc.ar_trans WHERE cust_row_id = 4 AND tran_date 
BETWEEN '2015-06-01' AND '2015-06-30'.


Because I have used 'WHERE tran_date' in the query, and tran_date is 
also derived from a CASE expression, I imagine that will also add some 
complication.


I am running PostgreSQL 11.1 on Fedora 29.

Frank
CREATE OR REPLACE VIEW ccc.ar_trans AS
 SELECT 'ar_inv'::text AS tran_type,
a.row_id AS tran_row_id,
a.tran_number,
a.cust_row_id,
a.tran_date,
a.text,
a.inv_net_cust + a.inv_tax_cust AS amount_cust,
a.inv_net_local + a.inv_tax_local AS amount_local
   FROM ccc.ar_tran_inv a
  WHERE a.posted = true
UNION ALL
 SELECT 'ar_crn'::text AS tran_type,
a.row_id AS tran_row_id,
a.tran_number,
a.cust_row_id,
a.tran_date,
a.text,
0::numeric - (a.crn_net_cust + a.crn_tax_cust) AS amount_cust,
0::numeric - (a.crn_net_local + a.crn_tax_local) AS amount_local
   FROM ccc.ar_tran_crn a
  WHERE a.posted = true
UNION ALL
 SELECT 'arec'::text AS tran_type,
a.row_id AS tran_row_id,
a.tran_number,
a.cust_row_id,
CASE
WHEN a.tran_type::text = 'ar_rec'::text THEN y.tran_date
WHEN a.tran_type::text = 'cb_rec'::text THEN w.tran_date
ELSE NULL::date
END AS tran_date,
CASE
WHEN a.tran_type::text = 'ar_rec'::text THEN y.text
WHEN a.tran_type::text = 'cb_rec'::text THEN w.text
ELSE NULL::character varying
END AS text,
0::numeric - a.arec_cust AS amount_cust,
round(0::numeric - round(a.arec_amount /
CASE
WHEN a.tran_type::text = 'ar_rec'::text THEN y.tran_exch_rate
WHEN a.tran_type::text = 'cb_rec'::text THEN w.tran_exch_rate
ELSE NULL::numeric
END, u.scale), 2) AS amount_local
   FROM ccc.ar_rec_subtran a
 LEFT JOIN ccc.ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
 LEFT JOIN ccc.ar_tran_rec y ON y.row_id = z.tran_row_id
 LEFT JOIN ccc.cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
 LEFT JOIN ccc.cb_tran_rec w ON w.row_id = x.tran_row_id
 LEFT JOIN ccc.adm_params v ON v.row_id = 1
 LEFT JOIN ccc.adm_currencies u ON u.row_id = v.local_curr_id
  WHERE
CASE
WHEN a.tran_type::text = 'ar_rec'::text THEN y.posted
WHEN a.tran_type::text = 'cb_rec'::text THEN w.posted
ELSE NULL::boolean
END = true
UNION ALL
 SELECT 'ar_disc'::text AS tran_type,
a.row_id AS tran_row_id,
a.tran_number,
a.cust_row_id,
a.tran_date,
a.text,
0::numeric - (a.disc_net_cust + a.disc_tax_cust) AS amount_cust,
0::numeric - (a.disc_net_local + a.disc_tax_local) AS amount_local
   FROM ccc.ar_tran_disc a
  WHERE a.posted = true
EXPLAIN SELECT * FROM ccc.ar_trans WHERE cust_row_id = 4 AND tran_date BETWEEN 
'2015-06-01' AND '2015-06-30'.



QUERY PLAN  


   
---
 Append  (cost=0.00..2545.77 rows=404 width=132)
   ->  Seq Scan on ar_tran_inv a  (cost=0.00..780.70 rows=300 width=125)
 Filter: (posted AND (tran_date >= '2015-06-01'::date) AND (tran_date 
<= '2015-06-30'::date) AND (cust_row_id = 4))
   ->  Seq Scan on ar_tran_crn a_1  (cost=0.00..14.91 rows=1 width=172)
 Filter: (posted AND (tran_date >= '2015-06-01'::date) AND (tran_date 
&l

Re: When to store data that could be derived

2019-03-25 Thread Frank




On 2019-03-25 5:11 PM, Frank wrote:



On 2019-03-25 4:06 PM, Ron wrote:

On 3/25/19 8:15 AM, Frank wrote:

It would be interesting to see what the query planner tries to do with 
this:


WHERE
 CASE
 WHEN a.tran_type = 'ar_rec' THEN y.posted
 WHEN a.tran_type = 'cb_rec' THEN w.posted
 END = '1'



I have attached the schema showing the full VIEW definition, and the 
result of the following EXPLAIN -


EXPLAIN SELECT * FROM ccc.ar_trans WHERE cust_row_id = 4 AND tran_date 
BETWEEN '2015-06-01' AND '2015-06-30'.


Because I have used 'WHERE tran_date' in the query, and tran_date is 
also derived from a CASE expression, I imagine that will also add some 
complication.


I am running PostgreSQL 11.1 on Fedora 29.

Frank


On reflection, I have not been consistent with my use of indexes, and I 
think that will affect the query plan.


There are at least two issues -

1. The first table defined in the VIEW is ccc.ar_tran_inv. It has the 
following index -
"arinv_cust_date" btree (cust_row_id NULLS FIRST, tran_date NULLS 
FIRST) WHERE deleted_id = 0


I have not used "WHERE deleted_id = 0" when constructing the VIEW, but I 
have used "WHERE posted = '1'". I don't think the index can be used with 
this setup.


2. The complicated table in the VIEW is ccc.ar_rec_subtran. Various 
columns such as tran_date and posted are retrieved via CASE expressions 
from two underlying tables. Those tables have certain indexes defined, 
but I cannot see how they can be utilised from my current setup.


I think I should spend some time tidying this up before you try to make 
sense of the query plan. Any tips on how to improve it will be appreciated.


Frank

Re: When to store data that could be derived

2019-03-26 Thread Frank




On 2019-03-25 5:44 PM, Frank wrote:



On reflection, I have not been consistent with my use of indexes, and I 
think that will affect the query plan.


There are at least two issues -

1. The first table defined in the VIEW is ccc.ar_tran_inv. It has the 
following index -
     "arinv_cust_date" btree (cust_row_id NULLS FIRST, tran_date NULLS 
FIRST) WHERE deleted_id = 0


I have not used "WHERE deleted_id = 0" when constructing the VIEW, but I 
have used "WHERE posted = '1'". I don't think the index can be used with 
this setup.


2. The complicated table in the VIEW is ccc.ar_rec_subtran. Various 
columns such as tran_date and posted are retrieved via CASE expressions 
from two underlying tables. Those tables have certain indexes defined, 
but I cannot see how they can be utilised from my current setup.


I think I should spend some time tidying this up before you try to make 
sense of the query plan. Any tips on how to improve it will be appreciated.




I have spent a lot of time testing various permutations and trying to 
understand them using EXPLAIN.


My original concern was the overhead of calculating derived data. I now 
realise that it is more important to get the indexes right, as that has 
a much bigger impact on performance.


The VIEW that I have been using for testing is actually masking the 
problem. The view combines four tables, three of which are 
straightforward, easy to index, and fast. The fourth one is complex, 
difficult to index, and comparatively slow. So I forgot about the VIEW 
and concentrated on the complex table.


I now understand the caveats I received earlier in this thread. It seems 
impossible to make use of the indexes on the JOINed tables in the 
following query. I did create an index on cust_row_id in the main table, 
and it made a difference with a simple query, but by the time I added 
the JOINs, the improvement was hardly noticeable.


This is the query that I used -

EXPLAIN SELECT COUNT(*)
FROM ccc.ar_rec_subtran a
LEFT JOIN ccc.ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ccc.ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN ccc.cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN ccc.cb_tran_rec w ON w.row_id = x.tran_row_id
WHERE a.cust_row_id = 4 AND
CASE
WHEN a.tran_type = 'ar_rec' THEN y.tran_date
WHEN a.tran_type = 'cb_rec' THEN w.tran_date
END BETWEEN '2015-05-01' AND '2015-05-31' AND
CASE
WHEN a.tran_type = 'ar_rec' THEN y.posted
WHEN a.tran_type = 'cb_rec' THEN w.posted
END = true AND
a.deleted_id = 0;

Attached is the EXPLAIN for this one.

I also ran the same query with the following WHERE clause -

WHERE a.cust_row_id = 4 AND
((a.tran_type = 'ar_rec' AND
y.tran_date BETWEEN '2015-05-01' AND '2015-05-31') OR
(a.tran_type = 'cb_rec' AND
w.tran_date BETWEEN '2015-05-01' AND '2015-05-31')) AND
((a.tran_type = 'ar_rec' AND y.posted = true) OR
(a.tran_type = 'cb_rec' AND w.posted = true)) AND
a.deleted_id = 0;

The timings were virtually identical, so I have not attached that EXPLAIN.

Frank



   QUERY PLAN   


 
-
 Aggregate  (cost=1514.75..1514.76 rows=1 width=8)
   ->  Hash Left Join  (cost=895.41..1514.49 rows=102 width=0)
 Hash Cond: (a.tran_det_row_id = x.row_id)
 Filter: (CASE WHEN ((a.tran_type)::text = 'ar_rec'::text) THEN 
y.posted WHEN ((a.tran_type)::text = 'cb_rec'::text) THEN w.posted ELSE 
NULL::boolean END AND (CASE WHEN ((a.tran_type)::text = 'ar_rec'::text) THEN 
y.tran_date WHEN ((a.tran_type)::text = 'cb_rec'::text) THEN w.tran_date ELSE 
NULL::date END >= '2015-05-01'::date) AND (CASE WHEN ((a.tran_type)::text = 
'ar_rec'::t

Slow SELECT

2020-05-26 Thread Frank Millman


Hi all

I have a SELECT that runs over 5 times slower on PostgreSQL compared 
with Sql Server and sqlite3. I am trying to understand why.


I have a table that looks like this (simplified) -

CREATE TABLE my_table (
row_id SERIAL PRIMARY KEY,
deleted_id INT DEFAULT 0,
fld_1 INT REFERENCES table_1(row_id),
fld_2 INT REFERENCES table_2(row_id),
fld_3 INT REFERENCES table_3(row_id),
fld_4 INT REFERENCES table_4(row_id),
tran_date DATE,
tran_total DEC(21,2)
);

CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3, 
fld_4, tran_date) WHERE deleted_id = 0;


The table sizes are -
my_table : 167 rows
table_1 : 21 rows
table_2 : 11 rows
table_3 : 3 rows
table_4 : 16 rows

Therefore for each tran_date in my_table there are potentially 
21x11x3x16 = 11088 rows. Most will be null.


I want to select the row_id for the last tran_date for each of those 
potential groups. This is my select -


SELECT (
SELECT a.row_id FROM my_table a
WHERE a.fld_1 = b.row_id
AND a.fld_2 = c.row_id
AND a.fld_3 = d.row_id
AND a.fld_4 = e.row_id
AND a.deleted_id = 0
ORDER BY a.tran_date DESC LIMIT 1
)
FROM table_1 b, table_2 c, table_3 d, table_4 e

Out of 11088 rows selected, 103 are not null.

On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on 
sqlite3, and 0.31 sec on PostgreSQL.


I have looked at the EXPLAIN, but I don't really know what to look for. 
I can supply it if that would help.


Thanks for any advice.

Frank Millman

Re: Slow SELECT

2020-05-26 Thread Frank Millman


On 2020-05-26 9:32 AM, Olivier Gautherot wrote:

Hi Frank,

On Tue, May 26, 2020 at 9:23 AM Frank Millman <mailto:fr...@chagford.com>> wrote:


Hi all

I have a SELECT that runs over 5 times slower on PostgreSQL compared
with Sql Server and sqlite3. I am trying to understand why.

I have a table that looks like this (simplified) -

CREATE TABLE my_table (
      row_id SERIAL PRIMARY KEY,
      deleted_id INT DEFAULT 0,
      fld_1 INT REFERENCES table_1(row_id),
      fld_2 INT REFERENCES table_2(row_id),
      fld_3 INT REFERENCES table_3(row_id),
      fld_4 INT REFERENCES table_4(row_id),
      tran_date DATE,
      tran_total DEC(21,2)
      );

CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3,
fld_4, tran_date) WHERE deleted_id = 0;

The table sizes are -
      my_table : 167 rows
      table_1 : 21 rows
      table_2 : 11 rows
      table_3 : 3 rows
      table_4 : 16 rows

Therefore for each tran_date in my_table there are potentially
21x11x3x16 = 11088 rows. Most will be null.

I want to select the row_id for the last tran_date for each of those
potential groups. This is my select -

      SELECT (
          SELECT a.row_id FROM my_table a
          WHERE a.fld_1 = b.row_id
          AND a.fld_2 = c.row_id
          AND a.fld_3 = d.row_id
          AND a.fld_4 = e.row_id
          AND a.deleted_id = 0
          ORDER BY a.tran_date DESC LIMIT 1
      )
      FROM table_1 b, table_2 c, table_3 d, table_4 e

Out of 11088 rows selected, 103 are not null.

On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on
sqlite3, and 0.31 sec on PostgreSQL.


SQL Server does a good job at caching data in memory. PostgreSQL does 
too on consecutive calls to the same table. What execution time do you 
get if you issue the query a second time?


My first guess would be to add an index on my_table.tran_date and check 
in EXPLAIN that you don't have a SEQUENTIAL SCAN on that table.


I have looked at the EXPLAIN, but I don't really know what to look for.
I can supply it if that would help.

Thanks for any advice.



Thanks Olivier. Unfortunately that did not help.

I was already running the query twice and only timing the second one.

I added the index on tran_date. The timing is the same, and EXPLAIN 
shows that it is using a SEQUENTIAL SCAN.


Here is the EXPLAIN -


 Nested Loop  (cost=0.00..64155.70 rows=11088 width=4)
   ->  Nested Loop  (cost=0.00..10.36 rows=528 width=12)
 ->  Nested Loop  (cost=0.00..2.56 rows=33 width=8)
   ->  Seq Scan on table_2 c  (cost=0.00..1.11 rows=11 width=4)
   ->  Materialize  (cost=0.00..1.04 rows=3 width=4)
 ->  Seq Scan on table_3 d  (cost=0.00..1.03 rows=3 
width=4)

 ->  Materialize  (cost=0.00..1.24 rows=16 width=4)
   ->  Seq Scan on table_4 e  (cost=0.00..1.16 rows=16 width=4)
   ->  Materialize  (cost=0.00..1.31 rows=21 width=4)
 ->  Seq Scan on table_1 b  (cost=0.00..1.21 rows=21 width=4)
   SubPlan 1
 ->  Limit  (cost=5.77..5.77 rows=1 width=8)
   ->  Sort  (cost=5.77..5.77 rows=1 width=8)
 Sort Key: a.tran_date DESC
 ->  Seq Scan on my_table a  (cost=0.00..5.76 rows=1 
width=8)
   Filter: ((fld_1 = b.row_id) AND (fld_2 = 
c.row_id) AND (fld_3 = d.row_id) AND (fld_4 = e.row_id) AND (deleted_id 
= 0))



Frank

Re: Slow SELECT

2020-05-26 Thread Frank Millman


On 2020-05-26 11:10 AM, Charles Clavadetscher wrote:

Hello

On 2020-05-26 10:38, Frank Millman wrote:

On 2020-05-26 9:32 AM, Olivier Gautherot wrote:

Hi Frank,

On Tue, May 26, 2020 at 9:23 AM Frank Millman <mailto:fr...@chagford.com>> wrote:


    Hi all

    I have a SELECT that runs over 5 times slower on PostgreSQL compared
    with Sql Server and sqlite3. I am trying to understand why.

    I have a table that looks like this (simplified) -

    CREATE TABLE my_table (
      row_id SERIAL PRIMARY KEY,
      deleted_id INT DEFAULT 0,
      fld_1 INT REFERENCES table_1(row_id),
      fld_2 INT REFERENCES table_2(row_id),
      fld_3 INT REFERENCES table_3(row_id),
      fld_4 INT REFERENCES table_4(row_id),
      tran_date DATE,
      tran_total DEC(21,2)
      );

    CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3,
    fld_4, tran_date) WHERE deleted_id = 0;

    The table sizes are -
      my_table : 167 rows
      table_1 : 21 rows
      table_2 : 11 rows
      table_3 : 3 rows
      table_4 : 16 rows

    Therefore for each tran_date in my_table there are potentially
    21x11x3x16 = 11088 rows. Most will be null.

    I want to select the row_id for the last tran_date for each of those
    potential groups. This is my select -

      SELECT (
          SELECT a.row_id FROM my_table a
          WHERE a.fld_1 = b.row_id
          AND a.fld_2 = c.row_id
          AND a.fld_3 = d.row_id
          AND a.fld_4 = e.row_id
          AND a.deleted_id = 0
          ORDER BY a.tran_date DESC LIMIT 1
      )
      FROM table_1 b, table_2 c, table_3 d, table_4 e

    Out of 11088 rows selected, 103 are not null.

    On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on
    sqlite3, and 0.31 sec on PostgreSQL.


SQL Server does a good job at caching data in memory. PostgreSQL does 
too on consecutive calls to the same table. What execution time do 
you get if you issue the query a second time?


My first guess would be to add an index on my_table.tran_date and 
check in EXPLAIN that you don't have a SEQUENTIAL SCAN on that table.


    I have looked at the EXPLAIN, but I don't really know what to 
look for.

    I can supply it if that would help.

    Thanks for any advice.



Thanks Olivier. Unfortunately that did not help.

I was already running the query twice and only timing the second one.

I added the index on tran_date. The timing is the same, and EXPLAIN
shows that it is using a SEQUENTIAL SCAN.

Here is the EXPLAIN -


 Nested Loop  (cost=0.00..64155.70 rows=11088 width=4)
   ->  Nested Loop  (cost=0.00..10.36 rows=528 width=12)
 ->  Nested Loop  (cost=0.00..2.56 rows=33 width=8)
   ->  Seq Scan on table_2 c  (cost=0.00..1.11 rows=11 
width=4)

   ->  Materialize  (cost=0.00..1.04 rows=3 width=4)
 ->  Seq Scan on table_3 d  (cost=0.00..1.03 
rows=3 width=4)

 ->  Materialize  (cost=0.00..1.24 rows=16 width=4)
   ->  Seq Scan on table_4 e  (cost=0.00..1.16 rows=16 
width=4)

   ->  Materialize  (cost=0.00..1.31 rows=21 width=4)
 ->  Seq Scan on table_1 b  (cost=0.00..1.21 rows=21 width=4)
   SubPlan 1
 ->  Limit  (cost=5.77..5.77 rows=1 width=8)
   ->  Sort  (cost=5.77..5.77 rows=1 width=8)
 Sort Key: a.tran_date DESC
 ->  Seq Scan on my_table a  (cost=0.00..5.76 rows=1 
width=8)

   Filter: ((fld_1 = b.row_id) AND (fld_2 =
c.row_id) AND (fld_3 = d.row_id) AND (fld_4 = e.row_id) AND
(deleted_id = 0))


Frank


If I see it correct, the query runs sequential scans on all tables, i.e. 
table_1 to table_4.

Do you have an index on the referenced keys (row_id) in table_1 to table_4?

It happens often that referenced keys are not indexed, leading to poor 
execution plans.




The referenced keys are all defined as SERIAL PRIMARY KEY in their own 
tables, so I presume that that are all indexed automatically.


On the other hand, there are not many rows in those tables, so the 
planner may decide not to use the index in that case.


Frank

Re: Slow SELECT

2020-05-26 Thread Frank Millman





On 2020-05-26 12:02 PM, Christian Ramseyer wrote:

Hi

On 26.05.20 09:22, Frank Millman wrote:


I have looked at the EXPLAIN, but I don't really know what to look for.
I can supply it if that would help.



My favorite approach to tuning Postgres queries is:

1. Run EXPLAIN ANALYZE 
2. Copy/Paste the output into the fantastic https://explain.depesz.com/

This will turn the somewhat hard-to-understand explain output into a
nice colored structure. If it's not obvious from the orange-reddish
boxes where the slowness comes from, please post the link here and
somebody will certainly have some advice.



Thanks, Christian. I will definitely look into that.

Frank

Re: Slow SELECT

2020-05-26 Thread Frank Millman





On 2020-05-26 11:27 AM, Charles Clavadetscher wrote:

On 2020-05-26 11:10, Charles Clavadetscher wrote:

Hello

On 2020-05-26 10:38, Frank Millman wrote:

On 2020-05-26 9:32 AM, Olivier Gautherot wrote:

Hi Frank,

On Tue, May 26, 2020 at 9:23 AM Frank Millman <mailto:fr...@chagford.com>> wrote:


    Hi all

    I have a SELECT that runs over 5 times slower on PostgreSQL 
compared

    with Sql Server and sqlite3. I am trying to understand why.

    I have a table that looks like this (simplified) -

    CREATE TABLE my_table (
      row_id SERIAL PRIMARY KEY,
      deleted_id INT DEFAULT 0,
      fld_1 INT REFERENCES table_1(row_id),
      fld_2 INT REFERENCES table_2(row_id),
      fld_3 INT REFERENCES table_3(row_id),
      fld_4 INT REFERENCES table_4(row_id),
      tran_date DATE,
      tran_total DEC(21,2)
      );

    CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3,
    fld_4, tran_date) WHERE deleted_id = 0;

    The table sizes are -
      my_table : 167 rows
      table_1 : 21 rows
      table_2 : 11 rows
      table_3 : 3 rows
      table_4 : 16 rows

    Therefore for each tran_date in my_table there are potentially
    21x11x3x16 = 11088 rows. Most will be null.

    I want to select the row_id for the last tran_date for each of 
those

    potential groups. This is my select -

      SELECT (
          SELECT a.row_id FROM my_table a
          WHERE a.fld_1 = b.row_id
          AND a.fld_2 = c.row_id
          AND a.fld_3 = d.row_id
          AND a.fld_4 = e.row_id
          AND a.deleted_id = 0
          ORDER BY a.tran_date DESC LIMIT 1
      )
      FROM table_1 b, table_2 c, table_3 d, table_4 e

    Out of 11088 rows selected, 103 are not null.

    On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on
    sqlite3, and 0.31 sec on PostgreSQL.


SQL Server does a good job at caching data in memory. PostgreSQL 
does too on consecutive calls to the same table. What execution time 
do you get if you issue the query a second time?


My first guess would be to add an index on my_table.tran_date and 
check in EXPLAIN that you don't have a SEQUENTIAL SCAN on that table.


    I have looked at the EXPLAIN, but I don't really know what to 
look for.

    I can supply it if that would help.

    Thanks for any advice.



Thanks Olivier. Unfortunately that did not help.

I was already running the query twice and only timing the second one.

I added the index on tran_date. The timing is the same, and EXPLAIN
shows that it is using a SEQUENTIAL SCAN.

Here is the EXPLAIN -


 Nested Loop  (cost=0.00..64155.70 rows=11088 width=4)
   ->  Nested Loop  (cost=0.00..10.36 rows=528 width=12)
 ->  Nested Loop  (cost=0.00..2.56 rows=33 width=8)
   ->  Seq Scan on table_2 c  (cost=0.00..1.11 rows=11 
width=4)

   ->  Materialize  (cost=0.00..1.04 rows=3 width=4)
 ->  Seq Scan on table_3 d  (cost=0.00..1.03 
rows=3 width=4)

 ->  Materialize  (cost=0.00..1.24 rows=16 width=4)
   ->  Seq Scan on table_4 e  (cost=0.00..1.16 rows=16 
width=4)

   ->  Materialize  (cost=0.00..1.31 rows=21 width=4)
 ->  Seq Scan on table_1 b  (cost=0.00..1.21 rows=21 width=4)
   SubPlan 1
 ->  Limit  (cost=5.77..5.77 rows=1 width=8)
   ->  Sort  (cost=5.77..5.77 rows=1 width=8)
 Sort Key: a.tran_date DESC
 ->  Seq Scan on my_table a  (cost=0.00..5.76 rows=1 
width=8)

   Filter: ((fld_1 = b.row_id) AND (fld_2 =
c.row_id) AND (fld_3 = d.row_id) AND (fld_4 = e.row_id) AND
(deleted_id = 0))


Frank


If I see it correct, the query runs sequential scans on all tables,
i.e. table_1 to table_4.
Do you have an index on the referenced keys (row_id) in table_1 to 
table_4?


It happens often that referenced keys are not indexed, leading to poor
execution plans.

Bye
Charles


I noticed later that you have very small tables. This will probably lead 
to a sequential scan althought there is an index in place.


I am not sure if it makes a difference, but what about using explicit 
joins?


SELECT a.row_id FROM my_table a
JOIN b table_1 ON (b.row_id = a.fld_1)
JOIN c table_2 ON (c.row_id = a.fld_2)
JOIN d table_3 ON (d.row_id = a.fld_3)
JOIN e table_4 ON (e.row_id = a.fld_4)
WHERE a.deleted_id = 0
ORDER BY a.tran_date DESC LIMIT 1;



Thanks, Charles. I tried that, but unfortunately it produces a different 
result. I need to test for every possible combination of fld1-4, and get 
the highest date for each one. Using joins only tests existing 
combinations, and gets the highest date for all of them combined.


Seel my reply to David Rowley. I do not fully understand his solution 
yet, but it seems to be what I am looking for.


Thanks again

Frank

Re: Slow SELECT

2020-05-26 Thread Frank Millman





On 2020-05-26 12:04 PM, David Rowley wrote:

On Tue, 26 May 2020 at 19:23, Frank Millman  wrote:

The table sizes are -
  my_table : 167 rows
  table_1 : 21 rows
  table_2 : 11 rows
  table_3 : 3 rows
  table_4 : 16 rows

Therefore for each tran_date in my_table there are potentially
21x11x3x16 = 11088 rows. Most will be null.

I want to select the row_id for the last tran_date for each of those
potential groups. This is my select -

  SELECT (
  SELECT a.row_id FROM my_table a
  WHERE a.fld_1 = b.row_id
  AND a.fld_2 = c.row_id
  AND a.fld_3 = d.row_id
  AND a.fld_4 = e.row_id
  AND a.deleted_id = 0
  ORDER BY a.tran_date DESC LIMIT 1
  )
  FROM table_1 b, table_2 c, table_3 d, table_4 e

Out of 11088 rows selected, 103 are not null.


Perhaps SQL Server is doing something to rewrite the subquery in the
target list to a LEFT JOIN.  PostgreSQL currently does not do that.

Since "my_table" is small, you'd likely be much better doing a manual
rewrite of the query to join a subquery containing the required
details from "my_table".  It looks like you want the row_id from the
latest tran_date for each fld_N column. So something like:

SELECT a.row_id
FROM table_1 b
CROSS JOIN table_2 c
CROSS JOIN table_3 d
CROSS JOIN table_4 e
LEFT OUTER JOIN (
SELECT fld_1,fld_2,fld_3,fld_4,row_id,tran_date,
ROW_NUMBER() OVER (PARTITION BY fld_1,fld_2,fld_3,fld_4 ORDER BY
tran_date DESC) row_num
FROM my_table
WHERE deleted_id = 0
) a ON a.fld_1 = b.row_id AND a.fld_2 = c.row_id AND a.fld_3 =
d.row_id AND a.fld_4 = e.row_id AND a.row_num = 1;

Should do it. You could also perhaps do something with DISTINCT ON
instead of using ROW_NUMBER(). That might be a bit more efficient, but
it's unlikely to matter too much since there are only 167 rows in that
table.



Thank you David. I tried that and it produced the correct result in 
53ms, which is what I am looking for.


It will take me some time to understand it fully, so I have some 
homework to do!


Much appreciated.

Frank

Re: Slow SELECT

2020-05-26 Thread Frank Millman





On 2020-05-26 1:45 PM, David Rowley wrote:

On Tue, 26 May 2020 at 22:31, Frank Millman  wrote:


Thank you David. I tried that and it produced the correct result in
53ms, which is what I am looking for.

It will take me some time to understand it fully, so I have some
homework to do!


The main problem with your previous query was that the subquery was
being executed 11088 times and could only ever find anything 167
times. The remaining number of times nothing would be found.

I just changed the subquery which would be executed once per output
row and altered it so it became a subquery that's joined and only
executed once.  The ROW_NUMBER() is a windowing function, which is
explained in [1].  I used this to get the row_id of the record with
the lowest tran_date, just like you were doing with the ORDER BY
tran_date DESC LIMIT 1, but the subquery with the windowing function
gets them all at once, rather than doing it in a way that requires it
to be executed once for each row in the top-level query. In this case,
the functionality that the LIMIT 1 does in your query is achieved with
"AND a.row_num = 1;" in my version. This is pretty fast to execute
once due to there only being 167 rows.

It's also important to know that there may be cases where the method I
proposed is slower.  For example, if my_table was very large and
contained rows that were not in table_1 to table_4. Since the subquery
in my version calculates everything then it could be wasteful to do
that for values that would never be used. For you, you have foreign
keys that ensure my_table does not contain records that are not in the
other tables, but you could still see this issue if you were to add
some restrictive WHERE clause to the outer query.  Perhaps this won't
be a problem for you, but it's likely good to know.

[1] https://www.postgresql.org/docs/current/tutorial-window.html



Thanks very much for the explanation. I will go through it carefully.

For the record, your query works without modification in both Sql Server 
and sqlite3. It is also much faster in all three cases - all around 
0.005 sec instead of 0.05 sec.


Frank

Re: Slow SELECT

2020-05-29 Thread Frank Millman





On 2020-05-26 12:04 PM, David Rowley wrote:


Since "my_table" is small, you'd likely be much better doing a manual
rewrite of the query to join a subquery containing the required
details from "my_table".  It looks like you want the row_id from the
latest tran_date for each fld_N column. So something like:

SELECT a.row_id
FROM table_1 b
CROSS JOIN table_2 c
CROSS JOIN table_3 d
CROSS JOIN table_4 e
LEFT OUTER JOIN (
SELECT fld_1,fld_2,fld_3,fld_4,row_id,tran_date,
ROW_NUMBER() OVER (PARTITION BY fld_1,fld_2,fld_3,fld_4 ORDER BY
tran_date DESC) row_num
FROM my_table
WHERE deleted_id = 0
) a ON a.fld_1 = b.row_id AND a.fld_2 = c.row_id AND a.fld_3 =
d.row_id AND a.fld_4 = e.row_id AND a.row_num = 1;

Should do it. You could also perhaps do something with DISTINCT ON
instead of using ROW_NUMBER(). That might be a bit more efficient, but
it's unlikely to matter too much since there are only 167 rows in that
table.



I have studied the above SELECT, and I now more or less understand it. I 
would not have come up with that unaided, so many thanks.


I tried DISTINCT ON, and it was very efficient, but unfortunately that 
is not supported by SQL Server or sqlite3.


Then I came up with this alternative, which works on all three platforms 
and seems a bit faster than the above -


SELECT a.row_id FROM (
SELECT row_id,
ROW_NUMBER() OVER (PARTITION BY fld_1, fld_2, fld_3, fld_4
ORDER BY tran_date DESC) row_num
FROM my_table
WHERE deleted_id = 0
   ) as a
WHERE a.row_num = 1

Do you see any problem with this?

Thanks

Frank

Re: Why password authentication failed for user "postgres"?

2022-06-05 Thread Frank Finner



On 2022-06-04 00:32, BeginnerC wrote:

Hello everyone,
I am a newbie to the postgres,when I use the psql to connect to the 
postgres,a error message printed:

These command list like this:

psql -U postgres
Password for user postgres:postgres
*postgreSQL: password authentication failed for user "postgres"*
*
*
How to solve this problem?
Thanks in advance!


 * Which OS are you working on?
 * Can you show us your pg_hba.conf?

If you use -U with psql, the connection must not try to use method 
"peer" (which means "use the system user with this name", but also means 
"You must be logged in with the system user corresponding to the 
postgresql user"), but some kind of authorization, like md5, and use it 
with IP address. This might be difficult for the user postgres, who 
often has no password set in the database. If you really need to connect 
with user postgres and some password, you can set one using the method 
described by Reid Thompson.


Regards,

Frank




OpenPGP_0xE5418B2366C33EE6.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature

Re: first order by then partition by x < a fixed value.

2022-07-14 Thread Frank Streitzig

Am Thu, Jul 14, 2022 at 01:23:55PM +0530 schrieb jian he:
> This question (https://stackoverflow.com/q/72975669/15603477) is fun.
> DB fiddle link:
> https://dbfiddle.uk/?rdbms=postgres_14&fiddle=36d685ad463831877ae70361be2cfa3b
>
> account
> size   idname 1001 John 2002 Mary 3003
> Jane 4004 Anne1005 Mike 6006 Joanne
>
> Then expected output: account group  size   idname 1   100
> 1 John 1   2002 Mary 1   3003 Jane 2
> 4004 Anne2   1005 Mike 3   6006 Joanne
>
> Idea is fixed order by id then cumulative sum. if  <=600 then grouped
> together using the same row_number.
>
> But I imagine this kind question has been solved many times.
> Current posted solution uses a recursive query, which is really hard for
> me.
> Is there any simple or more intuitive way to solve this kind of problem?
> I can write some customized aggregate function to aggregate stops at 600.
>
> I can get the following result, So what's the next step?
>
> +--+++---+-+
> | size | id |  name  | capped_at_600 | large_or_eq_600 |
> +--+++---+-+
> |  100 |  1 | John   |   100 | f   |
> |  200 |  2 | Mary   |   300 | f   |
> |  300 |  3 | Jane   |   600 | t   |
> |  400 |  4 | Anne   |   400 | f   |
> |  100 |  5 | Mike   |   500 | f   |
> |  600 |  6 | Joanne |  1100 | t   |
> +--+++---+-+
>
>
>
> --
>  I recommend David Deutsch's <>
>
>   Jian

My proposal:

https://dbfiddle.uk/?rdbms=postgres_14&fiddle=7daef32ae39b2ec7c38a83cf9e19d4ae

select  id, name, size
, sum(size) over (order by id) as size_total
, ((sum(size) over (order by id) - 1) / 600) + 1 as size_group
from account
order by id, name;


Best regards
Frank

Re: a database can be created but not droped

2022-08-01 Thread Frank Streitzig

Am Mon, Aug 01, 2022 at 11:22:33AM +0200 schrieb Matthias Apitz:
> testdb=# DROP DATABASE IF EXISTS  SRP-27097  WITH FORCE;
> ERROR:  syntax error at or near "-"
> ZEILE 1: DROP DATABASE IF EXISTS  SRP-27097  WITH FORCE;
>
> Why is this?

Please try: DROP DATABASE IF EXISTS "SRP-27097" WITH FORCE;

Best rehards
Frank

Question about locking

2022-08-06 Thread Frank Millman


Hi all

Apologies if this is an FAQ, but I could not find the answer I was 
looking for.


I want to run two SELECTs, one immediately after the other.

I want to ensure that no other process updates the database in between 
the two.


What is the best way to achieve this?

Thanks

Frank Millman

Re: Question about locking

2022-08-06 Thread Frank Millman




On 2022-08-07 7:54 AM, Ron wrote:

On 8/7/22 00:30, Frank Millman wrote:

Hi all

Apologies if this is an FAQ, but I could not find the answer I was 
looking for.


I want to run two SELECTs, one immediately after the other.

I want to ensure that no other process updates the database in 
between the two.


What is the best way to achieve this?


You probably want to put the queries in a Repeatable Read transaction, 
or even a READ ONLY serialized transaction.
https://www.postgresql.org/docs/12/transaction-iso.html#XACT-REPEATABLE-READ 


https://www.postgresql.org/docs/12/transaction-iso.html#XACT-SERIALIZABLE
https://www.postgresql.org/docs/12/sql-set-transaction.html


Thanks, Ron. That is exactly what I am looking for.

Frank

Re: pg_restore 12 "permission denied for schema" errors

2022-10-19 Thread Frank Gard


Hi Ron,

Am 20.10.22 um 02:58 schrieb Ron:

RDS Postgresql 12.11

Backed up a database from one RDS instance, and now am trying to restore it to 
a new instance.  (If you're wondering why, it's to restore in an instance with 
less disk space.)

Here are the commands:

export PGHOST=${RDSENV}..us-east-1.rds.amazonaws.com
cd /migrate/TASK001793786/2022-10-19b
DB=sides
pg_restore -v --create --clean --jobs=`nproc` -Fd --dbname=template1 $DB


What database do you want to restore your data into? As far as I know your 
pg_restore command would import the data into template1, right? pg_restore's 
manpage says:


[…]

> -ddbname
> --dbname=dbname
>  Connect to database dbnameand restore directly into the database. 
The dbnamecan be a connection string. If so, connection string parameters will 
override any conflicting command line options.
> […] ^^

And I'm wondering, if it's what you want. Possibly your errors could come from 
there…

Cheers,
Frank.

Re: pg_restore 12 "permission denied for schema" errors

2022-10-20 Thread Frank Gard


Hi Tom,

Am 20.10.22 um 07:12 schrieb Tom Lane:

Frank Gard  writes:

Am 20.10.22 um 02:58 schrieb Ron:

pg_restore -v --create --clean --jobs=`nproc` -Fd --dbname=template1 $DB

What database do you want to restore your data into? As far as I know your 
pg_restore command would import the data into template1, right?

No --- my ears went up about that too, but he's using --create.
So template1 is just the springboard for the initial connection
in which CREATE DATABASE will be issued.

you're right, of course. Overseen that detail. I've to learn to read more 
carefully before answering. Sorry.

Frank.

Calling function from VFP changes character field to Memo

2022-11-15 Thread Frank Cazabon


Hi,

I have a simple table Patients with one field FirstName of type 
character(30).


If I

SELECT FirstName From public.patients;

I get back the expected character(30) field.

If however I have a function defined like this

CREATE OR REPLACE FUNCTION public.testfunction(
    )
    RETURNS TABLE
    (
    Firstname character(30)
    )
    LANGUAGE 'plpgsql'


AS $BODY$
BEGIN
    RETURN QUERY SELECT p.cFirstName FROM patients p;
END;
$BODY$;

And I call:

SELECT * FROM public.testFunction();

Then FirstName returns as a Memo field (similar to a Text field).

Any idea what I need to do to get it to return the character(30) type?

--

Frank.

Frank Cazabon

Re: Calling function from VFP changes character field to Memo

2022-11-15 Thread Frank Cazabon




On 15/11/2022 2:48 pm, Adrian Klaver wrote:

On 11/15/22 10:43 AM, Frank Cazabon wrote:

Please reply to list als.
Ccing list
Sorry about that, first time using this list and just assumed I was 
replying to the list and the list would then notify you



SELECT * FROM public.testFunction();


SELECT firstname from FROM public.testFunction();

This has the same result.


How about?:

SELECT firstname::varchar(30) from FROM public.testFunction();


That does work, thanks, but it may cause me some logic problems in the 
rest of the code. I'll try the type suggestion from Tom and see if that 
works and then decide which is my best approach.


Thanks







Then FirstName returns as a Memo field (similar to a Text field).

Any idea what I need to do to get it to return the character(30) type?

Re: Calling function from VFP changes character field to Memo

2022-11-15 Thread Frank Cazabon




On 15/11/2022 2:44 pm, Tom Lane wrote:

Frank Cazabon  writes:

If however I have a function defined like this
CREATE OR REPLACE FUNCTION public.testfunction(
      )
      RETURNS TABLE
      (
      Firstname character(30)
      )
      LANGUAGE 'plpgsql'
AS $BODY$
BEGIN
      RETURN QUERY SELECT p.cFirstName FROM patients p;
END;
$BODY$;
And I call:
SELECT * FROM public.testFunction();
Then FirstName returns as a Memo field (similar to a Text field).

This is mostly about whatever software stack you're using on the
client side --- Memo is certainly not something Postgres knows about.


Any idea what I need to do to get it to return the character(30) type?

There's no chance of getting back the "30" part with this structure,
because function signatures do not carry length restrictions.
What I expect is happening is that you get firstname as an
unspecified-length "character" type, and something on the client
side is deciding to cope with that by calling it "Memo" instead.

You could perhaps work around that by defining a named composite
type:

create type testfunction_result as (firstname character(30), ...);

create function testfunction() returns setof testfunction_result as ...

regards, tom lane

Thanks, so I could define the function like this - removed the (30):

CREATE OR REPLACE FUNCTION public.testfunction(
     )
 RETURNS TABLE
     (
     Firstname character
     )
 LANGUAGE 'plpgsql'

I'll try the type definition and see if that helps.

Re: Calling function from VFP changes character field to Memo

2022-11-15 Thread Frank Cazabon

I don't think that's necessary, I'm 100% certain that it's VFP not able to 
interpret the size of what is coming back to it so it just gives it the biggest 
type it can.

Thanks

15 Nov 2022 14:59:59 Ron :

> On 11/15/22 12:54, Frank Cazabon wrote:
>> 
>> On 15/11/2022 2:44 pm, Tom Lane wrote:
>>> Frank Cazabon  writes:
>>>> If however I have a function defined like this
>>>> CREATE OR REPLACE FUNCTION public.testfunction(
>>>>       )
>>>>       RETURNS TABLE
>>>>       (
>>>>       Firstname character(30)
>>>>       )
>>>>       LANGUAGE 'plpgsql'
>>>> AS $BODY$
>>>> BEGIN
>>>>       RETURN QUERY SELECT p.cFirstName FROM patients p;
>>>> END;
>>>> $BODY$;
>>>> And I call:
>>>> SELECT * FROM public.testFunction();
>>>> Then FirstName returns as a Memo field (similar to a Text field).
>>> This is mostly about whatever software stack you're using on the
>>> client side --- Memo is certainly not something Postgres knows about.
>>> 
>>>> Any idea what I need to do to get it to return the character(30) type?
>>> There's no chance of getting back the "30" part with this structure,
>>> because function signatures do not carry length restrictions.
>>> What I expect is happening is that you get firstname as an
>>> unspecified-length "character" type, and something on the client
>>> side is deciding to cope with that by calling it "Memo" instead.
>>> 
>>> You could perhaps work around that by defining a named composite
>>> type:
>>> 
>>> create type testfunction_result as (firstname character(30), ...);
>>> 
>>> create function testfunction() returns setof testfunction_result as ...
>>> 
>>>     regards, tom lane
>> Thanks, so I could define the function like this - removed the (30):
>> 
>> CREATE OR REPLACE FUNCTION public.testfunction(
>>      )
>>  RETURNS TABLE
>>      (
>>      Firstname character
>>      )
>>  LANGUAGE 'plpgsql'
>> 
>> I'll try the type definition and see if that helps.
> 
> I think you should try the original function in psql.  That will delineate 
> Postgresql from framework.
> 
> -- 
> Angular momentum makes the world go 'round.

Re: Calling function from VFP changes character field to Memo

2022-11-15 Thread Frank Cazabon




On 15/11/2022 2:58 pm, Adrian Klaver wrote:

On 11/15/22 10:54 AM, Frank Cazabon wrote:


On 15/11/2022 2:44 pm, Tom Lane wrote:

Frank Cazabon  writes:

If however I have a function defined like this
CREATE OR REPLACE FUNCTION public.testfunction(
      )
      RETURNS TABLE
      (
      Firstname character(30)
      )
      LANGUAGE 'plpgsql'
AS $BODY$
BEGIN
      RETURN QUERY SELECT p.cFirstName FROM patients p;
END;
$BODY$;
And I call:
SELECT * FROM public.testFunction();
Then FirstName returns as a Memo field (similar to a Text field).

This is mostly about whatever software stack you're using on the
client side --- Memo is certainly not something Postgres knows about.


Any idea what I need to do to get it to return the character(30) type?

There's no chance of getting back the "30" part with this structure,
because function signatures do not carry length restrictions.
What I expect is happening is that you get firstname as an
unspecified-length "character" type, and something on the client
side is deciding to cope with that by calling it "Memo" instead.

You could perhaps work around that by defining a named composite
type:

create type testfunction_result as (firstname character(30), ...);

create function testfunction() returns setof testfunction_result as ...

    regards, tom lane

Thanks, so I could define the function like this - removed the (30):

CREATE OR REPLACE FUNCTION public.testfunction(
  )
  RETURNS TABLE
  (
  Firstname character
  )
  LANGUAGE 'plpgsql'


No you don't want to do that:

select 'test'::char;
 bpchar

 t

vs

select 'test'::varchar;
 varchar
-
 test

Besides you missed the important part, after creating the type 
testfunction_result:


create function testfunction() returns setof testfunction_result as ...



Sorry about the confusion, I have got it working using the type definition.

Thanks for the help





I'll try the type definition and see if that helps.

PostgreSQL listens on localhost?

2023-08-01 Thread Frank Gunseor

I have read that PostgreSQL listens on localhost is it possible to have it
listen on an IP address?

Thank you.

FDG

Re: PostgreSQL listens on localhost?

2023-08-01 Thread Frank Gunseor

Hi Adrian,

I have just started with PostgreSQL after years away from Micro$oft SQL
Server.
Thank you very much for the quick reference as I have also been trying to
get through the documentation (all 3063 pages of it!).

FDG

On Tue, Aug 1, 2023 at 11:42 AM Adrian Klaver 
wrote:

> On 8/1/23 09:12, Frank Gunseor wrote:
> > I have read that PostgreSQL listens on localhost is it possible to have
> > it listen on an IP address?
>
> Read:
>
>
> https://www.postgresql.org/docs/current/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SETTINGS
>
> >
> > Thank you.
> >
> > FDG
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>

pgBadger: Cannot find any log entries from systemd-journald

2024-03-05 Thread Frank Lanitz


Hello,

I think I just might did a small configuration issue, but somehow I cannot 
figure it out by myself.

I want to let pgbadger directly fetch data from journald -- so I was using with 
a unexpected outcome:

$ pgbadger --journalctl "journalctl -u postgresql.service"
LOG: Ok, generating html report...s: 0, events: 0

Having a look into the journal there is a lot of

---cut
Mar 05 09:06:23 myhost postgres[441820]: 2024-03-05 08:06:23 GMT [441820]: 
user=[unknown],db=[unknown],app=[unknown],client=xxx LOCATION:  
BackendInitialize, postmaster.c:4373
Mar 05 09:06:23 myhost postgres[441820]: 2024-03-05 08:06:23 GMT [441820]: 
user=[unknown],db=[unknown],app=[unknown],client=xxx LOG:  0: connection 
received: host=xxx port=34574
Mar 05 09:06:10 myhost postgres[441805]: 2024-03-05 08:06:10 GMT [441805]: 
user=postgres,db=xxx,app=psql,client=[local] LOCATION:  log_disconnections, 
postgres.c:4722
Mar 05 09:06:10 myhost postgres[441805]: 2024-03-05 08:06:10 GMT [441805]: 
user=postgres,db=xxx,app=psql,client=[local] LOG:  0: disconnection: 
session time: 0:00:00.016 user=postgres database=xxx host=[local]
Mar 05 09:06:10 myhost postgres[441805]: 2024-03-05 08:06:10 GMT [441805]: 
user=postgres,db=xxx,app=[unknown],client=[local] LOCATION:  
PerformAuthentication, postinit.c:292
Mar 05 09:06:10 myhost postgres[441805]: 2024-03-05 08:06:10 GMT [441805]: 
user=postgres,db=xxx,app=[unknown],client=[local] LOG:  0: connection 
authorized: user=postgres database=xxx application_name=psql
---cut---

My psotgresql.conf looks like:
---cut
log_autovacuum_min_duration = 0
log_checkpoints = yes
log_connections = yes
log_destination = 'stderr'
log_disconnections = yes
log_error_verbosity = 'verbose'
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '
log_lock_waits = yes
log_min_duration_statement = 100
log_temp_files = 0
---cut

Any idea what I missed or did wrong?

Cheers,
Frank

Re: pgBadger: Cannot find any log entries from systemd-journald

2024-03-12 Thread Frank Lanitz


Hello,

Sorry for the late response.

Am 06.03.24 um 16:40 schrieb Greg Sabino Mullane:

On Tue, Mar 5, 2024 at 3:14 AM Frank Lanitz mailto:fr...@frank.uvena.de>> wrote:

$ pgbadger --journalctl "journalctl -u postgresql.service"


You could try adding --verbose to see if it gives more clues.


No, unfortunately not. Though, it adds some more nice information in general, 
but none that actually helped me.


Having a look into the journal there is a lot of


None of the snippets from journald you showed were actually things pgbadger cares about, FWIW. 


It was just an random pick to show there is a bunch of logging -- I also can 
see checkpoints and vacuum etc. I just didn't want to paste 1GB of log ;)

You can get an idea of what is actually parsed by running "journalctl -u postgresql 
--output=short-iso"

Looks good to me.


log_error_verbosity = 'verbose'


This is probably not needed, and there is a finite chance that the extra 
context is confusing pgbadger.


I tried to change it, but in my case no difference.

Cheers,
Frank

SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman

    Filter: ((tran_date < 
'2018-03-01'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))

(24 rows)

Maybe SQL Server has a way of optimising this, and there is nothing more 
I can do. I can live with that. But I just thought I would ask the question.


Thanks for any advice.

Frank Millman

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman




On 2021-03-19 10:29 AM, Thomas Kellerer wrote:

Frank Millman schrieb am 19.03.2021 um 09:19:

This may be a non-issue, and I don't want to waste your time. But perhaps 
someone can have a look to see if there is anything obvious I have missed.

I am writing a cross-platform accounting app, and I test using Sql
Server on Windows 10 and PostgreSql on Fedora 31. Performance is
usually very similar, with a slight edge to PostgreSql. Now I have a
SELECT which runs over twice as fast on Sql Server compared to
PostgreSql.


Can you change the SELECT statement?

Very often "distinct on ()" is faster in Postgres compared to the equivalent 
solution using window functions

Something along the lines (for the first derived table):

SELECT ...
FROM (
 SELECT a.source_code_id, SUM(a.tran_tot) AS cl_tot
 FROM (
 SELECT distinct on (location_row_id, function_row_id, source_code_id) 
source_code_id, tran_tot
 FROM prop.ar_totals
 WHERE deleted_id = 0
   AND tran_date <= '2018-03-31'
   AND ledger_row_id = 1
 ORDER BY location_row_id, function_row_id, source_code_id, tran_date 
DESC
 ) AS a
 GROUP BY a.source_code_id
) as cl_bal
...

Thanks, Thomas

I tried that, and it ran about 10% faster. Every little helps, but SQL 
Server appears to have some secret sauce!


Frank

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman




On 2021-03-19 11:04 AM, Thomas Kellerer wrote:

Frank Millman schrieb am 19.03.2021 um 09:52:

I am writing a cross-platform accounting app, and I test using Sql
Server on Windows 10 and PostgreSql on Fedora 31. Performance is
usually very similar, with a slight edge to PostgreSql. Now I have a
SELECT which runs over twice as fast on Sql Server compared to
PostgreSql.


Can you change the SELECT statement?

Very often "distinct on ()" is faster in Postgres compared to the equivalent 
solution using window functions

Thanks, Thomas

I tried that, and it ran about 10% faster. Every little helps, but SQL Server 
appears to have some secret sauce!

The two derived tables (cl_bal, op_bal) seem to be doing exactly the same thing 
- at least I can't spot a difference.

If that is correct, you can move them into a common table expression - maybe 
detecting that is SQL Server's secret sauce.

 with totals as (
SELECT a.source_code_id, SUM(a.tran_tot) AS total
FROM (
SELECT distinct on (location_row_id, function_row_id, 
source_code_id) source_code_id, tran_tot
FROM prop.ar_totals
WHERE deleted_id = 0
  AND tran_date <= '2018-03-31'
  AND ledger_row_id = 1
ORDER BY location_row_id, function_row_id, source_code_id, 
tran_date DESC
) AS a
GROUP BY a.source_code_id
 )
 select
 '2018-03-01' AS op_date, '2018-03-31' AS cl_date,
 cl_bal.source_code_id, op_bal.total as op_tot, cl_bal.total.cl_tot
 FROM totals as cl_bal
   LEFT JOIN totals as op_bal ON op_bal.source_code_id = 
cl_bal.source_code_id;


There is a difference.

cl_bal selects WHERE tran_date <= '2018-03-31'.

op_bal selects WHERE tran_date < '2018-03-01'.

The second one could be written as WHERE tran_date <= '2018-02-28', but 
I don't think that would make any difference.


Frank

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman



On 2021-03-19 10:56 AM, Pavel Stehule wrote:



pá 19. 3. 2021 v 9:53 odesílatel Frank Millman <mailto:fr...@chagford.com>> napsal:



On 2021-03-19 10:29 AM, Thomas Kellerer wrote:
    > Frank Millman schrieb am 19.03.2021 um 09:19:
>> This may be a non-issue, and I don't want to waste your time.
But perhaps someone can have a look to see if there is anything
obvious I have missed.
>>
>> I am writing a cross-platform accounting app, and I test using Sql
>> Server on Windows 10 and PostgreSql on Fedora 31. Performance is
>> usually very similar, with a slight edge to PostgreSql. Now I
have a
>> SELECT which runs over twice as fast on Sql Server compared to
>> PostgreSql.
>>
> Can you change the SELECT statement?
>
> Very often "distinct on ()" is faster in Postgres compared to
the equivalent solution using window functions
>
> Something along the lines (for the first derived table):
>
> SELECT ...
> FROM (
>      SELECT a.source_code_id, SUM(a.tran_tot) AS cl_tot
>      FROM (
>          SELECT distinct on (location_row_id, function_row_id,
source_code_id) source_code_id, tran_tot
>          FROM prop.ar_totals
>          WHERE deleted_id = 0
>            AND tran_date <= '2018-03-31'
>            AND ledger_row_id = 1
>          ORDER BY location_row_id, function_row_id,
source_code_id, tran_date DESC
>      ) AS a
>      GROUP BY a.source_code_id
> ) as cl_bal
> ...
Thanks, Thomas

I tried that, and it ran about 10% faster. Every little helps, but
SQL
Server appears to have some secret sauce!


can you send a result of EXPLAIN ANALYZE?

QUERY PLAN

 Nested Loop Left Join  (cost=5.66..5.74 rows=1 width=132) (actual 
time=0.213..0.248 rows=5 loops=1)

   Join Filter: (a_1.source_code_id = a.source_code_id)
   Rows Removed by Join Filter: 4
   ->  GroupAggregate  (cost=3.65..3.67 rows=1 width=36) (actual 
time=0.144..0.157 rows=5 loops=1)

 Group Key: a.source_code_id
 ->  Sort  (cost=3.65..3.65 rows=1 width=10) (actual 
time=0.131..0.135 rows=29 loops=1)

   Sort Key: a.source_code_id
   Sort Method: quicksort  Memory: 26kB
   ->  Subquery Scan on a  (cost=2.36..3.64 rows=1 
width=10) (actual time=0.063..0.116 rows=29 loops=1)

 Filter: (a.row_num = 1)
 Rows Removed by Filter: 3
 ->  WindowAgg  (cost=2.36..3.24 rows=32 width=34) 
(actual time=0.062..0.107 rows=32 loops=1)
   ->  Sort  (cost=2.36..2.44 rows=32 
width=26) (actual time=0.054..0.059 rows=32 loops=1)
 Sort Key: ar_totals.location_row_id, 
ar_totals.function_row_id, ar_totals.source_code_id, 
ar_totals.tran_date DESC

 Sort Method: quicksort Memory: 27kB
 ->  Seq Scan on ar_totals 
(cost=0.00..1.56 rows=32 width=26) (actual time=0.014..0.028 rows=32 
loops=1)
   Filter: ((tran_date <= 
'2018-03-31'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))
   ->  GroupAggregate  (cost=2.01..2.03 rows=1 width=36) (actual 
time=0.017..0.017 rows=1 loops=5)

 Group Key: a_1.source_code_id
 ->  Sort  (cost=2.01..2.02 rows=1 width=10) (actual 
time=0.012..0.013 rows=8 loops=5)

   Sort Key: a_1.source_code_id
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on a_1  (cost=1.68..2.00 rows=1 
width=10) (actual time=0.032..0.047 rows=8 loops=1)

 Filter: (a_1.row_num = 1)
 ->  WindowAgg  (cost=1.68..1.90 rows=8 width=34) 
(actual time=0.031..0.043 rows=8 loops=1)
   ->  Sort  (cost=1.68..1.70 rows=8 width=26) 
(actual time=0.023..0.024 rows=8 loops=1)
 Sort Key: 
ar_totals_1.location_row_id, ar_totals_1.function_row_id, 
ar_totals_1.source_code_id, ar_totals_1.tran_date DESC

 Sort Method: quicksort Memory: 25kB
 ->  Seq Scan on ar_totals 
ar_totals_1  (cost=0.00..1.56 rows=8 width=26) (actual 
time=0.006..0.013 rows=8 loops=1)
   Filter: ((tran_date < 
'2018-03-01'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))

   Rows Removed by Filter: 24
 Planning Time: 0.479 ms
 Execution Time: 0.344 ms
(33 rows)

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman



On 2021-03-19 12:00 PM, Pavel Stehule wrote:


In this query the most slow operation is query planning. You try to do 
tests on almost empty tables. This has no practical sense. You should 
test queries on tables with size similar to production size.



Sorry about that. I hope this one is better. Same query, different data set.

QUERY PLAN

 Merge Left Join  (cost=1401.00..1401.12 rows=1 width=132) (actual 
time=3.595..3.611 rows=5 loops=1)

   Merge Cond: (a.source_code_id = a_1.source_code_id)
   ->  GroupAggregate  (cost=673.16..673.18 rows=1 width=36) (actual 
time=1.101..1.108 rows=5 loops=1)

 Group Key: a.source_code_id
 ->  Sort  (cost=673.16..673.16 rows=1 width=12) (actual 
time=1.092..1.093 rows=5 loops=1)

   Sort Key: a.source_code_id
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on a  (cost=670.67..673.15 rows=1 
width=12) (actual time=1.008..1.086 rows=5 loops=1)

 Filter: (a.row_num = 1)
 Rows Removed by Filter: 59
 ->  WindowAgg  (cost=670.67..672.37 rows=62 
width=36) (actual time=1.006..1.076 rows=64 loops=1)
   ->  Sort  (cost=670.67..670.82 rows=62 
width=28) (actual time=0.996..1.004 rows=64 loops=1)
 Sort Key: ar_totals.location_row_id, 
ar_totals.function_row_id, ar_totals.source_code_id, ar_totals.tran_date 
DESC

 Sort Method: quicksort  Memory: 30kB
 ->  Seq Scan on ar_totals 
(cost=0.00..668.82 rows=62 width=28) (actual time=0.012..0.933 rows=64 
loops=1)
   Filter: ((tran_date <= 
'2015-04-30'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))

   Rows Removed by Filter: 840
   ->  GroupAggregate  (cost=727.85..727.89 rows=2 width=36) (actual 
time=2.490..2.495 rows=5 loops=1)

 Group Key: a_1.source_code_id
 ->  Sort  (cost=727.85..727.85 rows=3 width=12) (actual 
time=2.485..2.485 rows=5 loops=1)

   Sort Key: a_1.source_code_id
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on a_1  (cost=700.70..727.82 rows=3 
width=12) (actual time=1.684..2.479 rows=5 loops=1)

 Filter: (a_1.row_num = 1)
 Rows Removed by Filter: 674
 ->  WindowAgg  (cost=700.70..719.35 rows=678 
width=36) (actual time=1.682..2.397 rows=679 loops=1)
   ->  Sort  (cost=700.70..702.40 rows=678 
width=28) (actual time=1.676..1.758 rows=679 loops=1)
 Sort Key: ar_totals_1.location_row_id, 
ar_totals_1.function_row_id, ar_totals_1.source_code_id, 
ar_totals_1.tran_date DESC

 Sort Method: quicksort  Memory: 78kB
 ->  Seq Scan on ar_totals ar_totals_1  
(cost=0.00..668.82 rows=678 width=28) (actual time=0.007..0.836 rows=679 
loops=1)
   Filter: ((tran_date < 
'2015-09-01'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))

   Rows Removed by Filter: 225
 Planning Time: 0.496 ms
 Execution Time: 3.695 ms
(34 rows)

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman




On 2021-03-19 12:58 PM, Frank Millman wrote:

QUERY PLAN


 Merge Left Join  (cost=1401.00..1401.12 rows=1 width=132) (actual 
time=3.595..3.611 rows=5 loops=1)

   Merge Cond: (a.source_code_id = a_1.source_code_id)
   ->  GroupAggregate  (cost=673.16..673.18 rows=1 width=36) (actual 
time=1.101..1.108 rows=5 loops=1)

 Group Key: a.source_code_id
 ->  Sort  (cost=673.16..673.16 rows=1 width=12) (actual 
time=1.092..1.093 rows=5 loops=1)

   Sort Key: a.source_code_id
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on a  (cost=670.67..673.15 rows=1 
width=12) (actual time=1.008..1.086 rows=5 loops=1)

 Filter: (a.row_num = 1)
 Rows Removed by Filter: 59
 ->  WindowAgg  (cost=670.67..672.37 rows=62 
width=36) (actual time=1.006..1.076 rows=64 loops=1)
   ->  Sort  (cost=670.67..670.82 rows=62 
width=28) (actual time=0.996..1.004 rows=64 loops=1)
 Sort Key: ar_totals.location_row_id, 
ar_totals.function_row_id, ar_totals.source_code_id, 
ar_totals.tran_date DESC

 Sort Method: quicksort  Memory: 30kB
 ->  Seq Scan on ar_totals 
(cost=0.00..668.82 rows=62 width=28) (actual time=0.012..0.933 rows=64 
loops=1)
   Filter: ((tran_date <= 
'2015-04-30'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))

   Rows Removed by Filter: 840
   ->  GroupAggregate  (cost=727.85..727.89 rows=2 width=36) (actual 
time=2.490..2.495 rows=5 loops=1)

 Group Key: a_1.source_code_id
 ->  Sort  (cost=727.85..727.85 rows=3 width=12) (actual 
time=2.485..2.485 rows=5 loops=1)

   Sort Key: a_1.source_code_id
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on a_1  (cost=700.70..727.82 rows=3 
width=12) (actual time=1.684..2.479 rows=5 loops=1)

 Filter: (a_1.row_num = 1)
 Rows Removed by Filter: 674
 ->  WindowAgg  (cost=700.70..719.35 rows=678 
width=36) (actual time=1.682..2.397 rows=679 loops=1)
   ->  Sort  (cost=700.70..702.40 rows=678 
width=28) (actual time=1.676..1.758 rows=679 loops=1)
 Sort Key: 
ar_totals_1.location_row_id, ar_totals_1.function_row_id, 
ar_totals_1.source_code_id, ar_totals_1.tran_date DESC

 Sort Method: quicksort  Memory: 78kB
 ->  Seq Scan on ar_totals 
ar_totals_1  (cost=0.00..668.82 rows=678 width=28) (actual 
time=0.007..0.836 rows=679 loops=1)
   Filter: ((tran_date < 
'2015-09-01'::date) AND (deleted_id = 0) AND (ledger_row_id = 1))

   Rows Removed by Filter: 225
 Planning Time: 0.496 ms
 Execution Time: 3.695 ms
(34 rows)



@Pavel & depesz

Thanks for the replies. I am now focusing on the index. I tried dropping 
the index 'ar_tots_cover', and then adding back the index columns one at 
a time. Adding 'tran_date desc' made a small difference. Adding 
'tran_day' and 'tran_tot' made a big difference. This changed the index 
into a 'covering' index, and this is reflected in the new EXPLAIN 
ANALYSE (see below).


Execution of my main query has improved from 50ms to 33ms. Sql Server 
takes 25ms, but this is much better than it was.


However, the bizarre thing is that I have simply restored the index to 
what it was in the first place. If you look at the table definition in 
my original message you can see that all the columns were included in 
the index. But the query did not use it as a covering index. Now the 
EXPLAIN ANALYSE clearly shows 'Index Only Scan using ar_tots_cover'. I 
have no idea what changed.


Here is the new EXPLAIN ANALYSE -

QUERY PLAN

 Merge Left Join  (cost=161.39..161.51 rows=1 width=132) (actual 
time=1.566..1.581 rows=5 loops=1)

   Merge Cond: (a.source_code_id = a_1.source_code_id)
   ->  GroupAggregate  (cost=50.27..50.29 rows=1 width=36) (actual 
time=0.226..0.232 rows=5 loops=1)

 Group Key: a.source_code_id
 ->  Sort  (cost=50.27..50.28 rows=1 width=12) (actual 
time=0.217..0.218 rows=5 loops=1)

   Sort Key: a.source_code_id
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on a  (cost=47.78..50.26 rows=1 
width=

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman




On 2021-03-19 4:38 PM, Tom Lane wrote:

Frank Millman  writes:

However, the bizarre thing is that I have simply restored the index to
what it was in the first place. If you look at the table definition in
my original message you can see that all the columns were included in
the index. But the query did not use it as a covering index. Now the
EXPLAIN ANALYSE clearly shows 'Index Only Scan using ar_tots_cover'. I
have no idea what changed.

VACUUM, maybe?  Even if there's a covering index, the planner is not
likely to prefer an index-only scan unless it thinks that most of the
table's pages are known all-visible.  If they're not, most of the
rows will require heap probes anyway to check row visibility, meaning
that the "index-only" scan's performance degrades to about that of a
regular indexscan.

In this example, since you're fetching such a large fraction of the
table (which the planner is accurately estimating), there's not a lot
of daylight between the estimated costs of seqscan and index-only
scan to begin with.  I'm not surprised that it'd prefer the former
if the table isn't recently vacuumed.


It is possible. I know that I *did* vacuum. But I also ran a program to 
generate a few hundred additional rows, and I cannot remember if I ran 
the vacuum before or after that.


Frank

Re: SELECT is faster on SQL Server

2021-03-19 Thread Frank Millman




On 2021-03-19 5:32 PM, Jehan-Guillaume de Rorthais wrote:

On Fri, 19 Mar 2021 14:28:27 +0200
Frank Millman  wrote:


[...]
Execution of my main query has improved from 50ms to 33ms. Sql Server
takes 25ms, but this is much better than it was.

[...]

Here is the new EXPLAIN ANALYSE -

QUERY PLAN

   Merge Left Join  (...) (actual time=1.566..1.581 rows=5 loops=1)

1.581ms to output the very last row of this plan. This is in contradiction with
the 33ms you are referencing above.

What do I miss here? Maybe your 33ms comes yet from another set of data? Could
you share an explain analyze actually showing this 33ms total execution time?


Sorry, I should have explained.

The query I showed selects data for a single month. The 'real' query 
repeats this 12 times, each with different dates, and combines the 
results using UNION ALL. This was the timing mentioned above.


BTW, I know that I can improve this by setting up the dates in a CTE and 
using JOIN LATERAL. I am avoiding this as it is not supported by SQL 
Server or sqlite3, and I am trying to stick to one code base for all 
databases. But I will look into it further.


Frank

Re: SELECT is faster on SQL Server

2021-03-20 Thread Frank Millman




On 2021-03-19 7:11 PM, Thomas Kellerer wrote:

Frank Millman schrieb am 19.03.2021 um 10:16:


cl_bal selects WHERE tran_date <= '2018-03-31'.

op_bal selects WHERE tran_date < '2018-03-01'.

The second one could be written as WHERE tran_date <= '2018-02-28', 
but I don't think that would make any difference.


I knew I overlooked something ;)

But as one is a true subset of the other, I think you can merge that 
into a single SELECT statement:


    select '2018-03-01' AS op_date,
   '2018-03-31' AS cl_date,
   a.source_code_id,
   sum(a.tran_tot) AS cl_tot,
   sum(a.tran_tot) filter (where tran_date < '2018-03-01') AS 
op_tot

    FROM (
   SELECT distinct on (location_row_id, function_row_id, 
source_code_id) source_code_id, tran_tot, tran_date

   FROM prop.ar_totals
   WHERE deleted_id = 0
 AND tran_date <= '2018-03-31'
 AND ledger_row_id = 1
   ORDER BY location_row_id, function_row_id, source_code_id, 
tran_date DESC

    ) AS a
    GROUP BY a.source_code_id


Thanks very much Thomas - I did not know about FILTER.

But it does not quite work. If the SELECT does find a row where the max 
tran_date is <= '2018-03-31' it correctly includes it in 'cl_tot'. But 
the filter returns nothing for 'op_tot' because there is no 
corresponding row where tran_date < '2018-03-01'.


But I have learned something new, so thanks for that.

Frank

Empty tablespace directory after restore with psql -d dbname -f dump.sql

2021-05-14 Thread frank picabia

We cancelled a web application upgrade part way through when it was realized
Postgres needed a version upgrade first.  Using the dump file extracted from
a full dump with pg_dumpall the DB was restored with a line like :

 psql -d dbname -f dump.sql

The file dump.sql was extracted from the full dump using the script
pg_extract.sh script provided on Github by someone (not sure which fork).

Others may want to take note, this tool does not include the declaration
of TABLESPACE and its location.  So the tablespace files will be
in the general data area.  \db+ reports tablespace is defined, but it's
just empty.  If you have a CREATE TABLESPACE with LOCATION in your full
dump, you should add it back into the extracted result for a single DB.

Re: case insensitive collation of Greek's sigma

2021-11-30 Thread Frank Limpert


Am 26.11.21 um 08:37 schrieb Jakub Jedelsky:

Hello,

during our tests of Postgres with ICU we found an issue with ILIKE of 
upper and lowercase sigma (Σ). The letter has two lowercase variants σ 
and ς (at the end of a word). I'm working with en_US and en-US-x-icu 
collations and results are a bit unexpected - they are inverted:


postgres=# SELECT
postgres-# 'ΣΣ' ILIKE 'σσ' COLLATE "en_US",
postgres-# 'ΣΣ' ILIKE 'σς' COLLATE "en_US"
postgres-# ;
 ?column? | ?column?
--+--
 t        | f
(1 row)

postgres=# SELECT
postgres-# 'ΣΣ' ILIKE 'σσ' COLLATE "en-US-x-icu",
postgres-# 'ΣΣ' ILIKE 'σς' COLLATE "en-US-x-icu";
 ?column? | ?column?
--+--
 f        | t
(1 row)

I run those commands on the latest (14.1) official docker image.

Is it possible to unify the behaviour?And which one is correct from 
the community point of view?


If I could start, I think both results are wrong as both should return 
True. If I got it right, in the background there is a lower() function 
running to compare strings, which is not enough for such cases (until 
the left side isn't taken as a standalone word).


Thanks,

- jj


Have you seen the subtle intricacies in this example?

=> SELECT 'ΣΣ Μ' ILIKE 'σσ Μ' COLLATE "en_US" AS c0,
      'ΣΣ Μ' ILIKE 'σς Μ' COLLATE "en_US" AS c1,
          'ΣΣ Μ' ~*    'σσ Μ' COLLATE "el-GR-x-icu" AS c2,
      'ΣΣ Μ' ~*    'σς Μ' COLLATE "el-GR-x-icu" AS c3,
      'ΣΣ Μ' ILIKE 'σσ Μ' COLLATE "el-GR-x-icu" AS c4,
          'ΣΣ Μ' ILIKE 'σς Μ' COLLATE "el-GR-x-icu" AS c5,
      'ΣΣ Μ' ~*    'σσ Μ' COLLATE "en-US-x-icu" AS c6,
      'ΣΣ Μ' ~*    'σς Μ' COLLATE "en-US-x-icu" AS c7,
          'ΣΣ Μ' ILIKE 'σσ Μ' COLLATE "en-US-x-icu" AS c8,
      'ΣΣ Μ' ILIKE 'σς Μ' COLLATE "en-US-x-icu" AS c9;
 c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8 | c9
+++++++++
 t  | f  | t  | t  | f  | t  | t  | t  | f  | t
(1 row)

Obviously, the ILIKE operator is really strict regarding to the correct 
letter at the end of the word. The regular expression operator works as 
you expected.


Happy computing...
Frank

UNSUBSCRIBE

2017-11-20 Thread Frank Cavaliero

PLEASE UNSUBSCRIBE ME TO ALL pgsql* mailing lists.

Thanks.

 





 








 



From:   rammohan ganapavarapu 
To: gp...@free.fr, pgsql-ad...@postgresql.org
Date:   11/20/2017 01:25 PM
Subject:Re: [ADMIN] Can master and slave on different PG versions?



Gilles,

Thank you, if we set wal_level in version <= 9.6  to "logical" what does 
it mean what does it do? do we still call it as logical replication? also 
how to tell if my replication is logical or physical. Please excuse me if 
i am asking stupid questions :) i am new to Postgres.

Ram

On Mon, Nov 20, 2017 at 2:20 AM,  wrote:

Hello,

yes, before postgres version 10, you must use an extension such as 
pg_logical
to build logical replication between master and slave.

For pg_logical, see the following links to help you start :
https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/
http://bonesmoses.org/2016/10/14/pg-phriday-perfectly-logical/
https://www.depesz.com/2016/11/08/major-version-upgrading-with-minimal-downtime/


Regards
Gilles


- Mail original -
De: "rammohan ganapavarapu" 
À: gp...@free.fr, pgsql-ad...@postgresql.org
Envoyé: Vendredi 17 Novembre 2017 16:43:57
Objet: Re: [ADMIN] Can master and slave on different PG versions?



Hi,


I was using pg_basebackup to rebuild slave, so pglogical is not in-built 
in postgres and I have to install?


Thanks



On Nov 17, 2017 12:42 AM, < gp...@free.fr > wrote:



Hello,

How did you create the slave ? with pg_basebackup or via rsync ?
In any case, the slave MUST have been built with a backup/rsync made AFTER 
the master have been restarted when you changed
wal_level to logical.

Note, that you did so far, set wal_level=logical, is just a prerequisite 
to use logical replication.
After that, you need a tool which will extract the changes in the WALs 
from the master and will replay them on the slave.

Such tool is pg_logical : 
https://www.2ndquadrant.com/en/resources/pglogical/


Regards


- Mail original -
De: "rammohan ganapavarapu" < rammohanga...@gmail.com >
À: gp...@free.fr , pgsql-ad...@postgresql.org
Envoyé: Jeudi 16 Novembre 2017 17:58:42

Objet: Re: [ADMIN] Can master and slave on different PG versions?


From Slave (9.6):


Database cluster state: shut down in recovery

wal_level setting: logical



From master (9.4) :


Database cluster state: in production

Current wal_level setting: logical



Quite frankly i couldn't find proper document on how to set logical 
replication in 9.4, i think i am using pg_logical. All i did was set 
wal_level to logical and add primary_slot_name in recovery.conf. Do i need 
to install pglogical extension? i couldn't find one for amazon linux.






Thanks,
Ram


On Thu, Nov 16, 2017 at 6:12 AM, < gp...@free.fr > wrote:



Hello again,

can you execute pg_controldata command on both Master and Slave
and check the following properties :
Database cluster state
wal_level setting

Also, if I understand correctly, you want to establish logical replication 
between your 9.4 master
and your 9.6 slave but you don't explain how ? using pg_logical ?

Regards

- Mail original -
De: "rammohan ganapavarapu" < rammohanga...@gmail.com >
À: "David G. Johnston" < david.g.johns...@gmail.com >
Cc: pgsql-ad...@postgresql.org
Envoyé: Mercredi 15 Novembre 2017 22:23:57
Objet: Re: [ADMIN] Can master and slave on different PG versions?




Any one tried this before?


On Wed, Nov 8, 2017 at 8:17 AM, rammohan ganapavarapu < 
rammohanga...@gmail.com > wrote:



David,


So initially i have physical replication configured, so to upgrade slave 
to 9.6 version i did converted master slave to logical replication and 
then try to upgrade the slave to 9.6. I am getting this error when i try 
to start the slave. Any idea?





2017-11-08 05:08:49 UTC [20063]: [1-1] user= db= host= LOG: shutting down
2017-11-08 05:08:49 UTC [20063]: [2-1] user= db= host= LOG: database 
system is shut down
2017-11-08 05:09:45 UTC [20322]: [1-1] user= db= host= LOG: database 
system was shut down at 2017-11-08 05:08:55 UTC
2017-11-08 05:09:45 UTC [20322]: [2-1] user= db= host= LOG: entering 
standby mode
2017-11-08 05:09:45 UTC [20322]: [3-1] user= db= host= WARNING: WAL was 
generated with wal_level=minimal, data may be missing
2017-11-08 05:09:45 UTC [20322]: [4-1] user= db= host= HINT: This happens 
if you temporarily set wal_level=minimal without taking a new base backup.
2017-11-08 05:09:45 UTC [20322]: [5-1] user= db= host= FATAL: hot standby 
is not possible because wal_level was not set to "replica" or higher on 
the master server
2017-11-08 05:09:45 UTC [20322]: [6-1] user= db= host= HINT: Either set 
wal_level to "replica" on the master, or turn off hot_standby here.
2017-11-08 05:09:45 UTC [20312]: [3-1] user= db= host= LOG: startup 
process (PID 20322) exited with exit code 1
2017-11-08 05:09:45 UTC [20312]: [4-1] user= db= host= LOG: aborting 
startup due to startup process failure
2017-11-08 05:09:45 UTC [20312]: [5-1] user= db= host= LOG: da

Re: Postgresql database terminates abruptly with too many open files error

2025-01-14 Thread Frank Lanitz


Hello,


Have you checked something like lsof to see open file descriptors to see?

Cheers,
frank

Am 14.01.25 um 13:58 schrieb Sri Mrudula Attili:

Hello Team,

  We have a postgresql VDB(virtual database- Delphix) that keeps terminating due "to 
too many open files".

Below are few alerts that we could see from the postgresql.log


< 2025-01-14 11:37:20.724 GMT >LOG:  out of file descriptors: Too many open 
files in system; release and retry

< 2025-01-14 11:37:20.724 GMT >FATAL:  epoll_create1 failed: Too many open 
files in system


The number of allowed openfiles at OS level are 65000. even then we see these 
all these being utilised and causing the database to terminate.


Is there a way we could avoid this.


Thanks,

Sri

Re: duplicate key value violates unique constraint "chinese_price_infos_pkey"

2019-05-07 Thread Frank Alberto Rodriguez

You can fix the problem with this query:

SELECT setval('chinese_price_infos_id_seq', sq.val) from ( SELECT
MAX(id) as val FROM chinese_price_infos ) sq;

But you have to search in your application because in some point the
app are inserting the id column instead of leave this task to the DB. 
If you are using some entity framework and the app fill the ID field in
a new entity in some point of the workflow, then when you save the
entity, the framework automatically saves the ID in the DB without
checking if the ID already exist.


Sheers

On Mon, 2019-05-06 at 16:40 +0530, Arup Rakshit wrote:
> Hi,
> 
> Thanks for your reply. It is automatic, my app don’t creates ID, it delegates 
> it to the DB. I am using Ruby on Rails app, where we use Postgresql. 
> 
> docking_dev=# \d chinese_price_infos;
>  Table "public.chinese_price_infos"
>Column|Type | Collation | Nullable |   
>   Default 
> -+-+---+--+-
>  id  | integer |   | not null | 
> nextval('chinese_price_infos_id_seq'::regclass)
>  created_at  | timestamp without time zone |   |  | 
>  updated_at  | timestamp without time zone |   |  | 
>  item_code   | character varying(255)  |   |  | 
>  description | character varying(255)  |   |  | 
>  unit| character varying(255)  |   |  | 
>  price_cents | integer |   |  | 
>  uuid| uuid|   |  | 
> uuid_generate_v4()
>  company_id  | uuid|   |  | 
> Indexes:
> "chinese_price_infos_pkey" PRIMARY KEY, btree (id)
> "index_chinese_price_infos_on_company_id" btree (company_id)
> 
> 
> 
> 
> Thanks,
> 
> Arup Rakshit
> a...@zeit.io
> 
> 
> 
> On 06-May-2019, at 4:38 PM, Ray O'Donnell  wrote:
> 
> On 06/05/2019 12:05, Arup Rakshit wrote:
> Every time I try to insert I get the error:
> docking_dev=# INSERT INTO "chinese_price_infos" ("item_code",
> "price_cents", "unit", "description", "company_id", "created_at",
> "updated_at") VALUES ('01GS10001', 6000, 'Lift', 'Shore Crane
> Rental', '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06
> 10:49:03.894725', '2019-05-06 10:49:03.894725'); ERROR:  duplicate
> key value violates unique constraint "chinese_price_infos_pkey" DETAIL:  Key 
> (id)=(71165) already exists. docking_dev=# INSERT INTO
> "chinese_price_infos" ("item_code", "price_cents", "unit",
> "description", "company_id", "created_at", "updated_at") VALUES
> ('01GS10001', 6000, 'Lift', 'Shore Crane Rental',
> '9ae3f8b8-8f3f-491c-918a-efd8f5100a5e', '2019-05-06 10:49:03.894725',
> '2019-05-06 10:49:03.894725'); ERROR:  duplicate key value violates
> unique constraint "chinese_price_infos_pkey" DETAIL:  Key
> (id)=(71166) already exists.
> Then I found:
> docking_dev=# SELECT MAX(id) FROM chinese_price_infos; max  128520 (1 
> row)
> docking_dev=# SELECT nextval('chinese_price_infos_id_seq'); nextval - 
> 71164 (1 row)
> Not sure how it is out of sync. How can I fix this permanently. I ran
> vacuum analyze verbose; still same error.
> 
> 
> You can fix it by using setval() to set the sequence manually to something 
> higher than the highest current id value in the table. However, it sounds as 
> if something in the application code may be causing problems For example, 
> is something generating id values without reference to the sequence?
> 
> Ray.
> 
> 
> 
> -- 
> Raymond O'Donnell // Galway // Ireland
> r...@rodonnell.ie
> 
> 
> 
>

Re: PITR based recovery in a primary/standby cluster setup

2019-05-21 Thread Frank Alberto Rodriguez

Hello.I'm not sure which replications issues you have, and I never used
Wall-E before, but I get some issues with PotgreSql 10 and Barman.
Try starting the primary server at first, when it finish to recovery
this should start as primary, if not then go to the postgresql data
directory and rename the recovery.conf to recovery.done and start the
server as primary. 
Then start the standby server and when  recovery target  is reached,
the standby server should not leave the recovery status and this should
weep receiving wal through the archive_command, and should not rename
the recovery.conf file.
Regards
On Tue, 2019-05-21 at 14:27 +0530, Abhijit Gharami wrote:
> Hi,
> 
> We have primary and standby PostgreSQL cluster setup and also we have
> PITR enabled on it. To improve the recovery time we are thinking of
> recovering the database to both primary and standby at the same time.
> 
> These are the steps we are following:
>   1. Restore the base backup to the both primary and standby server 
>   2. Replay the WAL files on both primary and standby and once the
> recovery target is reached stop the servers 
>   3. Start one of the server as Primary and other one as standby
> 
> 
> We have followed the above steps but when we are trying to start the
> servers as primary and standby we are having replication issues.
> 
> Could you please suggest what should be done here so that we can
> recover the database in both primary as well as in  standby server?
> 
> We are using PostgrerSQL version: 9.6.12 and for PITR we are using
> WAL-E.
> 
> Regards,
> Abhijit

Re: Data entry / data editing tools (more end-user focus).

2019-05-23 Thread Frank Alberto Rodriguez

If you have the database modeled, the most quickly think I can thinks
is with python framework Django. Configure the connection to the DB and
make reverse engineer with Django, this create the entities class,
then  activate the administration forms and configure each form for the
entities (few lines) and Django makes the magic and makes the GUI for
the DB and if the entities are related this give you the option to
insert before the entities needed. As a plus, you have the access
control module done too.
Regards 
On Thu, 2019-05-23 at 11:52 +0200, Tony Shelver wrote:
> I looked at quite a few options.   Some constraints on my side that
> our direction is open source, with Linux development and servers.
> Radzen is .NET:  I could just as well use MS Access to cobble
> together a front end.
> 
> CUBA and OpenXava are Java based and seem to require writing Java for
> logic: I last used nearly 20 years ago and 'fast' development and
> Java IMHO is an oxymoron.
> 
> Aurelia looks a bit promising, but I am not sure if it gains anything
> over the current crop of JS libraries and frameworks, such as Vue,
> React et al, and then libraries / frameworks built on top of those
> such as  Nuxt / Vue, Quasar / Vue or Vuetify / Vue, which seem to
> have far more activity on Github.
> 
> I managed to set up a quick and dirty front end using LibreOffice
> Base over a weekend, next iteration i will probably move over to a
> Vue framework, probably using Quasar..
> 
> On Sat, 18 May 2019 at 00:26, Stefan Keller 
> wrote:
> > Dear all
> > 
> > 
> > 
> > What about following „Rapid App Development Tools"?
> > 
> > * OpenXava (Java): 
> > https://www.openxava.org/ate/visual-studio-lightswitch
> > 
> > * Radzen (.NET): 
> > https://www.radzen.com/visual-studio-lightswitch-alternative/
> > 
> > * Other: https://aurelia.io/ (JS) or CUBA 
> > https://www.cuba-platform.com/ (Java)
> > 
> > 
> > 
> > :Stefan
> > 
> > 
> > 
> > Am Do., 28. März 2019 um 15:39 Uhr schrieb Adrian Klaver
> > 
> > :
> > 
> > >
> > 
> > > On 3/27/19 11:49 PM, Tony Shelver wrote:
> > 
> > >
> > 
> > > Please reply to list also, more eyes on the the problem.
> > 
> > > Ccing list
> > 
> > >
> > 
> > > My take on below is since you are feeding a Website why not use
> > Web
> > 
> > > technologies for your data entry. My language of choice is
> > Python. I
> > 
> > > have done something similar to this(on small scale) using the
> > Django
> > 
> > > framework. For something lighter weight there is Flask. Then your
> > client
> > 
> > > becomes a browser and you do not have to distribute forms around.
> > You
> > 
> > > could integrate with the existing Web apps you are using e.g.
> > SnipCart.
> > 
> > >
> > 
> > >
> > 
> > > > Actually I found a possibility.  LibreOffice Base on top of PG
> > lets me
> > 
> > > > paste photos into a Postgresql bytea field no problem.  MS
> > Access should
> > 
> > > > work well also, but I am not going to buy it, and running
> > Ubuntu most of
> > 
> > > > the time.
> > 
> > > > Possibly will distribute the Base forms to select users to
> > enter data.
> > 
> > > > We are a startup company, so this is an affordable temporary
> > fix, until
> > 
> > > > the product I have been looking at matures, or we can roll our
> > own.
> > 
> > > >
> > 
> > > > We are building a company website, including an eStore, and
> > have a few
> > 
> > > > hundred products to load and maintain. Our product data
> > currently isn't
> > 
> > > > suitable for a sales catalog.
> > 
> > > > (Brands, categories, products, pricing and deal info, specials,
> > images,
> > 
> > > > product comparisons and so on).
> > 
> > > >
> > 
> > > > Right now I input / maintain this via CSV files maintained
> > through a
> > 
> > > > spreadsheet  (LibreOffice Calc) which our site generator
> > (Jekyll) uses
> > 
> > > > to build out the static HTML product [pages automatically.
> > 
> > > > This is really quick to enter basic data, but I have to
> > manually
> > 
> > > > maintain image uploads, image names and so on manually in the
> > 
> > > > spreadsheet and through individual file uploads. We have at
> > least one,
> > 
> > > > preferably 3 and up to 6 photos per product to maintain.  Call
> > it a 1000
> > 
> > > > images right now, and that will only go up.
> > 
> > > > Invalid text / characters in product descriptions and so on can
> > break
> > 
> > > > the CSV as well.
> > 
> > > >
> > 
> > > > There are headless CMS solutions out on the market targeting
> > this same
> > 
> > > > area, but for various reasons the suitable ones are still
> > maturing and
> > 
> > > > shaking out in the marketplace, so I am not in a hurry to make
> > a choice.
> > 
> > > >
> > 
> > > > So the goal is to replace CSV with JSON file input.  This will
> > also make
> > 
> > > > my life easier for more complex structures such as multiple
> > categories
> > 
> > > > and specs per product.
> > 
> > > > I also want to migrate text that can change from the HTML pages
> > into the
> > 
>

Re: One way replication in PostgreSQL

2019-06-03 Thread Frank Alberto Rodriguez

You could use FDW to replicate what you need to an external server from
the provider/primary/master to the subscriber/secondary/slaveUsing
triggers on the master tables that you want to replicate, you can
execute the insert/update/delete actions on the secondary tables
through the FDW.With this approach you only need a connection from
provider to the subscriber.
Regards  On Mon, 2019-06-03 at 18:00 +0200, PALAYRET Jacques wrote:
> Hello,
> 
> If, for security reasons, I can't create a connection or a flow from
> subscriber/secundary/slave towards provider/primary/master, witch
> replication systems can I use ?
> 
> If possible, I would prefer partial replication (only some tables) to
> full base replication (all instances).
> 
> Do trigger-based replication systems (like Slony or Londiste or
> others) need a connection or flow from subscriber to the provider ?
> 
> Thanks in advance
> - Météo-France -
> PALAYRET JACQUES
> DCSC/MBD 
> jacques.palay...@meteo.fr 
> Fixe : +33 561078319

Re: One way replication in PostgreSQL

2019-06-04 Thread Frank Alberto Rodriguez

The FDW is a PostgreSQL extension  to connect to other server from
PosgreSQL server inside, with this solution you only need connections
from P to S and no need a third server (external server), just use
triggers to push the INSERT/UPDATE/DELETE information you want to
replicate from P to S through Foreign Data Wrapper.
If you need integrity on the replicated information then I suggest to
use a control table to store the actions to replicate for the case when
it fails you can keep trying til the action succeeds.
Regards 
On Tue, 2019-06-04 at 09:02 +0200, PALAYRET Jacques wrote:
> Hello,
> 
> Thanks a lot for the suggested solutions.
> 
> So, I can use WAL-shipping replication from Primary to the Secundary
> server, but it's only for full replication.
> 
> Let's call " P " the provider/primary/master  and " S " the
> subscriber/secundary/slave one.
> For partial replication (not all the tables), the solutions should
> use a third (intermediate / middle) server which could have both ways
> flow with the server P but only one way flow towards the server S.
> For example, a logical replication (pglogical or logical Postgresql
> replication) between server P and the middle server and then a WAL-
> shipping replication between middle server and server S.
> Is that right ?
> 
> About the " FDW " solution in " an external server " (a middle one),
> is it possible to integrate the FDW in the P server to avoid the "
> external server " ?
> 
> => What about the trigger-based replication systems like Slony or
> Londiste ; is it really necessary to have a connection or flow from
> the server S towards the server P ?
> 
> Regards
> De: "PALAYRET Jacques" 
> À: pgsql-general@lists.postgresql.org
> Envoyé: Lundi 3 Juin 2019 18:00:51
> Objet: One way replication in PostgreSQL
> 
> Hello,
> 
> If, for security reasons, I can't create a connection or a flow from
> subscriber/secundary/slave towards provider/primary/master, witch
> replication systems can I use ?
> 
> If possible, I would prefer partial replication (only some tables) to
> full base replication (all instances).
> 
> Do trigger-based replication systems (like Slony or Londiste or
> others) need a connection or flow from subscriber to the provider ?
> 
> Thanks in advance
> - Météo-France -
> PALAYRET JACQUES
> DCSC/MBD 
> jacques.palay...@meteo.fr 
> Fixe : +33 561078319

Re: Forks of pgadmin3?

2019-03-22 Thread Frank Alberto Rodriguez Solana

You could try OmniDB, is web app but have a version that just feels like a
desktop application. Is supported by 2ndQuadrant.

This is the official website  https://omnidb.org/en/

Greetings

El vie., 22 mar. 2019 a las 4:56, Christian Henz ()
escribió:

> I know I'm late to the party, but we're only now migrating from
> Postgres 9.x, realizing that pgadmin3 does not support Postgres 11.
>
> I have checked out pgadmin4, but I don't like it at all. My colleagues
> feel the same way, and some web searching suggests that we are not
> alone.
>
> So I wonder if there are any active forks of pgadmin3?
>
> I found some on Github with some significant changes that I assume
> were done by people working for VK, the Russian social network. These
> appear to be personal hacks though (monosyllabic commit messages, build
> scripts added with hard coded local paths etc.).
>
> There are also the Debian packages that have patches adding Postgres
> 10 support among other things. Not sure if there would be interest
> there in continuing to support newer Postgres versions.
>
> Are there other, more organized efforts to continue pgadmin3?
>
> Are there technical reasons why such a continuation would not make
> sense?
>
> Cheers,
> Christian
>
> --
> Christian Henz
> Software Developer, software & vision Sarrazin GmbH & Co. KG
>
>
>

50 matches

Mail list logo