Re: [PATCH] Covering SPGiST index

2020-08-23 Thread Andrey M. Borodin
Hi!

> 17 авг. 2020 г., в 21:04, Pavel Borisov  написал(а):
> 
> Postgres Professional: http://postgrespro.com
> 

I'm looking into the patch. I have few notes:

1. I see that in src/backend/access/spgist/README you describe SP-GiST tuple as 
sequence of {Value, ItemPtr to heap, Included attributes}. Is it different from 
regular IndexTuple where tid is within TupleHeader?

2. Instead of cluttering tuple->nextOffset with bit flags we could just change 
Tuple Header for leaf tuples with covering indexes. Interpret tuples for 
indexes with included attributes differently, iff it makes code cleaner. There 
are so many changes with SGLT_SET_OFFSET\SGLT_GET_OFFSET that it seems viable 
to put some effort into research of other ways to represent two bits for null 
mask and varatts.

3. Comment "* SPGiST dead tuple: declaration for examining non-live tuples" 
does not precede relevant code. because struct SpGistDeadTupleData was not 
moved.

Thanks!

Best regards, Andrey Borodin.



Re: Yet another fast GiST build (typo)

2020-08-23 Thread Andrey M. Borodin


> 14 авг. 2020 г., в 14:21, Pavel Borisov  написал(а):
> 
> I see this feature quite useful in concept and decided to test it.

Thanks for reviewing and benchmarking, Pavel!

I agree with your suggestions. PFA patch with relevant changes.

Thanks a lot.

Best regards, Andrey Borodin.



v11-0001-Add-sort-support-for-point-gist_point_sortsuppor.patch
Description: Binary data


v11-0002-Implement-GiST-build-using-sort-support.patch
Description: Binary data


Re: Row estimates for empty tables

2020-08-23 Thread Tom Lane
Pavel Stehule  writes:
> I am sending a patch that is years used in GoodData.

I'm quite unexcited about that.  I'd be the first to agree that the
ten-pages estimate is a hack, but it's not an improvement to ask users
to think of a better value ... especially not as a one-size-fits-
all-relations GUC setting.

I did have an idea that I think is better than my previous one:
rather than lying about the value of relpages, let's represent the
case where we don't know the tuple density by setting reltuples = -1
initially.  This leads to a patch that's a good bit more invasive than
the quick-hack solution, but I think it's a lot cleaner on the whole.

A possible objection is that this changes the FDW API slightly, as
GetForeignRelSize callbacks now need to deal with rel->tuples possibly
being -1.  We could avoid an API break if we made plancat.c clamp
that value to zero; but then FDWs still couldn't tell the difference
between the "empty" and "never analyzed" cases, and I think this is
just as much of an issue for them as for the core code.

I'll add this to the upcoming CF.

regards, tom lane

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index fbcf7ca9c9..072a6dc1c1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -996,7 +996,7 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
 	/*
 	 * Estimate the number of tuples in the file.
 	 */
-	if (baserel->pages > 0)
+	if (baserel->tuples >= 0 && baserel->pages > 0)
 	{
 		/*
 		 * We have # of pages and # of tuples from pg_class (that is, from a
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index 3a99333d44..23306e11a7 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -195,6 +195,9 @@ statapprox_heap(Relation rel, output_type *stat)
 	stat->tuple_count = vac_estimate_reltuples(rel, nblocks, scanned,
 			   stat->tuple_count);
 
+	/* It's not clear if we could get -1 here, but be safe. */
+	stat->tuple_count = Max(stat->tuple_count, 0);
+
 	/*
 	 * Calculate percentages if the relation has one or more pages.
 	 */
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 9fc53cad68..a31abce7c9 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -692,15 +692,14 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	else
 	{
 		/*
-		 * If the foreign table has never been ANALYZEd, it will have relpages
-		 * and reltuples equal to zero, which most likely has nothing to do
-		 * with reality.  We can't do a whole lot about that if we're not
+		 * If the foreign table has never been ANALYZEd, it will have
+		 * reltuples < 0, meaning "unknown".  We can't do much if we're not
 		 * allowed to consult the remote server, but we can use a hack similar
 		 * to plancat.c's treatment of empty relations: use a minimum size
 		 * estimate of 10 pages, and divide by the column-datatype-based width
 		 * estimate to get the corresponding number of tuples.
 		 */
-		if (baserel->pages == 0 && baserel->tuples == 0)
+		if (baserel->tuples < 0)
 		{
 			baserel->pages = 10;
 			baserel->tuples =
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1232b24e74..7861379d97 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1977,6 +1977,10 @@ SCRAM-SHA-256$:&l
the planner.  It is updated by VACUUM,
ANALYZE, and a few DDL commands such as
CREATE INDEX.
+   If the table has never yet been vacuumed or
+   analyzed, reltuples
+   contains -1 indicating that the row count is
+   unknown.
   
  
 
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 74793035d7..72fa127212 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -130,7 +130,8 @@ GetForeignRelSize(PlannerInfo *root,
  (The initial value is
  from pg_class.reltuples
  which represents the total row count seen by the
- last ANALYZE.)
+ last ANALYZE; it will be -1 if
+ no ANALYZE has been done on this foreign table.)
 
 
 
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index 9cd6638df6..0935a6d9e5 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -727,7 +727,7 @@ ginvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 * entries.  This is bogus if the index is partial, but it's real hard to
 	 * tell how many distinct heap entries are referenced by a GIN index.
 	 */
-	stats->num_index_tuples = info->num_heap_tuples;
+	stats->num_index_tuples = Max(info->num_heap_tuples, 0);
 	stats->estimated_count = info->estimated_count;
 
 	/*
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 44e2224dd5..52dd0c59f0 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/

Re: proposal - function string_to_table

2020-08-23 Thread Peter Smith
I have re-checked the string_to_table_20200821.patch.

Below is one remaining problem.



COMMENT (help text)

+Splits the string at occurrences
+of delimiter and forms the remaining data
+into a table with one text type column.
+If delimiter is NULL,
+each character in the string will become a
+separate element in the array.

Seems like here is a cut/paste error from the string_to_array help text.

"separate element in the array" should say "separate row of the table"



>>> Maybe a different choice of function name would be more consistent
>>> with what is already there?
>>> e.g.  split_to_table, string_split_to_table, etc.
>>
>> I don't agree. This function is twin (with almost identical behaviour) for 
>> "string_to_array" function, so I think so the name is correct.

OK



Kind Regards,
Peter Smith.
Fujitsu Australia




Continuing instability in insert-conflict-specconflict test

2020-08-23 Thread Tom Lane
We've seen repeated failures in the tests added by commit 43e084197:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2020-08-23%2005%3A46%3A17
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2020-08-04%2001%3A05%3A30
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2020-03-14%2019%3A35%3A31
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=caiman&dt=2020-04-01%2004%3A10%3A51
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-03-10%2003%3A14%3A13
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2020-03-10%2011%3A01%3A49
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2020-03-09%2010%3A59%3A43
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2020-03-09%2015%3A52%3A50
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-09%2005%3A20%3A07
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2020-03-09%2003%3A00%3A15
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2020-03-09%2015%3A52%3A53

I dug into this a bit today, and found that I can reproduce the failure
reliably by adding a short delay in the right place, as attached.

However, after studying the test awhile I have to admit that I do not
understand why all these failures look the same, because it seems to
me that this test is a house of cards.  It *repeatedly* expects that
issuing a command to session X will result in session Y reporting
some notice before X's command terminates, even though X's command will
never meet the conditions for isolationtester to think it's blocked.
AFAICS that is nothing but wishful thinking.  Why is it that only one of
those places has failed so far?

regards, tom lane

diff --git a/src/test/isolation/specs/insert-conflict-specconflict.spec b/src/test/isolation/specs/insert-conflict-specconflict.spec
index 6028397491..92459047b6 100644
--- a/src/test/isolation/specs/insert-conflict-specconflict.spec
+++ b/src/test/isolation/specs/insert-conflict-specconflict.spec
@@ -28,6 +28,7 @@ setup
 RAISE NOTICE 'blurt_and_lock_4() called for % in session %', $1, current_setting('spec.session')::int;
 RAISE NOTICE 'acquiring advisory lock on 4';
 PERFORM pg_advisory_xact_lock(current_setting('spec.session')::int, 4);
+PERFORM pg_sleep(0.1);
 RETURN $1;
 END;$$;
 


list of extended statistics on psql

2020-08-23 Thread Tatsuro Yamada

Hi!

I created a POC patch that allows showing a list of extended statistics by
"\dz" command on psql. I believe this feature helps DBA and users who
would like to know all extended statistics easily. :-D

I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
than "\dz" but they were already assigned. Therefore I used "\dz"
instead of them.

Please find the attached patch.
Any comments are welcome!

For Example:
===
CREATE TABLE t1 (a INT, b INT);
CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
ANALYZE t1;

CREATE TABLE t2 (a INT, b INT, c INT);
CREATE STATISTICS stts4 ON b, c FROM t2;
ANALYZE t2;

postgres=# \dz
List of extended statistics
 Schema | Table | Name  | Columns | Ndistinct | Dependencies | MCV
+---+---+-+---+--+-
 public | t1| stts1 | a, b| f | t| f
 public | t1| stts2 | a, b| t | t| f
 public | t1| stts3 | a, b| t | t| t
 public | t2| stts4 | b, c| t | t| t
(4 rows)

postgres=# \?
...
  \dy [PATTERN]  list event triggers
  \dz [PATTERN]  list extended statistics
  \l[+]   [PATTERN]  list databases
...
===

For now, I haven't written a document and regression test for that.
I'll create it later.

Thanks,
Tatsuro Yamada


diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9902a4a..dc36c98 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -932,6 +932,9 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
case 'y':   /* Event Triggers */
success = listEventTriggers(pattern, 
show_verbose);
break;
+   case 'z':   /* Extended Statistics 
*/
+   success = listExtendedStats(pattern);
+   break;
default:
status = PSQL_CMD_UNKNOWN;
}
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index d81f157..8128b1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -4368,6 +4368,67 @@ listEventTriggers(const char *pattern, bool verbose)
 }
 
 /*
+ * \dz
+ *
+ * Briefly describes extended statistics.
+ */
+bool
+listExtendedStats(const char *pattern)
+{
+   PQExpBufferData buf;
+   PGresult   *res;
+   printQueryOpt myopt = pset.popt;
+
+   if (pset.sversion < 1)
+   {
+   charsverbuf[32];
+
+   pg_log_error("The server (version %s) does not support extended 
statistics.",
+formatPGVersionNumber(pset.sversion, 
false,
+   
   sverbuf, sizeof(sverbuf)));
+   return true;
+   }
+
+   initPQExpBuffer(&buf);
+   printfPQExpBuffer(&buf,
+ "SELECT "
+ 
"stxnamespace::pg_catalog.regnamespace AS \"%s\", "
+ "stxrelid::pg_catalog.regclass AS 
\"%s\", "
+ "stxname AS \"%s\", "
+ "(SELECT 
pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ') "
+ "FROM pg_catalog.unnest(stxkeys) 
s(attnum) "
+ "JOIN pg_catalog.pg_attribute a ON 
(stxrelid = a.attrelid AND "
+ "a.attnum = s.attnum AND NOT 
attisdropped)) AS \"%s\", "
+ "'d' = any(stxkind) AS \"%s\", "
+ "'f' = any(stxkind) AS \"%s\", "
+ "'m' = any(stxkind) AS \"%s\"  "
+ "FROM pg_catalog.pg_statistic_ext 
stat ",
+ gettext_noop("Schema"),
+ gettext_noop("Table"),
+ gettext_noop("Name"),
+ gettext_noop("Columns"),
+ gettext_noop("Ndistinct"),
+ gettext_noop("Dependencies"),
+ gettext_noop("MCV"));
+
+   appendPQExpBufferStr(&buf, "ORDER BY 1, 2, 3, 4;");
+
+   res = PSQLexec(buf.data);
+   termPQExpBuffer(&buf);
+   if (!res)
+   return false;
+
+   myopt.nullPrint = NULL;
+   myopt.title = _("List of extended statistics");
+   

Re: Creating a function for exposing memory usage of backend process

2020-08-23 Thread torikoshia

On 2020-08-22 21:18, Michael Paquier wrote:

Thanks for reviewing!


On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote:

OK. Added a regression test on sysviews.sql.
(0001-Added-a-regression-test-for-pg_backend_memory_contex.patch)

Fujii-san gave us an example, but I added more simple one considering
the simplicity of other tests on that.


What you have sent in 0001 looks fine to me.  A small test is much
better than nothing.


Added a patch for relocating the codes to mcxtfuncs.c.
(patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch)


The same code is moved around line-by-line.

Of course, this restriction makes pg_backend_memory_contexts hard to 
use
when the user of the target session is not granted pg_monitor because 
the

scope of this view is session local.

In this case, I imagine additional operations something like 
temporarily

granting pg_monitor to that user.


Hmm.  I am not completely sure either that pg_monitor is the best fit
here, because this view provides information about a bunch of internal
structures.  Something that could easily be done though is to revoke
the access from public, and then users could just set up GRANT
permissions post-initdb, with pg_monitor as one possible choice.  This
is the safest path by default, and this stuff is of a caliber similar
to pg_shmem_allocations in terms of internal contents.


I think this is a better way than what I did in
0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch.

Attached a patch.



It seems to me that you are missing one "REVOKE ALL on
pg_backend_memory_contexts FROM PUBLIC" in patch 0003.

By the way, if that was just for me, I would remove used_bytes, which
is just a computation from the total and free numbers.  I'll defer
that point to Fujii-san.
--
Michael



On 2020/08/20 2:59, Kasahara Tatsuhito wrote:
I totally agree that it's not *enough*, but in contrast to you I 
think
it's a good step. Subsequently we should add a way to get any 
backends

memory usage.
It's not too hard to imagine how to serialize it in a way that can be
easily deserialized by another backend. I am imagining something like
sending a procsignal that triggers (probably at CFR() time) a backend 
to
write its own memory usage into pg_memusage/ or something 
roughly

like that.


Sounds good. Maybe we can also provide the SQL-callable function
or view to read pg_memusage/, to make the analysis easier.

+1



I'm thinking about starting a new thread to discuss exposing other
backend's memory context.


Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

From dc4fade9111dc3f91e992c4d5af393dd5ed03270 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 24 Jul 2020 11:14:32 +0900
Subject: [PATCH] Previously pg_backend_memory_contexts doesn't have any
 restriction and anyone could access it. However, this view contains some
 internal information of the memory context. This policy could cause security
 issues. This patch revokes all on pg_shmem_allocations from public and only
 the superusers can access it.

---
 doc/src/sgml/catalogs.sgml   | 4 
 src/backend/catalog/system_views.sql | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1232b24e74..9fe260ecff 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -9697,6 +9697,10 @@ SCRAM-SHA-256$:&l

   
 
+  
+   By default, the pg_backend_memory_contexts view can be
+   read only by superusers.
+  
  
 
  
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ba5a23ac25..a2d61302f9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -557,6 +557,9 @@ REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
 CREATE VIEW pg_backend_memory_contexts AS
 SELECT * FROM pg_get_backend_memory_contexts();
 
+REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
+REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;
+
 -- Statistics views
 
 CREATE VIEW pg_stat_all_tables AS
-- 
2.18.1



Re: list of extended statistics on psql

2020-08-23 Thread Pavel Stehule
po 24. 8. 2020 v 5:23 odesílatel Tatsuro Yamada <
tatsuro.yamada...@nttcom.co.jp> napsal:

> Hi!
>
> I created a POC patch that allows showing a list of extended statistics by
> "\dz" command on psql. I believe this feature helps DBA and users who
> would like to know all extended statistics easily. :-D
>
> I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
> than "\dz" but they were already assigned. Therefore I used "\dz"
> instead of them.
>
> Please find the attached patch.
> Any comments are welcome!
>
> For Example:
> ===
> CREATE TABLE t1 (a INT, b INT);
> CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
> CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
> CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
> ANALYZE t1;
>
> CREATE TABLE t2 (a INT, b INT, c INT);
> CREATE STATISTICS stts4 ON b, c FROM t2;
> ANALYZE t2;
>
> postgres=# \dz
>  List of extended statistics
>   Schema | Table | Name  | Columns | Ndistinct | Dependencies | MCV
> +---+---+-+---+--+-
>   public | t1| stts1 | a, b| f | t| f
>   public | t1| stts2 | a, b| t | t| f
>   public | t1| stts3 | a, b| t | t| t
>   public | t2| stts4 | b, c| t | t| t
> (4 rows)
>
> postgres=# \?
> ...
>\dy [PATTERN]  list event triggers
>\dz [PATTERN]  list extended statistics
>\l[+]   [PATTERN]  list databases
> ...
> ===
>
> For now, I haven't written a document and regression test for that.
> I'll create it later.
>

+1 good idea

Pavel


> Thanks,
> Tatsuro Yamada
>
>
>


Re: Creating a function for exposing memory usage of backend process

2020-08-23 Thread Fujii Masao




On 2020/08/24 13:01, torikoshia wrote:

On 2020-08-22 21:18, Michael Paquier wrote:

Thanks for reviewing!


On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote:

OK. Added a regression test on sysviews.sql.
(0001-Added-a-regression-test-for-pg_backend_memory_contex.patch)

Fujii-san gave us an example, but I added more simple one considering
the simplicity of other tests on that.


What you have sent in 0001 looks fine to me.  A small test is much
better than nothing.


+1

But as I proposed upthread, what about a bit complicated test as follows,
e.g., to confirm that the internal logic for level works expectedly?

 SELECT name, ident, parent, level, total_bytes >= free_bytes FROM 
pg_backend_memory_contexts WHERE level = 0;





Added a patch for relocating the codes to mcxtfuncs.c.
(patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch)


Thanks for the patch! Looks good to me.
Barring any objection, I will commit this patch at first.




The same code is moved around line-by-line.


Of course, this restriction makes pg_backend_memory_contexts hard to use
when the user of the target session is not granted pg_monitor because the
scope of this view is session local.

In this case, I imagine additional operations something like temporarily
granting pg_monitor to that user.


Hmm.  I am not completely sure either that pg_monitor is the best fit
here, because this view provides information about a bunch of internal
structures.  Something that could easily be done though is to revoke
the access from public, and then users could just set up GRANT
permissions post-initdb, with pg_monitor as one possible choice.  This
is the safest path by default, and this stuff is of a caliber similar
to pg_shmem_allocations in terms of internal contents.


I think this is a better way than what I did in
0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch.


You mean 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch?



Attached a patch.


Thanks for updating the patch! This also looks good to me.



It seems to me that you are missing one "REVOKE ALL on
pg_backend_memory_contexts FROM PUBLIC" in patch 0003.

By the way, if that was just for me, I would remove used_bytes, which
is just a computation from the total and free numbers.  I'll defer
that point to Fujii-san.


Yeah, I was just thinking that displaying also used_bytes was useful,
but this might be inconsistent with the other views' ways.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: display offset along with block number in vacuum errors

2020-08-23 Thread Amit Kapila
On Fri, Aug 21, 2020 at 12:31 PM Masahiko Sawada
 wrote:
>
> On Thu, 20 Aug 2020 at 21:12, Amit Kapila  wrote:
> >
> >
> > Attached are both the patches. The first one is to improve existing
> > error context information, so I think we should back-patch to 13. The
> > second one is to add additional vacuum error context information, so
> > that is for only HEAD. Does that make sense? Also, let me know if you
> > have any more comments.
>
> Yes, makes sense to me.
>
> I don't have comments on both patches. They look good to me.
>

Thanks, I have pushed the first patch. I'll will push the second one
in a day or two unless someone has comments on the same.

-- 
With Regards,
Amit Kapila.




Re: Creating a function for exposing memory usage of backend process

2020-08-23 Thread Fujii Masao




On 2020/08/24 13:13, Fujii Masao wrote:



On 2020/08/24 13:01, torikoshia wrote:

On 2020-08-22 21:18, Michael Paquier wrote:

Thanks for reviewing!


On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote:

OK. Added a regression test on sysviews.sql.
(0001-Added-a-regression-test-for-pg_backend_memory_contex.patch)

Fujii-san gave us an example, but I added more simple one considering
the simplicity of other tests on that.


What you have sent in 0001 looks fine to me.  A small test is much
better than nothing.


+1

But as I proposed upthread, what about a bit complicated test as follows,
e.g., to confirm that the internal logic for level works expectedly?

  SELECT name, ident, parent, level, total_bytes >= free_bytes FROM 
pg_backend_memory_contexts WHERE level = 0;





Added a patch for relocating the codes to mcxtfuncs.c.
(patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch)


Thanks for the patch! Looks good to me.
Barring any objection, I will commit this patch at first.


As far as I know, utils/adt is the directory to basically include the files
for a particular type or operator. So ISTM that mcxtfuncs.c doesn't
fit to this directory. Isn't it better to put that in utils/mmgr ?


The copyright line in new file mcxtfuncs.c should be changed as follows
because it contains new code?

- * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
+ * Portions Copyright (c) 2020, PostgreSQL Global Development Group

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: list of extended statistics on psql

2020-08-23 Thread Julien Rouhaud
On Mon, Aug 24, 2020 at 6:13 AM Pavel Stehule  wrote:
>
> po 24. 8. 2020 v 5:23 odesílatel Tatsuro Yamada 
>  napsal:
>>
>> Hi!
>>
>> I created a POC patch that allows showing a list of extended statistics by
>> "\dz" command on psql. I believe this feature helps DBA and users who
>> would like to know all extended statistics easily. :-D
>>
>> I have not a strong opinion to assign "\dz". I prefer "\dx" or "\de*"
>> than "\dz" but they were already assigned. Therefore I used "\dz"
>> instead of them.
>>
>> Please find the attached patch.
>> Any comments are welcome!
>>
>> For Example:
>> ===
>> CREATE TABLE t1 (a INT, b INT);
>> CREATE STATISTICS stts1 (dependencies) ON a, b FROM t1;
>> CREATE STATISTICS stts2 (dependencies, ndistinct) ON a, b FROM t1;
>> CREATE STATISTICS stts3 (dependencies, ndistinct, mcv) ON a, b FROM t1;
>> ANALYZE t1;
>>
>> CREATE TABLE t2 (a INT, b INT, c INT);
>> CREATE STATISTICS stts4 ON b, c FROM t2;
>> ANALYZE t2;
>>
>> postgres=# \dz
>>  List of extended statistics
>>   Schema | Table | Name  | Columns | Ndistinct | Dependencies | MCV
>> +---+---+-+---+--+-
>>   public | t1| stts1 | a, b| f | t| f
>>   public | t1| stts2 | a, b| t | t| f
>>   public | t1| stts3 | a, b| t | t| t
>>   public | t2| stts4 | b, c| t | t| t
>> (4 rows)
>>
>> postgres=# \?
>> ...
>>\dy [PATTERN]  list event triggers
>>\dz [PATTERN]  list extended statistics
>>\l[+]   [PATTERN]  list databases
>> ...
>> ===
>>
>> For now, I haven't written a document and regression test for that.
>> I'll create it later.
>
>
> +1 good idea

+1 that's a good idea.  Please add it to the next commitfest!

You have a typo:

+if (pset.sversion < 1)
+{
+charsverbuf[32];
+
+pg_log_error("The server (version %s) does not support
extended statistics.",
+ formatPGVersionNumber(pset.sversion, false,
+   sverbuf, sizeof(sverbuf)));
+return true;
+}

the version test is missing a 0, the feature looks otherwise ok.

How about using \dX rather than \dz?




Re: Re: [HACKERS] Custom compression methods

2020-08-23 Thread Dilip Kumar
On Thu, Aug 13, 2020 at 5:18 PM Dilip Kumar  wrote:
>

There was some question which Robert has asked in this mail, please
find my answer inline.  Also, I have a few questions regarding further
splitting up this patch.

> On Fri, Jun 19, 2020 at 10:33 PM Robert Haas  wrote:
> >

> >
> > - One thing I really don't like about the patch is that it consumes a
> > bit from infomask2 for a new flag HEAP_HASCUSTOMCOMPRESSED. infomask
> > bits are at a premium, and there's been no real progress in the decade
> > plus that I've been hanging around here in clawing back any bit-space.
> > I think we really need to avoid burning our remaining bits for
> > anything other than a really critical need, and I don't think I
> > understand what the need is in this case.

IIUC,  the main reason for using this flag is for taking the decision
whether we need any detoasting for this tuple.  For example, if we are
rewriting the table because the compression method is changed then if
HEAP_HASCUSTOMCOMPRESSED bit is not set in the tuple header and tuple
length, not tup->t_len > TOAST_TUPLE_THRESHOLD then we don't need to
call heap_toast_insert_or_update function for this tuple.  Whereas if
this flag is set then we need to because we might need to uncompress
and compress back using a different compression method.  The same is
the case with INSERT into SELECT * FROM.

 I might be missing
> > something, but I'd really strongly suggest looking for a way to get
> > rid of this. It also invents the concept of a TupleDesc flag, and the
> > flag invented is TD_ATTR_CUSTOM_COMPRESSED; I'm not sure I see why we
> > need that, either.

This is also used in a similar way as the above but for the target
table, i.e. if the target table has the custom compressed attribute
then maybe we can not directly insert the tuple because it might have
compressed data which are compressed using the default compression
methods.

> > - It seems like this kind of approach has a sort of built-in
> > circularity problem. It means that every place that might need to
> > detoast a datum needs to be able to access the pg_am catalog. I wonder
> > if that's actually true. For instance, consider logical decoding. I
> > guess that can do catalog lookups in general, but can it do them from
> > the places where detoasting is happening? Moreover, can it do them
> > with the right snapshot? Suppose we rewrite a table to change the
> > compression method, then drop the old compression method, then try to
> > decode a transaction that modified that table before those operations
> > were performed. As an even more extreme example, suppose we need to
> > open pg_am, and to do that we have to build a relcache entry for it,
> > and suppose the relevant pg_class entry had a relacl or reloptions
> > field that happened to be custom-compressed. Or equally suppose that
> > any of the various other tables we use when building a relcache entry
> > had the same kind of problem, especially those that have TOAST tables.
> > We could just disallow the use of non-default compressors in the
> > system catalogs, but the benefits mentioned in
> > http://postgr.es/m/5541614a.5030...@2ndquadrant.com seem too large to
> > ignore.
> >
> > - I think it would be awfully appealing if we could find some way of
> > dividing this great big patch into some somewhat smaller patches. For
> > example:
> >
> > Patch #1. Add syntax allowing a compression method to be specified,
> > but the only possible choice is pglz, and the PRESERVE stuff isn't
> > supported, and changing the value associated with an existing column
> > isn't supported, but we can add tab-completion support and stuff.
> >
> > Patch #2. Add a second built-in method, like gzip or lz4.

I have already extracted these 2 patches from the main patch set.
But, in these patches, I am still storing the am_oid in the toast
header.  I am not sure can we get rid of that at least for these 2
patches?  But, then wherever we try to uncompress the tuple we need to
know the tuple descriptor to get the am_oid but I think that is not
possible in all the cases.  Am I missing something here?

> > Patch #3. Add support for changing the compression method associated
> > with a column, forcing a table rewrite.
> >
> > Patch #4. Add support for PRESERVE, so that you can change the
> > compression method associated with a column without forcing a table
> > rewrite, by including the old method in the PRESERVE list, or with a
> > rewrite, by not including it in the PRESERVE list.

Does this make sense to have Patch #3 and Patch #4, without having
Patch #5? I mean why do we need to support rewrite or preserve unless
we have the customer compression methods right? because the build-in
compression method can not be dropped so why do we need to preserve?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable

2020-08-23 Thread Amit Kapila
On Sat, Aug 22, 2020 at 8:33 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Thu, Aug 20, 2020 at 10:28 PM Tom Lane  wrote:
> >> 2. On the other hand, the code is *releasing* the
> >> ReplicationSlotControlLock before it calls
> >> ProcArraySetReplicationSlotXmin, and that seems like a flat out
> >> concurrency bug.
>
> > It is not clear to me how those values can go backward.
>
> After releasing ReplicationSlotControlLock, that code is holding no
> lock at all (in the already_locked=false case I'm concerned about).
> Thus the scenario to consider is:
>
> 1. Walsender A runs ReplicationSlotsComputeRequiredXmin, computes
> some perfectly-valid xmins, releases ReplicationSlotControlLock,
> amd then gets swapped out to Timbuktu.
>
> 2. Time passes and the "true values" of those xmins advance thanks
> to other walsender activity.
>
> 3. Walsender B runs ReplicationSlotsComputeRequiredXmin, computes
> some perfectly-valid xmins, and successfully stores them in the
> procarray.
>
> 4. Walsender A returns from never-never land, and stores its now
> quite stale results in the procarray, causing the globally visible
> xmins to go backwards from where they were after step 3.
>
> I see no mechanism in the code that prevents this scenario.
> On reflection I'm not even very sure that the code change
> I'm suggesting would prevent it.  It'd prevent walsenders
> from entering or exiting until we've updated the procarray,
> but there's nothing to stop the furthest-back walsender from
> advancing its values.
>

I think we can prevent that if we allow
ProcArraySetReplicationSlotXmin to update the shared values only when
new xmin values follows the shared values. I am not very sure if it is
safe but I am not able to think of a problem with it. The other way
could be to always acquire ProcArrayLock before calling
ReplicationSlotsComputeRequiredXmin or before acquiring
ReplicationSlotControlLock but that seems too restrictive.

-- 
With Regards,
Amit Kapila.




Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)

2020-08-23 Thread hubert depesz lubaczewski
On Fri, Aug 21, 2020 at 07:54:13AM +0200, Pierre Giraud wrote:
> It looks good to me too. Thanks a lot!
> Let's not forget to notify Hubert (depesz) once integrated.

Thanks a lot, and sorry for not responding earlier - vacation.

Best regards,

depesz