Re: Multivariate MCV list vs. statistics target

2019-06-30 Thread David Rowley
On Fri, 21 Jun 2019 at 19:09, Dean Rasheed  wrote:
> Hmm, maybe. I can certainly understand your dislike of using text[].
> I'm not sure that we can confidently say that multivariate statistics
> won't ever need additional tuning knobs, but I have no idea at the
> moment what they might be, and nothing else has come up so far in all
> the time spent considering MCV lists and histograms, so maybe this is
> OK.

I agree with having the stxstattarget column. Even if something did
come up in the future, then we could consider merging the
stxstattarget column with a new text[] column at that time.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: make \d pg_toast.foo show its indices ; and, \d toast show its main table ; and \d relkind=I show its partitions (and tablespace)

2019-06-30 Thread Fabien COELHO



There are 3 independent patches associated to one thread and one CF entry.


*** About toast table v3:

Patch applies cleanly, compiles, works for me.

ISTM that the he query should be unambiguous: pg_catalog.pg_class instead 
of pg_class, add an alias (eg c), use c.FIELD to access an attribute. In 
its current form "pg_class" could resolve to another table depending on 
the search path.


C style is broken. On "if () {", brace must be on next line. On "1 != 
PQntuples(result)", I would exchange operands.


PQclear must be called on the main path.

If the table name contains a ", the result looks awkward:

For table: "public.foo"bla"

I'm wondering whether some escaping should be done. Well, it is not done 
for other simular entries, so probably this is bad but okay:-)


There are no tests:-(


*** About toast index v3

Patch applies cleanly, compiles, works for me.

There are no tests:-(

*** About the next one, v4

Patch applies cleanly, compiles. Not sure how to test it.

"switch (*PQgetvalue(result, i, 2))": I understand that relkind is a must 
admit I do not like this style much, an intermediate variable would 
improve readability. Also, a simple if instead of a swich might be more 
appropriate, and be closer to the previous implementation.


There are no tests:-(

--
Fabien.




Re: proposal - patch: psql - sort_by_size

2019-06-30 Thread Pavel Stehule
Hi

so 29. 6. 2019 v 10:19 odesílatel Pavel Stehule 
napsal:

>
>
> so 29. 6. 2019 v 9:32 odesílatel Fabien COELHO 
> napsal:
>
>>
>> Hello Pavel,
>>
>> > \set SORT_BY_SIZE on
>> > \dt -- sorted by schema, name (size is not calculated and is not
>> visible)
>> > \dt+ -- sorted by size
>>
>> Patch applies cleanly, compiles, runs. "make check" ok. doc build ok.
>>
>> There are no tests. Some infrastructure should be in place so that such
>> features can be tested, eg so psql-specific TAP tests. ISTM that there
>> was
>> a patch submitted for that, but I cannot find it:-( Maybe it is combined
>> with some other patch in the CF.
>>
>
> It is not possible - the size of relations is not stable (can be different
> on some platforms), and because showing the size is base of this patch, I
> cannot to write tests. Maybe only only set/unset of variable.
>
>
>>
>> I agree that the simpler the better for such a feature.
>>
>> ISTM that the fact that the option is ignored on \dt is a little bit
>> annoying. It means that \dt and \dt+ would not show their results in the
>> same order. I understand that the point is to avoid the cost of computing
>> the sizes, but if the user asked for it, should it be done anyway?
>>
>
> It was one objection against some previous patches. In this moment I don't
> see any wrong on different order between \dt and \dt+. When column "size"
> will be displayed, then ordering of report will be clean.
>
> I am not strongly against this - implementation of support SORT_BY_SIZE
> for non verbose mode is +/- few lines more. But now (and it is just my
> opinion and filing, nothing more), I think so sorting reports by invisible
> columns can be messy. But if somebody will have strong different option on
> this point, I am able to accept it. Both variants can have some sense, and
> some benefits - both variants are consistent with some rules (but cannot be
> together).
>
>
>> I'm wondering whether it would make sense to have a slightly more generic
>> interface allowing for more values, eg:
>>
>>   \set DESCRIPTION_SORT "name"
>>   \set DESCRIPTION_SORT "size"
>>
>> Well, possibly this is a bad idea, so it is really a question.
>>
>
> We was at this point already :). If you introduce this, then you have to
> support combinations schema_name, name_schema, size, schema_size, ...
>
> My goal is implementation of most common missing alternative into psql -
> but I would not to do too generic implementation - it needs more complex
> design (and UI), and I don't think so people use it. SORT_BY_SIZE (on/off)
> looks simply, and because (if will not be changed) it has not impact on non
> verbose mode, then it can be active permanently (and if not, it is not
> mental hard work to set it).
>
> I think so more generic solution needs interactive UI. Now I working on
> vertical cursor support for pspg https://github.com/okbob/pspg. Next step
> will be sort by column under vertical cursor. So, I hope, it can be good
> enough for simply sorting by any column of report (but to be user friendly,
> it needs interactive UI). Because not everywhere is pspg installed, I would
> to push some simple solution (I prefer simplicity against generic) to psql.
>
>
>
>>
>> +   Setting this variable to on causes so results of
>> +   \d* commands will be sorted by size, when size
>> +   is displayed.
>>
>> Maybe the simpler: "Setting this variable on sorts \d* outputs by size,
>> when size is displayed."
>>
>
I used this text in today patch

Regards

Pavel


>
>> ISTM that the documentation is more generic than reality. Does it work
>> with \db+? It seems to work with \dm+.
>>
>> On equality, ISTM it it should sort by name as a secondary criterion.
>>
>> I tested a few cases, although not partitioned tables.
>>
>
> Thank you - I support now relations (tables, indexes, ), databases, and
> tablespaces. The column size is displayed  for data types report, but I am
> not sure about any benefit in this case.
>
> Regards
>
> Pavel
>
>
>> --
>> Fabien.
>>
>>
>>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index c6c20de243..b04e93c1f2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3959,6 +3959,17 @@ bar
 
   
 
+  
+SORT_BY_SIZE
+
+
+Setting this variable to on, sorts
+\d*+, \db+, \l+
+and \dP*+ outputs by size (when size is displayed).
+
+
+  
+
   
SQLSTATE

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 97167d2c4b..be149391a1 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -223,6 +223,7 @@ describeTablespaces(const char *pattern, bool verbose)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
+	const char *sizefunc = NULL;
 
 	if (pset.sversion < 8)
 	{
@@ -265,9 +266,12 @@ describeTablespaces(const char *pattern, bool verbose)
 		  gettext_noop("Options"));
 
 	if (verbose && pset.sversi

Re: check_recovery_target_lsn() does a PG_CATCH without a throw

2019-06-30 Thread Peter Eisentraut
This has been committed.

On 2019-06-24 06:06, Michael Paquier wrote:
> I have been looking at this patch set.  Patch 0001 looks good to me.
> You are removing all traces of a set of timestamp keywords not
> supported anymore, and no objections from my side for this cleanup.
> 
> +#define MAXPG_LSNCOMPONENT 8
> +
>  static bool
>  check_recovery_target_lsn(char **newval, void **extra, GucSource source)
> Let's avoid the duplication for the declarations.  I would suggest to
> move the definitions of MAXPG_LSNLEN and MAXPG_LSNCOMPONENT to
> pg_lsn.h.  Funny part, I was actually in need of this definition a
> couple of days ago for a LSN string in a frontend tool.  I would
> suggest renames at the same time:
> - PG_LSN_LEN
> - PG_LSN_COMPONENT

I ended up rewriting this by extracting the parsing code into
pg_lsn_in_internal() and having both pg_lsn_in() and
check_recovery_target_lsn() calling it.  This mirrors similar
arrangements in float.c

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Add parallelism and glibc dependent only options to reindexdb

2019-06-30 Thread Julien Rouhaud
Hi,

With the glibc 2.28 coming, all users will have to reindex almost
every indexes after a glibc upgrade to guarantee the lack of
corruption.  Unfortunately, reindexdb is not ideal for that as it's
processing everything using a single connexion and isn't able to
discard indexes that doesn't depend on a glibc collation.

PFA a patchset to add parallelism to reindexdb (reusing the
infrastructure in vacuumdb with some additions) and an option to
discard indexes that doesn't depend on glibc (without any specific
collation filtering or glibc version detection), with updated
regression tests.  Note that this should be applied on top of the
existing reindexdb cleanup & refactoring patch
(https://commitfest.postgresql.org/23/2115/).

This was sponsored by VMware, and has been discussed internally with
Kevin and Michael, in Cc.


0003-Add-parallel-processing-to-reindexdb.patch
Description: Binary data


0001-Export-vacuumdb-s-parallel-infrastructure.patch
Description: Binary data


0004-Add-a-glibc-dependent-option.patch
Description: Binary data


0002-Add-a-SimplePtrList-implementation.patch
Description: Binary data


Re: check_recovery_target_lsn() does a PG_CATCH without a throw

2019-06-30 Thread Michael Paquier
On Sun, Jun 30, 2019 at 11:06:58AM +0200, Peter Eisentraut wrote:
> I ended up rewriting this by extracting the parsing code into
> pg_lsn_in_internal() and having both pg_lsn_in() and
> check_recovery_target_lsn() calling it.  This mirrors similar
> arrangements in float.c

The refactoring looks good to me (including what you have just fixed
with PG_RETURN_LSN).  Thanks for considering it.
--
Michael


signature.asc
Description: PGP signature


Fix typos and inconsistencies for HEAD

2019-06-30 Thread Alexander Lakhin
Hello hackers,

Please consider fixing the next bunch of typos and inconsistencies in
the tree:
4.1. AccesShareLock -> AccessShareLock
4.2. AdjustTimestampForTypemod -> AdjustTimestampForTypmod
4.3. AExprConst -> AexprConst
4.4. AlterExtensionOwner_oid - remove (orphaned after 994c36e0)
4.5. AlterTableDropColumn -> ATExecDropColumn (renamed in 077db40f)
4.6. ApplySortComparatorFull -> ApplySortAbbrevFullComparator
4.7. arracontjoinsel -> arraycontjoinsel
4.8. ArrayNItems -> ArrayGetNItems
4.9. ArrayRef -> SubscriptingRef (renamed by 558d77f2)
4.10. AtPrepare_Inval - remove (orphaned after efc16ea52)

4.11. AttributeOffsetGetAttributeNumber - > AttrOffsetGetAttrNumber
4.12. AttInMetaData -> AttInMetadata
4.13. AuthenticationMD5 -> AuthenticationMD5Password (for the sake of
consistency with the docs)
4.14. AUTH_REQ_GSSAPI -> AUTH_REQ_GSS
4.15. autogened -> autogenerated
4.16. BarrierWait -> BarrierArriveAndWait()
4.17. bgprocno -> bgwprocno
4.18. BGW_NVER_RESTART -> BGW_NEVER_RESTART
4.19. BloomInitBuffer -> BloomInitPage
4.20. br_deconstruct_tuple -> brin_deconstruct_tuple

4.21. brin_tuples.c -> brin_tuple.c
4.22. bt_parallel_done -> _bt_parallel_done
4.23. bt_parallel_release -> _bt_parallel_release
4.24. btree_insert_redo -> btree_xlog_insert
4.25. bucket_has_garbage -> split_cleanup
4.26. byta -> bytea
4.27. CachePlan -> CachedPlan
4.28. CheckBufferLeaks -> CheckForBufferLeaks
4.29. check_for_free_segments -> check_for_freed_segments
4.30. chunkbit -> schunkbit

4.31. cking -> remove (the comment is old and irrelevant since PG95-1_01)
4.32. ClearPgTM -> ClearPgTm
4.33. close_ - > closept_
4.34. CloseTransient File -> CloseTransientFile
4.35. colorTrigramsGroups -> colorTrigramGroups
4.36. combinedproj -> remove (orphaned after 69c3936a)
4.37. contigous_pages -> contiguous_pages
4.38. cookies_len -> cookies_size
4.39. cost_tableexprscan -> remove (not used since introduction in fcec6caa)
4.40. create_custom_plan -> create_customscan_plan

4.41. CreateInitialDecodingContext -> CreateInitDecodingContext
4.42. CreateSlot -> CreateSlotOnDisk
4.43. create_tablexprscan_path -> remove (not used since introduction in
fcec6caa)
4.44. crypt_des -> px_crypt_des
4.45. ctrigOid -> trigOid
4.46. curCollations -> colCollations
4.47. cur_mem & prev_mem -> cur_em & prev_em
4.48. customer_id_indexdex -> customer_id_index
4.49. custom_scan -> cscan

I've split proposed patch to make the fixes checking simpler.

Best regards,
Alexander
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index ee3bd56274..cc1670934f 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -341,7 +341,7 @@ BloomPageAddItem(BloomState *state, Page page, BloomTuple *tuple)
 /*
  * Allocate a new page (either by recycling, or by extending the index file)
  * The returned buffer is already pinned and exclusive-locked
- * Caller is responsible for initializing the page by calling BloomInitBuffer
+ * Caller is responsible for initializing the page by calling BloomInitPage
  */
 Buffer
 BloomNewBuffer(Relation index)
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 5abb472ee4..7e9e73d2cf 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -207,7 +207,7 @@ brin_form_tuple(BrinDesc *brdesc, BlockNumber blkno, BrinMemTuple *tuple,
 		/*
 		 * Note that we reverse the sense of null bits in this module: we
 		 * store a 1 for a null attribute rather than a 0.  So we must reverse
-		 * the sense of the att_isnull test in br_deconstruct_tuple as well.
+		 * the sense of the att_isnull test in brin_deconstruct_tuple as well.
 		 */
 		bitP = ((bits8 *) ((char *) rettuple + SizeOfBrinTuple)) - 1;
 		bitmask = HIGHBIT;
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index f5db5a8c4a..b66b517aca 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -525,7 +525,7 @@ ResetBackgroundWorkerCrashTimes(void)
 		if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
 		{
 			/*
-			 * Workers marked BGW_NVER_RESTART shouldn't get relaunched after
+			 * Workers marked BGW_NEVER_RESTART shouldn't get relaunched after
 			 * the crash, so forget about them.  (If we wait until after the
 			 * crash to forget about them, and they are parallel workers,
 			 * parallel_terminate_count will get incremented after we've
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index 06659ab265..c8d4e6f9e4 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -220,7 +220,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
 	 * If asked, we need to waken the bgwriter. Since we don't want to rely on
 	 * a spinlock for this we force a read from shared memory once, and then
 	 * set the latch based on that value. We need to go through that length
-	 * because otherwise bgprocno might be reset while/

Re: [PATCH] Implement uuid_version()

2019-06-30 Thread Fabien COELHO



Hello Peter,


Yeah, I think implementing a v4 generator in core would be trivial and
address almost everyone's requirements.


Here is a proposed patch for this.  I did a fair bit of looking around
in other systems for a naming pattern but didn't find anything
consistent.  So I ended up just taking the function name and code from
pgcrypto.

As you can see, the code is trivial and has no external dependencies.  I
think this would significantly upgrade the usability of the uuid type.


Patch applies cleanly.

However it does not compile, it fails on: "Duplicate OIDs detected: 3429".

Someone inserted a new entry since it was produced.

I'm wondering whether pg_random_uuid() should be taken out of pgcrypto if 
it is available in core?


--
Fabien.




Re: Where is SSPI auth username determined for TAP tests?

2019-06-30 Thread Tom Lane
Michael Paquier  writes:
> On Sat, Jun 29, 2019 at 04:36:51PM -0400, Tom Lane wrote:
>> I think this is likely a consequence of ca129e58c0 having modified
>> 010_dump_connstr.pl to use "regress_postgres" not "postgres" as the
>> bootstrap superuser name in the source cluster.  I suppose I overlooked
>> some dependency on the user name that only affects SSPI ... but what?

> Didn't you get trapped with something similar to what has been fixed
> in d9f543e?  If you want pg_hba.conf to be correctly set up for SSPI
> on Windows, you should pass "auth_extra => ['--create-role',
> 'regress_postgres']" to the init() method initializing the node.

After further study, I think the root issue here is that pg_regress.c's
config_sspi_auth() has no provision for non-default bootstrap superuser
names --- it makes a mapping entry for (what should be) the default
superuser name whether the cluster is using that or not.  I now see that
010_dump_connstr.pl is hacking around that by doing

my $envar_node = get_new_node('destination_envar');
$envar_node->init(extra =>
  [ '-U', $dst_bootstrap_super, '--locale=C', '--encoding=LATIN1' ]);
$envar_node->run_log(
[
$ENV{PG_REGRESS},  '--config-auth',
$envar_node->data_dir, '--create-role',
"$dst_bootstrap_super,$restore_super"
]);

that is, it's explicitly listing the non-default bootstrap superuser
among the roles to be "created".  This is all pretty weird and
undocumented ...

We could apply the same hack on the source node, but I wonder if it
wouldn't be better to make this less of a kluge.  I'm tempted to
propose that "pg_regress --config-auth --user XXX" should understand
XXX as the bootstrap superuser name, and then we could clean up
010_dump_connstr.pl by using that.

regards, tom lane




Re: [PATCH] Implement uuid_version()

2019-06-30 Thread Tomas Vondra

On Fri, Jun 28, 2019 at 03:24:03PM -0700, Peter Geoghegan wrote:

On Mon, Apr 8, 2019 at 11:04 PM Peter Eisentraut
 wrote:

Yeah, I think implementing a v4 generator in core would be trivial and
address almost everyone's requirements.


FWIW, I think that we could do better with nbtree page splits given
sequential UUIDs of one form or another [1]. We could teach
nbtsplitloc.c to pack leaf pages full of UUIDs in the event of the
user using sequential UUIDs. With a circular UUID prefix, I think
you'll run into an issue similar to the issue that was addressed by
the "split after new tuple" optimization -- most leaf pages end up 50%
full. I've not verified this, but I can't see why it would be any
different to other multimodal key space with sequential insertions
that are grouped.


I think the state with pages being only 50% full is only temporary,
because thanks to the prefix being circular we'll get back to the page
eventually and add more tuples to it.

It's not quite why I made the prefix circular (in my extension) - that was
to allow reuse of space after deleting rows. But I think it should help
with this too.



Detecting this in UUIDs may or may not require
opclass infrastructure. Either way, I'm not likely to work on it until
there is a clear target, such as a core or contrib sequential UUID
generator. Though I am looking at various ways to improve nbtsplitloc.c
for Postgres 13 -- I suspect that additional wins are possible.



I'm not against improving this, although I don't have a very clear idea
how it should work in the end. But UUIDs are used pretty commonly so it's
a worthwhile optimization area.


Any sequential UUID scheme will already have far fewer problems with
indexing today, since random UUIDs are *dreadful*, but I can imagine
doing quite a lot better still. Application developers love UUIDs. We
should try to meet them where they are.



I agree.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Dead encoding conversion functions

2019-06-30 Thread Tom Lane
Peter Eisentraut  writes:
> On 2019-05-29 21:03, Tom Lane wrote:
>> If we do delete them as useless, it might also be advisable to change
>> CreateConversionCommand() to refuse creation of conversions to/from
>> SQL_ASCII, to prevent future confusion.

> It seems nonsensical by definition to allow that.

Here's a completed patch for that.  Obviously this is a bit late
for v12, but if there aren't objections I'll push this soon after
v13 opens.

regards, tom lane

diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index a2a46c6..29fe33a 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -1896,7 +1896,11 @@ RESET client_encoding;
 
  If the client character set is defined as SQL_ASCII,
  encoding conversion is disabled, regardless of the server's character
- set.  Just as for the server, use of SQL_ASCII is unwise
+ set.  (However, if the server's character set is
+ not SQL_ASCII, the server will still check that
+ incoming data is valid for that encoding; so the net effect is as
+ though the client character set were the same as the server's.)
+ Just as for the server, use of SQL_ASCII is unwise
  unless you are working with all-ASCII data.
 

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3a8581d..7b6e530 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2497,18 +2497,6 @@
 
  
   
-   ascii_to_mic
-   SQL_ASCII
-   MULE_INTERNAL
-  
-
-  
-   ascii_to_utf8
-   SQL_ASCII
-   UTF8
-  
-
-  
big5_to_euc_tw
BIG5
EUC_TW
@@ -2779,12 +2767,6 @@
   
 
   
-   mic_to_ascii
-   MULE_INTERNAL
-   SQL_ASCII
-  
-
-  
mic_to_big5
MULE_INTERNAL
BIG5
@@ -2905,12 +2887,6 @@
   
 
   
-   utf8_to_ascii
-   UTF8
-   SQL_ASCII
-  
-
-  
utf8_to_big5
UTF8
BIG5
diff --git a/doc/src/sgml/ref/create_conversion.sgml b/doc/src/sgml/ref/create_conversion.sgml
index 4ddbcfa..67b4bd2 100644
--- a/doc/src/sgml/ref/create_conversion.sgml
+++ b/doc/src/sgml/ref/create_conversion.sgml
@@ -28,12 +28,15 @@ CREATE [ DEFAULT ] CONVERSION name
 
   
CREATE CONVERSION defines a new conversion between
-   character set encodings.  Also, conversions that
-   are marked DEFAULT can be used for automatic encoding
-   conversion between
-   client and server. For this purpose, two conversions, from encoding A to
-   B and from encoding B to A, must be defined.
- 
+   two character set encodings.
+  
+
+  
+   Conversions that are marked DEFAULT can be used for
+   automatic encoding conversion between client and server.  To support that
+   usage, two conversions, from encoding A to B and
+   from encoding B to A, must be defined.
+  
 
   
To be able to create a conversion, you must have EXECUTE privilege
@@ -123,6 +126,13 @@ conv_proc(
   Notes
 
   
+   Neither the source nor the destination encoding can
+   be SQL_ASCII, as the server's behavior for cases
+   involving the SQL_ASCII encoding is
+   hard-wired.
+  
+
+  
Use DROP CONVERSION to remove user-defined conversions.
   
 
diff --git a/src/backend/commands/conversioncmds.c b/src/backend/commands/conversioncmds.c
index 5afe867..cb856e8 100644
--- a/src/backend/commands/conversioncmds.c
+++ b/src/backend/commands/conversioncmds.c
@@ -73,6 +73,18 @@ CreateConversionCommand(CreateConversionStmt *stmt)
 		to_encoding_name)));
 
 	/*
+	 * We consider conversions to or from SQL_ASCII to be meaningless.  (If
+	 * you wish to change this, note that pg_do_encoding_conversion() and its
+	 * sister functions have hard-wired fast paths for any conversion in which
+	 * the source or target encoding is SQL_ASCII, so that an encoding
+	 * conversion function declared for such a case will never be used.)
+	 */
+	if (from_encoding == PG_SQL_ASCII || to_encoding == PG_SQL_ASCII)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("encoding conversion to or from \"SQL_ASCII\" is not supported")));
+
+	/*
 	 * Check the existence of the conversion function. Function name could be
 	 * a qualified name.
 	 */
diff --git a/src/backend/utils/mb/conv.c b/src/backend/utils/mb/conv.c
index f2b51ac..3ecc92b 100644
--- a/src/backend/utils/mb/conv.c
+++ b/src/backend/utils/mb/conv.c
@@ -133,51 +133,6 @@ mic2latin(const unsigned char *mic, unsigned char *p, int len,
 
 
 /*
- * ASCII ---> MIC
- *
- * While ordinarily SQL_ASCII encoding is forgiving of high-bit-set
- * characters, here we must take a hard line because we don't know
- * the appropriate MIC equivalent.
- */
-void
-pg_ascii2mic(const unsigned char *l, unsigned char *p, int len)
-{
-	int			c1;
-
-	while (len > 0)
-	{
-		c1 = *l;
-		if (c1 == 0 || IS_HIGHBIT_SET(c1))
-			report_invalid_encoding(PG_SQL_ASCII, (const char *) l, len);
-		*p++ = c1;
-		l++;
-		len--;
-	}
-	*p = '\0';
-}
-
-/*
- 

Re: Fix up grouping sets reorder

2019-06-30 Thread Andrew Gierth
> "Richard" == Richard Guo  writes:

 Richard> Hi all,

 Richard> During the reorder of grouping sets into correct prefix order,
 Richard> if only one aggregation pass is needed, we follow the order of
 Richard> the ORDER BY clause to the extent possible, to minimize the
 Richard> chance that we add unnecessary sorts. This is implemented in
 Richard> preprocess_grouping_sets --> reorder_grouping_sets.

 Richard> However, current codes fail to do that.

You're correct, thanks for the report.

Your fix works, but I prefer to refactor the conditional logic slightly
instead, removing the outer if{}. So I didn't use your exact patch in
the fix I just committed.

-- 
Andrew (irc:RhodiumToad)




RE: [bug fix] Produce a crash dump before main() on Windows

2019-06-30 Thread Tsunakawa, Takayuki
From: Amit Kapila [mailto:amit.kapil...@gmail.com]
> Tsunakawa/Haribabu - By reading this thread briefly, it seems we need
> some more inputs from other developers on whether to fix this or not,
> so ideally the status of this patch should be 'Needs Review'.  Why it
> is in 'Waiting on Author' state?  If something is required from
> Author, then do we expect to see the updated patch in the next few
> days?

Thank you for paying attention to this.  I think the patch is good, but someone 
else may have a different solution.  So I marked it as needs review.


Regards
Takayuki Tsunakawa




RE: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-30 Thread Tsunakawa, Takayuki
From: Stephen Frost [mailto:sfr...@snowman.net]
> sh, don't look now, but there might be a "Resend email" button in the
> archives now that you can click to have an email sent to you...
> 
> Note that you have to be logged in, and the email will go to the email address
> that you're logging into the community auth system with.
> 
> (thank you Magnus)

Thank you so much, Magnus.  This is very convenient.  I'm forced to use Outlook 
at work, which doesn't allow to reply to a downloaded email.  Your help 
eliminates the need to save all emails.


Regards
Takayuki Tsunakawa





Re: mcvstats serialization code is still shy of a load

2019-06-30 Thread Tom Lane
Tomas Vondra  writes:
> Attached is a slightly improved version of the serialization patch.

I reviewed this patch, and tested it on hppa and ppc.  I found one
serious bug: in the deserialization varlena case, you need

-   dataptr += MAXALIGN(len);
+   dataptr += MAXALIGN(len + VARHDRSZ);

(approx. line 1100 in mcv.c).  Without this, the output data is corrupt,
plus the Assert a few lines further down about dataptr having been
advanced by the correct amount fires.  (On one machine I tested on,
that happened during the core regression tests.  The other machine
got through regression, but trying to do "select * from pg_stats_ext;"
afterwards exhibited the crash.  I didn't investigate closely, but
I suspect the difference has to do with different MAXALIGN values,
4 and 8 respectively.)

The attached patch (a delta atop your v2) corrects that plus some
cosmetic issues.

If we're going to push this, it would be considerably less complicated
to do so before v12 gets branched --- not long after that, there will be
catversion differences to cope with.  I'm planning to make the branch
tomorrow (Monday), probably ~1500 UTC.  Just sayin'.

regards, tom lane

diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 256728a..cfe7e54 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -40,13 +40,14 @@
  * stored in a separate array (deduplicated, to minimize the size), and
  * so the serialized items only store uint16 indexes into that array.
  *
- * Each serialized item store (in this order):
+ * Each serialized item stores (in this order):
  *
  * - indexes to values	  (ndim * sizeof(uint16))
  * - null flags			  (ndim * sizeof(bool))
  * - frequency			  (sizeof(double))
  * - base_frequency		  (sizeof(double))
  *
+ * There is no alignment padding within an MCV item.
  * So in total each MCV item requires this many bytes:
  *
  *	 ndim * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double)
@@ -61,7 +62,7 @@
 	(VARHDRSZ + sizeof(uint32) * 3 + sizeof(AttrNumber))
 
 /*
- * Size of the serialized MCV list, exluding the space needed for
+ * Size of the serialized MCV list, excluding the space needed for
  * deduplicated per-dimension values. The macro is meant to be used
  * when it's not yet safe to access the serialized info about amount
  * of data for each column.
@@ -619,6 +620,7 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
 		else if (info[dim].typlen == -1)	/* varlena */
 		{
 			info[dim].nbytes = 0;
+			info[dim].nbytes_aligned = 0;
 			for (i = 0; i < info[dim].nvalues; i++)
 			{
 Size		len;
@@ -646,6 +648,7 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
 		else if (info[dim].typlen == -2)	/* cstring */
 		{
 			info[dim].nbytes = 0;
+			info[dim].nbytes_aligned = 0;
 			for (i = 0; i < info[dim].nvalues; i++)
 			{
 Size		len;
@@ -743,7 +746,7 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
  * assumes little endian behavior.  store_att_byval does
  * almost what we need, but it requires properly aligned
  * buffer - the output buffer does not guarantee that. So we
- * simply use a static Datum variable (which guarantees proper
+ * simply use a local Datum variable (which guarantees proper
  * alignment), and then copy the value from it.
  */
 store_att_byval(&tmp, value, info[dim].typlen);
@@ -759,14 +762,14 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
 			}
 			else if (info[dim].typlen == -1)	/* varlena */
 			{
-uint32		len = VARSIZE_ANY_EXHDR(value);
+uint32		len = VARSIZE_ANY_EXHDR(DatumGetPointer(value));
 
 /* copy the length */
 memcpy(ptr, &len, sizeof(uint32));
 ptr += sizeof(uint32);
 
 /* data from the varlena value (without the header) */
-memcpy(ptr, VARDATA(DatumGetPointer(value)), len);
+memcpy(ptr, VARDATA_ANY(DatumGetPointer(value)), len);
 ptr += len;
 			}
 			else if (info[dim].typlen == -2)	/* cstring */
@@ -1100,7 +1103,7 @@ statext_mcv_deserialize(bytea *data)
 	map[dim][i] = PointerGetDatum(dataptr);
 
 	/* skip to place of the next deserialized value */
-	dataptr += MAXALIGN(len);
+	dataptr += MAXALIGN(len + VARHDRSZ);
 }
 			}
 			else if (info[dim].typlen == -2)
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index 6778746..8fc5419 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -36,7 +36,7 @@ typedef struct DimensionInfo
 {
 	int			nvalues;		/* number of deduplicated values */
 	int			nbytes;			/* number of bytes (serialized) */
-	int			nbytes_aligned;	/* deserialized data with alignment */
+	int			nbytes_aligned;	/* size of deserialized data with alignment */
 	int			typlen;			/* pg_type.typlen */
 	bool		typbyval;		/* pg_type.typby

Re: Fix typos and inconsistencies for HEAD

2019-06-30 Thread Michael Paquier
On Sun, Jun 30, 2019 at 04:06:47PM +0300, Alexander Lakhin wrote:
> 4.33. close_ - > closept_

This one is incorrect as it refers to the various close_* routines
below.

> 4.36. combinedproj -> remove (orphaned after 69c3936a)

This looks intentional?

> I've split proposed patch to make the fixes checking simpler.

Agreed with the rest, and applied.  Thanks!
--
Michael


signature.asc
Description: PGP signature


RE: Zedstore - compressed in-core columnar storage

2019-06-30 Thread Tsunakawa, Takayuki
From: Ashwin Agrawal [mailto:aagra...@pivotal.io]
> The objective is to gather feedback on design and approach to the same.
> The implementation has core basic pieces working but not close to complete.

Thank you for proposing a very interesting topic.  Are you thinking of 
including this in PostgreSQL 13 if possible?


> * All Indexes supported
...
> work. Btree indexes can be created. Btree and bitmap index scans work.

Does Zedstore allow to create indexes of existing types on the table (btree, 
GIN, BRIN, etc.) and perform index scans (point query, range query, etc.)?


> * Hybrid row-column store, where some columns are stored together, and
>   others separately. Provide flexibility of granularity on how to
>   divide the columns. Columns accessed together can be stored
>   together.
...
> This way of laying out the data also easily allows for hybrid row-column
> store, where some columns are stored together, and others have a dedicated
> B-tree. Need to have user facing syntax to allow specifying how to group
> the columns.
...
> Zedstore Table can be
> created using command:
> 
> CREATE TABLE  (column listing) USING zedstore;

Are you aiming to enable Zedstore to be used for HTAP, i.e. the same table can 
be accessed simultaneously for both OLTP and analytics with the minimal 
performance impact on OLTP?  (I got that impression from the word "hybrid".)
If yes, is the assumption that only a limited number of  columns are to be 
stored in columnar format (for efficient scanning), and many other columns are 
to be stored in row format for efficient tuple access?
Are those row-formatted columns stored in the same file as the column-formatted 
columns, or in a separate file?

Regarding the column grouping, can I imagine HBase and Cassandra?
How could the current CREATE TABLE syntax support column grouping?  (I guess 
CREATE TABLE needs a syntax for columnar store, and Zedstore need to be 
incorporated in core, not as an extension...)


> A column store uses the same structure but we have *multiple* B-trees, one
> for each column, all indexed by TID. The B-trees for all columns are stored
> in the same physical file.

Did you think that it's not a good idea to have a different file for each group 
of columns?  Is that because we can't expect physical adjacency of data blocks 
on disk even if we separate a column in a separate file?

I thought a separate file for each group of columns would be easier and less 
error-prone to implement and debug.  Adding and dropping the column group would 
also be very easy and fast.


Regards
Takayuki Tsunakawa




Re: Minimal logical decoding on standbys

2019-06-30 Thread Amit Khandekar
On Tue, 25 Jun 2019 at 19:14, Robert Haas  wrote:
>
> On Fri, Jun 21, 2019 at 11:50 AM Amit Khandekar  
> wrote:
> > > This definitely needs to be expanded, and follow the message style
> > > guideline.
> >
> > This message , with the v8 patch, looks like this :
> > ereport(LOG,
> > (errmsg("Dropping conflicting slot %s", NameStr(slotname)),
> > errdetail("%s", reason)));
> > where reason is a char string.
>
> That does not follow the message style guideline.
>
> https://www.postgresql.org/docs/12/error-style-guide.html
>
> From the grammar and punctuation section:
>
> "Primary error messages: Do not capitalize the first letter. Do not
> end a message with a period. Do not even think about ending a message
> with an exclamation point.
>
> Detail and hint messages: Use complete sentences, and end each with a
> period. Capitalize the first word of sentences. Put two spaces after
> the period if another sentence follows (for English text; might be
> inappropriate in other languages)."

Thanks. In the updated patch, changed the message style. Now it looks
like this :

primary message : dropped conflicting slot slot_name
error detail : Slot conflicted with xid horizon which was being
increased to 9012 (slot xmin: 1234, slot catalog_xmin: 5678).



Also, in the updated patch (v11), I have added some scenarios that
verify that slot is dropped when either master wal_level is
insufficient, or when slot is conflicting. Also organized the test
file a bit.


-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


logical-decoding-on-standby_v11.patch
Description: Binary data