Re: Optimize LISTEN/NOTIFY

2025-10-10 Thread Joel Jacobson
On Fri, Oct 10, 2025, at 20:46, Joel Jacobson wrote:
> On Wed, Oct 8, 2025, at 20:46, Tom Lane wrote:
>> "Joel Jacobson"  writes:
>>> On Tue, Oct 7, 2025, at 22:15, Tom Lane wrote:
 5. ChannelHashAddListener: "already registered" case is not reached,
 which surprises me a bit, and neither is the "grow the array" stanza.
>>
>>> I've added a test for the "grow the array" stanza.
>>
>>> The "already registered" case seems impossible to reach, since the
>>> caller, Exec_ListenCommit, returns early if IsListeningOn.
>>
>> Maybe we should remove the check for "already registered" then,
>> or reduce it to an Assert?  Seems pointless to check twice.
>>
>> Or thinking a little bigger: why are we maintaining the set of
>> channels-listened-to both as a list and a hash?  Could we remove
>> the list form?
>
> Yes, it was indeed possible to remove the list form.
>
> Some functions got a bit more complex, but I think it's worth it since a
> single source of truth seems like an important design goal.
>
> This also made LISTEN faster when a backend is listening on plenty of
> channels, since we can now lookup the channel in the hash, instead of
> having to go through the list as before. The additional linear scan of
> the listenersArray didn't add any noticeable extra cost even with
> thousands of listening backends for the channel.
>
> I also tried to keep listenersArray sorted and binary-search it, but
> even with thousands of listening backends, I couldn't measure any
> overall latency difference of LISTEN, so I kept the linear scan to keep
> it simple.
>
> In Exec_ListenCommit, I've now inlined code that is similar to
> IsListeningOn. I didn't want to use IsListeningOn since it felt wasteful
> having to do dshash_find, when we instead can just use
> dshash_find_or_insert, to handle both cases.
>
> I also added a static int numChannelsListeningOn variable, to avoid the
> possibly expensive operation of going through the entire hash, to be
> able to check `numChannelsListeningOn == 0` instead of the now removed
> `listenChannels == NIL`. It's of course critical to keep
> numChannelsListeningOn in sync, but I think it should be safe? Would of
> course be better to avoid this variable. Maybe the extra cycles that
> would cost would be worth it?

In addition to previously suggested optimization, there is another major
one that seems doable, that would mean a great improvement for workload
having large traffic differences between channels, i.e. some low traffic
and some high traffic.

I'm not entirely sure this approach is correct though, I've might
misunderstood the guarantees of the heavyweight lock. My assumption is
based on that there can only be one backend that is currently running
the code in PreCommit_Notify after having aquired the heavyweight lock.
If this is not true, then it doesn't work. What made me worried is the
exclusive lock we also take inside the same function, I don't see the
point of it since we're already holding the heavyweight lock, but maybe
this is just to "allows deadlocks to be detected" like the comment says?

---

Patches:

* 0001-optimize_listen_notify-v14.patch:
Just adds additional test coverage of async.c

* 0002-optimize_listen_notify-v14.patch:
Adds the shared channel hash.
Unchanged since 0002-optimize_listen_notify-v13.patch.

* 0003-optimize_listen_notify-v14.patch:

Optimize LISTEN/NOTIFY by advancing idle backends directly

Building on the previous channel-specific listener tracking
optimization, this patch further reduces context switching by detecting
idle listening backends that don't listen to any of the channels being
notified and advancing their queue positions directly without waking
them up.

When a backend commits notifications, it now saves both the queue head
position before and after writing. In SignalBackends(), backends that
are at the old queue head and weren't marked for wakeup (meaning they
don't listen to any of the notified channels) are advanced directly to
the new queue head. This eliminates unnecessary wakeups for these
backends, which would otherwise wake up, scan through all the
notifications, skip each one, and advance to the same position anyway.

The implementation carefully handles the race condition where other
backends may write notifications after the heavyweight lock is released
but before SignalBackends() is called. By saving queueHeadAfterWrite
immediately after writing (before releasing the lock), we ensure
backends are only advanced over the exact notifications we wrote, not
notifications from other concurrent backends.

---

Benchmark:

% ./pgbench_patched --listen-notify-benchmark --notify-round-trips=1 
--notify-idle-step=10
pgbench_patched: starting LISTEN/NOTIFY round-trip benchmark
pgbench_patched: round-trips per iteration: 1
pgbench_patched: idle listeners added per iteration: 10

master:

idle_listeners  round_trips_per_sec max_latency_usec
 0  33592.9 2278
10

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

2025-10-10 Thread Tatsuo Ishii
>> Also the error is certainly user-facing,
>> so using elog() was quite inappropriate.  It should be ereport with an
>> errcode of (probably) ERRCODE_FEATURE_NOT_SUPPORTED.  Rolling your
>> own implementation of get_func_name() wasn't great either.
> 
> I overlooked the elog() call and "own implementation of
> get_func_name()". Will fix.

Attached is a trivial patch to fix that. I am going to push it if
there's no objection.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


v1-0001-Use-ereport-rather-than-elog-in-WinCheckAndInitia.patch
Description: Binary data


Re: speedup COPY TO for partitioned table.

2025-10-10 Thread Álvaro Herrera
On 2025-Oct-10, jian he wrote:

> On Fri, Oct 10, 2025 at 6:03 AM Masahiko Sawada  wrote:

> > Yes, but I think it's more consistent given that we use the
> > parentheses in all other places in copyto.c.
> 
> If you look at tablecmds.c, like ATExecSetNotNull, there are
> parentheses and no parentheses cases.
> Overall, I think less parentheses improves readability and makes the
> code more future-proof.

I strive to remove those extra parentheses when I edit some part of the
code (though I may forget at times), but leave them alone from other
places that I'm not editing.  I don't add them in new code.  Most
likely, this is why ATExecSetNotNull has some cases with them and other
cases without.

Given sufficient time, the Postgres of Theseus would eventually have
zero of those extra parens.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Niemand ist mehr Sklave, als der sich für frei hält, ohne es zu sein."
   Nadie está tan esclavizado como el que se cree libre no siéndolo
   (Johann Wolfgang von Goethe)




Re: [PATCH TEST] Fix logical replication setup in subscription test `t/009_matviews.pl`

2025-10-10 Thread Ashutosh Bapat
On Fri, Oct 10, 2025 at 1:20 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Michael,
>
> > On Thu, Oct 09, 2025 at 03:37:25PM +0300, Грем Снорт wrote:
> > > I've found a simple problem in one of subscription tests
> > > (`src/test/subscription/t/009_matviews.pl`).
> >
> > (Added a couple of folks in CC.)
>
> Thanks for adding.
>
> > The backend change coupled with this test comes from bc1adc651b8e,
> > first introduced in v11.  At the top of REL_11_STABLE, which is the
> > first branch where the test has been introduced, if I update
> > pgoutput.c and remove the is_publishable_relation() call in
> > pgoutput_change() to undo the fix, then the test is able to hang as it
> > is designed.
> >
> > Now, if I do the same thing on HEAD, removing the check, then the test
> > passes!  Something else is going on here: the test is not checking
> > what it has been written for.  Applying your patch does not change
> > this state.
>
> I found that b4da732 [1] changes the behavior. It modifies how we create the
> materialized view.
>
> Since b4da732, CREATE MATERIALIZED VIEW ... AS ... behaves similarly with 
> REFRESH
> MATERIALIZED VIEW command.
> Refresh command initially creates a transient table, insert tuples then 
> swapping
> the relfilenumber between the mateview and transient one.
> Thus, from the WAL's perspective, INSERT happens to the temporary one, so it 
> won't
> be replicated to the subscriber. See comments atop RefreshMatViewByOid().
>
> So...I think it is not caused by changes for logical replication.

Thanks for the investigation. Makes sense.

I agree with Michael that the test could be written better, rather
than testing for timeout. AFAIU, the test is testing that the changes
to materialized view are not sent downstream; are properly filtered
out. I think the test should create MV and make sure that it doesn't
get populated through the logical replication but when refreshed
explicitly on the subscriber. While checking for that we should
perform usual wait for LSN. However, after your explanation above, it
seems that MVs should be considered similar to unlogged tables or
temporary tables after that change. So we could just merge this test
into subscription/100_bugs.pl.

-- 
Best Wishes,
Ashutosh Bapat




Re: [PATCH] jit: fix build with LLVM-21

2025-10-10 Thread Holger Hoffstätte

On 2025-09-12 16:22, Holger Hoffstätte wrote:

On 2025-09-12 14:47, Holger Hoffstätte wrote:


I temporarily removed the __arch64__ guard and can reproduce the above error,
using either gcc-15 or clang-21. It seems this llvm monkey patch backport thing
is affected by the following commit in llvm-21:
https://github.com/llvm/llvm-project/commit/cd585864c0bbbd74ed2a2b1ccc191eed4d1c8f90

Trying to figure out how to adapt the code..


Try the attached patch on your homebrew setup.

This compiles and passes "make check", though I do not think it runs any jit 
tests.
So intead I dropped this into my 17.6, disabled the __arch64__ guard to make 
sure
I get the backport class, rebuilt and ran the jit on/off example from the docs.
This showed the expected performance difference with jit=on vs. off.


Any news on the aarch64 front? The patch I sent in
https://www.postgresql.org/message-id/44718204-61a5-91b1-5574-154ff7a019f6%40applied-asynchrony.com

seemed to work for Anthonin Bonnefoy:
https://www.postgresql.org/message-id/CAO6_XqqbrO3O48f%2BXM0fp6P8ha61sZqYPuri%2B%2BP0%2BKMVM0xcdw%40mail.gmail.com

Just trying to make sure this doesn't fall into the couch crack.

thanks,
Holger




Re: [PATCH] Remove make_temptable_name_n()

2025-10-10 Thread Nathan Bossart
On Fri, Oct 10, 2025 at 06:22:46PM +0300, Aleksander Alekseev wrote:
> The proposed patch removes make_temptable_name_n() in matview.c. This
> function is used only once and can be replaced with psprintf().

This dates back to commit cc1965a, and I see some discussion about this
function in the corresponding thread [0].  One thing I don't like about
the patch is that we lose the comment about relying on the name to never be
double-quoted.  IMHO that's worth retaining in some form.

[0] 
https://postgr.es/m/CAP7Qgm%3Djb3xkzQXfGtX9STx8fzd8EDDQ-oJ8ekcyeOud%2ByLCoA%40mail.gmail.com

-- 
nathan




Re: Preserve index stats during ALTER TABLE ... TYPE ...

2025-10-10 Thread Bertrand Drouvot
Hi,

On Fri, Oct 10, 2025 at 07:37:59AM -0500, Sami Imseih wrote:
> Hi,
> 
> Thanks for raising this issue and for the patch!

Thanks for looking at it!

> > As you can see, the index stats (linked to the column that has been 
> > altered) are
> > not preserved. I think that they should be preserved (like a REINDEX does).
> 
> I agree.
> 
> > - We can not use pgstat_copy_relation_stats() because the old index is 
> > dropped
> > before the new one is created, so the patch adds a new PgStat_StatTabEntry
> > pointer in AlteredTableInfo.
> 
> I wonder if it will be good to have a pgstat_save_relation_stats() routine 
> that
> gets called in all code paths that will need to restore the stats. This way
> pgstat_copy_relation_stats can also be used. This will be cleaner than code
> paths that need this having to deal with pgstat_fetch_stat_tabentry?

pgstat_copy_relation_stats() needs 2 Relation, I'm not sure how a new
pgstat_save_relation_stats() could help using pgstat_copy_relation_stats()
here.

> The current patch does not work for partitioned tables because
> the "oldId" is that of the parent index which has no stats. So we
> are just copying zeros to the new entry.

Doh, of course. I've spend some time on it and now have something working. 

The idea is to:

- store a List of savedIndexStats. The savedIndexStats struct would get the
PgStat_StatTabEntry + all the information needed to be able to use 
CompareIndexInfo() when restoring the stats (so that we can restore each 
PgStat_StatTabEntry
in the right index).

- Iterate on all the indexes and populate this new list in AlteredTableInfo in
ATPostAlterTypeParse().

- Iterate on all the indexes and use the list above and CompareIndexInfo() to
restore the stats in ATExecAddIndex().

Will polish and share next week.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: [PATCH TEST] Fix logical replication setup in subscription test `t/009_matviews.pl`

2025-10-10 Thread Amit Kapila
On Thu, Oct 9, 2025 at 6:07 PM Грем Снорт  wrote:
>
> Hello!
> I've found a simple problem in one of subscription tests 
> (`src/test/subscription/t/009_matviews.pl`).
> When running this test, it completes successfully (All tests successful. 
> Result: PASS) but there is an error in subscriber's logs:
> ```
> logical replication apply worker[1865731] ERROR:  logical replication target 
> relation "public.test1" does not exist
> ```
>

This is a harmless ERROR as in the next try, the apply for insert will
be successful. But for neatness, we can use your change.

-- 
With Regards,
Amit Kapila.




Re: speedup COPY TO for partitioned table.

2025-10-10 Thread jian he
On Fri, Oct 10, 2025 at 6:03 AM Masahiko Sawada  wrote:
>
> > > ---
> > > +   relation_name = get_rel_name(childreloid);
> > > +   ereport(ERROR,
> > > +   errcode(ERRCODE_WRONG_OBJECT_TYPE),
> > > +   errmsg("cannot copy from foreign table
> > > \"%s\"", relation_name),
> > > +   errdetail("Partition \"%s\" is a foreign
> > > table in the partitioned table \"%s\"",
> > > + relation_name,
> > > RelationGetRelationName(rel)),
> > > +   errhint("Try the COPY (SELECT ...) TO 
> > > variant."));
> > >
> > > I think we don't need "the" in the error message.
> > >
> > > It's conventional to put all err*() macros in parentheses (i.e.,
> > > "(errcode(), ...)", it's technically omittable though.
> > >
> >
> > https://www.postgresql.org/docs/current/error-message-reporting.html
> > QUOTE:
> > >
> > The extra parentheses were required before PostgreSQL version 12, but
> > are now optional.
> > Here is a more complex example:
> > .
> > >
> >
> > related commit:
> > https://git.postgresql.org/cgit/postgresql.git/commit/?id=e3a87b4991cc2d00b7a3082abb54c5f12baedfd1
> > Less parenthesis is generally more readable, I think.
>
> Yes, but I think it's more consistent given that we use the
> parentheses in all other places in copyto.c.
>

If you look at tablecmds.c, like ATExecSetNotNull, there are
parentheses and no parentheses cases.
Overall, I think less parentheses improves readability and makes the
code more future-proof.


> >
> > > How about doing "slot = execute_attr_map_slot(map, slot, root_slot);"
> > > instead? (i.e., no need to have 'copyslot')
> > >
> >
> > I tried but it seems not possible.
> > table_scan_getnextslot function require certain type of "slot", if we do
> > "slot = execute_attr_map_slot(map, slot, root_slot);"
> > then pointer "slot" type becomes virtual slot, then
> > it will fail on second time call table_scan_getnextslot
>
> Right. Let's keep as it is.
>

> I've attached a patch for cosmetic changes including comment updates,
> indent fixes by pgindent, and renaming variable names. Some fixes are
> just my taste, so please check the changes.
>
thanks!
I have applied most of it. expect points I mentioned in this email.

> Also I have a few comments on new tests:
>
> +-- Tests for COPY TO with partitioned tables.
> +CREATE TABLE pp (id int,val int) PARTITION BY RANGE (id);
> +CREATE TABLE pp_1 (val int, id int) PARTITION BY RANGE (id);
> +CREATE TABLE pp_2 (val int, id int) PARTITION BY RANGE (id);
> +ALTER TABLE pp ATTACH PARTITION pp_1 FOR VALUES FROM (1) TO (5);
> +ALTER TABLE pp ATTACH PARTITION pp_2 FOR VALUES FROM (5) TO (10);
> +
> +CREATE TABLE pp_15 PARTITION OF pp_1 FOR VALUES FROM (1) TO (5);
> +CREATE TABLE pp_510 PARTITION OF pp_2 FOR VALUES FROM (5) TO (10);
> +
> +INSERT INTO pp SELECT g, 10 + g FROM generate_series(1,6) g;
> +
>
> I think it's better to have both cases: partitions' rowtype match the
> root's rowtype and partition's rowtype doesn't match the root's
> rowtype.
>
sure.

> ---
> +-- Test COPY TO with a foreign table or when the foreign table is a partition
> +COPY async_p3 TO stdout; --error
> +ERROR:  cannot copy from foreign table "async_p3"
> +HINT:  Try the COPY (SELECT ...) TO variant.
>
> async_p3 is a foreign table so it seems not related to this patch.
>
I replied in
https://postgr.es/m/CACJufxGkkMtRUJEbLczRnWp7x2YWqu4r1gEJEv9Po1UPxS6kGQ%40mail.gmail.com
I kind of doubt anyone would submit a patch just to rewrite a coverage test for
the sake of coverage itself.  While we're here, adding nearby coverage tests
should be fine?

i just found out I ignored the case when partitioned tables have RLS.
when exporting a partitioned table,
find_all_inheritors will sort the returned partition by oid.
in DoCopy, we can do the same:
make a SortBy node for SelectStmt->sortClause also mark the
RangeVar->inh as true.
OR
ereport(ERRCODE_FEATURE_NOT_SUPPORTED...) for partitioned tables with RLS.

please see the change I made in DoCopy.
From 63cd35bf535cbc06b09d07171cecf57ed21e89cc Mon Sep 17 00:00:00 2001
From: jian he 
Date: Fri, 10 Oct 2025 14:36:15 +0800
Subject: [PATCH v17 1/1] support COPY partitioned_table TO
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This is for implementation of ``COPY (partitioned_table) TO``.  it will be
faster than ``COPY (select * from partitioned_table) TO``.
If the destination table is a partitioned table, COPY table TO copies the same
rows as SELECT * FROM table.

reivewed by: vignesh C 
reivewed by: David Rowley 
reivewed by: Melih Mutlu 
reivewed by: Kirill Reshke 
reivewed by: Atsushi Torikoshi 
reivewed by: Álvaro Herrera 
reivewed by: Masahiko Sawada 
reivewed by: Chao Li 

discussion: https://postgr.es/m/CACJufxEZt+G19Ors3bQUq-42-61__C=y5k2wk=sHEFRusu7=i...@mail.gmail.com
commitfest entry: https://commitfest.postgr

Re: [PATCH] ecpg: check return value of replace_variables()

2025-10-10 Thread Daniel Gustafsson
> On 7 Oct 2025, at 16:04, Aleksander Alekseev  wrote:

> Currently in src/interfaces/ecpg/ecpglib/prepare.c we don't check the
> return value of replace_variables(). The proposed patch fixes this.
> The rest of the code looks OK to me.

Seems reasonable on a first glance, I'll pick it up.

--
Daniel Gustafsson





Re: URLs in rbtree.c are broken

2025-10-10 Thread Chao Li
On Fri, Oct 10, 2025 at 3:34 PM Daniel Gustafsson  wrote:

> > On 10 Oct 2025, at 09:00, Chao Li  wrote:
>
> > For the first one, by search for “Thomas Niemann’s Sorting and Searching
> Algorithms: a Cookbook”, I got
> https://www.epaperpress.com/sortsearch/rbt.html, but I am not sure it
> provides the exactly same content as the old URL.
>
> The entire algolist.manual.ru site is not available in DNS, but according
> to
> archive.org it was responsinding quite recently.  Perhaps we should give
> that a
> little time before we do anything?
>
> Looks like the last accessible day was late of Jan this year. Sure, we can
wait for its back.


> > For the second one, I cannot find a replacement.
>
> This link has been dead for over a decade by the looks of it, and noone has
> complained.  However, since this refers to license terms it would be best
> replace with an archive.org link here:
>
>
> https://web.archive.org/web/20131202103513/http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.htm
>
> Make sense. Attached is a small patch for the URL replacement.

Best regards,
Chao Li (Evan)
-
HighGo Software Co., Ltd.
https://www.highgo.com/


v1-0001-Fixed-a-broken-URL-in-rbtree.c.patch
Description: Binary data


Re: RFC: extensible planner state

2025-10-10 Thread Robert Haas
On Fri, Oct 10, 2025 at 11:58 AM Tom Lane  wrote:
> Should the CF entry be closed out now?
>
> https://commitfest.postgresql.org/patch/5994/

Yes, done now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: new environment variable INITDB_LOCALE_PROVIDER

2025-10-10 Thread Jeff Davis
On Sat, 2025-10-11 at 08:30 +0800, Chao Li wrote:
> * If we make that fail, I don’t think that would break existing
> scripts. Because the default provider is libc and you are introducing
> a new environment variable to set locale provider, thus a plain
> initdb will not use builtin provider. Maybe provider can come from
> PG_TEST_INITDB_EXTRA_OPTS, I'm ok for test environment to only only
> issue warnings.

I would like it to be possible to change the initdb default in the
future to "builtin". See:

https://www.postgresql.org/message-id/[email protected]

in that case, initdb should be able to succeed without other options.

> * I am thinking loudly. Builtin provider is more performant but with
> certain limitations. Some production users may want to try builtin
> provider for better performance but not being aware of the
> limitation. Their environment contains the actual LC_CTYPE/LC_COLLATE
> they want to use, and they set the new environment variable with
> “builtin” for provider. In this case, failing “initdb” would make the
> user clearly realize the limitation of builtin provider. Otherwise,
> if the user also ignores the warning messages, then the database
> would be created with unexpected ctype, which would lead to loss
> (time, data, etc.)

What limitation and/or loss are you concerned about?

Unless I'm mistaken, LC_CTYPE has very little practical effect when the
provider is builtin and the encoding is UTF-8.

The main effect that I'm aware of is that system errors from the OS
rely on LC_CTYPE for translation. Ordinary Postgres messages don't need
LC_CTYPE, so most of NLS still works even with LC_CTYPE=C; it's just
strerror() that depends on LC_CTYPE for the encoding.

LC_CTYPE also affects full text search parsing, but I'm fixing that as
part of another patch to use the database locale instead.

I think contrib/fuzzystrmatch may be affected.

Callers of pg_strcasecmp() could be affected, but it's mostly used to
compare with ascii anyway.

If you are aware of other areas, please let me know.

Regards,
Jeff Davis





Re: create table like including storage parameter

2025-10-10 Thread jian he
On Wed, Oct 1, 2025 at 2:42 PM jian he  wrote:
>
> “including parameters” would add another keyword "PARAMETERS''.
> currently "INCLUDING STORAGE PARAMETER" no need to add a new keyword.
> If other people favor “including parameters”, then we can do it that
> way in the future.
>
> attached is just a simple rebase of v1.

hi.
https://api.cirrus-ci.com/v1/artifact/task/5202996067303424/testrun/build/testrun/regress/regress/regression.diffs

+ERROR:  relation "t" already exists
I made the same mistake, regress test using a simple name like "t" will conflict
with other regress tests.

please check the attached, mainly regress tests changes.
From 58e527e08e0673683295465e07e049d41b29d79e Mon Sep 17 00:00:00 2001
From: jian he 
Date: Sat, 11 Oct 2025 11:03:51 +0800
Subject: [PATCH v3 1/1] create table like incluing storage parameter

demo:
create table t(a text) with (fillfactor = 100, toast.vacuum_truncate=true);
create table t2(like t including storage parameter);

  Table "public.t2"
 Column | Type | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
+--+---+--+-+--+-+--+-
 a  | text |   |  | | extended | |  |
Access method: heap
Options: fillfactor=100, toast.vacuum_truncate=true

discussion: https://postgr.es/m/CACJufxHr=nKEsS66M7rTHgB+mXdQ948=_ej1jztac7pt-ej...@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/6088
---
 doc/src/sgml/ref/create_table.sgml| 19 +-
 src/backend/access/common/reloptions.c| 49 +++
 src/backend/parser/gram.y |  1 +
 src/backend/parser/parse_utilcmd.c| 63 +++
 src/include/access/reloptions.h   |  1 +
 src/include/nodes/parsenodes.h|  1 +
 .../regress/expected/create_table_like.out| 39 
 src/test/regress/sql/create_table_like.sql| 28 +
 8 files changed, 200 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..a389ce597a9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -88,7 +88,7 @@ class="parameter">referential_action ] [ ON UPDATE and like_option is:
 
-{ INCLUDING | EXCLUDING } { COMMENTS | COMPRESSION | CONSTRAINTS | DEFAULTS | GENERATED | IDENTITY | INDEXES | STATISTICS | STORAGE | ALL }
+{ INCLUDING | EXCLUDING } { COMMENTS | COMPRESSION | CONSTRAINTS | DEFAULTS | GENERATED | IDENTITY | INDEXES | STATISTICS | STORAGE | STORAGE PARAMETER | ALL }
 
 and partition_bound_spec is:
 
@@ -776,6 +776,23 @@ WITH ( MODULUS numeric_literal, REM
 

 
+   
+INCLUDING STORAGE PARAMETER
+
+ 
+  STORAGE PARAMETER settings for the copied table will be copied.
+  For table storage parameters, see  below for more information.
+ 
+
+
+  Do not confuse this with INCLUDING STORAGE.
+  INCLUDING STORAGE is for copying table column storage property,
+  this is for copying table storage parameters.
+
+
+
+   
+

 INCLUDING ALL
 
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 35150bf237b..6032cd5813e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -1383,6 +1383,55 @@ untransformRelOptions(Datum options)
 	return result;
 }
 
+/*
+ * Convert the reloptions from text-array format into a List of DefElem.  This
+ * is the reverse operation of transformRelOptions(). If any option includes a
+ * namespace qualification, create the DefElem in that namespace.
+ * If nameSpace is NULL, this behave the same as untransformRelOptions.
+ */
+List *
+untransformRelOptionsExtended(Datum options, char* nameSpace)
+{
+	List	   *result = NIL;
+	ArrayType  *array;
+	Datum	   *optiondatums;
+	int			noptions;
+	int			i;
+
+	/* Nothing to do if no options */
+	if (DatumGetPointer(options) == NULL)
+		return result;
+
+	array = DatumGetArrayTypeP(options);
+
+	deconstruct_array_builtin(array, TEXTOID, &optiondatums, NULL, &noptions);
+
+	for (i = 0; i < noptions; i++)
+	{
+		char	   *s;
+		char	   *p;
+		Node	   *val = NULL;
+
+		s = TextDatumGetCString(optiondatums[i]);
+		p = strchr(s, '=');
+		if (p)
+		{
+			*p++ = '\0';
+			val = (Node *) makeString(p);
+		}
+
+		if (nameSpace == NULL)
+			result = lappend(result, makeDefElem(s, val, -1));
+		else
+			result = lappend(result, makeDefElemExtended(nameSpace, s, val,
+		 DEFELEM_UNSPEC,
+		 -1));
+	}
+
+	return result;
+}
+
+
 /*
  * Extract and parse reloptions from a pg_class tuple.
  *
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 21caf2d43bf..ef8498ddb23 100644
--- a/src/backend/parser/gram