回复: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

2020-11-30 Thread Yulin PEI
I think you are right after reading code in compute_parallel_vacuum_workers() :)


发件人: Tom Lane 
发送时间: 2020年12月1日 2:54
收件人: Yulin PEI 
抄送: Masahiko Sawada ; pgsql-hackers@lists.postgresql.org 

主题: Re: 回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in 
standalone mode

Yulin PEI  writes:
> Yes, I agree because (IsNormalProcessingMode() ) means that current process 
> is not in bootstrap mode and postmaster process will not build index.
> So my new modified patch is attached.

This is a good catch, but the proposed fix still seems pretty random
and unlike how it's done elsewhere.  It seems to me that since
index_build() is relying on plan_create_index_workers() to assess
parallel safety, that's where to check IsUnderPostmaster.  Moreover,
the existing code in compute_parallel_vacuum_workers (which gets
this right) associates the IsUnderPostmaster check with the initial
check on max_parallel_maintenance_workers.  So I think that the
right fix is to adopt the compute_parallel_vacuum_workers coding
in plan_create_index_workers, and thereby create a model for future
uses of max_parallel_maintenance_workers to follow.

regards, tom lane



[PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

2020-11-30 Thread Yulin PEI
Hi hackers,

I found that when running command vaccum full in stand-alone mode there will be 
a core dump.
The core stack looks like this:

backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)


I think that when there is a btree index in a table, vacuum full tries to 
rebuild the btree index in a parallel way.

This will launch several workers and each of then will try to apply a dynamic 
shared memory segment, which is not allowed in stand-alone mode.

I think it is better not to use btree index build in parallel in stand-alone 
mode. My patch is attached below.





Best Regards!
Yulin PEI












standalone_vacuum_full.patch
Description: standalone_vacuum_full.patch


回复: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone mode

2020-11-30 Thread Yulin PEI
Yes, I agree because (IsNormalProcessingMode() ) means that current process is 
not in bootstrap mode and postmaster process will not build index.
So my new modified patch is attached.



发件人: Masahiko Sawada 
发送时间: 2020年11月30日 17:27
收件人: Yulin PEI 
抄送: pgsql-hackers@lists.postgresql.org 
主题: Re: [PATCH] BUG FIX: Core dump could happen when VACUUM FULL in standalone 
mode



On Mon, Nov 30, 2020 at 5:45 PM Yulin PEI 
mailto:ype...@connect.ust.hk>> wrote:
Hi hackers,

I found that when running command vaccum full in stand-alone mode there will be 
a core dump.
The core stack looks like this:

backend> vacuum full
TRAP: FailedAssertion("IsUnderPostmaster", File: "dsm.c", Line: 439)
./postgres(ExceptionalCondition+0xac)[0x55c664c36913]
./postgres(dsm_create+0x3c)[0x55c664a79fee]
./postgres(GetSessionDsmHandle+0xdc)[0x55c6645f8296]
./postgres(InitializeParallelDSM+0xf9)[0x55c6646c59ca]
./postgres(+0x16bdef)[0x55c664692def]
./postgres(+0x16951e)[0x55c66469051e]
./postgres(btbuild+0xcc)[0x55c6646903ec]
./postgres(index_build+0x322)[0x55c664719749]
./postgres(reindex_index+0x2ee)[0x55c66471a765]
./postgres(reindex_relation+0x1e5)[0x55c66471acca]
./postgres(finish_heap_swap+0x118)[0x55c6647c8db1]
./postgres(+0x2a0848)[0x55c6647c7848]
./postgres(cluster_rel+0x34e)[0x55c6647c727f]
./postgres(+0x3414cf)[0x55c6648684cf]
./postgres(vacuum+0x4f3)[0x55c664866591]
./postgres(ExecVacuum+0x736)[0x55c66486609b]
./postgres(standard_ProcessUtility+0x840)[0x55c664ab74fc]
./postgres(ProcessUtility+0x131)[0x55c664ab6cb5]
./postgres(+0x58ea69)[0x55c664ab5a69]
./postgres(+0x58ec95)[0x55c664ab5c95]
./postgres(PortalRun+0x307)[0x55c664ab5184]
./postgres(+0x587ef6)[0x55c664aaeef6]
./postgres(PostgresMain+0x819)[0x55c664ab3271]
./postgres(main+0x2e1)[0x55c6648f9df5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f65376192e1]
./postgres(_start+0x2a)[0x55c6645df86a]
Aborted (core dumped)


I think that when there is a btree index in a table, vacuum full tries to 
rebuild the btree index in a parallel way.

This will launch several workers and each of then will try to apply a dynamic 
shared memory segment, which is not allowed in stand-alone mode.

I think it is better not to use btree index build in parallel in stand-alone 
mode. My patch is attached below.



Good catch. This is a bug in parallel index(btree) creation. I could reproduce 
this assertion failure with HEAD even by using CREATE INDEX command.

- indexRelation->rd_rel->relam == BTREE_AM_OID)
+ indexRelation->rd_rel->relam == BTREE_AM_OID && IsPostmasterEnvironment && 
!IsBackendStandAlone())

+#define IsBackendStandAlone() (!IsBootstrapProcessingMode() && 
!IsPostmasterEnvironment)
 /*
  * Auxiliary-process type identifiers.  These used to be in bootstrap.h
  * but it seems saner to have them here, with the ProcessingMode stuff.

I think we can use IsUnderPostmaster instead. If it's false we should not 
enable parallel index creation.

Regards,

--
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/


standalone_vacuum_full.patch
Description: standalone_vacuum_full.patch


Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-12 Thread Yulin PEI
HI hackers,
I found it could cause a crash when executing sql statement: `CREATE VIEW 
v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10)); ` in 
postgres 13.2 release.

The crash happens at view.c:89 and I did some analysis:

```

ColumnDef  *def = makeColumnDef(tle->resname,
exprType((Node *) tle->expr),
exprTypmod((Node *) tle->expr),
exprCollation((Node *) tle->expr));



/*
 * It's possible that the column is of a collatable type but the
 * collation could not be resolved, so double-check.
 */

// Here is the analysis:

//example :  ('4' COLLATE "C")::INT

//exprCollation((Node *) tle->expr) is the oid of collate "COLLATE 'C'" so 
def->collOid is valid
//exprType((Node *) tle->expr)) is 23 which is the oid of type int4.
//We know that int4 is not collatable by calling type_is_collatable()

if (type_is_collatable(exprType((Node *) tle->expr)))
{
   if (!OidIsValid(def->collOid))
  ereport(ERROR,
(errcode(ERRCODE_INDETERMINATE_COLLATION),
 errmsg("could not determine which collation to use for view column 
\"%s\"",
  def->colname),
 errhint("Use the COLLATE clause to set the collation 
explicitly.")));
}
else

   // So we are here! int is not collatable and def->collOid is valid.
   Assert(!OidIsValid(def->collOid));

```

I am not sure whether to fix this bug in function DefineVirtualRelation or to 
fix this bug in parse tree and analyze procedure, so maybe we can discuss.




Best Regard!
Yulin PEI




回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-13 Thread Yulin PEI
After reading the code and the patch, I think the patch is good. If the 
type is non-collatable, we do not add a CollateExpr node as a 'parent' node to 
the coerced node.


发件人: Tom Lane 
发送时间: 2021年4月13日 0:59
收件人: Yulin PEI 
抄送: pgsql-hackers@lists.postgresql.org 
主题: Re: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' 
COLLATE "C")::INT FROM generate_series(1, 10));

Yulin PEI  writes:
> I found it could cause a crash when executing sql statement: `CREATE VIEW 
> v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10)); ` in 
> postgres 13.2 release.

Nice catch.  I don't think the code in DefineVirtualRelation is wrong:
exprCollation shouldn't report any collation for an expression of a
non-collatable type.  Rather the problem is with an old kluge in
coerce_type(), which will push a type coercion underneath a CollateExpr
... without any mind for the possibility that the coercion result isn't
collatable.  So the right fix is more or less the attached.

regards, tom lane



回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-13 Thread Yulin PEI
I think it is better to add this test case to regress.

发件人: Tom Lane 
发送时间: 2021年4月13日 0:59
收件人: Yulin PEI 
抄送: pgsql-hackers@lists.postgresql.org 
主题: Re: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' 
COLLATE "C")::INT FROM generate_series(1, 10));

Yulin PEI  writes:
> I found it could cause a crash when executing sql statement: `CREATE VIEW 
> v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10)); ` in 
> postgres 13.2 release.

Nice catch.  I don't think the code in DefineVirtualRelation is wrong:
exprCollation shouldn't report any collation for an expression of a
non-collatable type.  Rather the problem is with an old kluge in
coerce_type(), which will push a type coercion underneath a CollateExpr
... without any mind for the possibility that the coercion result isn't
collatable.  So the right fix is more or less the attached.

regards, tom lane



0001-add-regress.patch
Description: 0001-add-regress.patch


回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-18 Thread Yulin PEI
After several tests, I found that this patch do not fix the bug well.

I think we should use the same logic to treat parent CollateExpr and child 
CollateExpr. In your patch, if the parent node is CollateExpr and the target 
type is non-collatable,  we coerce CollateExpr->arg. If the child node is 
CollateExpr and the target type is non-collatable, we just skip.
Some types can be casted to another type even if  type_is_collatable 
returns false. Like bytea to int (It depends on the content of the string). If 
we simply skip,  bytea will never be casted to int even if the content is all 
digits.

So the attachment is my patch and it works well as far as I tested.




发件人: Tom Lane 
发送时间: 2021年4月13日 0:59
收件人: Yulin PEI 
抄送: pgsql-hackers@lists.postgresql.org 
主题: Re: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' 
COLLATE "C")::INT FROM generate_series(1, 10));

Yulin PEI  writes:
> I found it could cause a crash when executing sql statement: `CREATE VIEW 
> v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10)); ` in 
> postgres 13.2 release.

Nice catch.  I don't think the code in DefineVirtualRelation is wrong:
exprCollation shouldn't report any collation for an expression of a
non-collatable type.  Rather the problem is with an old kluge in
coerce_type(), which will push a type coercion underneath a CollateExpr
... without any mind for the possibility that the coercion result isn't
collatable.  So the right fix is more or less the attached.

regards, tom lane



fix-collation-coercion.patch
Description: fix-collation-coercion.patch


回覆: 回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-19 Thread Yulin PEI
Consider the SQL statement 'SELECT (('1' COLLATE "C") ||(B'1'));' . 
Intuitively, the result will be '11' and the result is '11' in pg 13.2 release 
as well.

The function stack is make_fn_arguments -> coerce_type, which means that the 
param "Node *node" of function coerce_type could be a CollateExpr Node.
Let's look at your patch:

```
// node is ('1' COLLATE "C")
// targetType is varbit and it is non-collatable
if (IsA(node, CollateExpr) && type_is_collatable(targetTypeId))
{

// we will not reach here.

CollateExpr *coll = (CollateExpr *) node;
CollateExpr *newcoll = makeNode(CollateExpr);



// An error will be generated. "failed to find conversion function"

}

```

So I suggest:

```
// node is ('1' COLLATE "C")

if (IsA(node, CollateExpr))
   {
  CollateExpr *coll = (CollateExpr *) node;
  CollateExpr *newcoll = makeNode(CollateExpr);


  //targetType is varbit and it is non-collatable

  if (!type_is_collatable(targetTypeId)) {

 // try to convert '1'(string) to varbit

 // We do not make a new CollateExpr here, but don't forget to coerce 
coll->arg.

 return coerce_type(pstate, (Node *) coll->arg,
inputTypeId, targetTypeId, targetTypeMod,
        ccontext, cformat, location);
  }
 ...
   }

```







寄件者: Tom Lane 
寄件日期: 2021年4月19日 1:46
收件者: Yulin PEI 
副本: pgsql-hackers@lists.postgresql.org 
主旨: Re: 回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT 
('4' COLLATE "C")::INT FROM generate_series(1, 10));

Yulin PEI  writes:
> After several tests, I found that this patch do not fix the bug well.

What do you think is wrong with it?

> So the attachment is my patch and it works well as far as I tested.

This seems equivalent to the already-committed patch [1] except that
it wastes a makeNode call in the coerce-to-uncollatable-type case.

regards, tom lane

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c402b02b9fb53aee2a26876de90a8f95f9a9be92