[Bug fix] ECPG: ECPG preprocessor outputs "unsupported feature will be passed to server" even if the command is supported

2019-10-02 Thread higuchi.dais...@fujitsu.com
Hi, 

I found wrong message output when ECPG preprocessed some SQL commands . 
For example, ECPG application has following command, ECPG outputs "unsupported 
feature will be passed to server". 
---
EXEC SQL CREATE SCHEMA IF NOT EXISTS hollywood;
---
I attached sample code to reproduce this problem. 

[Investigation]
I think parse.pl has some problem. The following filters do not seem to work 
properly: 
 src/interfaces/ecpg/preproc/parse.pl
if ($feature_not_supported == 1)
{

# we found an unsupported feature, but we have to
# filter out ExecuteStmt: CREATE OptTemp TABLE ...
# because the warning there is only valid in some 
situations
if ($flds->[0] ne 'create' || $flds->[2] ne 'table')
{
add_to_buffer('rules',
'mmerror(PARSE_ERROR, ET_WARNING, 
"unsupported feature will be passed to server");'   
 );
}
$feature_not_supported = 0;
}

[Solution]
I have two solutions for this. 
1) This problem occurs because filter does not work properly. 
   So, by setting the filter conditions properly, wrong warning should not be 
output. 
   However, we have to modify the conditions when the syntax in gram.y is 
changed. 

2) This problem occurs when a syntax is not supported under certain conditions 
like following: 
   So, when "if sentence" is found inside rule, the warning should not be 
output. 

CreateSchemaStmt:
 | CREATE SCHEMA IF_P NOT EXISTS OptSchemaName AUTHORIZATION RoleSpec 
OptSchemaEltList
 {
 CreateSchemaStmt *n = makeNode(CreateSchemaStmt);

 if ($9 != NIL)
 ereport(ERROR,
 
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  errmsg("CREATE SCHEMA IF NOT 
EXISTS cannot include schema elements"),
  parser_errposition(@9)));

I attached a draft patch of the solution 2). 

Regards, 
Daisuke, Higuchi



ECPG_unsupported_messages_fix_v1.patch
Description: ECPG_unsupported_messages_fix_v1.patch


test.pgc
Description: test.pgc


[Bug fix]There is the case archive_timeout parameter is ignored after recovery works.

2020-06-28 Thread higuchi.dais...@fujitsu.com
Hi,

I found the bug about archive_timeout parameter.
There is the case archive_timeout parameter is ignored after recovery works.

[Problem]
When the value of archive_timeout is smaller than that of checkpoint_timeout 
and recovery works, archive_timeout is ignored in the first WAL archiving.
Once WAL is archived, the archive_timeout seems to be valid after that.

I attached the simple script for reproducing this problem on version 12. 
I also confirmed that PostgreSQL10, 11 and 12. I think other supported versions 
have this problem. 

[Investigation]
In the CheckpointerMain(), calculate the time (cur_timeout) to wait on 
WaitLatch.

-
now = (pg_time_t) time(NULL);
elapsed_secs = now - last_checkpoint_time;
if (elapsed_secs >= CheckPointTimeout)
continue;   /* no sleep for us ... */
cur_timeout = CheckPointTimeout - elapsed_secs;
if (XLogArchiveTimeout > 0 && !RecoveryInProgress())
{
elapsed_secs = now - last_xlog_switch_time;
if (elapsed_secs >= XLogArchiveTimeout)
continue;   /* no sleep for us ... */
cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
}

(void) WaitLatch(MyLatch,
 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
 cur_timeout * 1000L /* convert to ms */ ,
 WAIT_EVENT_CHECKPOINTER_MAIN);
-

Currently, cur_timeout is set according to only checkpoint_timeout when it is 
during recovery.
Even during recovery, the cur_timeout should be calculated including 
archive_timeout as well as checkpoint_timeout, I think.
I attached the patch to solve this problem.

Regards,
Daisuke, Higuchi


archive_timeout_test.sh
Description: archive_timeout_test.sh


archive_timeout.patch
Description: archive_timeout.patch


RE: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.

2020-06-29 Thread higuchi.dais...@fujitsu.com
Thank you for comments.

>Unfortunately the diff command in your test script doesn't show me
>anything, but I can understand what you are thinking is a problem,
>maybe.

I'm sorry but I might have confused you... I explain how to use my test script.
I use diff command to check if the archiver has started. diff command does not 
output nothing to stdout.
So, please see the time displayed by the two date command by output of my test 
script.
I think you can confirm that the difference between the results of date 
commands is not the archive_timeout setting of 10 seconds.
If my test script runs for a few minutes, it means that my problem is 
reproduced.

>immediately independently from checkpointer. The parameter, as
>described in documentation, forces the server to switch to a new WAL
>segment file periodically so that it can be archived, that is, it
>works only on primary.

I confirm that this problem is occurred in non-replication environment.
The problem occurs when database try to archive WAL during or after archive 
recovery.
So your patch may be good to solve another problem, but unfortunately it didn't 
fix my problem.

Regards,
Daisuke, Higuchi





RE: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.

2020-06-29 Thread higuchi.dais...@fujitsu.com
Fujii-san, thank you for comments.

>The cause of this problem is that the checkpointer's sleep time is calculated
>from both checkpoint_timeout and archive_timeout during normal running,
>but calculated only from checkpoint_timeout during recovery. So Daisuke-san's
>patch tries to change that so that it's calculated from both of them even
>during recovery. No?

Yes, it's exactly so.

>last_xlog_switch_time is not updated during recovery. So "elapsed_secs" can be
>large and cur_timeout can be negative. Isn't this problematic?

Yes... My patch was missing this.
How about using the original archive_timeout value for calculating cur_timeout 
during recovery?

if (XLogArchiveTimeout > 0 && !RecoveryInProgress())
{
elapsed_secs = now - last_xlog_switch_time;
if (elapsed_secs >= XLogArchiveTimeout)
continue;   /* no sleep for us ... 
*/
cur_timeout = Min(cur_timeout, XLogArchiveTimeout - 
elapsed_secs);
}
+   else if (XLogArchiveTimeout > 0)
+   cur_timeout = Min(cur_timeout, XLogArchiveTimeout);

During recovery, accurate cur_timeout is not calculated because elapsed_secs is 
not used.
However, after recovery is complete, WAL archiving will start by the next 
archive_timeout is reached.
I felt it is enough to solve this problem.

>As another approach, what about waking the checkpointer up at the end of
>recovery like we already do for walsenders?

If the above solution is not good, I will consider this approach.

Regards,
Daisuke, Higuchi




RE: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.

2020-06-29 Thread higuchi.dais...@fujitsu.com
>> We don't want change checkpoint interval during recovery, that means
>> we cannot cnosider archive_timeout at the fist checkpointer after
>> recovery ends. So I think that the suggestion from Fujii-san is the
>> direction.
>+1
>If this idea has some problems, we can revisit Daisuke-san's idea.

Thanks for your comments. 
Ok, I will work on the fix to wake the checkpointer up at the end of recovery.

Regards,
Daisuke, Higuchi