Re: [HACKERS] Rectifying wrong Date outputs

Bruce Momjian Mon, 05 Sep 2011 16:33:09 -0700

Piyush Newe wrote:
> Hi,
> 
> I was randomly testing some date related stuff on PG & observed that the
> outputs were wrong.
> 
> e.g.
> postgres=# SELECT TO_DATE('01-jan-2010',  'DD-MON-YY');
>   to_date
> ------------
>  3910-01-01  <--------- Look at this
> (1 row)
> 
> postgres=# SELECT TO_DATE('01-jan-2010',  'DD-MON-YYYY');
>   to_date
> ------------
>  2010-01-01
> (1 row)


I have done some work on this problem, and have developed the attached
patch.  It genarates the output in the final column of this table:

                                            Oracle          PostgreSQL      
With PG Patch
 1  TO_DATE('01-jan-1',  'DD-MON-Y')        01-JAN-2011     01-JAN-2001     
01-JAN-2001+
 2  TO_DATE('01-jan-1',  'DD-MON-YY')       01-JAN-2001     01-JAN-2001     
01-JAN-2001
 3  TO_DATE('01-jan-1',  'DD-MON-YYY')      01-JAN-2001     01-JAN-2001     
01-JAN-2001
 4  TO_DATE('01-jan-1',  'DD-MON-YYYY')     01-JAN-0001     01-JAN-0001     
01-JAN-0001
 5  TO_DATE('01-jan-10',  'DD-MON-Y')       Error           01-JAN-2010     
01-JAN-2010
 6  TO_DATE('01-jan-10',  'DD-MON-YY')      01-JAN-2010     01-JAN-2010     
01-JAN-2010
 7  TO_DATE('01-jan-10',  'DD-MON-YYY')     01-JAN-2010     01-JAN-2010     
01-JAN-2010
 8  TO_DATE('01-jan-10',  'DD-MON-YYYY')    01-JAN-0010     01-JAN-0010     
01-JAN-0010
 9  TO_DATE('01-jan-067',  'DD-MON-Y')      Error           01-JAN-2067     
01-JAN-2067
10  TO_DATE('01-jan-111',  'DD-MON-YY')     01-JAN-0111     01-JAN-2011     
01-JAN-2111*+
11  TO_DATE('01-jan-678',  'DD-MON-YYY')    01-JAN-2678     01-JAN-1678     
01-JAN-1678+
12  TO_DATE('01-jan-001',  'DD-MON-YYYY')   01-JAN-0001     01-JAN-0001     
01-JAN-0001
13  TO_DATE('01-jan-2010',  'DD-MON-Y')     Error           01-JAN-4010     
01-JAN-2010*
14  TO_DATE('01-jan-2010',  'DD-MON-YY')    01-JAN-2010     01-JAN-3910     
01-JAN-2010*
15  TO_DATE('01-jan-2010',  'DD-MON-YYY')   Error           01-JAN-3010     
01-JAN-2010*
16  TO_DATE('01-jan-2010',  'DD-MON-YYYY')  01-JAN-2010     01-JAN-2010     
01-JAN-2010

I marked with '*' every case where the patch doesn't match current PG,
and used a '+' to mark every case where it doesn't match Oracle.

I know Tom was worried that because the year field took more digits than
specified, it would prevent numeric columns from being pulled apart, but
our code has this check:

    if (S_FM(node->suffix) || is_next_separator(node))
    {
        /*
         * This node is in Fill Mode, or the next node is known to be a
         * non-digit value, so we just slurp as many characters as we can get.
         */
        errno = 0;
        result = strtol(init, src, 10);
    }

The reason these tests are accepting an unlimited number of digits is
because it is at the end of the string.  If you place a digit field
right after it, it will not use more characters than specified:

        test=> select to_date('9876', 'YYY');
          to_date
        ------------
         9876-01-01
        (1 row)
        
        test=> select to_date('9876', 'YYYMM');
          to_date
        ------------
         1987-06-01
        (1 row)

Yes, not documented, but I assume the coder was trying to be helpful.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
new file mode 100644
index 726a1f4..f4677af
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
*************** static void dump_node(FormatNode *node, 
*** 964,969 ****
--- 964,970 ----
  
  static char *get_th(char *num, int type);
  static char *str_numth(char *dest, char *num, int type);
+ static int add_era_to_partial_year(int year);
  static int	strspace_len(char *str);
  static int	strdigits_len(char *str);
  static void from_char_set_mode(TmFromChar *tmfc, const FromCharDateMode mode);
*************** is_next_separator(FormatNode *n)
*** 1968,1973 ****
--- 1969,1995 ----
  	return TRUE;				/* some non-digit input (separator) */
  }
  
+ 
+ static int
+ add_era_to_partial_year(int year)
+ {
+ 	/* Force 0-69 into the 2000's */
+ 	if (year < 70)
+ 		return year + 2000;
+ 	/* Force 70-99 into the 1900's */
+ 	else if (year >= 70 && year < 100)
+ 		return year + 1900;
+ 	/* Force 100-499 into the 2000's */
+ 	else if (year >= 100 && year < 500)
+ 		return year + 2000;
+ 	/* Force 500-999 into the 1000's */
+ 	else if (year >= 500 && year < 1000)
+ 		return year + 1000;
+ 	else
+ 		return year;
+ }
+ 
+ 
  static int
  strspace_len(char *str)
  {
*************** DCH_from_char(FormatNode *node, char *in
*** 2931,2972 ****
  			case DCH_YYY:
  			case DCH_IYY:
  				from_char_parse_int(&out->year, &s, n);
  				out->yysz = 3;
- 
- 				/*
- 				 * 3-digit year: '100' ... '999' = 1100 ... 1999 '000' ...
- 				 * '099' = 2000 ... 2099
- 				 */
- 				if (out->year >= 100)
- 					out->year += 1000;
- 				else
- 					out->year += 2000;
  				s += SKIP_THth(n->suffix);
  				break;
  			case DCH_YY:
  			case DCH_IY:
  				from_char_parse_int(&out->year, &s, n);
  				out->yysz = 2;
- 
- 				/*
- 				 * 2-digit year: '00' ... '69'	= 2000 ... 2069 '70' ... '99'
- 				 * = 1970 ... 1999
- 				 */
- 				if (out->year < 70)
- 					out->year += 2000;
- 				else
- 					out->year += 1900;
  				s += SKIP_THth(n->suffix);
  				break;
  			case DCH_Y:
  			case DCH_I:
  				from_char_parse_int(&out->year, &s, n);
  				out->yysz = 1;
- 
- 				/*
- 				 * 1-digit year: always +2000
- 				 */
- 				out->year += 2000;
  				s += SKIP_THth(n->suffix);
  				break;
  			case DCH_RM:
--- 2953,2974 ----
  			case DCH_YYY:
  			case DCH_IYY:
  				from_char_parse_int(&out->year, &s, n);
+ 				out->year = add_era_to_partial_year(out->year);
  				out->yysz = 3;
  				s += SKIP_THth(n->suffix);
  				break;
  			case DCH_YY:
  			case DCH_IY:
  				from_char_parse_int(&out->year, &s, n);
+ 				out->year = add_era_to_partial_year(out->year);
  				out->yysz = 2;
  				s += SKIP_THth(n->suffix);
  				break;
  			case DCH_Y:
  			case DCH_I:
  				from_char_parse_int(&out->year, &s, n);
+ 				out->year = add_era_to_partial_year(out->year);
  				out->yysz = 1;
  				s += SKIP_THth(n->suffix);
  				break;
  			case DCH_RM:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rectifying wrong Date outputs

Reply via email to