Hi all, I also feel that adding more config doesn't make sense in this as we are tightening the date and timestamp format. We should decide upon a single solution even if it break the compatibility. Below the comparison of HIVE 1.2, HIVE 3.2, MYSQL, PostgreSQL, Oracle
Query Hive 1.2 Hive 3.2 Mysql PostgreSQL ORACLE select cast('2020-20-20' as date); NULL 2021-08-20 NULL date/time field value out of range: "2020-20-20" not a valid month select cast(null as date); NULL NULL NULL NULL NULL select cast('2020-02-31' as date); 2020-03-02 2020-03-02 NULL date/time field value out of range: "2020-02-31" date format picture ends before converting entire input string select cast('2020/02/20' as date); NULL NULL 2020-02-20 2020-02-20 literal does not match format string select cast('0000-00-00' as date); NULL 0002-11-30 NULL date/time field value out of range: "0000-00-00" literal does not match format string >From the comparison it is quite clear that date and timestamp formatting was much tighter in older versions of HIVE. For most of the wrong date input *NULL *was the standard response instead of Exception. Also when I went through the code I found that. While doing the Vector implementation of some of the date related UDF like datediff etc. MySql was taken as the gold standard <https://issues.apache.org/jira/browse/HIVE-15338?focusedCommentId=15727553&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15727553>. So it make more sense that we should comply with MySql as we already refer MySql as gold standard and returning NULL as result for wrong dates in cast is also documented <https://cwiki.apache.org/confluence/display/hive/languagemanual+types#LanguageManualTypes-CastingDates> *So I propose to make NULL as the standard response for all parsing errors.* Thanks Ashish Sharma On Tue, Jul 13, 2021 at 9:52 PM Stamatis Zampetakis <zabe...@gmail.com> wrote: > Hi all, > > Thanks for pushing this forward Ashish! > > Actually I am not in favor of creating a flag for this. Either we decide > consciously to break backward compatibility in the hope that we are > improving the expected results or we keep the current behavior. > Adding another flag means that we maintain and support two variants that > makes the problem of test coverage brought by David even worse. > > I second David's idea to run some tests over some well adopted DBMS (MySQL, > Oracle, MSSQL, Postgres) to see what they return. > I think Ashish already did some tests over MySQL and MSSQL but personally I > would like to see some more (dates + engines) in order to express > a preference. > We shouldn't forget that since Hive is implemented in Java, having > functions that are inline with the Java APIs is not such a bad idea. > The last comment is slightly supportive of the current behavior. > > I am including user@ list in the discussion since we should definitely > consider the feedback of people that are using Hive for real. > > Best, > Stamatis > > On Tue, Jul 13, 2021 at 4:31 PM David <dam6...@gmail.com> wrote: > > > Hello, > > > > Is anyone able to try out a few different vendor RDBMS to see how they > > handle invalid dates, or provide links to documentation, both for invalid > > formatting and things like mm-dd-yyy 12-40-2021? > > > > Thanks. > > > > On Tue, Jul 13, 2021 at 5:14 AM Sankar Hariappan > > <sankar.hariap...@microsoft.com.invalid> wrote: > > > >> I'm supporting this change to return "NULL" for invalid date/timestamp. > >> In the interest of backward compatibility, can we make all these changes > >> under a flag which can be enabled by default? > >> > >> > >> Thanks, > >> Sankar > >> -----Original Message----- > >> From: David <dam6...@gmail.com> > >> Sent: 10 July 2021 07:35 > >> To: dev <d...@hive.apache.org> > >> Cc: sank...@apache.org; Stamatis Zampetakis <zabe...@gmail.com> > >> Subject: [EXTERNAL] Re: Move Date and Timestamp parsing from > >> ResolverStyle.LENIENT to ResolverStyle.STRICT > >> > >> Hello, > >> > >> I too would be in favor of this. It drastically cuts down on the test > >> matrix for Hive if we can clamp down on timestamp formats. With that > being > >> said, I've tried this and it's a big effort. I put it down without > getting > >> consensus or buy-in or engagement on the effort. Please check out my > work > >> here: > >> > >> > >> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fplugins%2Fservlet%2Fmobile%23issue%2FHIVE-24814&data=04%7C01%7CSankar.Hariappan%40microsoft.com%7Cd47432b9d7654d66a46908d943472338%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637614795446338436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=mvYgaG7liJOwUZmMvgwlo%2B1HvcUsrnzXA3Ltfz5yEYE%3D&reserved=0 > >> > >> > >> On Fri, Jul 9, 2021, 9:49 PM Ashish Sharma < > ashishkumarsharm...@gmail.com > >> > > >> wrote: > >> > >> > Hi, > >> > > >> > When casting incorrect date or timestamp literals to DATE or TIMESTAMP > >> > data type hive returns wrong values > >> > > >> > hive> select cast('2020-20-20' as date); > >> > > >> > OK > >> > > >> > 2021-08-20 > >> > > >> > Time taken: 4.436 seconds, Fetched: 1 row(s) > >> > > >> > > >> > I have created a solution draft. Please review the draft and provide > >> > your valuable feedback on the same. > >> > > >> > > >> > > >> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs > >> > .google.com > %2Fdocument%2Fd%2F1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCo > >> > c1Y%2Fedit%3Fusp%3Dsharing&data=04%7C01%7CSankar.Hariappan%40micro > >> > soft.com > %7Cd47432b9d7654d66a46908d943472338%7C72f988bf86f141af91ab2d7c > >> > d011db47%7C1%7C0%7C637614795446338436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi > >> > MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000& > >> > sdata=iihK9wJC%2B1uPktHSE9BpXADvbal1UT7vZ3rwigkgkIY%3D&reserved=0 > >> > > >> > > >> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissu > >> > es.apache.org > %2Fjira%2Fbrowse%2FHIVE-25306&data=04%7C01%7CSankar.H > >> > ariappan%40microsoft.com > %7Cd47432b9d7654d66a46908d943472338%7C72f988bf > >> > 86f141af91ab2d7cd011db47%7C1%7C0%7C637614795446338436%7CUnknown%7CTWFp > >> > bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn > >> > 0%3D%7C3000&sdata=nW%2Bw%2B0AYn%2BbvOqRLrXghfH0FG%2B1cQW%2BzdrpT%2 > >> > B9R%2B6rA%3D&reserved=0 > >> > > >> > > >> > Thank you > >> > > >> > Ashish Sharma > >> > > >> > > >