Hi Ayush,

thanks for your thoughts!

My answers:

- On the official Hive page about types this is not documented at all:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=82706456#LanguageManualTypes-date

"Dates:  DATE values describe a particular year/month/day, in the form
*YYYY-­MM-­DD.
For example, DATE '2013-­01-­01'.* Date types do not have a time of day
component. The range of values supported for the Date type is 0000-­01-­01
to 9999-­12-­31, dependent on support by the primitive Java Date type."

- Since the beginning of Hive it worked like returning null, this was
introduced one year ago as a side effect of this: HIVE-27586 "Parse dates
from strings ignoring trailing (potentialy) invalid chars", so the goal of
this ticket was to be able to handle string to date parsing like :
"2024-09-18 GARBAGE" and parse it to 2024-09-18 as date.
*So it is not a long standing behaviour.*

I made a comparison of other DB (Postre, MySQL, ORACLE) and none of them
handle it like Hive now. Please check it here:
https://issues.apache.org/jira/browse/HIVE-28483
I agree with you to align with those. This is why I asked this question.

Thanks and regards,

Zoltan Ratkai


On Wed, Sep 18, 2024 at 8:58 AM Ayush Saxena <ayush...@gmail.com> wrote:

> ++ Adding the dev mailing list, as this might be relevant to them as well.
>
> I have a few additional questions:
>
> * Has this behavior been present since the inception of Hive, or did
> it start occurring more recently? If this has been the behavior for
> over a decade, it might be best to leave it unchanged. However, if
> this is a recent development, we should aim to restore the original
> behavior. In my opinion, any change that alters long-standing behavior
> could be considered incompatible unless it was introduced to prevent
> data loss or address a security issue.
> * How do other engines (e.g., Impala, Spark) handle similar scenarios,
> and how do other databases like MySQL and PostgreSQL behave in this
> regard? In the past, we've typically aimed to align with these
> systems, not necessarily exactly, but closely enough—especially in
> cases where there's ambiguity or conflict.
>
>
> -Ayush
>
>
> On Wed, 18 Sept 2024 at 12:00, Zoltán Rátkai <zrat...@cloudera.com> wrote:
> >
> > Dear Hive User!
> >
> > In Hive the standard date format is YYYY-MM-DD, like “2024-09-17”.
> > So when casting a string which format is different, like DD-MM-YYYY e.g.
> “17-09-2024" to date, as a Hive user what would you expect as a result?
> >
> >
> > 1.  “0017-09-20”  => Year of 17 September 20
> > or
> > 2. null
> >
> > It is possible to give a date pattern CAST(<string-value> AS DATE FORMAT
> <date-pattern>)and cast with it like "DD-MM-YYYY".
> > Earlier it gave back null, but currently Hive gives back the first,
> which I think unexpected as a user and customers complain about that. There
> is a debate if it is a bug or feature, so I need your help!
> >
> > I would like to ask you to reply with a vote on 1 or 2 what you want as
> a user to happen!
> >
> > I really appreciate your input!
> >
> > Thank you,
> >
> > Zoltan Ratkai
>

Reply via email to