>> * ToDate(chararray) accepts ISO-8601 'T' timestamps, but not
>> JDBC space ' ' timestamps ... thereby make it incompatible
>> with hive, impala & JDBC data sources

> ToDate use org.joda.time.format.ISODateTimeFormat to parse
> date string. I am open to change as long as it does not
> break backward compatibility. Any suggestion?

We can construct and use a different joda-time DateTimeFormatter which will
support either ISO-8601 'T' format or JDBC ' ' space format for timestamps.

(see my separate message regarding continuing/future use of joda-time vs
JSR-310 time)

I will file a JIRA and will submit a patch.

As a newbie I may need some assistance with the mechanics.


>> * casting: (datetime)timestampString fails, even though
>> datetime is listed as a primitive data type.

> This can be fixed, can you open a Jira?

Yes, I will open a JIRA


>> * ToDate(chararray) throws an exception (rather than
>> returning null) when given a mal-formed timestamp
>> ... is this the desired behavior?

> Probably return a null is better, but we cannot break
> backward compatibility. We can create a new UDF to return
> null.

Two thoughts:

1. Throwing an exception terminates one's pig job. I cannot imagine this
being interpreted as 'desirable behavior'. Halting a job doesn't seem to be
in the spirit of pig. Therefore, I think there is a strong argument that
this is a bug fix and that the 'cannot break backward compatibility'
constraint can be relaxed.

2. Since the datatype is called 'datetime', the built-in UDF should not be
called 'ToDate' anyway. Therefore, we should create a new UDF with the
desired behavior and call it 'ToDateTime' so that the function name is
consistent with the datatype name.


Michael

Reply via email to