>> * ToDate(chararray) accepts ISO-8601 'T' timestamps, but not >> JDBC space ' ' timestamps ... thereby make it incompatible >> with hive, impala & JDBC data sources
> ToDate use org.joda.time.format.ISODateTimeFormat to parse > date string. I am open to change as long as it does not > break backward compatibility. Any suggestion? We can construct and use a different joda-time DateTimeFormatter which will support either ISO-8601 'T' format or JDBC ' ' space format for timestamps. (see my separate message regarding continuing/future use of joda-time vs JSR-310 time) I will file a JIRA and will submit a patch. As a newbie I may need some assistance with the mechanics. >> * casting: (datetime)timestampString fails, even though >> datetime is listed as a primitive data type. > This can be fixed, can you open a Jira? Yes, I will open a JIRA >> * ToDate(chararray) throws an exception (rather than >> returning null) when given a mal-formed timestamp >> ... is this the desired behavior? > Probably return a null is better, but we cannot break > backward compatibility. We can create a new UDF to return > null. Two thoughts: 1. Throwing an exception terminates one's pig job. I cannot imagine this being interpreted as 'desirable behavior'. Halting a job doesn't seem to be in the spirit of pig. Therefore, I think there is a strong argument that this is a bug fix and that the 'cannot break backward compatibility' constraint can be relaxed. 2. Since the datatype is called 'datetime', the built-in UDF should not be called 'ToDate' anyway. Therefore, we should create a new UDF with the desired behavior and call it 'ToDateTime' so that the function name is consistent with the datatype name. Michael
