Hi,

For now, There are only two types of IntervalUnit inside Arrow:

- YearMonth - month stored as int32
- DayTime - days as int32 and time in milliseconds  as in32. Total (64 bites)

Since DF is using Arrow, It’s not possible to store “Complex” intervals such 1 
MONTH 1 DAY 1 HOUR.
I think, the best way to understand the problem will be to read a comment from 
DF codebase: 
https://github.com/apache/arrow/blob/bca7d2fe84ccd8fc1129cb4d85448eb0779c52c3/rust/datafusion/src/sql/planner.rs#L1148

        // Interval is tricky thing
        // 1 day is not 24 hours because timezones, 1 year != 365/364! 30 days 
!= 1 month
        // The true way to store and calculate intervals is to store it as it 
defined
        // Due the fact that Arrow supports only two types YearMonth (month) 
and DayTime (day, time)
        // It's not possible to store complex intervals
        // It's possible to do select (NOW() + INTERVAL '1 year') + INTERVAL '1 
day'; as workaround
        if result_month != 0 && (result_days != 0 || result_millis != 0) {
            return Err(DataFusionError::NotImplemented(format!(
                "DF does not support intervals that have both a Year/Month part 
as well as Days/Hours/Mins/Seconds: {:?}. Hint: try breaking the interval into 
two parts, one with Year/Month and the other with Days/Hours/Mins/Seconds - 
e.g. (NOW() + INTERVAL '1 year') + INTERVAL '1 day'",
                value
            )));
        }



I prepared a PR https://github.com/apache/arrow/pull/9516/files 
<https://github.com/apache/arrow/pull/9516/files> that introduce a new type for 
IntervalUnit called Complex, that store both YearMonth and DayTime to support 
complex interval.
I didn’t find any page/documentation on how to do RFC in Arrow protocol, so can 
anyone point me to it or PR with email will be enough?

Thanks.

Reply via email to