[ 
https://issues.apache.org/jira/browse/ARROW-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500701#comment-17500701
 ] 

Dragoș Moldovan-Grünfeld commented on ARROW-15798:
--------------------------------------------------

[~jorisvandenbossche] & [~rokm] I am also bumping a bit into conversion from 
double/ float into date. 

In R we could do something like this:

{code:r}
# convert a double to date 
as.Date(34.56, origin = "1970-01-01") 
#> [1] "1970-02-04"
{code}

but then add to it and spill over into the following day
{code:r}
as.Date(34.56, origin = "1970-01-01") + 0.45
#> [1] "1970-02-05"
{code}

due to the fact that (under the hood) the numerical representation is intact
{code:r}
as.numeric(as.Date(34.56, origin = "1970-01-01"))
#> [1] 34.56
{code} 

{{date64()}} might be the solution, but the roundtrip to R results in 
{{date64()}} being converted as date-time and not date
{code:r}
library(arrow, warn.conflicts = FALSE)
#> See arrow_info() for available features

a <- Array$create(34.56)
(a*86400*1000)$cast(int64())$cast(date64())
#> Array
#> <date64[ms]>
#> [
#>   1970-02-04
#> ]
{code}

In a dplyr pipeline:
{code:r}
library(dplyr, warn.conflicts = FALSE)
library(arrow, warn.conflicts = FALSE)
#> See arrow_info() for available features

df <- tibble::tibble(
  a = 34.56
)

df %>% 
  mutate(b = as.Date(a, origin = "1970-01-01"))
#> # A tibble: 1 × 2
#>       a b         
#>   <dbl> <date>    
#> 1  34.6 1970-02-04

df %>% 
  arrow_table() %>% 
  mutate(b = as.Date(a, origin = "1970-01-01")) %>% 
  collect()
#> # A tibble: 1 × 2
#>       a b         
#>   <dbl> <dttm>    
#> 1  34.6 1970-02-04 14:26:24
{code}

> [R][C++] Discussion: Plans for date casting from int to support an origin 
> option?
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-15798
>                 URL: https://issues.apache.org/jira/browse/ARROW-15798
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, R
>            Reporter: Dragoș Moldovan-Grünfeld
>            Priority: Major
>
> 2 questions:
> * plans to support an origin option for int -> date32 casting?
> * plans to support double -> date32 casting? 
> =======================
> Currently the casting from integer to date works, but assumes epoch 
> (1970-01-01) as the origin. 
> {code:r}
> > a <- Array$create(32L)
> > a$cast(date32())
> Array
> <date32[day]>
> [
>   1970-02-02
> ]
> {code}
> Would it make sense to have an {{origin}} option that would allow the user to 
> fine tune the casting? For example, in R the {{base::as.Date()}} function has 
> such an argument
> {code:r}
> > as.Date(32, origin = "1970-01-02")
> [1] "1970-02-03"
> {code}
> We have a potential workaround in R (once we support date & duration 
> arithmetic), but I was wondering if there might me more general interest for 
> this. 
> A secondary aspect (as my R example shows) R support casting to date not only 
> from integers, but also doubles. Would there be interesting in that? Need be 
> I can split this into several tickets.  
> Are there any plans in either of these 2 directions?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to