Re: [R] Help with read.csv.sql()

H Mon, 20 Jul 2020 14:56:02 -0700

On 07/18/2020 11:42 PM, Rasmus Liland wrote:
> On 2020-07-18 18:09 +0100, Rui Barradas wrote:
> | �s 17:59 de 18/07/2020, H escreveu:
> | | On Fri, Jul 17, 2020 at 6:28 PM H <age...@meddatainc.com> wrote:
> | | | 
> | | | The problem I am having is that 
> | | | the csv files have header rows 
> | | | with column names that are 
> | | | slightly different from the column 
> | | | names I have assigned in the 
> | | | dataframe and it seems that when I 
> | | | read the csv data into the 
> | | | dataframe, the column names from 
> | | | the csv file replace the column 
> | | | names I chose when creating the 
> | | | dataframe.
> | | | 
> | | | A secondary issue is that the csv 
> | | | files have a column with a date in 
> | | | mm/dd/yyyy format that I would 
> | | | like to make into a Date type 
> | | | column in my dataframe. Again, I 
> | | | have been unable to find a way - 
> | | | if at all possible - to force a 
> | | | conversion into a Date format when 
> | | | importing into the dataframe. The 
> | | | best I have so far is to import is 
> | | | a character column and then use 
> | | | as.Date() to later force the 
> | | | conversion of the dataframe 
> | | | column.
> | | 
> | | The documentation for read.csv.sql() 
> | | suggests that colClasses() and/or 
> | | field.types() should work but I may 
> | | well have misunderstood the 
> | | documentation, hence my question in 
> | | this group.
> | 
> | As for colClasses, those are R class 
> | names.
>
> Ok Mister H, I might have hit the nail 
> on the head this time with this badass 
> example for your usecase:
>
>       # Make a csv with %d/%m/%Y dates in it ...
>       Lines <- "STM05-1 2005/02/28 17:35 Good -35.562 177.158
>       STM05-1 2005/02/28 19:44 Good -35.487 177.129
>       STM05-1 2005/02/28 23:01 Unknown -35.399 177.064
>       STM05-1 2005/03/01 07:28 Unknown -34.978 177.268
>       STM05-1 2005/03/01 18:06 Poor -34.799 177.027
>       STM05-1 2005/03/01 18:47 Poor -34.85 177.059
>       STM05-2 2005/02/28 12:49 Good -35.928 177.328
>       STM05-2 2005/02/28 21:23 Poor -35.926 177.314
>       "
>       DF <- read.table(textConnection(Lines), as.is = TRUE,
>         col.names = c("Id", "Date", "Time", "Quality", "Lat", "Long"))
>       DF$Date <- format(as.Date(DF$Date, "%Y/%m/%d"), "%d/%m/%Y")
>       write.csv(DF, file="df.csv", row.names=FALSE)
>       
>       colClasses <-
>         c("character",
>           "Date",
>           "character",
>           "character",
>           "numeric",
>           "numeric")
>       sql <- paste0(
>         "select ",
>           "date(",  # [2]
>             "substr(Date, 8, 4) || '-' || ",  # [1]
>             "substr(Date, 5, 2) || '-' || ",
>             "substr(Date, 2, 2)), Long, Lat, Quality ",
>         "from ff where Quality like '%oo%' and Long>177.129")
>       ff <- file(description="df.csv", open="r")
>       dat <- sqldf::read.csv.sql(
>         sql=sql, colClasses=colClasses)
>       close(ff)
>       
>       str(dat)
>       
>       as.Date(dat[,1])
>       dat[,3]
>
> Both sqlite and Postgres has a function 
> substr you can call on strings like 
> this.[5]  I have a hunch this has always 
> been possible in sql from way back ... 
>
> The warning from sqldf about unused 
> connections, might suggest file 
> descriptor handling to be a bit crusty 
> ... [3] 
>
> The thing is, defining the second column 
> as of type Date in colClasses happens to 
> work, but it's still character when you 
> check with str(dat) ... perhaps it has 
> something to do with this info from [4]: 
>
>       as_tibble_row() converts a vector to 
>       a tibble with one row. The input 
>       must be a bare vector, e.g. vectors 
>       of dates are not supported yet. If 
>       the input is a list, all elements 
>       must have length one.
>
> [1] 
> https://stackoverflow.com/questions/15563656/convert-string-to-date-in-sqlite
> [2] https://www.sqlite.org/lang_datefunc.html
> [3] https://groups.google.com/forum/#!topic/sqldf/mcQ_K_E--q8
> [4] https://tibble.tidyverse.org/reference/as_tibble.html
> [5] https://www.sqlite.org/lang_corefunc.html#substr, 
> https://www.postgresql.org/docs/9.1/functions-string.html,
> http://www.h2database.com/html/functions.html#substring 
>
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Thank you for your extensive example. However, I have decided to simply convert 
column types as necessary and rename columns as desired after importing the 
data since that seems the simplest solution.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with read.csv.sql()

Reply via email to