On 2020-07-18 18:09 +0100, Rui Barradas wrote:
| Às 17:59 de 18/07/2020, H escreveu:
| | On Fri, Jul 17, 2020 at 6:28 PM H <age...@meddatainc.com> wrote:
| | | 
| | | The problem I am having is that 
| | | the csv files have header rows 
| | | with column names that are 
| | | slightly different from the column 
| | | names I have assigned in the 
| | | dataframe and it seems that when I 
| | | read the csv data into the 
| | | dataframe, the column names from 
| | | the csv file replace the column 
| | | names I chose when creating the 
| | | dataframe.
| | | 
| | | A secondary issue is that the csv 
| | | files have a column with a date in 
| | | mm/dd/yyyy format that I would 
| | | like to make into a Date type 
| | | column in my dataframe. Again, I 
| | | have been unable to find a way - 
| | | if at all possible - to force a 
| | | conversion into a Date format when 
| | | importing into the dataframe. The 
| | | best I have so far is to import is 
| | | a character column and then use 
| | | as.Date() to later force the 
| | | conversion of the dataframe 
| | | column.
| | 
| | The documentation for read.csv.sql() 
| | suggests that colClasses() and/or 
| | field.types() should work but I may 
| | well have misunderstood the 
| | documentation, hence my question in 
| | this group.
| 
| As for colClasses, those are R class 
| names.

Ok Mister H, I might have hit the nail 
on the head this time with this badass 
example for your usecase:

        # Make a csv with %d/%m/%Y dates in it ...
        Lines <- "STM05-1 2005/02/28 17:35 Good -35.562 177.158
        STM05-1 2005/02/28 19:44 Good -35.487 177.129
        STM05-1 2005/02/28 23:01 Unknown -35.399 177.064
        STM05-1 2005/03/01 07:28 Unknown -34.978 177.268
        STM05-1 2005/03/01 18:06 Poor -34.799 177.027
        STM05-1 2005/03/01 18:47 Poor -34.85 177.059
        STM05-2 2005/02/28 12:49 Good -35.928 177.328
        STM05-2 2005/02/28 21:23 Poor -35.926 177.314
        "
        DF <- read.table(textConnection(Lines), as.is = TRUE,
          col.names = c("Id", "Date", "Time", "Quality", "Lat", "Long"))
        DF$Date <- format(as.Date(DF$Date, "%Y/%m/%d"), "%d/%m/%Y")
        write.csv(DF, file="df.csv", row.names=FALSE)
        
        colClasses <-
          c("character",
            "Date",
            "character",
            "character",
            "numeric",
            "numeric")
        sql <- paste0(
          "select ",
            "date(",  # [2]
              "substr(Date, 8, 4) || '-' || ",  # [1]
              "substr(Date, 5, 2) || '-' || ",
              "substr(Date, 2, 2)), Long, Lat, Quality ",
          "from ff where Quality like '%oo%' and Long>177.129")
        ff <- file(description="df.csv", open="r")
        dat <- sqldf::read.csv.sql(
          sql=sql, colClasses=colClasses)
        close(ff)
        
        str(dat)
        
        as.Date(dat[,1])
        dat[,3]

Both sqlite and Postgres has a function 
substr you can call on strings like 
this.[5]  I have a hunch this has always 
been possible in sql from way back ... 

The warning from sqldf about unused 
connections, might suggest file 
descriptor handling to be a bit crusty 
... [3] 

The thing is, defining the second column 
as of type Date in colClasses happens to 
work, but it's still character when you 
check with str(dat) ... perhaps it has 
something to do with this info from [4]: 

        as_tibble_row() converts a vector to 
        a tibble with one row. The input 
        must be a bare vector, e.g. vectors 
        of dates are not supported yet. If 
        the input is a list, all elements 
        must have length one.

[1] 
https://stackoverflow.com/questions/15563656/convert-string-to-date-in-sqlite
[2] https://www.sqlite.org/lang_datefunc.html
[3] https://groups.google.com/forum/#!topic/sqldf/mcQ_K_E--q8
[4] https://tibble.tidyverse.org/reference/as_tibble.html
[5] https://www.sqlite.org/lang_corefunc.html#substr, 
https://www.postgresql.org/docs/9.1/functions-string.html,
http://www.h2database.com/html/functions.html#substring 

Attachment: signature.asc
Description: PGP signature

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to