Re: [R] Help with read.csv.sql()

H Sun, 19 Jul 2020 17:08:46 -0700

On 07/18/2020 01:38 PM, William Michels wrote:
> Do either of the postings/threads below help?
>
> https://r.789695.n4.nabble.com/read-csv-sql-to-select-from-a-large-csv-file-td4650565.html#a4651534
> https://r.789695.n4.nabble.com/using-sqldf-s-read-csv-sql-to-read-a-file-with-quot-NA-quot-for-missing-td4642327.html
>
> Otherwise you can try reading through the FAQ on Github:
>
> https://github.com/ggrothendieck/sqldf
>
> HTH, Bill.
>
> W. Michels, Ph.D.
>
>
>
> On Sat, Jul 18, 2020 at 9:59 AM H <age...@meddatainc.com> wrote:
>> On 07/18/2020 11:54 AM, Rui Barradas wrote:
>>> Hello,
>>>
>>> I don't believe that what you are asking for is possible but like Bert 
>>> suggested, you can do it after reading in the data.
>>> You could write a convenience function to read the data, then change what 
>>> you need to change.
>>> Then the function would return this final object.
>>>
>>> Rui Barradas
>>>
>>> Às 16:43 de 18/07/2020, H escreveu:
>>>
>>>> On 07/17/2020 09:49 PM, Bert Gunter wrote:
>>>>> Is there some reason that you can't make the changes to the data frame 
>>>>> (column names, as.date(), ...) *after* you have read all your data in?
>>>>>
>>>>> Do all your csv files use the same names and date formats?
>>>>>
>>>>>
>>>>> Bert Gunter
>>>>>
>>>>> "The trouble with having an open mind is that people keep coming along 
>>>>> and sticking things into it."
>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>>
>>>>>
>>>>> On Fri, Jul 17, 2020 at 6:28 PM H <age...@meddatainc.com 
>>>>> <mailto:age...@meddatainc.com>> wrote:
>>>>>
>>>>>      I have created a dataframe with columns that are characters, 
>>>>> integers and numeric and with column names assigned by me. I am using 
>>>>> read.csv.sql() to read portions of a number of large csv files into this 
>>>>> dataframe, each csv file having a header row with columb names.
>>>>>
>>>>>      The problem I am having is that the csv files have header rows with 
>>>>> column names that are slightly different from the column names I have 
>>>>> assigned in the dataframe and it seems that when I read the csv data into 
>>>>> the dataframe, the column names from the csv file replace the column 
>>>>> names I chose when creating the dataframe.
>>>>>
>>>>>      I have been unable to figure out if it is possible to assign column 
>>>>> names of my choosing in the read.csv.sql() function? I have tried various 
>>>>> variations but none seem to work. I tried colClasses = c(....) but that 
>>>>> did not work, I tried field.types = c(...) but could not get that to work 
>>>>> either.
>>>>>
>>>>>      It seems that the above should be feasible but I am missing 
>>>>> something? Does anyone know?
>>>>>
>>>>>      A secondary issue is that the csv files have a column with a date in 
>>>>> mm/dd/yyyy format that I would like to make into a Date type column in my 
>>>>> dataframe. Again, I have been unable to find a way - if at all possible - 
>>>>> to force a conversion into a Date format when importing into the 
>>>>> dataframe. The best I have so far is to import is a character column and 
>>>>> then use as.Date() to later force the conversion of the dataframe column.
>>>>>
>>>>>      Is it possible to do this when importing using read.csv.sql()?
>>>>>
>>>>>      ______________________________________________
>>>>>      R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- 
>>>>> To UNSUBSCRIBE and more, see
>>>>>      https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>      PLEASE do read the posting guide 
>>>>> http://www.R-project.org/posting-guide.html
>>>>>      and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>> Yes, the files use the same column names and date format (at least as far 
>>>> as I know now.) I agree I could do it as you suggest above but from a 
>>>> purist perspective I would rather do it when importing the data using 
>>>> read.csv.sql(), particularly if column names and/or date format might 
>>>> change, or be different between different files. I am indeed selecting 
>>>> rows from a large number of csv files so this is entirely plausible.
>>>>
>>>> Has anyone been able to name columns in the read.csv.sql() call and/or 
>>>> force date format conversion in the call itself? The first refers to 
>>>> naming columns differently from what a header in the csv file may have.
>>>>
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>> The documentation for read.csv.sql() suggests that colClasses() and/or 
>> field.types() should work but I may well have misunderstood the 
>> documentation, hence my question in this group.
>>
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


I had read the sqldf() documentation but was left with the impression that what 
I want to do is not easily doable.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with read.csv.sql()

Reply via email to