On 07/18/2020 01:38 PM, William Michels wrote: > Do either of the postings/threads below help? > > https://r.789695.n4.nabble.com/read-csv-sql-to-select-from-a-large-csv-file-td4650565.html#a4651534 > https://r.789695.n4.nabble.com/using-sqldf-s-read-csv-sql-to-read-a-file-with-quot-NA-quot-for-missing-td4642327.html > > Otherwise you can try reading through the FAQ on Github: > > https://github.com/ggrothendieck/sqldf > > HTH, Bill. > > W. Michels, Ph.D. > > > > On Sat, Jul 18, 2020 at 9:59 AM H <age...@meddatainc.com> wrote: >> On 07/18/2020 11:54 AM, Rui Barradas wrote: >>> Hello, >>> >>> I don't believe that what you are asking for is possible but like Bert >>> suggested, you can do it after reading in the data. >>> You could write a convenience function to read the data, then change what >>> you need to change. >>> Then the function would return this final object. >>> >>> Rui Barradas >>> >>> Às 16:43 de 18/07/2020, H escreveu: >>> >>>> On 07/17/2020 09:49 PM, Bert Gunter wrote: >>>>> Is there some reason that you can't make the changes to the data frame >>>>> (column names, as.date(), ...) *after* you have read all your data in? >>>>> >>>>> Do all your csv files use the same names and date formats? >>>>> >>>>> >>>>> Bert Gunter >>>>> >>>>> "The trouble with having an open mind is that people keep coming along >>>>> and sticking things into it." >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>>> >>>>> >>>>> On Fri, Jul 17, 2020 at 6:28 PM H <age...@meddatainc.com >>>>> <mailto:age...@meddatainc.com>> wrote: >>>>> >>>>> I have created a dataframe with columns that are characters, >>>>> integers and numeric and with column names assigned by me. I am using >>>>> read.csv.sql() to read portions of a number of large csv files into this >>>>> dataframe, each csv file having a header row with columb names. >>>>> >>>>> The problem I am having is that the csv files have header rows with >>>>> column names that are slightly different from the column names I have >>>>> assigned in the dataframe and it seems that when I read the csv data into >>>>> the dataframe, the column names from the csv file replace the column >>>>> names I chose when creating the dataframe. >>>>> >>>>> I have been unable to figure out if it is possible to assign column >>>>> names of my choosing in the read.csv.sql() function? I have tried various >>>>> variations but none seem to work. I tried colClasses = c(....) but that >>>>> did not work, I tried field.types = c(...) but could not get that to work >>>>> either. >>>>> >>>>> It seems that the above should be feasible but I am missing >>>>> something? Does anyone know? >>>>> >>>>> A secondary issue is that the csv files have a column with a date in >>>>> mm/dd/yyyy format that I would like to make into a Date type column in my >>>>> dataframe. Again, I have been unable to find a way - if at all possible - >>>>> to force a conversion into a Date format when importing into the >>>>> dataframe. The best I have so far is to import is a character column and >>>>> then use as.Date() to later force the conversion of the dataframe column. >>>>> >>>>> Is it possible to do this when importing using read.csv.sql()? >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- >>>>> To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> Yes, the files use the same column names and date format (at least as far >>>> as I know now.) I agree I could do it as you suggest above but from a >>>> purist perspective I would rather do it when importing the data using >>>> read.csv.sql(), particularly if column names and/or date format might >>>> change, or be different between different files. I am indeed selecting >>>> rows from a large number of csv files so this is entirely plausible. >>>> >>>> Has anyone been able to name columns in the read.csv.sql() call and/or >>>> force date format conversion in the call itself? The first refers to >>>> naming columns differently from what a header in the csv file may have. >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >> The documentation for read.csv.sql() suggests that colClasses() and/or >> field.types() should work but I may well have misunderstood the >> documentation, hence my question in this group. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
I had read the sqldf() documentation but was left with the impression that what I want to do is not easily doable. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.