On Tue, Mar 1, 2011 at 10:06 AM, chen jia <chen_1...@fisher.osu.edu> wrote: > Hi Seth, > > Thanks so much for identifying the problem and explaining everything. > I think the first solution that you suggest--make sure the schema has > well defined types--would work the best for me. But, I have one > question about how to implement it, which is more about sqlite itself. > > First, I found out that the columns that don't have the expected data > types in the table annual_data3 are created by aggregate functions in > a separate table. These columns are later combined with other columns > that do. > > I read the link that you provide, > http://www.sqlite.org/datatype3.html. One paragraph says "When > grouping values with the GROUP BY clause values with different storage > classes are considered distinct, except for INTEGER and REAL values > which are considered equal if they are numerically equal. No > affinities are applied to any values as the result of a GROUP by > clause." > > If I understand it correctly, the columns created by aggregate > functions with a GROUP by clause do not have any expected data types. > > My solution is to use CREATE TABLE clause to declare the expected > datatype and then insert the values of columns created by the > aggregate functions with the GROUP by clause. However, this solution > requires a CREATE TABLE cause every time the aggregate function and > the GROUP by clause is used. > > My question is: Is this the best way to make sure that the columns as > a result of a GROUP by clause have the expected data types? Thanks.
That might be a good question to post to the SQLite user's list :-) I don't have an answer off the top of my head. My reading of the SQLite docs would lead me to expect that a GROUP BY clause would not change/remove type if the column being grouped contains all the same declared type affinity. + seth > > Best, > Jia > > On Tue, Mar 1, 2011 at 1:16 AM, Seth Falcon <s...@userprimary.net> wrote: >> Hi Jia, >> >> On Mon, Feb 28, 2011 at 6:57 PM, chen jia <chen_1...@fisher.osu.edu> wrote: >>> The .schema of table annual_data3 is >>> sqlite> .schema annual_data3 >>> CREATE TABLE "annual_data3"( >>> PERMNO INT, >>> DATE INT, >>> CUSIP TEXT, >>> EXCHCD INT, >>> SICCD INT, >>> SHROUT INT, >>> PRC REAL, >>> RET REAL, >>> ... >>> pret_var, >>> pRET_sd, >>> nmret, >>> pya_var, >> >> [snip] >> >> Is there a reason that you've told SQLite the expected data type for >> only some of the columns? >> >>> Interestingly, I find that the problem I reported does not for columns >>> labeled real in the schema info. For example, the type of column RET >>> never changes no matter what the first observation is. >> >> Yes, that is expected and I think it is the solution to your problem: >> setup your schema so that all columns have a declared type. For some >> details on SQLite's type system see >> http://www.sqlite.org/datatype3.html. >> >> RSQLite currently maps NA values to NULL in the database. Pulling >> data out of a SELECT query, RSQLite uses the sqlite3_column_type >> SQLite API to determine the data type and map it to an R type. If >> NULL is encountered, then the schema is inspected using >> sqlite3_column_decltype to attempt to obtain a type. If that fails, >> the data is mapped to a character vector at the R level. The type >> selection is done once after the first row has been fetched. >> >> To work around this you can: >> >> - make sure your schema has well defined >> types (which will help SQLite perform its operations); >> >> - check whether the returned column has the expected type and convert >> if needed at the R level. >> >> - remove NA/NULL values from the db or decide on a different way of >> encoding them (e.g you might be able to use -1 in the db in some >> situation to indicate missing). Your R code would then need to map >> these to proper NA. >> >> Hope that helps. >> >> + seth >> >> >> >> -- >> Seth Falcon | @sfalcon | http://userprimary.net/ >> > > > > -- > 700 Fisher Hall > 2100 Neil Ave. > Columbus, Ohio 43210 > http://www.fisher.osu.edu/~chen_1002/ > -- Seth Falcon | @sfalcon | http://userprimary.net/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.