[R] Create previous dates from date with consideration of leap year
Dear R Community, I wish to create 5 preceding dates from the date variable by ID. How could I create such dates? The code should consider leap year. Thanks Sample data follows: structure(list(id = 1:12, date = structure(c(9L, 6L, 11L, 8L, 7L, 5L, 4L, 3L, 12L, 1L, 10L, 2L), .Label = c("01feb2003", "03mar2008", "04feb2008", "07jul1991", "07jun2010", "13feb2005", "18dec1991", "22sep2005", "27apr1993", "29jan2009", "29may2002", "31jan2005" ), class = "factor"), case = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("id", "date", "case"), class = "data.frame", row.names = c(NA, -12L)) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rename columns with pattern
I want to rename columns 1 to 6 in the sample data set as bp_1 to bp_6. How could I do that in R? Thanks > dput(dff) structure(list(one = c(1.00027378507871, 0.982313483915127, 1.1531279945243, 1.07400410677618, 1.22710472279261, 1.19762271047046, 1.10904859685147, 1.32060232717317), two = c(1.04707392197125, 1.00998288843258, 1.17598904859685, 1.09595482546201, 1.28599589322382, 1.26632675564591, 1.12986995208761, 1.30704654346338), three = c(1.06301619895049, 1.02743782797171, 1.1977093315081, 1.11466803559206, 1.2949441022131, 1.28365657768591, 1.1305452886151, 1.32089436459046), four = c(1.06994010951403, 1.03489904175222, 1.19799452429843, 1.1172022587269, 1.28742984257358, 1.27650013346977, 1.12265058179329, 1.30723134839151), five = c(1.07019712525667, 1.03722792607803, 1.19174811772758, 1.11514168377823, 1.26594387405886, 1.25720010677582, 1.11339630390144, 1.29178507871321), six = c(1.1909650924, 1.08407027150354, 1.24785877253023, 1.16373032169747, 1.31150581793292, 1.31042514031455, 1.16205338809035, 1.37122975131189), idd = 1:8), .Names = c("one", "two", "three", "four", "five", "six", "idd"), row.names = c(NA, -8L), class = c("tbl_df", "data.frame")) > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Aggegate minutes data to hourly data
I have a measurement that was taken in 15 minutes or more and want to aggregate it by hour. How could I do that? Sample data is found below date_time concentration 26/11/2013 15:46 529.25 26/11/2013 16:03 1596 26/11/2013 16:23 1027.111 26/11/2013 16:39 1001.9 26/11/2013 16:54 -80.25 26/11/2013 17:12 1064.125 26/11/2013 11:14 7969.7 26/11/2013 11:32 522 26/11/2013 11:58 845.111 26/11/2013 12:12 1166.875 26/11/2013 12:30 473.375 26/11/2013 12:42 466.2 26/11/2013 07:47 4358.833 26/11/2013 08:05 1257.545 26/11/2013 08:24 828.6 26/11/2013 08:45 942 26/11/2013 08:58 758.111 26/11/2013 09:13 832.333 26/11/2013 15:45 1876.909 26/11/2013 16:07 574.25 26/11/2013 16:27 1736.846 26/11/2013 16:43 1024.857 26/11/2013 16:59 858.538 26/11/2013 17:15 912.455 26/11/2013 11:18 2086.143 26/11/2013 11:39 2078.667 26/11/2013 12:03 1619.072 26/11/2013 12:16 1197.583 26/11/2013 12:35 619.308 26/11/2013 12:51 1222.571 26/11/2013 07:49 1357.929 26/11/2013 08:08 1120 26/11/2013 08:29 1381.6 26/11/2013 08:48 1493.429 26/11/2013 09:03 1113.786 26/11/2013 09:18 1217.143 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace a column value on condition
I want to replace column c3 with values from column c2 whenever values of column Id are 2. In stata I could use replace c3 = c2 if id ==2. How could I do that in R? Thanks Sample data found below: > dput(df4) structure(list(c2 = c(42L, 42L, 47L, 47L, 55L, 55L, 36L, 36L, 61L, 61L), c3 = c(68L, 59L, 68L, 50L, 62L, 50L, 63L, 45L, 65L, 45L), id = c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)), datalabel = "Written by R. ", time.stamp = "23 Mar 2015 13:54", .Names = c("c2", "c3", "id"), formats = c("%9.0g", "%9.0g", "%9.0g"), types = c(253L, 253L, 255L), val.labels = c("", "", ""), var.labels = c("", "", ""), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), version = 12L, class = "data.frame") __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subset a column with specific characters
This post has NOT been accepted by the mailing list yet. I would like to subset a column based on the contents of a column with specific character. In the sample data I wish to do the following: First keep the data based on column "prog" if prog contains "ca", and secondly to drop if race contains "ic" Thanks library(foreign) hsb2 <- read.dta('http://www.ats.ucla.edu/stat/stata/notes/hsb2.dta') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Obtain coefficients of several nlme objects
I have several lme objects like the ones shown below and I wish to combine the coefficients and confidence intervals of fixed effects of several models. Is there a function that could do that job? m1 <- lme(mark1 ~ pm10 + temp + + age + gender + bmi + statin + smoke + dow + season , data = df , random = ~ 1 | id,na.action=na.exclude, method="ML") m2 <- lme(mark2 ~ pm10 + temp + + age + gender + bmi + statin + smoke + dow + season , data = df , random = ~ 1 | id,na.action=na.exclude, method="ML") __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to identify outliers with values five times 99th percentile
I have a data frame with some extreme values which I wish to identify and repeat an analysis without these extreme values. How could I identify several columns with values which are 5 times higher than the 99th percentile? Sample data is pasted below. > dput(df) structure(list(ad1 = c(98, 6.9, 8.1, 56, 3.9, 6.9, 6.9, 5.8, 7.2, 20.5, 9.4, 7.6, 5.3, 7.9, 62.2, 9.2, 11.9, 8.8, 23.1, 5.4, 9.4, 56, 8.6, 20.7, 21, 10.5, 5.5, 4.3, 15.8, 6.8, 10.4, 5.1), ad2 = c(14.9, 19.7, 1, 17.7, 14.9, 13.6, 18.8, 20.9, 46, 16.5, 11.7, 1, 9.2, 23.6, 19.7, 1, 11.4, 11, 23.1, 1, 1, 8.9, 11.3, 6.4, 15.2, 1, 17.3, 10.1, 13.3, 21.3, 12.3, 15.4 ), ad3 = c(0.91, 0.95, 10.7, 4.4, 0.43, 0.8, 3.1, 1.9, 2.3, 5.6, 3.9, 7.3, 0.37, 4.1, 15.1, 21.8, 3, 0.79, 1, 4.6, 0.61, 0.46, 0.87, 23.5, 3.8, 3.1, 0.33, 1.9, 3.2, 1.7, 0.53, 62.5 ), ad4 = c(225.5, 269.7, 326, 485.4, 193.2, 274.1, 553.2, 166.8, 435.9, 433.2, 187.1, 660.4, 235.4, 356.5, 378.8, 500.5, 323.5, 327.1, 289.5, 301.2, 291.7, 333.5, 351.7, 384.1, 347, 1354, 440.4, 189.2, 381, 252.7, 391.1, 255.1), ad5 = c(337.9, 355.6, 419.5, 798.5, 225, 355.9, 394.4, 340.6, 463.9, 291.9, 312.3, 491, 290.5, 231.9, 358, 386.4, 306.7, 440.6, 297.9, 339.3, 341.1, 366.2, 325.4, 357, 412.2, 370.2, 421.3, 346.3, 289.1, 257.4, 368, 322.6), ad6 = c(64.5, 130.6, 76, 167.8, 47.3, 117, 60.7, 91.9, 221.9, 91.1, 105.1, 110.8, 64.5, 184.5, 191.6, 259.4, 879.5, 142.1, 55.3, 123.1, 62.2, 75.2, 154.6, 100.7, 93.1, 136.7, 74.3, 41.8, 110.1, 109.1, 172.5, 87.7 ), ad7 = c(128L, 987L, 158L, 124L, 137L, 215L, 141L, 98L, 291L, 261L, 106L, 137L, 141L, 159L, 221L, 108L, 123L, 107L, 137L, 175L, 257L, 97L, 168L, 145L, 147L, 188L, 145L, 128L, 153L, 187L, 123L, 354L), ad8 = c(3.26, 3.98, 2.88, 2.85, 4.17, 3.16, 3.09, 4.35, 3.46, 3.81, 3.78, 3.81, 4.17, 4.27, 4.27, 2.97, 3.43, 3.48, 3.78, 3.86, 3.11, 3.12, 3.16, 4.24, 3.81, 3.11, 5.31, 3.75, 3.78, 3.55, 4.08, 3.5), ad9 = c(433L, 211L, 66L, 173L, 224L, 466L, 224L, 273L, 94L, 321L, 160L, 107L, 121L, 186L, 455L, 80L, 897L, 186L, 285L, 134L, 107L, 355L, 261L, 249L, 332L, 107L, 273L, 107L, 160L, 535L, 160L, 121L)), .Names = c("ad1", "ad2", "ad3", "ad4", "ad5", "ad6", "ad7", "ad8", "ad9"), class = "data.frame", row.names = c(NA, -32L)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to combine character month and year columns into one column
Dear R users, I have a data with month and year columns which are both characters and wanted to create a new column like Jan-1999 with the following code. The result is all NA for the month part. What is wrong with the and what is the right way to combine the two? ddf$MonthDay <- paste(month.abb[ddf$month], ddf$Year, sep="-" ) Thanks > dput(ddf) structure(list(month = c("01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"), Year = c("1999", "1999", "1999", "1999", "1999", "1999", "1999", "1999", "1999", "1999", "1999", "1999"), views = c(42, 49, 44, 38, 37, 35, 38, 39, 38, 39, 38, 46), MonthDay = c("NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999")), .Names = c("month", "Year", "views", "MonthDay"), row.names = 109:120, class = "data.frame") > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to combine character month and year columns into one column
Many thanks for your quick answer which has created what I wished. May I ask followup question on the same issue. I failed to convert the new column into date format with this code. The class of MonthDay is still character df$MonthDay <- format(df$MonthDay, format=c("%b %Y")) I would appreciate if you could suggest a working solution Thanks On 23 September 2014 18:03, Marc Schwartz wrote: > On Sep 23, 2014, at 10:41 AM, Kuma Raj wrote: > >> Dear R users, >> >> I have a data with month and year columns which are both characters >> and wanted to create a new column like Jan-1999 >> with the following code. The result is all NA for the month part. What >> is wrong with the and what is the right way to combine the two? >> >> ddf$MonthDay <- paste(month.abb[ddf$month], ddf$Year, sep="-" ) >> >> >> Thanks >> >>> dput(ddf) >> structure(list(month = c("01", "02", "03", "04", "05", "06", >> "07", "08", "09", "10", "11", "12"), Year = c("1999", "1999", >> "1999", "1999", "1999", "1999", "1999", "1999", "1999", "1999", >> "1999", "1999"), views = c(42, 49, 44, 38, 37, 35, 38, 39, 38, >> 39, 38, 46), MonthDay = c("NA-1999", "NA-1999", "NA-1999", "NA-1999", >> "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", "NA-1999", >> "NA-1999", "NA-1999")), .Names = c("month", "Year", "views", >> "MonthDay"), row.names = 109:120, class = "data.frame") >>> >> > > > > Since you are trying to use ddf$month as an index into month.abb, you will > either need to coerce ddf$month to numeric in your code, or adjust how the > data frame is created. > > In the case of the former approach: > >> paste(month.abb[as.numeric(ddf$month)], ddf$Year, sep="-" ) > [1] "Jan-1999" "Feb-1999" "Mar-1999" "Apr-1999" "May-1999" "Jun-1999" > [7] "Jul-1999" "Aug-1999" "Sep-1999" "Oct-1999" "Nov-1999" "Dec-1999" > > > Regards, > > Marc Schwartz > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generate sequence of date based on a group ID
I want to generate a sequence of date based on a group id(similar IDs should have same date). The id variable contains unequal observations and the length of the data set also varies. How could I create a sequence that starts on specific date (say January 1, 2000 onwards) and continues until the end without specifying length? Sample data follows: df<-structure(list(id = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), out1 = c(0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L)), .Names = c("id", "out1"), class = "data.frame", row.names = c(NA, -23L)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to sum some columns based on their names
I want to sum columns based on their names. As an exampel how could I sum columns which contain 6574, 7584 and 85 as column names? In addition, how could I sum those which contain 6574, 7584 and 85 in ther names and have a prefix "f". My data contains several variables with I want to sum columns based on their names. As an exampel how could I sum columns which contain 6574, 7584 and 85 as column names? In addition, how could I sum those which contain 6574, 7584 and 85 in ther names and have a prefix "f". My data contains several variables with dput(df1) structure(list(date = structure(c(1230768000, 1230854400, 1230940800, 1231027200, 1231113600, 123120, 1231286400, 1231372800, 1231459200, 1231545600, 1231632000), class = c("POSIXct", "POSIXt"), tzone = "UTC"), f014card = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), f1534card = c(0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1), f3564card = c(1, 6, 1, 5, 5, 4, 4, 7, 6, 4, 6), f6574card = c(3, 6, 4, 5, 5, 2, 10, 3, 4, 2, 4), f7584card = c(13, 6, 1, 4, 10, 6, 8, 12, 10, 4, 3), f85card = c(5, 3, 1, 0, 2, 10, 7, 9, 1, 7, 3), m014card = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), m1534card = c(0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0), m3564card = c(12, 7, 4, 7, 12, 13, 12, 7, 12, 2, 11), m6574card = c(3, 4, 8, 8, 8, 10, 7, 6, 7, 7, 5), m7584card = c(8, 10, 5, 4, 12, 7, 14, 11, 9, 1, 11), m85card = c(1, 4, 3, 0, 3, 4, 5, 5, 4, 5, 0)), .Names = c("date", "f014card", "f1534card", "f3564card", "f6574card", "f7584card", "f85card", "m014card", "m1534card", "m3564card", "m6574card", "m7584card", "m85card"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11")) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can I merge data with differing length?
How can I merge data frame df and "tem" shown below by filling the head of "tem" with missing values? a<- rnorm(1825, 20) b<- rnorm(1825, 30) date<-seq(as.Date("2000/1/1"), by = "day", length.out = 1825) df<-data.frame(date,a,b) tem<- rpois(1095, lambda=21) Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to update R without losing packages
A solution on the link below provides the steps of updating R without losing packages in Unix. http://zvfak.blogspot.se/2012/06/updating-r-but-keeping-your-installed.html How could I do that on windows 7 platform? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove a column of a matrix with unnamed column header
I have a matrix names test which I want to convert to a data frame. When I use a command test2<-as.data.frame(test) it is executed without a problem. But when I want to browse the content I receive an error message "Error in data.frame(outcome = c("cardva", "respir", "cereb", "neoplasm", : duplicate row.names: Estimate" . The problem is clearly due to a duplicate in row name . But I am unable to remove this column. I need help on how to remove this specific column that has essentially no column header name. dput of the matrix is here: > dput(test) structure(c("cardva", "respir", "cereb", "neoplasm", "ami", "ischem", "heartf", "pneumo", "copd", "asthma", "dysrhy", "diabet", "0.00259492159959046", "0.00979775441709427", "0.00103414632535868", "0.00486468139227382", "0.0164825543879707", "0.0116647168053943", "-0.0012137908515233", "0.00730433232907741", "0.00355583994565985", "0.000712387285735019", "-0.00103763671307935", "0.00981500221106926", "0.00325476724733837", "0.0049232113728293", "0.00520118026087645", "0.00386848394426742", "0.00688121694253705", "0.00585772614064902", "0.00564983058883797", "0.0061328202328586", "0.0108212194251692", "0.0173804438930357", "0.00867931407250442", "0.0106638104533486", "0.425323120845664", "0.0466180768654915", "0.842402292743715", "0.208609687427072", "0.0166336682608816", "0.0464833846710956", "0.8299010611324", "0.233685747699204", "0.742469001175026", "0.967306766450795", "0.904840885401235", "0.357394700741248"), .Dim = c(12L, 4L), .Dimnames = list( c("Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate"), c("outcome", "beta", "se", "pval" ))) > test2<-as.data.frame(test) > test2 Error in data.frame(outcome = c("cardva", "respir", "cereb", "neoplasm", : duplicate row.names: Estimate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove a column of a matrix with unnamed column header
Beend, Thanks for that. Conversion of test to a data frame resulted in a factor. Is there a possibility to selectively convert to numeric? I have tried this code and that has not produced the intended result. test[, c(2:4)] <- sapply(test[, c(2:4)], as.numeric) On 8 November 2013 11:31, Berend Hasselman wrote: > > On 08-11-2013, at 10:40, Kuma Raj wrote: > > > I have a matrix names test which I want to convert to a data frame. When > I > > use a command test2<-as.data.frame(test) it is executed without a > problem. > > But when I want to browse the content I receive an error message "Error > in > > data.frame(outcome = c("cardva", "respir", "cereb", "neoplasm", : > > duplicate row.names: Estimate" . The problem is clearly due to a > duplicate > > in row name . But I am unable to remove this column. I need help on how > to > > remove this specific column that has essentially no column header name. > > dput of the matrix is here: > > > >> dput(test) > > structure(c("cardva", "respir", "cereb", "neoplasm", "ami", "ischem", > > "heartf", "pneumo", "copd", "asthma", "dysrhy", "diabet", > > "0.00259492159959046", > > "0.00979775441709427", "0.00103414632535868", "0.00486468139227382", > > "0.0164825543879707", "0.0116647168053943", "-0.0012137908515233", > > "0.00730433232907741", "0.00355583994565985", "0.000712387285735019", > > "-0.00103763671307935", "0.00981500221106926", "0.00325476724733837", > > "0.0049232113728293", "0.00520118026087645", "0.00386848394426742", > > "0.00688121694253705", "0.00585772614064902", "0.00564983058883797", > > "0.0061328202328586", "0.0108212194251692", "0.0173804438930357", > > "0.00867931407250442", "0.0106638104533486", "0.425323120845664", > > "0.0466180768654915", "0.842402292743715", "0.208609687427072", > > "0.0166336682608816", "0.0464833846710956", "0.8299010611324", > > "0.233685747699204", "0.742469001175026", "0.967306766450795", > > "0.904840885401235", "0.357394700741248"), .Dim = c(12L, 4L), .Dimnames = > > list( > >c("Estimate", "Estimate", "Estimate", "Estimate", "Estimate", > >"Estimate", "Estimate", "Estimate", "Estimate", "Estimate", > >"Estimate", "Estimate"), c("outcome", "beta", "se", "pval" > >))) > > > >> test2<-as.data.frame(test) > >> test2 > > Error in data.frame(outcome = c("cardva", "respir", "cereb", "neoplasm", > : > > duplicate row.names: Estimate > > rownames(test) <- NULL > > Berend > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loop through columns of outcomes
I have asked this question on SO, but it attracted no response, thus I am cross- posting it here with the hope that someone would help. I want to estimate the effect of pm10 and o3 on three outcome(death, cvd and resp). What I want to do is run one model for each of the main predictors (pm10 and o3) and each outcome(death, cvd and resp). Thus I expect to obtain 6 models. The script below works for one outcome (death) and I wish to use it for more dependent variables. library(quantmod) library(mgcv) library(dlnm) df <- chicagoNMMAPS outcomes<- c("death", "cvd", "resp ") varlist0 <- c("pm10", "o3") m1 <- lapply(varlist0,function(v) { f <- sprintf("death~ s(time,bs='cr',k=200)+s(temp,bs='cr') + Lag(%s,0:6)",v) gam(as.formula(f),family=quasipoisson,na.action=na.omit,data=df) }) Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop through columns of outcomes
Thanks for the script which works perfectly. I am interested to do model checking and also interested to extract the coefficients for linear and spline terms. For model checkup I could run this script which will give different plots to test model fit: gam.check(m2[[1]]). Thanks to mnel from SO I could also extract the linear terms with the following script: m2 <- unlist(m1, recursive = FALSE) ## unlist First extract the model elements: mod1<-m2[[1]] mod2<-m2[[2]] mod3<-m2[[3]] mod4<-m2[[4]] mod5<-m2[[5]] mod6<-m2[[6]] And run the following: mlist <- list(mod1, mod2, mod3,mod4,mod5,mod6) ## Creates a list of models names(mlist) <- list("mod1", "mod2", "mod3","mod4","mod5","mod6") slist <- lapply(mlist, summary) ## obtain summaries plist <- lapply(slist, `[[`, 'p.table') ## list of the coefficients linear terms For 6 models this is relatively easy to do, but how could I shorten the process if I have large number of models? Thanks On 12 November 2013 12:32, Rui Barradas wrote: > Hello, > > Use nested lapply(). Like this: > > > > m1 <- lapply(varlist0,function(v) { > lapply(outcomes, function(o){ > f <- sprintf("%s~ s(time,bs='cr',k=200)+s(temp,bs='cr') + > Lag(%s,0:6)", o, v) > > gam(as.formula(f),family=quasipoisson,na.action=na.omit,data=df) > })}) > > m1 <- unlist(m1, recursive = FALSE) > m1 > > > Hope this helps, > > Rui Barradas > > > Em 12-11-2013 09:53, Kuma Raj escreveu: >> >> I have asked this question on SO, but it attracted no response, thus I am >> cross- posting it here with the hope that someone would help. >> >> I want to estimate the effect of pm10 and o3 on three outcome(death, cvd >> and resp). What I want to do is run one model for each of the main >> predictors (pm10 and o3) and each outcome(death, cvd and resp). Thus I >> expect to obtain 6 models. The script below works for one outcome (death) >> and I wish to use it for more dependent variables. >> >> >> >> library(quantmod) >> library(mgcv) >> library(dlnm) >> df <- chicagoNMMAPS >> outcomes<- c("death", "cvd", "resp ") >> varlist0 <- c("pm10", "o3") >> >> m1 <- lapply(varlist0,function(v) { >> f <- sprintf("death~ s(time,bs='cr',k=200)+s(temp,bs='cr') + >> Lag(%s,0:6)",v) >> gam(as.formula(f),family=quasipoisson,na.action=na.omit,data=df) >>}) >> >> Thanks >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop through columns of outcomes
Very helpful, many thanks. On 12 November 2013 16:09, Rui Barradas wrote: > Hello, > > Once again, use lapply. > > mlist <- lapply(seq_along(m2), function(i) m2[[i]]) > names(mlist) <- paste0("mod", seq_along(mlist)) > > slist <- lapply(mlist, summary) > > > plist <- lapply(slist, `[[`, 'p.table') > > > Hope this helps, > > Rui Barradas > > Em 12-11-2013 13:28, Kuma Raj escreveu: > >> Thanks for the script which works perfectly. I am interested to do >> model checking and also interested to extract the coefficients for >> linear and spline terms. For model checkup I could run this script >> which will give different plots to test model fit: gam.check(m2[[1]]). >> Thanks to mnel from SO I could also extract the linear terms with the >> following script: >> >> m2 <- unlist(m1, recursive = FALSE) ## unlist >> >> First extract the model elements: >> >> mod1<-m2[[1]] >> mod2<-m2[[2]] >> mod3<-m2[[3]] >> mod4<-m2[[4]] >> mod5<-m2[[5]] >> mod6<-m2[[6]] >> >> And run the following: >> >> mlist <- list(mod1, mod2, mod3,mod4,mod5,mod6) ## Creates a list of >> models >> names(mlist) <- list("mod1", "mod2", "mod3","mod4","mod5","mod6") >> >> slist <- lapply(mlist, summary) ## obtain summaries >> >> plist <- lapply(slist, `[[`, 'p.table') ## list of the coefficients >> linear terms >> >> For 6 models this is relatively easy to do, but how could I shorten >> the process if I have large number of models? >> >> Thanks >> >> >> On 12 November 2013 12:32, Rui Barradas wrote: >>> >>> Hello, >>> >>> Use nested lapply(). Like this: >>> >>> >>> >>> m1 <- lapply(varlist0,function(v) { >>> lapply(outcomes, function(o){ >>> f <- sprintf("%s~ s(time,bs='cr',k=200)+s(temp,bs='cr') >>> + >>> Lag(%s,0:6)", o, v) >>> >>> gam(as.formula(f),family=quasipoisson,na.action=na.omit,data=df) >>>})}) >>> >>> m1 <- unlist(m1, recursive = FALSE) >>> m1 >>> >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> >>> Em 12-11-2013 09:53, Kuma Raj escreveu: >>>> >>>> >>>> I have asked this question on SO, but it attracted no response, thus I >>>> am >>>> cross- posting it here with the hope that someone would help. >>>> >>>> I want to estimate the effect of pm10 and o3 on three outcome(death, >>>> cvd >>>> and resp). What I want to do is run one model for each of the main >>>> predictors (pm10 and o3) and each outcome(death, cvd and resp). Thus I >>>> expect to obtain 6 models. The script below works for one outcome >>>> (death) >>>> and I wish to use it for more dependent variables. >>>> >>>> >>>> >>>> library(quantmod) >>>> library(mgcv) >>>> library(dlnm) >>>> df <- chicagoNMMAPS >>>> outcomes<- c("death", "cvd", "resp ") >>>> varlist0 <- c("pm10", "o3") >>>> >>>> m1 <- lapply(varlist0,function(v) { >>>> f <- sprintf("death~ s(time,bs='cr',k=200)+s(temp,bs='cr') + >>>> Lag(%s,0:6)",v) >>>> >>>> gam(as.formula(f),family=quasipoisson,na.action=na.omit,data=df) >>>> }) >>>> >>>> Thanks >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> __ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.