[R] Count objects and print it into a new variable
Hello, I used R a year ago. With the data I am working with now, I realized that I need to go back to R. Unfortunately, my memory is not my friend if it comes down to coding :-) What I want to do is extract the length of a variable of a file with certain conditions and then print this number in a new variable.. My file (yukon), contains information about fires in the Yukon Territory, and it looks basically like this: Fire_Year Area_Hecta InitialFir ... and other headers but they are not important The Fire_Year ranges from 1980-2010, the Area_Hecta contains numbers and the InitialFir contains a date written like this: 24/05/1980 My end goal is to plot 2 plots, one with the summed area burned per year and the other plot should show the number of fires happening per year. I managed the first plot with the command aggregate. annualAB.sum<-aggregate(yukon$Area_Hecta~yukon$Fire_Year, sum, data=yukon) Now I am having troubles with the second plot. I created subsets for each year and then used the length command..then I could take all those numbers and put them in a new variable... y1980 <- subset(yukon,Fire_Year=="1980") length(y1980$Area_Hecta) but I somehow feel that there must be a better solution...maybe a loop? Any help is greatly appreciated Thanks so much Sandra -- View this message in context: http://r.789695.n4.nabble.com/Count-objects-and-print-it-into-a-new-variable-tp3595174p3595174.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate a mean for several months for several years
Hello everyone, I have a dataset with 3 colums (Year, Month, MeanTemp). Now I would like to calculate the average of the mean temperature for the summer months (Juli, August, September) for each of the 20 years. I'm sure it's somehow possible with a loop, but all I tried so far didn't worked and by the time I spent looking for a solution I would have even be faster doing this in excel!!! Any help is greatly appreciated!!! Cheers Sandra -- View this message in context: http://r.789695.n4.nabble.com/Calculate-a-mean-for-several-months-for-several-years-tp3318363p3318363.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate a mean for several months for several years
Thank you so much for your help Dennis, Pete and D Kelly! In the meantime I tried to do a loop and came up with this: l<-0 for (i in 0:((length(lakewil$Year)/12)-1)) { l[i+1]<- mean(lakewil$MeanTemp[((i*12)+7):((i*12)+9)]) } print(l) This gives me the summer averages over the 3 months for each year...Was this a silly way of doing this??? Your solutions look way more like "programming" but I m having troubles understanding them. But I guess this is exactly what I have to do now, because now as a next step for my linear model I have to calculate an average of fire sizes during the same time period (1980-1999). "Unfortunately" not in every year occured fire so I cannot use the same comand as above where I always had 12 months in between. Now the dataset (named: fire) looks like this: Year Size 1981 50.3 1984 57.3 1984 1989 1989 1989 ... etc How do I calculate now a mean average fire size for the years that fires occured? I can do it for a group of years seperately but how do I make it better? FSize<-mean(fire$Size[fire$Year==1981]) Thank you so much Sandra -- View this message in context: http://r.789695.n4.nabble.com/Calculate-a-mean-for-several-months-for-several-years-tp3318363p3318683.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] copy values from one dataframes into another
Hello everyone, I have the following problem, I have a dataframes that looks like this: fire$Year fire$Size 1 19811738.0 2 19842228.1 3 1985 38963.3 4 19862223.4 5 19873594.6 6 19881520.0 ... What I would like to do is copy the values from the fire$Size colum and put it into a new df but with "0" for the years that are missing. The result should look like this: year size 1981 1738.0 1982 0 1983 0 1984 2228.1 ... First I tried to merge the two dataframes temp <-merge(fire.sum,fire2, by.x="fire$Year", by.y="year") but then it only gives me the years that are the same. So I thought it might be easier to just copy them? Any help is appreciated, thank you so much Cheers Sandra -- View this message in context: http://r.789695.n4.nabble.com/copy-values-from-one-dataframes-into-another-tp3321805p3321805.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reshape panel data
I have a data set with observations on 549 cities spanning an 18 year period. However, some of cities did not report in one or more of the 18 years. I would like to implement the procedure suggested by Wooldridge section 17.1.3 in his "Econometric analysis of cross section and panel data" to correct for attrition. For example the table below indicates that the 3rd and the 7th cities in the data set do not have observations for several years. The Wooldridge procedure requires the generation of a selection variable that takes on the value of 1 if the city reports in that year and 0 otherwise. How do I assign a zero to a city when it does not have an observation for that year? For example. Suppose I have the following data set. The observation range over three years 1990-1992. But some cities did not report in some years. The original data looks like this: Cicoidyear other_variables seclection-variable 1 1990 x x x x x x x 1 1 1991 xx 1 2 1991 xx 1 3 1990 xx 1 3 1991 xx 1 3 1992 xx 1 I would like to get a data set that looks like this: Cicoidyear other_variables seclection-variable 1 1990 x x x x x x x1 1 1991 xx 1 1 1992 ... 0 2 1990 0 2 1991 xx 1 2 1992 0 3 1990 xx 1 3 1991 xx 1 3 1992 xx 1 I can reshape the data using STATA with the following three simple commands: xtset Cicoid year tsfill ,full replace selection_variable=0 if selection_variable==. I proclaim the data as a panel series identifying the ID and TIME index variables. Then use the time-series fill command. I have searched the help and vignettes of both the "zoo" and "plm" packages but cannot find the solution. Can anyone help? Thanks, Richard Saba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading hierarchical data
I would like to read the following hierarchical data set. There is a family record followed by one or more personal records. If col. 7 is "1" it is a family record. If it is "2" it is a personal record. The family record is formatted as follows: col. 1-5 family id col. 7"1" col. 9dwelling type code The personal record is formatted as follows: col. 1-5personal id col. 7 "2" col. 8-9age col. 11 sex code The first six family and accompanying personal records look like this: 06470 1 1 1 232 0 2 230 1 07470 1 0 1 240 1 08470 1 0 1 227 0 09470 1 0 1 213 1 2 222 0 3 224 1 10470 1 1 1 220 0 2 211 1 11470 1 0 1 217 0 2 210 1 3 226 1 I want to create a dataset containing . family ID . dwelling code . person ID . age . sex code The dataset will contain one observation per person, and the with family information repeated for people in the same family. Can anyone help? Thanks, Richard Saba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with nested loops
Each of the data sets contains monthly observations on price indices for 7 countries. I use the fitted values from reg1 in the reg2 model. The interior loop executes without error as long as I explicitly specify the data set, i.e. data=dat70. However the code fails to execute if I specify the model in the form of the commented line, i. e reg1 <-dynlm(form1,data=Dnames[j]) I get the following error message: Error in merge.zoo(USA, lag(USA, k = -1), lag(USA, k = -2), lag(Canada, : object 'USA' not found Apparently the Dnames[j] does not evaluate to the dataset name. Does anyone have a solution to my problem? The values in Names are: [1] "Canada" "France" "Germany" "Italy" "Japan" "UK" "USA" And in Dnames are : [1] "dat70" "dat80" "dat90" "dat2000" library(dynlm) kimdat<-ts(read.csv("data.csv", header = TRUE),start=1970,frequency=12) dat70 <- window(kimdat,start =c(1970,1), end=c(1979,12)) dat80 <- window(kimdat,start =c(1980,1), end=c(1989,12)) dat90 <- window(kimdat,start =c(1990,1), end=c(1999,12)) dat2000 <- window(kimdat,start =c(2000,1), end=c(2009,12)) Names<-colnames(kimdat) Dnames <- c("dat70","dat80","dat90","dat2000") for (j in 1:4) { for( i in 7:2) { form1<-as.formula(paste(Names[i],"~","lag(",Names[i],",k=-1) + lag(",Names[i],",k=-2)+ lag(",Names[1],",k=-1) +lag(",Names[1],",k=-2)")) form2<-as.formula(paste(Names[1],"~fitted(reg1)")) # reg1 <-dynlm(form1,data=Dnames[j]) # reg2 <-dynlm(form2,data=Dnames[j]) reg1 <-dynlm(form1,data=dat80) reg2 <-dynlm(form2,data=dat80) print(summary(reg1)) print(summary(reg2)) } } Thanks, Richard Saba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Read variable column width data
Reading data with variable column widths. Here are several lines of a txt data set I would like to read. The number of variables is fixed at 13 . The problem is how to read the first variable when it can contain blank space-- for example " Alabama (Seasonally Adjusted)" , "St. Clair", etc. Alabama (Seasonally Adjusted) 2,168,870 2,162,604 2,122,787 1,954,895 1,956,026 1,925,007 213,975 206,578 197,780 9.9% 9.6% 9.3% Alabama (Not Seasonally Adjusted) 2,185,690 2,155,322 2,135,467 1,955,512 1,951,696 1,930,257 230,178 203,626 205,210 10.5% 9.4% 9.6% Autauga 24,743 24,472 24,234 22,355 22,373 22,394 2,388 2,099 1,840 9.7% 8.6% 7.6% Baldwin 86,185 84,039 83,698 78,160 76,934 76,736 8,025 7,105 6,962 9.3% 8.5% 8.3% Barbour 9,954 9,706 9,737 8,611 8,546 8,588 1,343 1,160 1,149 13.5% 12.0% 11.8% .. St. Clair 36,821 36,139 35,964 33,233 33,021 32,540 3,588 3,118 3,424 9.7% 8.6% 9.5% ... Winston 9,150 8,986 9,295 7,779 7,717 7,933 1,371 1,269 1,362 15.0% 14.1% 14.7% United States (Seasonally Adjusted) 153,421,000 153,693,000 153,684,000 139,334,000 139,779,000 139,092,000 14,087,000 13,914,000 14,593,000 9.2% 9.1% 9.5% United States (Not Seasonally Adj.) 154,538,000 153,449,000 154,767,000 140,129,000 140,028,000 139,882,000 14,409,000 13,421,000 14,885,000 9.3% 8.7% 9.6% Thanks, Richard Saba -- View this message in context: http://r.789695.n4.nabble.com/Read-variable-column-width-data-tp3744922p3744922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vars impulse responce function output
Does anyone know if the bootstrap CI intervals generated by the irf() function (impulse response function) in the " vars" package are bias corrected? Thanks, Richard Saba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vars impulse response function output
Sorry about first post. This is in plain text. Does anyone know if the bootstrap CI intervals generated by the irf() function (impulse response function) in the " vars" package are bias corrected? Thanks, Richard Saba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tseries(arma) vs. stats(arima)
Hello, The "arma" function in the "tseries" package allows estimation of models with specific "ar" and "ma" lags with its "lag" argument. For example: y[t] = a[0] + a[1]y[t-3] +b[1]e[t-2] + e[t] can be estimated with the following specification : arma(y, lag=list(ar=3,ma=2)). Is this possible with the "arima" function in the "stats" or in other time series packages like fArima, forecast, or FinTS? They all take a "lag" argument. I would like to have the ability to estimate models like the one above while utilizing the "xreg" argument available in the other arima functions . Thanks, Richard Saba [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert weekly time series data to monthly
I have weekly time series data with year, month, day, and price variables. The input data set for the weekly series takes the following form: Year month day price 19908 20 119.1 19908 27 124.5 19909 3 124.2 19909 10 125.2 19909 17 126.6 19909 24 127.2 199010 1 132.1 199010 8 133.3 199010 15 133.9 199010 22 134.5 199010 29 133.9 .. ... ... ... ... ... 20083 3 313.7 20083 10 320 20083 17 325.7 20083 24 322.4 I would like to collapse the data into monthly averages to merge with a monthly series. The input data set for the monthly series takes the following form: M-Y YearMonth Change Aug-199019908 -226.871 Sep-199019909 -896.333 Oct-1990199010 111.419 Nov-1990199011 -364.2 Dec-1990199012 -527.645 Jan-199119911 -70.935 Feb-199119912 231.214 Mar-199119913 -239 ... ... . .. The merged data set should be of class(ts). I can perform the conversions outside of R and then import but I would rather perform all conversions within R. I have looked through the zoo and Rmetrics packages but without success. Any help will be appreciated. Thanks, Richard Saba Department of Economics Auburn University Email: [EMAIL PROTECTED] Phone: 334 844-2922 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug? in summary( ) function base package
There seems to be an error in the summary() function when applied to "ts" class objects. The results of a call to summary( ), on the R "ts" data set USAccDeaths , reports the wrong value for Max. The value reported by the summary function is 11320. The max( ) function returns the correct value 11317, the July 1993 value. Coercing the data to a data.frame and calling summary returns the correct max value. A search of R -help found a post in 2007 that mentioned a problem but attributed it to rounding errors. But this is too large a difference to account for a simple rounding error. Has anyone else encountered the problem? Is there a workaround? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> R version 2.6.2 Patched (2008-02-08 r44394) Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > data(USAccDeaths) > summary(USAccDeaths) Min. 1st Qu. MedianMean 3rd Qu.Max. 68928089872887899323 11320 > max(USAccDeaths) [1] 11317 > USAccDeaths Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1973 9007 8106 8928 9137 10017 10826 11317 10744 9713 9938 9161 8927 1974 7750 6981 8038 8422 8714 9512 10120 9823 8743 9129 8710 8680 1975 8162 7306 8124 7870 9387 9556 10093 9620 8285 8466 8160 8034 1976 7717 7461 7767 7925 8623 8945 10078 9179 8037 8488 7874 8647 1977 7792 6957 7726 8106 8890 9299 10625 9302 8314 8850 8265 8796 1978 7836 6892 7791 8192 9115 9434 10484 9827 9110 9070 8633 9240 > dat1<-as.data.frame(USAccDeaths) > summary(dat1) x Min. : 6892 1st Qu.: 8089 Median : 8728 Mean : 8789 3rd Qu.: 9323 Max. :11317 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, R Saba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting year an month from ts data set
I have an ascii data set of monthly observation starting in Jan 1946 with a header. hstarts 57 65 95 103 103 97 94 . . . Which I read with the following code tab6.1<-ts(read.table(fname, header=TRUE),frequency=12,start=c(1946,1)) I would like to run a time series model with dummy variables for each month. If I had a variable which take values from 1 to 12 indicating the month I could use the factor() function to model the series. reg1<-lm(hstarts~ -1 + factor(months)) Is there a function that will extract the year and month from a ts data set? Thanks, Richard Saba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Working with "ts" objects
I am relatively new to R and object oriented programming. I have relied on SAS for most of my data analysis. I teach an introductory undergraduate forecasting course using the Diebold text and I am considering using R in addition to SAS and Eviews in the course. I work primarily with univariate or multivariate time series data. I am having a great deal of difficulty understanding and working with "ts" objects particularly when it comes to referencing variables in plot commands or in formulas. The confusion is amplified when certain procedures (lm for example) coerce the "ts" object into a data.frame before application with the results that the output is stored in a data.frame object. For example the two sets of code below replicate examples from chapter 2 and 6 in the text. In the first set of code if I were to replace "anscombe<-read.table(fname, header=TRUE)" with "anscombe<-ts(read.table(fname, header=TRUE))" the plot() commands would generate errors. The objects "x1", "y1" ... would not be recognized. In this case I would have to reference the specific column in the anscombe data set. If I would have constructed the data set from several different data sets using the ts.intersect() function (see second code below)the problem becomes even more involved and keeping track of which columns are associated with which variables can be rather daunting. All I wanted was to plot actual vs. predicted values of "hstarts" and the residuals from the model. Given the difficulties I have encountered I know my students will have similar problems. Is there a source other than the basic R manuals that I can consult and recommend to my students that will help get a handle on working with time series objects? I found the Shumway "Time series analysis and its applications with R Examples" website very helpful but many practical questions involving manipulation of time series data still remain. Any help will be appreciated. Thanks, Richard Saba Department of Economics Auburn University Email: [EMAIL PROTECTED] Phone: 334 844-2922 anscombe<-read.table(fname, header=TRUE) names(anscombe)<-c("x1","y1","x2","y2","x3","y3","x4","y4") reg1<-lm(y1~1 + x1, data=anscombe) reg2<-lm(y2~1 + x2, data=anscombe) reg3<-lm(y3~1 + x3, data=anscombe) reg4<-lm(y4~1 + x4, data=anscombe) summary(reg1) summary(reg2) summary(reg3) summary(reg4) par(mfrow=c(2,2)) plot(x1,y1) abline(reg1) plot(x2,y2) abline(reg2) plot(x3,y3) abline(reg3) plot(x4,y4) abline(reg4) .. fname<-file.choose() tab6.1<-ts(read.table(fname, header=TRUE),frequency=12,start=c(1946,1)) month<-cycle(tab6.1) year<-floor(time(tab6.1)) dat1<-ts.intersect(year,month,tab6.1) dat2<-window(dat1,start=c(1946,1),end=c(1993,12)) reg1<-lm(tab6.1~1+factor(month),data=dat2, na.action=NULL) summary(reg1) hstarts<-dat2[,3] plot1<-ts.intersect(hstarts,reg1$fitted.value,reg1$resid) plot.ts(plot1[,1]) lines(plot1[,2], col="red") plot.ts(plot[,3], ylab="Residuals") __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R procedure similar to STATA heckprob?
Is anyone aware of an R procedure similar to STATA's "heckprob" procedure? "Heckprob" fits maximum likelihood probit models correcting for sample selection bias. Thanks, Richard Saba Department of Economics Auburn University Email: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about xreg of arima
Tom A constant term is not included in the model if any differencing is specified. The xreg= parameter is used to add other explanatory variables to the model. In your case xreg=1:length(x) adds a vector of 1's to the model. Robert Shumway and David Stoffer's website for their "Time Series Analysis an its Applications with R Examples" text has several very helpful documents posted on the site (http://www.stat.pitt.edu/stoffer/tsa2/index.html) specific to time series analysis. The R ISSUES document address your question. Richard >Hi, >I am trying to understand exactly what xreg does in arima. The documentation for xreg says:"xreg Optionally, a vector or matrix of external regressors, which must have >the same number of rows as x." What does this mean with regard to the action of xreg in arima? >Apparently somehow xreg made the following two arima fit equivalent in R: >arima(x, order=c(1,1,1), xreg=1:length(x)) >is the same as > arima(diff(x), order=c(1,0,1)) >While I understand the latter fit (I think), I am puzzled with regard to the former. Does anyone know what the former is doing to arima, and why it works as it does? >Thanks! -- >Tom Richard Saba Department of Economics Auburn University Auburn, AL 36849 USA [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overriding contributed package functions
The "tsdiag" function in the TSA package overrides the "tsdiag" function in the "stats" package. There are a few annoying bugs in the TSA's version of the function so I would like to use the "stats" function but still have access to other TSA functions. I have tried using stats::tsdiag( ) but as long as the TSA package is attached the function from the "TSA" package is called. I believe the problem is the result of the TSA package not having a "namespace". The only solution I have found is to detach the TSA package, (detach("package:TSA")) , which results in the loss of all the TSA specific functions. Does anyone have another solution? The following code illustrates the problem: Y1<-arima.sim(n=100,list(ar=c(.95,-0.2))) model1<-arima(Y1,order=c(2,0,0)) tsdiag(model1) library(TSA) tsdiag(model1) stats::tsdiag(model1) detach("package:TSA") tsdiag(model1) R Saba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Formulae for R functions
Can someone direct me to a resource or resources that list the formulae used by R functions (i.e. predict.lm ) to calculate the statistic reported. I am not a programmer and studying the r code is extremely slow going. I have searched r-project.org and all the function help files without success. For example I have attempted to replicate by hand the se.fit calculation from a lm object calculated by a call to the predict function and have not been able to reproduce the results. Thanks, Richard Saba Department of Economics Auburn University Email: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newey-West corrections in SUR regression models
Is anyone aware of a procedure to apply Newey-West corrections for autocorrelation to a SUR regression model? The SANDWICH package seems to be applicable only to LM or GLM models. Thanks, Richard Saba Department of Economics Auburn University Email: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newey-West and SUR regression models
Is anyone aware of a procedure to apply Newey-West corrections for autocorrelation to a SUR regression model? The SANDWICH package seems to be applicable only to LM or GLM models. Thanks, Richard Saba Department of Economics Auburn University Email: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing value with "1"
Hi I have a matrix that contains 1565 rows and 132 columns. All the observations are either "0" or "1". Now I want to keep all the observations same but just one change, i.e. whenever there is "1", the very next value in the same row should become "1". Please see below as a sample: >df 00100 NA0110 0100NA What I want is: 00110 NA0111 0110NA I shall be thankful for the reply. Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-help mailing list
Hi I am a PhD student and I want to learn how to run Linear regression with Lag-5 on R through "For Loop". Please find the details below: 1- I need guidance about Coding/ Programming for Simple Linear Regression with Lag-5 on R. 2- I have time series data of “Daily Returns” of 15 stocks and I want to see how each stock’sreturn is connected to all other stocks’ returns. This means, I have to runregression as follows: a) Impact of Stock 1’s return on return of Stock 2. Impact of Stock 1’s return onreturn of Stock 3. Impact of Stock 1’s return on return of Stock 4 ……… tillreturn of Stock 15. b) Then, Impact of Stock 2’s return on return of Stock 1. Impact of Stock 2’sreturn on return of Stock 3. Impact of Stock 2’s return on return of Stock 4……… till return of Stock 15. And this will continue till Stock 15, one after another. c) As the the process will have to be repeated, therefore instead of manual coding everytime, “For Loop” is required. I shall bereally grateful for a detailed reply. Thanks. Regards Saba Sehrish [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] For loop coding
Hi I will be grateful if someone please tell me the programming to run regression on time series data through "For Loop". Regards. Saba Sent from Yahoo Mail on Android [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in linear regression
Hi I am trying to apply linear regression on the attached data of two variables (DODGX, TRMCX) in R by taking into account time lag=5 for both of them. Each time I run this command, it gives me following error: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'In addition: Warning message:In model.response(mf, "numeric") : NAs introduced by coercion Following is the programming I am using: data<-read.csv(file="---",header=T)A<-as.matrix(data$DODGX)B<-as.matrix(data$TRMCX) nrow<-nrow(A)A1<-matrix(NA,nrow,1)A2<-matrix(NA,nrow,1)A3<-matrix(NA,nrow,1)A4<-matrix(NA,nrow,1)A5<-matrix(NA,nrow,1)A1[2:nrow,1]<-A[1:(nrow-1),1]A2[3:nrow,1]<-A[1:(nrow-2),1]A3[4:nrow,1]<-A[1:(nrow-3),1]A4[5:nrow,1]<-A[1:(nrow-4),1]A5[6:nrow,1]<-A[1:(nrow-5),1]nrow<-nrow(B)B1<-matrix(NA,nrow,1)B2<-matrix(NA,nrow,1)B3<-matrix(NA,nrow,1)B4<-matrix(NA,nrow,1)B5<-matrix(NA,nrow,1)B1[2:nrow,1]<-B[1:(nrow-1),1]B2[3:nrow,1]<-B[1:(nrow-2),1]B3[4:nrow,1]<-B[1:(nrow-3),1]B4[5:nrow,1]<-B[1:(nrow-4),1]B5[6:nrow,1]<-B[1:(nrow-5),1] reg1<-lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5)reg2<-lm(B~B1+B2+B3+B4+B5+A1+A2+A3+A4+A5) Kindly guide me in this regard. Thanks. Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in linear regression
Hi Please find the attachment with (.txt) extension and I hope the command is visible now. library(lmtest)data<-read.csv(file="---",header=T,sep=",")A<-as.matrix(data$DODGX)B<-as.matrix(data$TRMCX) nrow<-nrow(A)A1<-matrix(NA,nrow,1)A2<-matrix(NA,nrow,1) A3<-matrix(NA,nrow,1) A4<-matrix(NA,nrow,1) A5<-matrix(NA,nrow,1) A1[2:nrow,1]<-A[1:(nrow-1),1]A2[3:nrow,1]<-A[1:(nrow-2),1] A3[4:nrow,1]<-A[1:(nrow-3),1] A4[5:nrow,1]<-A[1:(nrow-4),1] A5[6:nrow,1]<-A[1:(nrow-5),1] nrow<-nrow(B)B1<-matrix(NA,nrow,1)B2<-matrix(NA,nrow,1) B3<-matrix(NA,nrow,1) B4<-matrix(NA,nrow,1) B5<-matrix(NA,nrow,1) B1[2:nrow,1]<-B[1:(nrow-1),1]B2[3:nrow,1]<-B[1:(nrow-2),1] B3[4:nrow,1]<-B[1:(nrow-3),1] B4[5:nrow,1]<-B[1:(nrow-4),1] B5[6:nrow,1]<-B[1:(nrow-5),1] reg1<-lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5)reg2<-lm(B~B1+B2+B3+B4+B5+A1+A2+A3+A4+A5) Following error is occurring: Error in lm.fit(x,y,offset = offset, singular.ok = singular.ok, ...) : NA/NaN/lnf in 'y'In addition: Warning message:In model.response(mf,"numeric") : NAs introduced by coercion RegardsSaba On Friday, 18 December 2015, 15:11, David Winsemius wrote: > On Dec 17, 2015, at 1:13 PM, Saba Sehrish via R-help > wrote: > > Hi I am trying to apply linear regression on the attached data The is no attached data; please read the posting guide. Do not post with .csv or .doc files. You can have commas as separators but an attachment must have a .txt extension. > of two variables (DODGX, TRMCX) in R by taking into account time lag=5 for > both of them. Each time I run this command, it gives me following error: > Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > NA/NaN/Inf in 'y'In addition: Warning message:In model.response(mf, > "numeric") : NAs introduced by coercion > Following is the programming I am using: > > data<-read.csv(file="---",header=T)A<-as.matrix(data$DODGX)B<-as.matrix(data$TRMCX) > nrow<-nrow(A)A1<-matrix(NA,nrow,1)A2<-matrix(NA,nrow,1)A3<-matrix(NA,nrow,1)A4<-matrix(NA,nrow,1)A5<-matrix(NA,nrow,1)A1[2:nrow,1]<-A[1:(nrow-1),1]A2[3:nrow,1]<-A[1:(nrow-2),1]A3[4:nrow,1]<-A[1:(nrow-3),1]A4[5:nrow,1]<-A[1:(nrow-4),1]A5[6:nrow,1]<-A[1:(nrow-5),1]nrow<-nrow(B)B1<-matrix(NA,nrow,1)B2<-matrix(NA,nrow,1)B3<-matrix(NA,nrow,1)B4<-matrix(NA,nrow,1)B5<-matrix(NA,nrow,1)B1[2:nrow,1]<-B[1:(nrow-1),1]B2[3:nrow,1]<-B[1:(nrow-2),1]B3[4:nrow,1]<-B[1:(nrow-3),1]B4[5:nrow,1]<-B[1:(nrow-4),1]B5[6:nrow,1]<-B[1:(nrow-5),1] > reg1<-lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5)reg2<-lm(B~B1+B2+B3+B4+B5+A1+A2+A3+A4+A5) I do not see the usual html delted message but nonetheless your code has arrived without any linebreaks. Linebreaks are syntactically necessary. So pleas learn to post with plain text in a format that does mangle the ability of humans to read this code. -- David Winsemius Alameda, CA, USA DODGX,TRMCX "739,171,876.13","-30,023,111.44" "487,266,676.01","21,283,768.23" "372,851,476.15","-40,442,678.43" "63,229,603.27","10,656,220.90" "42,006,490.16","-11,533,497.55" "190,745,334.56","-5,394,116.27" "172,710,138.57","-15,091,006.48" "231,059,302.57","23,568,469.87" "519,602,621.84","64,131,342.59" "997,358,074.79","23,623,980.29" "291,864,614.39","65,303,351.45" "80,844,732.71","69,354,076.90" "701,170,068.28","106,386,633.76" "440,463,911.27","105,165,515.47" "67,256,920.87","57,943,316.76" "64,101,070.80","50,209,212.89" "-71,028,831.03","31,292,473.88" "-197,854,142.48","32,805,225.46" "-189,290,263.33","4,638,671.93" "-520,470,164.74","962,640,792.41" "-471,115,277.27","-1,093,666,458.34" "-955,868,238.04","-102,261,874.75" "-1,098,715,608.87","-101,020,121.92" "-738,546,938.53","-69,222,216.12" "-1,085,874,989.74","-136,045,443.89" "193,157,212.12","-2,473,692.63" "-6,269,415.53","-28,891,931.00" "199,824,564.81","5,127,403.10" "302,376,261.45","6,655,585.13" "-67,851,220.11","-13,741,489.54" "-370,952,946.99","-24,219,268.21" "34,404,761.25","27,283,468.90" "-428,849,252.43","-85,765,593.88" "-924,463,014.01","-112,574,045.54" "-495,270,249.60","-2,965,265.14"
[R] Error-linear regression
Hi I am trying to apply linear regression on the attached data of two variables (DODGX, TRMCX) in R by taking time lag=5 for both of them. Each time I run this command, it gives me following error: Error in lm.fit(x,y,offset = offset, singular.ok = singular.ok, ...) : NA/NaN/lnf in 'y' In addition: Warning message: In model.response(mf,"numeric") : NAs introduced by coercion Following is the command: library(lmtest)data<-read.csv(file="---",header=T,sep=",") A<-as.matrix(data$DODGX) B<-as.matrix(data$TRMCX) nrow<-nrow(A) A1<-matrix(NA,nrow,1) A2<-matrix(NA,nrow,1) A3<-matrix(NA,nrow,1) A4<-matrix(NA,nrow,1) A5<-matrix(NA,nrow,1) A1[2:nrow,1]<-A[1:(nrow-1),1] A2[3:nrow,1]<-A[1:(nrow-2),1] A3[4:nrow,1]<-A[1:(nrow-3),1] A4[5:nrow,1]<-A[1:(nrow-4),1] A5[6:nrow,1]<-A[1:(nrow-5),1] nrow<-nrow(B) B1<-matrix(NA,nrow,1) B2<-matrix(NA,nrow,1) B3<-matrix(NA,nrow,1) B4<-matrix(NA,nrow,1) B5<-matrix(NA,nrow,1) B1[2:nrow,1]<-B[1:(nrow-1),1] B2[3:nrow,1]<-B[1:(nrow-2),1] B3[4:nrow,1]<-B[1:(nrow-3),1] B4[5:nrow,1]<-B[1:(nrow-4),1] B5[6:nrow,1]<-B[1:(nrow-5),1] reg1<-lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5) reg2<-lm(B~B1+B2+B3+B4+B5+A1+A2+A3+A4+A5) Regards Saba DODGX,TRMCX "739,171,876.13","-30,023,111.44" "487,266,676.01","21,283,768.23" "372,851,476.15","-40,442,678.43" "63,229,603.27","10,656,220.90" "42,006,490.16","-11,533,497.55" "190,745,334.56","-5,394,116.27" "172,710,138.57","-15,091,006.48" "231,059,302.57","23,568,469.87" "519,602,621.84","64,131,342.59" "997,358,074.79","23,623,980.29" "291,864,614.39","65,303,351.45" "80,844,732.71","69,354,076.90" "701,170,068.28","106,386,633.76" "440,463,911.27","105,165,515.47" "67,256,920.87","57,943,316.76" "64,101,070.80","50,209,212.89" "-71,028,831.03","31,292,473.88" "-197,854,142.48","32,805,225.46" "-189,290,263.33","4,638,671.93" "-520,470,164.74","962,640,792.41" "-471,115,277.27","-1,093,666,458.34" "-955,868,238.04","-102,261,874.75" "-1,098,715,608.87","-101,020,121.92" "-738,546,938.53","-69,222,216.12" "-1,085,874,989.74","-136,045,443.89" "193,157,212.12","-2,473,692.63" "-6,269,415.53","-28,891,931.00" "199,824,564.81","5,127,403.10" "302,376,261.45","6,655,585.13" "-67,851,220.11","-13,741,489.54" "-370,952,946.99","-24,219,268.21" "34,404,761.25","27,283,468.90" "-428,849,252.43","-85,765,593.88" "-924,463,014.01","-112,574,045.54" "-495,270,249.60","-2,965,265.14" "-668,618,574.50","-39,930,551.16" "-10,436,100.77","90,010,638.89" "-281,751,636.53","-22,157,882.66" "-385,194,082.95","43,186,980.60" "104,681,563.10","40,450,660.38" "-15,283,793.52","60,454,998.18" "-26,567,438.37","52,683,189.80" "-98,612,309.08","25,319,905.01" "21,402,708.99","44,019,777.51" "-74,846,057.05","45,104,511.78" "-951,203,476.25","9,858,962.32" "-338,231,274.10","86,293,283.74" "-424,023,473.54","102,767,273.58" "20,027,128.13","185,851,265.95" "-815,545.80","163,237,321.24" "46,996,041.85","194,808,434.99" "134,571,135.25","122,988,858.88" "-183,703,166.02","53,086,443.78" "212,728,895.49","73,301,796.90" "-197,466,304.16","-11,713,239.02" "-393,762,814.65","11,580,149.74" "-343,324,235.59","-13,610,112.45" "-260,888,613.88","10,047,787.51" "-759,009,960.63","-151,251,490.77" "-383,721,497.02","-42,502,501" __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in vcovNW
Hi I am using NeweyWest standard errors to correct lm( ) output. For example: lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5) vcovNW<-NeweyWest(lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5)) I am using package(sandwich) for NeweyWest. Now when I run this command, it gives following error: Error in solve.default(diag(ncol(umat)) - apply(var.fit$ar, 2:3, sum)) :system is computationally singular: reciprocal condition number = 7.49468e-18 Attached herewith is data for A&B, A1,A2,A3,A4,A5,B1,B2,B3,B4,B5 are simply lag variables. Can you help me removing this error please? SabaA B 739171876.1 -30023111.44 487266676 21283768.23 372851476.2 -40442678.43 63229603.27 10656220.9 42006490.16 -11533497.55 190745334.6 -5394116.27 172710138.6 -15091006.48 231059302.6 23568469.87 519602621.8 64131342.59 997358074.8 23623980.29 291864614.4 65303351.45 80844732.71 69354076.9 701170068.3 106386633.8 440463911.3 105165515.5 67256920.87 57943316.76 64101070.8 50209212.89 -71028831.0331292473.88 -197854142.532805225.46 -189290263.34638671.93 -520470164.7962640792.4 -471115277.3-1093666458 -955868238 -102261874.8 -1098715609 -101020121.9 -738546938.5-6916.12 -1085874990 -136045443.9 193157212.1 -2473692.63 -6269415.53 -28891931 199824564.8 5127403.1 302376261.5 6655585.13 -67851220.11-13741489.54 -370952947 -24219268.21 34404761.25 27283468.9 -428849252.4-85765593.88 -924463014 -112574045.5 -495270249.6-2965265.14 -668618574.5-39930551.16 -10436100.7790010638.89 -281751636.5-22157882.66 -385194083 43186980.6 104681563.1 40450660.38 -15283793.5260454998.18 -26567438.3752683189.8 -98612309.0825319905.01 21402708.99 44019777.51 -74846057.0545104511.78 -951203476.39858962.32 -338231274.186293283.74 -424023473.5102767273.6 20027128.13 185851266 -815545.8 163237321.2 46996041.85 194808435 134571135.3 122988858.9 -183703166 53086443.78 212728895.5 73301796.9 -197466304.2-11713239.02 -393762814.711580149.74 -343324235.6-13610112.45 -260888613.910047787.51 -759009960.6-151251490.8 -383721497 -151251490.8 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in vcovNW
Thank you. The issue is resolved by scaling the data in millions. Saba On Saturday, 19 December 2015, 15:06, Achim Zeileis wrote: On Sat, 19 Dec 2015, Saba Sehrish via R-help wrote: > Hi I am using NeweyWest standard errors to correct lm( ) output. For example: > lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5) > vcovNW<-NeweyWest(lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5)) > > I am using package(sandwich) for NeweyWest. Now when I run this command, it > gives following error: > Error in solve.default(diag(ncol(umat)) - apply(var.fit$ar, 2:3, sum)) > :system is computationally singular: reciprocal condition number = 7.49468e-18 > > Attached herewith is data for A&B, A1,A2,A3,A4,A5,B1,B2,B3,B4,B5 are > simply lag variables. Can you help me removing this error please? Without trying to replicate the error, there are at least two issues: (1) You should scale your data to use more reasonable orders of magnitude, e.g., in millions. This will help avoiding numerical problems. (2) More importantly, you should not employ HAC/Newey-West standard errors in autoregressive models. If you use an autoregressive specification, you should capture all relevant autocorrelations - and then no HAC estimator is necessary. Alternatively, one may treat autocorrelation as a nuisance parameter and not model it - but instead capture it in HAC standard errors. Naturally, the former strategy will typically perform better if the autocorrelations are more substantial. > Saba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in vcovNW
Hi Thanks for the reminder. Actually I want to analyse whether present value of variable A is Granger caused by lag values of B and test linear hypothesis "B1,B2,B3,B4,B5=0". Therefore, to get robust standard error NeweyWest estimates are applied. Saba On Saturday, 19 December 2015, 23:26, Achim Zeileis wrote: On Sat, 19 Dec 2015, Saba Sehrish wrote: > Thank you. The issue is resolved by scaling the data in millions. That solves the numerical problem but the second issue (inappropriateness of the Newey-West estimator for an autoregressive model) persists. > Saba > > > On Saturday, 19 December 2015, 15:06, Achim Zeileis > wrote: > > > On Sat, 19 Dec 2015, Saba Sehrish via R-help wrote: > > > Hi I am using NeweyWest standard errors to correct lm( ) output. For > example: > > lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5) > > vcovNW<-NeweyWest(lm(A~A1+A2+A3+A4+A5+B1+B2+B3+B4+B5)) > > > > I am using package(sandwich) for NeweyWest. Now when I run this command, > it gives following error: > > Error in solve.default(diag(ncol(umat)) - apply(var.fit$ar, 2:3, sum)) > :system is computationally singular: reciprocal condition number = > 7.49468e-18 > > > > Attached herewith is data for A&B, A1,A2,A3,A4,A5,B1,B2,B3,B4,B5 are > > simply lag variables. Can you help me removing this error please? > > Without trying to replicate the error, there are at least two issues: > > (1) You should scale your data to use more reasonable orders of magnitude, > e.g., in millions. This will help avoiding numerical problems. > > (2) More importantly, you should not employ HAC/Newey-West standard errors > in autoregressive models. If you use an autoregressive specification, you > should capture all relevant autocorrelations - and then no HAC estimator > is necessary. Alternatively, one may treat autocorrelation as a nuisance > parameter and not model it - but instead capture it in HAC standard > errors. Naturally, the former strategy will typically perform better if > the autocorrelations are more substantial. > > > Saba > > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Descriptive Statistics of time series data
Hi I have four variables and the time series data for each variable consists of values for past 10 years on monthly basis. I want to get descriptive stats for these four variables separately (mean, median, sd, min, max). The data I import to R consists of different columns, where each column gives values for one month of a particular year (e.g. March 31st, 2010). Right now R gives descriptive results for each column, whereas I need it collectively for all the years ( one mean, one sd, one min, one max and one median) for each variable. Kindly guide me in this regard. Thanks. Saba Sent from Yahoo Mail on Android [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unbalanced number of rows
HiI have a data frame with rows specifying companies (codes are assigned to companies) and columns specify months (monthly data). The data is based on male (M) and female (F) information for each month. Following is an example of how data looks like: 01 02 03 04001 na M M M001 M M M F002 M F F na003 M na na M003 F M M F003 F F M M na= no male/female. Now, I want to firstly add rows with similar codes to see total number of Male and Female in each month. Secondly, I need to calculate fraction of Female in each month (F/ M+F) for these three companies. Kindly guide me in this regard. ThanksSaba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] working with unequal rows
Hi I have a data frame with rows specifying companies (codes are assigned to companies) and columns specify months (monthly data). The data is based on male (M) and female (F) information for each month. Following is an example of how my data looks like: 01 02 03 04 001 M M M na 001 F M M M 002 M na F F 003 F F F M 003 F F M na 003 M M M M na= no male/female. Now, I want to firstly add rows with similar codes to see total number of Male and Female in each month for each company. Secondly, I need to calculate fraction of Female in each month (F/ M+F) for each one of these companies. For example, in first month of company 001, there is a male and a female working, so in this month the fraction of female is 0.5. I need to know the coding to get this fraction for my whole data. Kindly guide me in this regard. Thanks Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding Highest value in groups
Hi I have two columns in data frame. First column is based on "ID" assigned to each group of my data (similar ID depicts one group). From second column, I want to identify highest value among each group and want to assign the same ID to that highest value. Right now the data looks like: IDValue 10.69 10.31 20.01 20.99 31.00 4NA 40 41 50.5 50.5 I want to use R program to get results as below: ID Value 10.69 20.99 31.00 41 50.5 Kindly guide me in this regard. Thanks Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding Highest value in groups
Thanks a lot. Its really helpful Regards Saba On Saturday, 23 April 2016, 6:50, Giorgio Garziano wrote: Since the aggregate S3 method for class formula already has got na.action = na.omit, ## S3 method for class 'formula' aggregate(formula, data, FUN, ..., subset, na.action = na.omit) I think that to deal with NA's, it is enough: aggregate(Value~ID, dta, max) Moreover, passing na.rm = FALSE/TRUE is "don't care": aggregate(Value~ID, dta, max, na.rm=FALSE) result is: ID Value 1 1 0.69 2 2 0.99 3 3 1.00 4 4 1.00 5 5 0.50 which is the same of na.rm=TRUE. On the contrary, in the following cases: aggregate(Value~ID, dta, max, na.action = na.pass) ID Value 1 1 0.69 2 2 0.99 3 3 1.00 4 4NA 5 5 0.50 aggregate(Value~ID, dta, max, na.action = na.fail) Error in na.fail.default(list(Value = c(0.69, 0.31, 0.01, 0.99, 1, NA the result is different. -- Best, GG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiplication by groups
Hi I have two data frames as shown below (second one is obtained by aggregating rows of similar IDs in df1.). They both have similar number of columns but rows of df2 are lesser than rows of df1. df1: IDAB 1 12 1 03 2 5NA 2 13 3 14 4 NA NA 4 01 4 30 5 25 5 7NA df2: IDAB 1 15 2 63 3 14 4 31 5 95 Now, to obtain weight of each value of df1, I want to divide each row of df1 by the row of df2 having similar ID. What I want is as below: IDAB 1 1 0.4 1 0 0.6 2 0.83NA 2 0.17 1 3 14 4 NA NA 4 01 4 10 5 0.22 1 5 0.78NA Kindly guide me in this regard. Thanks Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dividing rows in groups
Hi I have two data frames as shown below (second one is obtained by aggregating rows of similar IDs in df1.). They both have similar number of columns but rows of df2 are lesser than rows of df1. df1: ID A B 1 1 2 1 0 3 25 NA 2 1 3 3 1 4 4 NA NA 4 0 1 4 3 0 5 2 5 5 7 NA df2: ID A B 1 1 5 2 6 3 3 1 4 4 3 1 59 5 Now, to obtain weight of each value of df1, I want to divide each row of df1 by the row of df2 having similar ID. What I want is as below: IDAB 110.4 100.6 20.83 NA 20.17 1 31 4 4NANA 40 1 41 0 50.22 1 50.78 NA Kindly guide me in this regard. Thanks Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Inserting a blank row to every other row
Hi I need to insert a blank row after every row in R data frame. I have achieved it through: df[rep(1:nrow(df),1,each=2),] But it inserts a row with name of previous row, while i want a complete blank row without any name/title. Please guide me Regards Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replacing values of rows with identical row names in two dataframes
Hi I have two dataframes(df1, df2) with equal number of columns (1566) but lesser rows in df2 (2772 in df1 and 40 in df2). Row names are identical in both dataframes (date). I want to replace NAs of df1 with the values of df2 for all those rows having identical row names (date) but without affecting already existing values in those rows of df1. Please see below: df1: date 11A 11A 21B 3CC 3CC 20040101 100 150 NA NA 140 20040115 200 NA 200 NA NA 20040131 NA 165 180 190 190 20040205 NA NA NA NA NA 20040228 NA NA NA NA NA 20040301 150 155 170 150 160 20040315 NA NA 180 190 200 20040331 NA NA NA 175 180 df2: date 11A 11A 21B 3CC 3CC 20040131 170 NA NA NA NA 20040228 140 145 165 150 155 20040331 NA 145 160 NA NA I want the resulting dataframe to be: df3: date 11A 11A 21B 3CC 3CC 20040101 100 150 NA NA 140 20040115 200 NA 200 NA NA 20040131 170 165 180 190 190 20040205 NA NA NA NA NA 20040228 140 145 165 150 155 20040301 150 155 170 150 160 20040315 NA NA 180 190 200 20040331 NA 145 160 175 180 If it is possible, I would prefer to use "for loop" and "which" function to achieve the result. Please guide me in this regard. Thanks Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Install GARPFRM package
Hi I am trying to install GARPFRM package to R (version: 3.3.0) by following steps: (a) install.packages("GARPFRM", repos="http://R-Forge.R-project.org";) It gives following Warning messages: 1: running command '"C:/PROGRA~1/R/R-33~1.0/bin/i386/R" CMD INSTALL -l "C:\Users\ssehrish\Documents\R\win-library\3.3" C:\Users\ssehrish\AppData\Local\Temp\RtmpU3JvBo/downloaded_packages/GARPFRM_0.1.0.tar.gz' had status 1 2: In install.packages("GARPFRM", repos = "http://R-Forge.R-project.org";) : installation of package ‘GARPFRM’ had non-zero exit status (b) library(GARPFRM) It gives following error : Error in library(GARPFRM) : there is no package called ‘GARPFRM’ Please help me in this regard. Thanks Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Install GARPFRM package
Hi If a package is not loading, it is a matter of concern. Therefore, I have asked for the assistance or guidance in this regards. Saba __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying Gender
Hi I have a csv file of Names based on male and female managers. Is there some code in R to identify the gender by names? ThanksSaba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.