[R] #INCLUDE
What is R's equivalent to a C-like #include to incorporate external files. I have a 2k line function that is generated and need to include it at runtime but not manage it as a package (as it changes hourly.) Any ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Proper Paste for Data Member
I imported a spreadsheet into a variable sh e.g. sh$, sh$, etc... doing the following: tsSource <- ts(paste("sh$",NAMEVARIABLE,sep="") ... ) fails. The paste isn't evaluating properly. What is the proper way to concatenate a data source with a member name such that they evaluate properly. actual code below: doEnv <- function(SOURCEDATA,REGDATA,HOUR,ENVNAME,REPORTNAME) { print(SOURCEDATA) print(REGDATA) print(HOUR) print(ENVNAME) print(REPORTNAME) # blah blah blah ... #Raw Data channel1 <- odbcConnectExcel("Q:/metrics.xls") sqlTables(channel1) sh1 <- sqlFetch(channel1, "Actuals$") close(channel1) # Something here is borked like the Chef himself tsSource<-ts(paste("sh1$",ENVNAME,sep=""),start=c(2004,1),freq=52) print(tsSource) plot(tsSource,col="grey",type="n") return("AUTOBOT") # I use AUTOBOT or DECEPTICON for generic pass fail return values. Yes I am a geek... } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Spaces in a name
I am reading regressors from an excel file (I have no control over the file) and some of the element names have spaces: i.e. "Small Bank Aquired" but I have found that lm(SourceData ~ . - "Small Bank Aquired", mcReg) doesn't work (mcReg = modelCurrentRegressors) As they are toggles I have ran them through factor() to be treated propertly as 0 or 1 but due to the fact I am grabbing automagically the first 2/3rds of the data some of the regressors are either all 0s or all 1s accordingly so I need to take them out of the model by hand for now until I find a nice automatic method for removing regressors that only have 1 factor. So Primarily: how do I handle names that include spaces in this context and as a bonus: Anyone have a nice method for yanking regressors that only have a single factor in them from the lm() function? e.g. (for the following 30 elements) 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 1,1,1,1,1,1,1,1,1,1 As you can see grabbing the first 2/3rds is all 0s and the last 1/3rd is all ones (doing in-sample forecast diagnostic building the model only on the first 2/3rds of data, then forecasting the next 1/3rd and comparing.) Sorry if I am rambling a bit, still on cup of coffee #1... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] storing lm() results and other objects in a list
to clean up some code I would like to make a list of arbitrary length to store various objects for use in a loop sample code: BEGIN SAMPLE ## # You can see the need for a loop already linearModel1=lm(modelSource ~ .,mcReg) linearModel2=step(linearModel1) linearModel3=lm(modelSource ~ .-1,mcReg) linearModel4=step(linearModel3) #custom linearModel5=lm(modelSource ~ . -ACF-MonthlyST1-MonthlyST2-MonthlyBLA,mcReg) LinearModel1.res <- residuals(linearModel1) LinearModel2.res <- residuals(linearModel2) LinearModel3.res <- residuals(linearModel3) LinearModel4.res <- residuals(linearModel4) LinearModel5.res <- residuals(linearModel5) #hmmm bolt on linearModel[x] as linearModel[x]$arma.fit? arma1.fit <- auto.arima(LinearModel1.res) arma2.fit <- auto.arima(LinearModel2.res) arma3.fit <- auto.arima(LinearModel3.res) arma4.fit <- auto.arima(LinearModel4.res) arma5.fit <- auto.arima(LinearModel5.res,stepwise=T,trace=T) #Ok what is left over after Regression and ARIMA that cannot #be explained. Stupid outliers #AO's can be added to the cReg as a normal dummy variable # but these are AOs from the model not the original data. # is it better to handle AOs from the original data? #linearModel[x]arma.ao? arma1.ao <- detectAO(arma1.fit) arma2.ao <- detectAO(arma2.fit) arma3.ao <- detectAO(arma3.fit) arma4.ao <- detectAO(arma4.fit) arma5.ao <- detectAO(arma5.fit) #What do I do with an innovative outlier? Transfer function or what? #auto.arima doesn't handle the IO=c(...) stuff Umm... #transfer functions, etc. are a deficency in the script at this point #linearModel[x]arma.io? arma1.io <- detectIO(arma1.fit) arma2.io <- detectIO(arma2.fit) arma3.io <- detectIO(arma3.fit) arma4.io <- detectIO(arma4.fit) arma5.io <- detectIO(arma5.fit) #Sample on how to auto-grab regressors from DetectAO and DetectIO and #appened them to our regression array. You'd have to do this for each model #as the residuals are where the outliers are coming from and diff models #would have different residuals left over. IO is best left to arimax functions #directly. I assume at this point that AO's can be added to Regression tables #if that is the case then REM out the IO lines and pass the detectIO results #into the arimax(x,y,z,IO=detectIO(blah)) # # Need a better understanding of how to address the AO and IO's in this script before implementing them # (Repeat for each model, cReg1,cReg2,etc..) # #cReg1=cReg #fReg1=fReg #for(i in arma1.io$ind){ print(i);cReg1[,paste(sep=" ","IO",i)]=1*(seq(cReg1[,2])==i)} #for(i in arma1.ao$ind){ print(i);cReg1[,paste(sep=" ","AO",i)]=1*(seq(cReg1[,2])==i)} #for(i in arma1.io$ind){ print(i);fReg1[,paste(sep=" ","IO",i)]=1*(seq(fReg1[,2]))} #for(i in arma1.ao$ind){ print(i);fReg1[,paste(sep=" ","AO",i)]=1*(seq(fReg1[,2]))} #Get the pdq,PDQs into a variable so we can re-feed it if neccessary #oh crap absorbing this into LinearModel[x] looks ugly for syntax arma1.fit$order=c(arma1.fit$arma[1],arma1.fit$arma[2],arma1.fit$arma[6]) arma2.fit$order=c(arma2.fit$arma[1],arma2.fit$arma[2],arma2.fit$arma[6]) arma3.fit$order=c(arma3.fit$arma[1],arma3.fit$arma[2],arma3.fit$arma[6]) arma4.fit$order=c(arma4.fit$arma[1],arma4.fit$arma[2],arma4.fit$arma[6]) arma5.fit$order=c(arma5.fit$arma[1],arma5.fit$arma[2],arma5.fit$arma[6]) arma1.fit$seasonal=c(arma1.fit$arma[3],arma1.fit$arma[4],arma1.fit$arma[7]) arma2.fit$seasonal=c(arma2.fit$arma[3],arma2.fit$arma[4],arma2.fit$arma[7]) arma3.fit$seasonal=c(arma3.fit$arma[3],arma3.fit$arma[4],arma3.fit$arma[7]) arma4.fit$seasonal=c(arma4.fit$arma[3],arma4.fit$arma[4],arma4.fit$arma[7]) arma5.fit$seasonal=c(arma5.fit$arma[3],arma5.fit$arma[4],arma5.fit$arma[7]) #these Two are used for linearModel2 and linearModel4, Get only the #regressors that surived step removal. newcReg=cReg[match(names(linearModel2$coeff[-1]),names(cReg))] newfReg=fReg[match(names(linearModel2$coeff[-1]),names(fReg))] newmcReg=mcReg[match(names(linearModel2$coeff[-1]),names(mcReg))] newmfReg=mfReg[match(names(linearModel2$coeff[-1]),names(mfReg))] #Scenario 1 - All Regressors Left In newFit1.b <- Arima(modelSource,order=arma1.fit$order,seasonal=list(order=arma1.fit$seasonal),xreg=mcReg,include.drift=F) #Scenario 2 - Step Removal of Regressors newFit2.b <- Arima(modelSource,order=arma2.fit$order,seasonal=list(order=arma2.fit$seasonal),xreg=newmcReg,include.drift=F) #Scenario 3 - All Regressors Left In with Intercept Removed newFit3.b <- Arima(modelSource,order=arma3.fit$order,seasonal=list(order=arma3.fit$seasonal),xreg=mcReg,include.drift=F) #Scenario 4 - Step Removal of Regressors with Intercept Removed (I have a feeling this is identical to #2 in results newFit4.b <- Arima(modelSource,order=arma4.fit$order,seasonal=list(order=arma4.fit$seasonal),xreg=newmcReg,include.drift=F) #Scenario 5 - Robust1, For giggles and grins for now newFit5.b <- Arima(modelSource,order=arma5.fit$order,seasonal=list(order=arma5.fit$seasonal),xreg=newmcReg,include.drift=F) #All the
[R] list of lm() results
How can I get the results of lm() into a list so I can loop through the results? e.g. myResults[1] <- lm(...) myResults[2] <- lm(...) myResults[3] <- lm(...) ... myResults[15] <- lm(...) myResults[16] <- lm(...) so far every attempt I've tried doesn't work throwing a "number of items to replace is not a multiple of replacement length" error or simply not working. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odd coefficent behavior
Why are my coefficients getting appended with a 1? It borks a match I do later against the original list that doesn't have the random 1 added to the end. > linearModel[[1]] Call: lm(formula = modelSource ~ +UNITBUILD + UNITDB + ITBUILD + ITDB + UATBUILD + UATDB + HOGANCODE + RCF + ReleaseST1 + ReleaseST2 + ReleaseBLA + Small.Bank.Acquisitions + HLY.NewYear + HLY.MLK + HLY.PRES + HLY.MEMORIAL + HLY.J4 + HLY.LABOR + HLY.COLUMBUS + HLY.VETS + HLY.THANKS + HLY.XMAS + HLY.ELECT + HLY.PATRIOT + EOM, data = mcReg) Coefficients: (Intercept)UNITBUILD1 UNITDB1 405.8326 -8.5675 13.5029 ITBUILD1 ITDB1 UATBUILD1 33.0950 -6.19380.2625 UATDB1HOGANCODE1 RCF1 -3.7793 -3.48255.3243 ReleaseST11 ReleaseST21 ReleaseBLA1 13.6911 -9.4573 -3.3526 Small.Bank.Acquisitions1 HLY.NewYear1 HLY.MLK1 36.6445 -92.5360 22.1168 HLY.PRES1 HLY.MEMORIAL1 HLY.J41 7.1886 -13.0013 -14.3520 HLY.LABOR1 HLY.COLUMBUS1 HLY.VETS1 -0.9740 -16.9177 16.2969 HLY.THANKS1 HLY.XMAS1HLY.ELECT1 -15.9056 -65.9887 -10.9916 HLY.PATRIOT1 EOM1 -20.2531 15.4775 Now all the variables with a 1 appended are factors so is that normal behavior? (if so then I can adjust the match() command to pad a 1 to the master list.) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic Question (Real Basic)
I am having a total brain fart... complete and total. This is part R, Part Statstitics, Part "My Brain is on vacation apparently." Ok I have a time series I need to LOG and DIFF for ARIMA with Regressors. Say 100 data points. Obviously when I diff the series once I get 98 data points now. So what is the appropriate way to handle that now. Part B (This is where I am having a fundamental brain fart). Given that I have Logged and Diff'ed the original Time Series and I want to get a forecast, how do I apply that back to my original data? I am having a mental implosion right now (Just got back from vacation.) I am just not wrapping my head around forecasting in R against transformed data (For stationary purposes). Total brain meltdown ( grilled cheese sounds good...) Anyone have a basic shell script I can look at for reference... not connecting dots today Idgarad -- "Who is John Galt?" [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Table to List Transformation Scenario
I have a series of tables, one for each environment indicating a date (row) and a sample at each hour of the day (0 to 23) Test1 Table: Date,Hour1,Hour2,...Hour23 1/1/10,123,123,...,123 I would like to model this as a time series but how can I translate the table into a list such that I can get: 1/1/10 00:00, 123 1/1/10 01:00, 123 1/1/10 02:00, 123 ... 1/1/10 23:00, 123 Any suggestions on how to get that kind of translation done in R? Idgarad -- "Who is John Galt?" [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with predict() and factors
I am working on a script that takes numeric performance indicators and runs them against a series of regressors (dummy regressors, yes\no stuff via 0 and 1, e.g. Was is Christmas this week 0=no, 1=yes). The script is as follows (Written as a function): -- Begin Script -- doEnv <- function(HOUR,ENVNAME,REPORTNAME) { library(RODBC) library(forecast) library("geneplotter") library(forecast) library(fUtilities) library(TSA) require(gplots) library(robfilter) SOURCEDATA <- paste("Q:/TEST/RSTATS/EPOC ",HOUR," Metrics.xls",sep="") REGRESSORS <- "Q:/TEST/RSTATS/eventswithholidays.xls" mypalette=c() mypalette$background="#FF" mypalette$chart="#FF" mypalette$forecastRegion="#66CCFF" mypalette$confidence="#FF9966" mypalette$limits="#FF" mypalette$major="#00" mypalette$minor="#cc" mypalette$actual="#aa" mypalette$dp1="#9900FF" mypalette$dp2="#00" mypalette$dp3="#CCFF00" mypalette$dp4="#00CCFF" mypalette$dp5="#FF00CC" #Raw Data channel1 <- odbcConnectExcel(SOURCEDATA) sqlTables(channel1) sh1 <- sqlFetch(channel1, "Actuals$") close(channel1) channel2 <- odbcConnectExcel(REGRESSORS) sqlTables(channel2) sh2 <- sqlFetch(channel2, "data$") close(channel2) #Get Raw Data tsSource<-ts(sh1[[ENVNAME]],start=c(2004,1),freq=52) #Data is now a Time Series #Prep Out-of-sample test ranges modLength=length(sh1[[ENVNAME]]) modMax=round((modLength/3)*2) modEndDate=time(tsSource)[modMax] modStartDate=time(tsSource)[1] #RAW SUMAMRY WITH OVERLAY OF OUT OF SAMPLE RANGES summary(tsSource) modelSource=window(tsSource,modStartDate,end=modEndDate) verSource=window(tsSource,time(tsSource)[modMax+1]) pdf(paste("Q:/ReleaseMgmt/Environment Mgmt/Data/Current/Metrics/Mainframe/Test Environment Projections/RSTATS/images/",ENVNAME,"-",HOUR,"-","Raw Metrics with Test Range.pdf",sep=""),width=9, height=6.5) plot(tsSource,col="grey", main=paste("Raw Data for", REPORTNAME), xlab="Date", ylab="MiPS Used") points(modelSource,col="red", pch=20) points(verSource,col="blue", pch=20) smartlegend( x="left", y= "top", inset=0, #smartlegend parameters legend = c("Actual Data","Data for Model Selection","Data for In Sample Verification"), fill=c(mypalette$actual,"red","blue"),bg = mypalette$background) print("The Red region is where we are going to develop the model from and the blue area is where we will evaluate the model (In Sample Testing)") #Ok our ranges are comfirmed we'll get a better graph later # This Heavy Voodoo allows us to have a dynamic number of #dummy variables we can add\remove from the spreadsheet forecastDistance <- 52 #Grab Existing Regressors (clipping out the data) cReg <- sh2[1:modLength,-1] mcReg <- sh2[1:modMax,-1] #transform the on\offs into proper factors for(i in names(cReg)) cReg[[i]] <- factor(cReg[[i]]) for(i in names(mcReg)) mcReg[[i]] <- factor(mcReg[[i]]) #Grab X Future Regressors equal to the forecastDistance (gotta double check if I need a +1 on the start point) fReg <- sh2[length(tsSource):(length(tsSource)+forecastDistance),-1] mfReg <-sh2[(modMax+1):modLength,-1] #fix variable names names(cReg) <- make.names(names(cReg)) names(mcReg) <- make.names(names(mcReg)) names(fReg) <- make.names(names(fReg)) names(mfReg) <- make.names(names(mfReg)) #print("#") #print("This is the CReg Data") #print("#") #print(summary(cReg)) #print("##") #print("This is the mcReg Data") #print("##") #print(summary(mcReg)) #names(mcReg) for(i in names(fReg)) fReg[[i]] <- factor(fReg[[i]]) for(i in names(mfReg)) mfReg[[i]] <- factor(mfReg[[i]]) #end heavy voodoo # # MODEL VERIFICATION FIRST! # # Basic Look at the raw data hist(modelSource) plot(density(modelSource,na.rm=TRUE)) plot(sort(modelSource),pch=".") for(i in names(mcReg)) { pairs(modelSource ~ .,mcReg[[i]], main=paste("Model - MIPS vs",i)) } #Build the list to store our results linearModel <- list() residuals <- list() arima_Fit <- list() arima_AO <- list() arima_IO <- list() newcReg <- list() newfReg <- list() newmcReg <- list() newmfReg <- list() newFit <- list() newForecast <- list() # Following won't work until mcReg contains full variety linearModel[[1]]=lm(modelSource ~ + UNITBUILD + UNITDB + ITBUILD + ITDB + UATBUILD + UATDB + HOGANCODE + RCF + ReleaseST1 + ReleaseST2 + ReleaseBLA + Small.Bank.Acquisitions + HLY.NewYear + HLY.MLK + HLY.PRES + HLY.MEMORIAL + HLY.J4 + HLY.LABOR + HLY.COLUMBUS + HLY.VETS + HLY.THANKS + HLY.XMAS + HLY.ELECT + HLY.PATRIOT + EOM,mcReg) linearModel[[2]]=step(linearModel[[1]], trace=1) linearModel[[3]]=lm(modelSource ~ + UNITBUILD + UNITDB + ITBUILD + ITDB + UATBUILD + UATDB + HOGANCODE + RCF + ReleaseST1 + ReleaseST2 + ReleaseBLA + Small.Bank.Acquisitions + HLY.NewYear + HLY.MLK + HLY.PRES + HLY.MEMORIAL + HLY.J4 + HLY.LABOR + HLY.COLUMBUS + HLY.VETS + HLY.THANKS + HLY.XMAS + HLY.ELECT + HLY.PATRIOT + EOM - 1,mcReg) linearModel[[4]]=step(linearModel[[3]],trace=1) if(ENVNAME=="E081") {linearModel[[5]]=lm(modelS
[R] Holiday Gift Perl Script for US Holiday Dummy Regressors
# BEGIN CODE ## #!/usr/bin/perl ## # # --start, -s = The date you would like to start generating regressors #--end, -e = When to stop generating holiday regressros # --scope, -c = D, W for Daily or Weekly respectively (e.g. Does this week have a particular holiday) # --file, -f = Ummm where to write the output silly! # # **NOTE** The EOM holiday is "End of Month" for computer systems this may be important for # extra processing and what not. # # You may need to set yout TZ environment variable if the script cannot # determine your time zone from the system (e.g. SET TZ=CST ) ## use Getopt::Long; use Date::Manip; use Spreadsheet::WriteExcel; use Calendar::Functions; use Date::Holidays::USFederal; use Set::Array; use POSIX qw/strftime/; use Time::Local; my @regressors = (); #my $holidays = Date::Holidays->new(countrycode => 'us'); $result = GetOptions ("start|s=s" => \$start, "end|e=s" => \$end, "scope|c=s" => \$scope, "file|f=s" => \$filename); print "Generating Holiday Dummy Variables starting $start to $end generated by $scope. Output to $filename \n"; #print all the dates based on scope as a test $startDate=ParseDate(\$start); if (! $startDate) { print "Error in the date";exit; } $endDate= ParseDate($end); print "Start Date: ",UnixDate($startDate,"%m/%d/%Y"),"\n"; print "End Date: ",$end,"\n"; print "Last Day in Month: ",UnixDate(ParseDate("last day in JAN 2004"),"%m/%d/%Y"),"\n"; #HEADER OUTpUT print "Date,HLY-NewYear,HLY-MLK,HLY-PRES,HLY-MEMORIAL,HLY-J4,HLY-LABOR,HLY-COLUMBUS,HLY-VETS,HLY-THANKS,HLY-XMAS,HLY-ELECT,HLY-PATRIOT,EOM\n"; $baseDate=$startDate; if ($scope eq "d"){ while(Date_Cmp($baseDate,$endDate)<0) { print UnixDate($baseDate,"%m/%d/%Y"), ","; if(holidayCheck($baseDate) eq "New Year's Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Martin Luther King, Jr. Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Presidents' Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Memorial Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Independence Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Labor Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Columbus Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Veterans' Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Thanksgiving Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Christmas Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "Election Day"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "U.S. Patriot Day Unofficial Observation"){print "1,"} else {print "0,"}; if(holidayCheck($baseDate) eq "EOM"){print "1"} else {print "0"}; print "\n"; $baseDate=DateCalc($baseDate,"+1 day"); } } # END IF D if($scope eq "w") { while(Date_Cmp($baseDate,DateCalc($endDate,"+7 days"))<0) { print UnixDate($baseDate,"%m/%d/%Y"), ","; if( (holidayCheck(DateCalc($baseDate,"+0 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+1 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+2 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+3 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+4 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+5 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+6 days")) eq "New Year's Day") ){print "1,"} else {print "0,"}; if( holidayCheck(DateCalc($baseDate,"+0 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+1 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+2 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+3 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+4 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+5 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+6 days")) eq "Martin Luther King, Jr. Day" ){print "1,"} else {print "0,"}; if( holidayCheck(DateCalc($baseDate,"+0 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+1 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+2 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+3 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+4 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+5 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+6 days")) eq "Presidents' Day" ){print "1,"} else {print "0,"}; if( holidayCheck(DateCalc($baseDate,"+0 days")) eq "Memorial Day" || holidayCheck(DateCalc($baseDate,"+1 days")) eq "Memorial Day" || holidayCheck(DateCalc($baseDate,"+2 days")) eq "Memorial Day" || holidayCheck(DateCalc($baseDate,"+3 days")) eq "Memorial Day" || holidayCheck(DateCalc($baseDate,"+4 days")) eq "Memorial Day" || holidayCheck(DateCalc($baseDate,"+5 days")) eq "Memorial Day" || h
[R] Opps Correct Version of Holiday Regressor Perl Script
Here is the correct version. The old version is the redirect only version of the script. ### BEGIN SCRIPT #!/usr/bin/perl ## # --start, -s = The date you would like to start generating regressors #--end, -e = When to stop generating holiday regressros # --scope, -c = D, W for Daily or Weekly respectively (e.g. Does this week have a particular holiday) # --file, -f = Ummm where to write the output silly! # # **NOTE** The EOM holiday is "End of Month" for computer systems this may be important for # extra processing and what not. # # You may need to set yout TZ environment variable if the script cannot # determine your time zone from the system (e.g. SET TZ=CST ) ## use Getopt::Long; use Date::Manip; use Spreadsheet::WriteExcel; use Calendar::Functions; use Date::Holidays::USFederal; #use Date::Holidays; use Set::Array; use POSIX qw/strftime/; use Time::Local; my @regressors = (); #my $holidays = Date::Holidays->new(countrycode => 'us'); $result = GetOptions ("start|s=s" => \$start, "end|e=s" => \$end, "scope|c=s" => \$scope, "file|f=s" => \$filename); open (OUTFILE, ">>$filename"); print "Generating Holiday Dummy Variables starting $start to $end generated by $scope. Output to $filename \n"; #print all the dates based on scope as a test $startDate=ParseDate(\$start); if (! $startDate) { print "Error in the date";exit; } $endDate= ParseDate($end); print OUTFILE "Start Date: ",UnixDate($startDate,"%m/%d/%Y"),"\n"; print OUTFILE "End Date: ",$end,"\n"; # print OUTFILE "Last Day in Month: ",UnixDate(ParseDate("last day in JAN 2004"),"%m/%d/%Y"),"\n"; print OUTFILE "Date,HLY-NewYear,HLY-MLK,HLY-PRES,HLY-MEMORIAL,HLY-J4,HLY-LABOR,HLY-COLUMBUS,HLY-VETS,HLY-THANKS,HLY-XMA S,HLY-ELECT,HLY-PATRIOT,EOM\n"; $baseDate=$startDate; if ($scope eq "d"){ while(Date_Cmp($baseDate,$endDate)<0) { print OUTFILE UnixDate($baseDate,"%m/%d/%Y"), ","; if(holidayCheck($baseDate) eq "New Year's Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Martin Luther King, Jr. Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Presidents' Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Memorial Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Independence Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Labor Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Columbus Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Veterans' Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Thanksgiving Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Christmas Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "Election Day"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "U.S. Patriot Day Unofficial Observation"){print OUTFILE "1,"} else {print OUTFILE "0,"}; if(holidayCheck($baseDate) eq "EOM"){print OUTFILE "1"} else {print OUTFILE "0"}; print OUTFILE "\n"; $baseDate=DateCalc($baseDate,"+1 day"); } } # END IF D if($scope eq "w") { while(Date_Cmp($baseDate,DateCalc($endDate,"+7 days"))<0) { print OUTFILE UnixDate($baseDate,"%m/%d/%Y"), ","; if( (holidayCheck(DateCalc($baseDate,"+0 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+1 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+2 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+3 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+4 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+5 days")) eq "New Year's Day") || (holidayCheck(DateCalc($baseDate,"+6 days")) eq "New Year's Day") ){print OUTFILE "1,"} else {print OUTFILE "0,"}; if( holidayCheck(DateCalc($baseDate,"+0 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+1 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+2 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+3 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+4 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+5 days")) eq "Martin Luther King, Jr. Day" || holidayCheck(DateCalc($baseDate,"+6 days")) eq "Martin Luther King, Jr. Day" ){print OUTFILE "1,"} else {print OUTFILE "0,"}; if( holidayCheck(DateCalc($baseDate,"+0 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+1 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+2 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+3 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+4 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+5 days")) eq "Presidents' Day" || holidayCheck(DateCalc($baseDate,"+6 days")) eq "Presidents' Day" ){
[R] lm() and factors appending
How for the love of god can I prevent the lm() function from padding on to my factor variables? I start out with 2 tables: Table1 123123 124351 ... 626773 Table2 Count,IS_DEAD,IS_BURNING 1231,T,F 4521,F,T ... 3321,T,T Everything looks fine when I import the data. then we get a oh_crap <- lm(table1 ~ Count + IS_DEAD + IS_BURNING, table2) Magically when I look at my oh_crap coefficents they get turned into Count, IS_DEADTRUE, IS_BURNINGTRUE I get it that it finds them to be factors by how in the name of all that is holy do I prevent them from doing that crap since later after a stepwise removal I go into one of the models grab what coefficents were kept (IS_BURNINGTTRUE now) and read future regressors from the original table which read IS_BURNING rather then IS_BURNINGTRUE. Since there is a mix of numeric and dummy regressors I cannot selectively append TRUE to variables names as I don't have control of what regressors get imported (one month there might be 50 and another month 2). How can I stop lm() from padding onto a coefficent's name? I have no objection to post processing by finding and assasinating any name with the word TRUE appended to the end of a name but in that case then how do I change the coeff name? Maddening... who thought it would be a good idea to append things without asking? A simple lm(appendFactor=FALSE) would have been nice... took me 3 hours to find out what was going wrong on this Grabbing coffee, a carton of cigs, and heading outside to smash my head with a brick to make the hurting stop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drop a part of an array\list\vector?
I did have a verbose description of why but rather then make everyone's eyes bleed with useless details I ask the following :) To make a long story short: How can I make newmcReg[[i]]["PreIO308"] go away in the following list... er vector... no wait array dataframe awww crap... summary(newmcReg[[i]]) UNITBUILD UNITDB ITBUILD ITDB Mode :logical Mode :logical Mode :logical Mode :logical FALSE:249 FALSE:249 FALSE:249 FALSE:249 TRUE :21TRUE :21TRUE :21TRUE :21 UATBUILD UATDB HOGANCODE ACF Mode :logical Mode :logical Mode :logical Mode :logical FALSE:250 FALSE:250 FALSE:208 FALSE:225 TRUE :20TRUE :20TRUE :62TRUE :45 RCF ReleaseST1 ReleaseST2 ReleaseBLA Mode :logical Mode :logical Mode :logical Mode :logical FALSE:186 FALSE:167 FALSE:157 FALSE:228 TRUE :84TRUE :103 TRUE :113 TRUE :42 MonthlyST1 MonthlyST2 MonthlyBLA Small.Bank.Acquisitions Mode :logical Mode :logical Mode :logical Min. :0. FALSE:107 FALSE:105 FALSE:147 1st Qu.:0. TRUE :163 TRUE :165 TRUE :123 Median :0. Mean :0.1556 3rd Qu.:0. Max. :1. Conversions Build.New.Environment HLY.NewYear HLY.MLK Min. :0.0 Mode :logical Mode :logical Mode :logical 1st Qu.:0.0 FALSE:262 FALSE:266 FALSE:264 Median :0.0 TRUE :8 TRUE :4 TRUE :6 Mean :0.08889 3rd Qu.:0.0 Max. :1.0 HLY.PRES HLY.MEMORIAL HLY.J4HLY.LABOR Mode :logical Mode :logical Mode :logical Mode :logical FALSE:264 FALSE:265 FALSE:265 FALSE:265 TRUE :6 TRUE :5 TRUE :5 TRUE :5 HLY.COLUMBUS HLY.VETS HLY.THANKS HLY.XMAS Mode :logical Mode :logical Mode :logical Mode :logical FALSE:265 FALSE:265 FALSE:265 FALSE:265 TRUE :5 TRUE :5 TRUE :5 TRUE :5 HLY.ELECT HLY.PATRIOTEOMNEWMF Mode :logical Mode :logical Mode :logical Mode :logical FALSE:265 FALSE:265 FALSE:210 FALSE:263 TRUE :5 TRUE :5 TRUE :60TRUE :7 PreIO47 PreIO151PreIO164PreIO169 Mode :logical Mode :logical Mode :logical Mode :logical FALSE:269 FALSE:269 FALSE:269 FALSE:269 TRUE :1 TRUE :1 TRUE :1 TRUE :1 PreIO197PreIO203PreIO209PreIO241 Mode :logical Mode :logical Mode :logical Mode :logical FALSE:269 FALSE:269 FALSE:269 FALSE:269 TRUE :1 TRUE :1 TRUE :1 TRUE :1 PreIO261PreIO308 Mode :logical Mode :logical FALSE:269 FALSE:270 TRUE :1 (the PreIO are outliers identified from the output of stl() e.g. outliers in the source data) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting a date out of an indice in a time series
I have a weekly data set imported via: tsSource=ts(sh1$I000,start=c(2004,1),freq=52) I am now getting to some 'spit and polish' but I realize something I can't wrap my head around. Given an outlier I find at say tsSource[54] ... how can get translate index 54 into the date\week. I mean I can figure out obviously that entry 52 is last week of 2004 but since the data goes for many years week 251 is a tad bit tricky since there are some years with 53 weeks in a year. Is there a function that I am missing to give me a date (or even julian week #) from a time series object? Since I am generating graphs that should read "Week starting xxx/xxx/xxx" I was hoping there was a systematic way to retrive the date from the time series object. Any suggestions? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [Solved][Code Snippets] Dropping Empty Regressors
To make a long story short I was doing some in-sample testing in which some dynamically created regressors would end up either all true or all false based on the validation portion. In my case a new mainframe configuration (this is a crappy way to handle a level shift but I do what I can.) So here is the code snippet that finally let me pre-check my regressors and drop any of them that were all true or all false. First the automagic STL outlier grabber that caused part of the problem: # tsSource being my time Series source. # sh2 is a table of all my regessors that have been previously pulled in # this has historic and future values in it also, it gets sliced later. # the EOM is the regessor holding weeks that contain an 'End of Month' # # This appends the found IOs to the regressor table. Stepwise tends to # remove them later on. I needed a programtic way of removing useless # regressors for model verification since I would not know their names # if any are found tsSourceDiag <- stl(tsSource,s.window="per", robust=TRUE) # tsSourceIO <- which(tsSourceDiag $ weights < 1e-8) # # This is how to append run-time regessors for(z in tsSourceIO) { tmpname <-paste("PreIO",z,sep="") #COPY EOM AS A TEMPLATE sh2[[tmpname]] <- sh2[["EOM"]] #SET IT ALL TO 0 sh2[[tmpname]][]<-FALSE #SET The Proper Indice to TRUE sh2[[tmpname]][z]<- TRUE } So to get rid of them (those empty useless regressors) I cooked up this: ### #Prune Empty Regressors (All false or all true) # the newmcReg you see is a copy of the sh2 from earlier # newmcReg = New Model Current Regressors # sh2 later became cReg. # # Yes it makes my eyes bleed. in short we count all the trues # and all the false and if they happen to be the same number # as the length we know they are all true or false. # # the trick I finally found was that you could in fact -c() # a list (e.g. ask for everything but the following) but you # can't apparently do that inline so we just make a list of # regressors that get shown the door then after hunting # them down we give em the boot. This mess is soley # so my in-sample Arima doesn't choke on xreg=newmcReg # in which one of the newmcReg happen to be all true or false. # # God I wish I had taken more then a Trig course. Where was I? # # Yes that phantom 'i' you see is that this is all in a big loop # for 6 possible models # lm1 = all regressors w/ intercept # lm2 = lm1 stepwise removal # lm3 = all regressors wo/ intercept # lm4 = lm3 stepwise removal # lm5 = Hand Tuned # lm6 = lm5 stepwise removal ### toPurge=c() for(k in names(newmcReg[[i]])) { print (paste("check to see if",k,"is a useless regressors for model",i)) if(sum(newmcReg[[i]][k][,1])==length(newmcReg[[i]][k][,1])) { print(paste("All of",k,"are TRUE")) getLost=which(names(newmcReg[[i]])==k) toPurge=c(toPurge,getLost) print(paste(k, "has been added to the purge list for model", i,"!")) } if(sum(newmcReg[[i]][k][,1]==FALSE)==length(newmcReg[[i]][k][,1])) { print(paste("All of",k,"are FALSE")) getLost=which(names(newmcReg[[i]])==k) toPurge=c(toPurge,getLost) print(paste(k, "has been added to the purge list for model", i,"!")) } } toPurge # Do this only if there are any or R will beat you senseless and # steal all your M&Ms! if(length(toPurge)!=0) { names(newmcReg[[i]]) names(newmcReg[[i]][-c(toPurge)]) newmcReg[[i]] <- newmcReg[[i]][-c(toPurge)] newmfReg[[i]] <- newmfReg[[i]][-c(toPurge)] names(newmcReg[[i]]) } ## # End Regressor Pruning ## Big thanks to the help so far. Now about those darn transfer functions... hmm and pulse detection... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble Highlighting outliers on Time Series Plot
I am having trouble plotting outliers on time series. Give then following code: # find STL Outliers by weight and append sh2, use Robust # this should allow the initial outliers to be filtered # this section may be commented out. tsSourceDiag <- stl(tsSource,s.window="per", robust=TRUE) # tsSourceIO <- which(tsSourceDiag $ weights < 1e-8) # # This is how to append run-time regessors for(z in tsSourceIO) { tmpname <-paste("PreIO",z,sep="") #COPY EOM REGRESSOR AS A TEMPLATE sh2[[tmpname]] <- sh2[["EOM"]] #SET IT ALL TO 0 sh2[[tmpname]][]<-FALSE #SET The Proper Indice to TRUE sh2[[tmpname]][z]<- TRUE } Ok so I have a time series tsSource. I yank out the index of each tsSourceDiag and appending it to an existing list of regressors with all false save one that is true for the index of the suspected outlier. I decided that a plot of the time series as points was in order and thought, "Hey I should really fill the circle that is considered an outlier red so I can eye ball check the graph to see if that is indeed an outlier needing agent Fox and Scully to investigate (yes my later list of outliers is in fact called "XFILES"). So I am like, BOOM! plot(tsSource) and points(tsSource[tsSourceIO]). Nada. A plot of tsSource[tsSourceIO] reveals a hint of what is wrong. tsSource is as time series with date info while tsSource[tsSourceIO] is just a series with no proper alignment with the cosmic universe... errr... I mean time series. Anyone have some sweet voodoo on how to get a proper time series plot while properly overplotting various indicies? (e.g. tsSeries[c=(1,22,11,61)]) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Automagic Outlier Dummy Variables
10/06/080 0 0 10/13/080 0 0 10/20/080 0 0 10/27/080 0 0 11/03/081 0 0 11/10/080 1 0 11/17/080 0 1 11/24/080 0 0 12/01/080 0 0 12/08/080 0 0 12/15/080 0 0 12/22/080 0 0 12/29/080 0 0 01/05/090 0 0 01/12/090 0 0 01/19/090 0 0 01/26/090 0 0 02/02/091 0 0 02/09/090 1 0 02/16/090 0 1 02/23/090 0 0 03/02/090 0 0 03/09/090 0 0 03/16/090 0 0 03/23/090 0 0 03/30/090 0 0 04/06/090 0 0 04/13/090 0 0 04/20/090 0 0 04/27/090 0 0 05/04/091 0 0 05/11/090 1 0 05/18/090 0 1 05/25/090 0 0 06/01/090 0 0 06/08/090 0 0 06/15/090 0 0 06/22/090 0 0 06/29/090 0 0 07/06/090 0 0 07/13/090 0 0 07/20/090 0 0 07/27/090 0 0 08/03/091 0 0 08/10/090 1 0 08/17/090 0 1 08/24/090 0 0 08/31/090 0 0 09/07/090 0 0 09/14/090 0 0 09/21/090 0 0 09/28/090 0 0 10/05/090 0 0 10/12/090 0 0 10/19/090 0 0 10/26/090 0 0 11/02/090 0 0 11/09/090 0 0 11/16/090 0 0 11/23/090 0 0 11/30/090 0 0 12/07/090 0 0 Now I want to try and find outliers in either the source data that are X standard deviations out and generate a dummy variable for each. So rather then having cReg[UnitBuild,UnitDB,ITBuild] I would also want to generate (say if there was an outlier on 11/27/06) I want to generate cReg[UnitBuild,UnitDB,ITBuild,112706] with the 11/27/06 value of the 11/27/06 set to 1. I hope this makes sense but I am having a heck of a time wrapping my head around generating an additional column in R, naming after the date of the outlier and subsequently generating the appropriate sequence for the dummy variable. Idgarad P.S. Anyone know how to generate a holiday dummy series but aligned for weekly samples? (e.g. Week 52 has Christmas so get a column called Christmas setting ever week 52 entry to 1?) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help fCalendar holidayNYSE for Regressors
I am working with weekly time series data as in: tsData=ts(data,start=c(2004,1),freq=52) I have a table of regression variables that matchs called cReg (loaded from an xls sheet). I would like to append to the cReg table dummy variables for all the holidays as calculated from the fCalendar package for instance, Easter, Christmas, Memorial Day, etc.) The problem I am running into is how to get the data that I can get from the fCalendar package into something useful for time series analysis such that my sampled data tsData is weekly but the holiday functions are giving me raw dates. In short I need to be able to answer "Does this week have a christmas in it? Does this week contain Memorial day, etc." Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Programatic Method for Holiday Dummy Variables
I am working on a weekly analysis and would like to look into the effects of a holiday. As I only get weekly data how can I populate, automagically, a series of dummy variables that are aligned with my data. Snip - channel1 <- odbcConnectExcel("D:/RSTATS/metrics.xls") sqlTables(channel1) sh1 <- sqlFetch(channel1, "Actuals$") channel2 <- odbcConnectExcel("D:/RSTATS/events.xls") sqlTables(channel2) sh2 <- sqlFetch(channel2, "data$") tsU000=ts(sh1$U000,start=c(2004,1),freq=52) summary(tsU000) #Variable forecastDistance <- 52 #Grab Existing Regressors cReg <- sh2[1:length(tsU000),-1] #Grab X Future Regressors equal to the forecastDistance fReg <- sh2[length(tsU000):(length(tsU000)+forecastDistance),-1] ---snip Somewhere in there I need to append to cReg a series of holiday dummy variables but here is the catch, it's weekly data. First How can I get an array of holidays in the first place aligned on weeks? e.g. week,xmas,newyears,holloween,cinco,...,qwanza, 1,0,0,0,0,0,...,0 2,0,0,0,0,0,...,0 3,0,0,0,0,0,...,0 4,0,0,0,0,0,...,0 5,0,0,0,0,0,...,0 .. 52,1,0,0,0,0,...,0 then how do I append the holiday array to the cReg array? Help I am confused and bewildered __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Proper Usage of the XREG in ARIMA
I am using the auto.arima package to do some basic forecasting based on CPU usage. I now have found a calendar that has various activities that partially control the computer's usage and want to factor that in (They are effectively dummy variables indicating a particular type of activity that week). Per the ARIMA instructions I am to feed those in a a vector or matrix. I am getting lost in the sand so to speak at this point. How would I prepare that data? I am pulling from a CSV that is roughly: date,usage,allocation,number of engines, theoretical max,r1,r2,...r21 So far so good just working with a copy of the CSV that is just date,usage But what should I do to disect the configuration data and the r1 to r21 dummy variables? (Some of these explain certain spikes and level shifts, forinstance r21 indicates if there was conversion activity during the week). I never really could figure out in R (only been using it a week or so) how to pull out part of an array. Also should I do my disection prior to or after concerting it into a ts object? the short of the script is (removing plots etc..): -- baseU000 <- read.csv("testfile.csv",header=T) #--- hmm what happens in years with a 53rd week... tsbaseU000 <- ts(baseU000,start=2004,frequency=52) #--- add regressors arimafit <- auto.arima(tsU000[,2],approximation=T,stepwise=N) forecastU000 <- forecast(arimafit,52) plot(forecastU000) lines(fitted(arimafit),col=3,lty="dashed") -- What I am just trying to do is build the best educated guess on what the cpu usage is going to be for some planning. As I control part of the calendar I need to start working towards the ability to do some "What-If" so I can provide future values for those dummy variables also. Soo close yet so far away Any suggestions? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.