[R] #INCLUDE

2009-07-08 Thread Idgarad
What is R's equivalent to a C-like #include to incorporate external files. I
have a 2k line function that is generated and need to include it at runtime
but not manage it as a package (as it changes hourly.) Any ideas?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Proper Paste for Data Member

2009-07-14 Thread Idgarad
I imported a spreadsheet into a variable sh

e.g. sh$, sh$, etc...

doing the following:

tsSource <- ts(paste("sh$",NAMEVARIABLE,sep="") ... )

fails. The paste isn't evaluating properly. What is the proper way to
concatenate a data source with a member name such that they evaluate
properly.

actual code below:
doEnv <- function(SOURCEDATA,REGDATA,HOUR,ENVNAME,REPORTNAME) {
print(SOURCEDATA)
print(REGDATA)
print(HOUR)
print(ENVNAME)
print(REPORTNAME)
# blah blah blah ...

#Raw Data
channel1 <- odbcConnectExcel("Q:/metrics.xls")
sqlTables(channel1)
sh1 <- sqlFetch(channel1, "Actuals$")
close(channel1)

# Something here is borked like the Chef himself
tsSource<-ts(paste("sh1$",ENVNAME,sep=""),start=c(2004,1),freq=52)
print(tsSource)
plot(tsSource,col="grey",type="n")
return("AUTOBOT") # I use AUTOBOT or DECEPTICON for generic pass fail return
values. Yes I am a geek...
}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spaces in a name

2009-07-15 Thread Idgarad
I am reading regressors from an excel file (I have no control over the file)
and some of the element names have spaces:

i.e. "Small Bank Aquired"

but I have found that lm(SourceData ~ . - "Small Bank Aquired", mcReg)
doesn't work (mcReg = modelCurrentRegressors)

As they are toggles I have ran them through factor() to be treated propertly
as 0 or 1 but due to the fact I am grabbing automagically the first 2/3rds
of the data some of the regressors are either all 0s or all 1s accordingly
so I need to take them out of the model by hand for now until I find a nice
automatic method for removing regressors that only have 1 factor.

So Primarily: how do I handle names that include spaces in this context and
as a bonus: Anyone have a nice method for yanking regressors that only have
a single factor in them from the lm() function?


e.g. (for the following 30 elements)
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1

As you can see grabbing the first 2/3rds is all 0s and the last 1/3rd is all
ones (doing in-sample forecast diagnostic building the model only on the
first 2/3rds of data, then forecasting the next 1/3rd and comparing.)

Sorry if I am rambling a bit, still on cup of coffee #1...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] storing lm() results and other objects in a list

2009-07-15 Thread Idgarad
to clean up some code I would like to make a list of arbitrary length
to store various objects for use in a loop

sample code:


 BEGIN SAMPLE ##
# You can see the need for a loop already
linearModel1=lm(modelSource ~ .,mcReg)
linearModel2=step(linearModel1)
linearModel3=lm(modelSource ~ .-1,mcReg)
linearModel4=step(linearModel3)
#custom
linearModel5=lm(modelSource ~ . -ACF-MonthlyST1-MonthlyST2-MonthlyBLA,mcReg)

LinearModel1.res <- residuals(linearModel1)
LinearModel2.res <- residuals(linearModel2)
LinearModel3.res <- residuals(linearModel3)
LinearModel4.res <- residuals(linearModel4)
LinearModel5.res <- residuals(linearModel5)

#hmmm bolt on linearModel[x] as linearModel[x]$arma.fit?
arma1.fit <- auto.arima(LinearModel1.res)
arma2.fit <- auto.arima(LinearModel2.res)
arma3.fit <- auto.arima(LinearModel3.res)
arma4.fit <- auto.arima(LinearModel4.res)
arma5.fit <- auto.arima(LinearModel5.res,stepwise=T,trace=T)

#Ok what is left over after Regression and ARIMA that cannot
#be explained. Stupid outliers
#AO's can be added to the cReg as a normal dummy variable
# but these are AOs from the model not the original data.
# is it better to handle AOs from the original data?

#linearModel[x]arma.ao?
arma1.ao <- detectAO(arma1.fit)
arma2.ao <- detectAO(arma2.fit)
arma3.ao <- detectAO(arma3.fit)
arma4.ao <- detectAO(arma4.fit)
arma5.ao <- detectAO(arma5.fit)

#What do I do with an innovative outlier? Transfer function or what?
#auto.arima doesn't handle the IO=c(...) stuff Umm...
#transfer functions, etc. are a deficency in the script at this point

#linearModel[x]arma.io?
arma1.io <- detectIO(arma1.fit)
arma2.io <- detectIO(arma2.fit)
arma3.io <- detectIO(arma3.fit)
arma4.io <- detectIO(arma4.fit)
arma5.io <- detectIO(arma5.fit)

#Sample on how to auto-grab regressors from DetectAO and DetectIO and
#appened them to our regression array. You'd have to do this for each model
#as the residuals are where the outliers are coming from and diff models
#would have different residuals left over. IO is best left to arimax functions
#directly. I assume at this point that AO's can be added to Regression tables
#if that is the case then REM out the IO lines and pass the detectIO results

#into the arimax(x,y,z,IO=detectIO(blah))

#
# Need a better understanding of how to address the AO and IO's in
this script before implementing them
# (Repeat for each model, cReg1,cReg2,etc..)
#
#cReg1=cReg
#fReg1=fReg
#for(i in arma1.io$ind){ print(i);cReg1[,paste(sep="
","IO",i)]=1*(seq(cReg1[,2])==i)}
#for(i in arma1.ao$ind){ print(i);cReg1[,paste(sep="
","AO",i)]=1*(seq(cReg1[,2])==i)}
#for(i in arma1.io$ind){ print(i);fReg1[,paste(sep="
","IO",i)]=1*(seq(fReg1[,2]))}
#for(i in arma1.ao$ind){ print(i);fReg1[,paste(sep="
","AO",i)]=1*(seq(fReg1[,2]))}


#Get the pdq,PDQs into a variable so we can re-feed it if neccessary
#oh crap absorbing this into LinearModel[x] looks ugly for syntax
arma1.fit$order=c(arma1.fit$arma[1],arma1.fit$arma[2],arma1.fit$arma[6])
arma2.fit$order=c(arma2.fit$arma[1],arma2.fit$arma[2],arma2.fit$arma[6])
arma3.fit$order=c(arma3.fit$arma[1],arma3.fit$arma[2],arma3.fit$arma[6])
arma4.fit$order=c(arma4.fit$arma[1],arma4.fit$arma[2],arma4.fit$arma[6])
arma5.fit$order=c(arma5.fit$arma[1],arma5.fit$arma[2],arma5.fit$arma[6])

arma1.fit$seasonal=c(arma1.fit$arma[3],arma1.fit$arma[4],arma1.fit$arma[7])
arma2.fit$seasonal=c(arma2.fit$arma[3],arma2.fit$arma[4],arma2.fit$arma[7])
arma3.fit$seasonal=c(arma3.fit$arma[3],arma3.fit$arma[4],arma3.fit$arma[7])
arma4.fit$seasonal=c(arma4.fit$arma[3],arma4.fit$arma[4],arma4.fit$arma[7])
arma5.fit$seasonal=c(arma5.fit$arma[3],arma5.fit$arma[4],arma5.fit$arma[7])

#these Two are used for linearModel2 and linearModel4, Get only the
#regressors that surived step removal.
newcReg=cReg[match(names(linearModel2$coeff[-1]),names(cReg))]
newfReg=fReg[match(names(linearModel2$coeff[-1]),names(fReg))]
newmcReg=mcReg[match(names(linearModel2$coeff[-1]),names(mcReg))]
newmfReg=mfReg[match(names(linearModel2$coeff[-1]),names(mfReg))]

#Scenario 1 - All Regressors Left In
newFit1.b <- 
Arima(modelSource,order=arma1.fit$order,seasonal=list(order=arma1.fit$seasonal),xreg=mcReg,include.drift=F)

#Scenario 2 - Step Removal of Regressors
newFit2.b <- 
Arima(modelSource,order=arma2.fit$order,seasonal=list(order=arma2.fit$seasonal),xreg=newmcReg,include.drift=F)

#Scenario 3 - All Regressors Left In with Intercept Removed
newFit3.b <- 
Arima(modelSource,order=arma3.fit$order,seasonal=list(order=arma3.fit$seasonal),xreg=mcReg,include.drift=F)

#Scenario 4 - Step Removal of Regressors with Intercept Removed (I
have a feeling this is identical to #2 in results
newFit4.b <- 
Arima(modelSource,order=arma4.fit$order,seasonal=list(order=arma4.fit$seasonal),xreg=newmcReg,include.drift=F)

#Scenario 5 - Robust1, For giggles and grins for now
newFit5.b <- 
Arima(modelSource,order=arma5.fit$order,seasonal=list(order=arma5.fit$seasonal),xreg=newmcReg,include.drift=F)

#All the 

[R] list of lm() results

2009-07-21 Thread Idgarad
How can I get the results of lm() into a list so I can loop through the results?

e.g.

myResults[1] <- lm(...)
myResults[2] <- lm(...)
myResults[3] <- lm(...)
...
myResults[15] <- lm(...)
myResults[16] <- lm(...)

so far every attempt I've tried doesn't work throwing a "number of
items to replace is not a multiple of replacement length" error or
simply not working.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odd coefficent behavior

2009-07-21 Thread Idgarad
Why are my coefficients getting appended with a 1? It borks a match I
do later against the original list that doesn't have the random 1
added to the end.

> linearModel[[1]]

Call:
lm(formula = modelSource ~ +UNITBUILD + UNITDB + ITBUILD + ITDB +
UATBUILD + UATDB + HOGANCODE + RCF + ReleaseST1 + ReleaseST2 +
ReleaseBLA + Small.Bank.Acquisitions + HLY.NewYear + HLY.MLK +
HLY.PRES + HLY.MEMORIAL + HLY.J4 + HLY.LABOR + HLY.COLUMBUS +
HLY.VETS + HLY.THANKS + HLY.XMAS + HLY.ELECT + HLY.PATRIOT + EOM,
data = mcReg)

Coefficients:
 (Intercept)UNITBUILD1   UNITDB1
405.8326   -8.5675   13.5029
ITBUILD1 ITDB1 UATBUILD1
 33.0950   -6.19380.2625
  UATDB1HOGANCODE1  RCF1
 -3.7793   -3.48255.3243
 ReleaseST11   ReleaseST21   ReleaseBLA1
 13.6911   -9.4573   -3.3526
Small.Bank.Acquisitions1  HLY.NewYear1  HLY.MLK1
 36.6445  -92.5360   22.1168
   HLY.PRES1 HLY.MEMORIAL1   HLY.J41
  7.1886  -13.0013  -14.3520
  HLY.LABOR1 HLY.COLUMBUS1 HLY.VETS1
 -0.9740  -16.9177   16.2969
 HLY.THANKS1 HLY.XMAS1HLY.ELECT1
-15.9056  -65.9887  -10.9916
HLY.PATRIOT1  EOM1
-20.2531   15.4775

Now all the variables with a 1 appended are factors so is that normal
behavior? (if so then I can adjust the match() command to pad a 1 to
the master list.)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic Question (Real Basic)

2010-04-01 Thread Idgarad
I am having a total brain fart... complete and total. This is part R, Part
Statstitics, Part "My Brain is on vacation apparently."

Ok I have a time series I need to LOG and DIFF for ARIMA with Regressors.
Say 100 data points.
Obviously when I diff the series once I get 98 data points now. So what is
the appropriate way to handle that now.

Part B (This is where I am having a fundamental brain fart). Given that I
have Logged and Diff'ed the original Time Series and I want to get a
forecast, how do I apply that back to my original data? I am having a mental
implosion right now (Just got back from vacation.) I am just not wrapping my
head around forecasting in R against transformed data (For stationary
purposes).

Total brain meltdown ( grilled cheese sounds good...) Anyone have a
basic shell script I can look at for reference... not connecting dots
today



Idgarad
-- 
"Who is John Galt?"

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Table to List Transformation Scenario

2010-04-21 Thread Idgarad
I have a series of tables, one for each environment indicating a date (row)
and a sample at each hour of the day (0 to 23)

Test1 Table:
Date,Hour1,Hour2,...Hour23
1/1/10,123,123,...,123

I would like to model this as a time series but how can I translate the
table into a list such that I can get:

1/1/10 00:00, 123
1/1/10 01:00, 123
1/1/10 02:00, 123
...
1/1/10 23:00, 123

Any suggestions on how to get that kind of translation done in R?

Idgarad
-- 
"Who is John Galt?"

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with predict() and factors

2009-12-03 Thread Idgarad
I am working on a script that takes numeric performance indicators and runs
them against a series of regressors (dummy regressors, yes\no stuff via 0
and 1, e.g. Was is Christmas this week 0=no, 1=yes).

The script is as follows (Written as a function):


-- Begin Script --

doEnv <- function(HOUR,ENVNAME,REPORTNAME) {
library(RODBC)
library(forecast)
library("geneplotter")
library(forecast)
library(fUtilities)
library(TSA)
require(gplots)
library(robfilter)

SOURCEDATA <- paste("Q:/TEST/RSTATS/EPOC ",HOUR," Metrics.xls",sep="")
REGRESSORS <- "Q:/TEST/RSTATS/eventswithholidays.xls"

mypalette=c()
mypalette$background="#FF"
mypalette$chart="#FF"
mypalette$forecastRegion="#66CCFF"
mypalette$confidence="#FF9966"
mypalette$limits="#FF"
mypalette$major="#00"
mypalette$minor="#cc"
mypalette$actual="#aa"
mypalette$dp1="#9900FF"
mypalette$dp2="#00"
mypalette$dp3="#CCFF00"
mypalette$dp4="#00CCFF"
mypalette$dp5="#FF00CC"

#Raw Data
channel1 <- odbcConnectExcel(SOURCEDATA)
sqlTables(channel1)
sh1 <- sqlFetch(channel1, "Actuals$")
close(channel1)
channel2 <- odbcConnectExcel(REGRESSORS)
sqlTables(channel2)
sh2 <- sqlFetch(channel2, "data$")
close(channel2)

#Get Raw Data
tsSource<-ts(sh1[[ENVNAME]],start=c(2004,1),freq=52)

#Data is now a Time Series
#Prep Out-of-sample test ranges
modLength=length(sh1[[ENVNAME]])
modMax=round((modLength/3)*2)
modEndDate=time(tsSource)[modMax]
modStartDate=time(tsSource)[1]

#RAW SUMAMRY WITH OVERLAY OF OUT OF SAMPLE RANGES
summary(tsSource)
modelSource=window(tsSource,modStartDate,end=modEndDate)
verSource=window(tsSource,time(tsSource)[modMax+1])
pdf(paste("Q:/ReleaseMgmt/Environment
Mgmt/Data/Current/Metrics/Mainframe/Test Environment
Projections/RSTATS/images/",ENVNAME,"-",HOUR,"-","Raw Metrics with Test
Range.pdf",sep=""),width=9, height=6.5)
plot(tsSource,col="grey", main=paste("Raw Data for", REPORTNAME),
xlab="Date", ylab="MiPS Used")
points(modelSource,col="red", pch=20)
points(verSource,col="blue", pch=20)
smartlegend( x="left", y= "top", inset=0,
#smartlegend parameters
 legend = c("Actual Data","Data for Model Selection","Data for
In Sample Verification"),
   fill=c(mypalette$actual,"red","blue"),bg = mypalette$background)
print("The Red region is where we are going to develop the model from and
the blue area is where we will evaluate the model (In Sample Testing)")

#Ok our ranges are comfirmed we'll get a better graph later

# This Heavy Voodoo™ allows us to have a dynamic number of
#dummy variables we can add\remove from the spreadsheet
forecastDistance <- 52
#Grab Existing Regressors (clipping out the data)
cReg <- sh2[1:modLength,-1]
mcReg <- sh2[1:modMax,-1]
#transform the on\offs into proper factors
for(i in names(cReg)) cReg[[i]] <- factor(cReg[[i]])
for(i in names(mcReg)) mcReg[[i]] <- factor(mcReg[[i]])
#Grab X Future Regressors equal to the forecastDistance (gotta double check
if I need a +1 on the start point)
fReg <- sh2[length(tsSource):(length(tsSource)+forecastDistance),-1]
mfReg <-sh2[(modMax+1):modLength,-1]
#fix variable names
names(cReg) <- make.names(names(cReg))
names(mcReg) <- make.names(names(mcReg))
names(fReg) <- make.names(names(fReg))
names(mfReg) <- make.names(names(mfReg))
#print("#")
#print("This is the CReg Data")
#print("#")
#print(summary(cReg))
#print("##")
#print("This is the mcReg Data")
#print("##")
#print(summary(mcReg))
#names(mcReg)
for(i in names(fReg)) fReg[[i]] <- factor(fReg[[i]])
for(i in names(mfReg)) mfReg[[i]] <- factor(mfReg[[i]])
#end heavy voodoo


#
# MODEL VERIFICATION FIRST!
#
# Basic Look at the raw data
hist(modelSource)
plot(density(modelSource,na.rm=TRUE))
plot(sort(modelSource),pch=".")
for(i in names(mcReg)) {
pairs(modelSource ~ .,mcReg[[i]], main=paste("Model - MIPS vs",i))
}
#Build the list to store our results
linearModel <- list()
residuals <- list()
arima_Fit <- list()
arima_AO <- list()
arima_IO <- list()
newcReg <- list()
newfReg <- list()
newmcReg <- list()
newmfReg <- list()
newFit <- list()
newForecast <- list()
# Following won't work until mcReg contains full variety
linearModel[[1]]=lm(modelSource ~ + UNITBUILD + UNITDB + ITBUILD + ITDB +
UATBUILD + UATDB + HOGANCODE + RCF + ReleaseST1 + ReleaseST2 + ReleaseBLA +
Small.Bank.Acquisitions + HLY.NewYear + HLY.MLK + HLY.PRES + HLY.MEMORIAL +
HLY.J4 + HLY.LABOR + HLY.COLUMBUS + HLY.VETS + HLY.THANKS + HLY.XMAS +
HLY.ELECT + HLY.PATRIOT + EOM,mcReg)
linearModel[[2]]=step(linearModel[[1]], trace=1)
linearModel[[3]]=lm(modelSource ~ + UNITBUILD + UNITDB + ITBUILD + ITDB +
UATBUILD + UATDB + HOGANCODE + RCF + ReleaseST1 + ReleaseST2 + ReleaseBLA +
Small.Bank.Acquisitions + HLY.NewYear + HLY.MLK + HLY.PRES + HLY.MEMORIAL +
HLY.J4 + HLY.LABOR + HLY.COLUMBUS + HLY.VETS + HLY.THANKS + HLY.XMAS +
HLY.ELECT + HLY.PATRIOT + EOM - 1,mcReg)
linearModel[[4]]=step(linearModel[[3]],trace=1)
if(ENVNAME=="E081") {linearModel[[5]]=lm(modelS

[R] Holiday Gift Perl Script for US Holiday Dummy Regressors

2009-12-08 Thread Idgarad
# BEGIN CODE ##



#!/usr/bin/perl
##
#
# --start, -s = The date you would like to start generating regressors
#--end, -e = When to stop generating holiday regressros
# --scope, -c = D, W for Daily or Weekly respectively (e.g. Does this week
have a particular holiday)
# --file, -f = Ummm where to write the output silly!
#
# **NOTE** The EOM holiday is "End of Month" for computer systems this may
be important for
# extra processing and what not.
#
# You may need to set yout TZ environment variable if the script cannot
# determine your time zone from the system (e.g. SET TZ=CST )
##
use Getopt::Long;
use Date::Manip;
use Spreadsheet::WriteExcel;
use Calendar::Functions;
use Date::Holidays::USFederal;
use Set::Array;
use POSIX qw/strftime/;
use Time::Local;

my @regressors = ();
#my $holidays = Date::Holidays->new(countrycode => 'us');

$result = GetOptions ("start|s=s" => \$start,
   "end|e=s" => \$end,
   "scope|c=s" => \$scope,
   "file|f=s" => \$filename);




print "Generating Holiday Dummy Variables starting $start to $end generated
by $scope. Output to $filename \n";

#print all the dates based on scope as a test




$startDate=ParseDate(\$start);
if (! $startDate) {
print "Error in the date";exit;
}
$endDate= ParseDate($end);
print "Start Date: ",UnixDate($startDate,"%m/%d/%Y"),"\n";
print "End Date: ",$end,"\n";

print "Last Day in Month: ",UnixDate(ParseDate("last day in JAN
2004"),"%m/%d/%Y"),"\n";



#HEADER OUTpUT
print
"Date,HLY-NewYear,HLY-MLK,HLY-PRES,HLY-MEMORIAL,HLY-J4,HLY-LABOR,HLY-COLUMBUS,HLY-VETS,HLY-THANKS,HLY-XMAS,HLY-ELECT,HLY-PATRIOT,EOM\n";
$baseDate=$startDate;

if ($scope eq "d"){

while(Date_Cmp($baseDate,$endDate)<0)
{
print UnixDate($baseDate,"%m/%d/%Y"), ",";
if(holidayCheck($baseDate) eq "New Year's Day"){print "1,"} else {print
"0,"};
if(holidayCheck($baseDate) eq "Martin Luther King, Jr. Day"){print "1,"}
else {print "0,"};
if(holidayCheck($baseDate) eq "Presidents' Day"){print "1,"} else {print
"0,"};
if(holidayCheck($baseDate) eq "Memorial Day"){print "1,"} else {print "0,"};
if(holidayCheck($baseDate) eq "Independence Day"){print "1,"} else {print
"0,"};
if(holidayCheck($baseDate) eq "Labor Day"){print "1,"} else {print "0,"};
if(holidayCheck($baseDate) eq "Columbus Day"){print "1,"} else {print "0,"};
if(holidayCheck($baseDate) eq "Veterans' Day"){print "1,"} else {print
"0,"};
if(holidayCheck($baseDate) eq "Thanksgiving Day"){print "1,"} else {print
"0,"};
if(holidayCheck($baseDate) eq "Christmas Day"){print "1,"} else {print
"0,"};
if(holidayCheck($baseDate) eq "Election Day"){print "1,"} else {print "0,"};
if(holidayCheck($baseDate) eq "U.S. Patriot Day Unofficial
Observation"){print "1,"} else {print "0,"};
if(holidayCheck($baseDate) eq "EOM"){print "1"} else {print "0"};
print "\n";

$baseDate=DateCalc($baseDate,"+1 day");
}

} # END IF D

if($scope eq "w") {

while(Date_Cmp($baseDate,DateCalc($endDate,"+7 days"))<0)
{
print UnixDate($baseDate,"%m/%d/%Y"), ",";

if(

(holidayCheck(DateCalc($baseDate,"+0 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+1 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+2 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+3 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+4 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+5 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+6 days")) eq "New Year's Day")

){print "1,"} else {print "0,"};

if(
holidayCheck(DateCalc($baseDate,"+0 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+1 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+2 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+3 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+4 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+5 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+6 days")) eq "Martin Luther King, Jr. Day"

){print "1,"} else {print "0,"};

if(
holidayCheck(DateCalc($baseDate,"+0 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+1 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+2 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+3 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+4 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+5 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+6 days")) eq "Presidents' Day"
){print "1,"} else {print "0,"};

if(
holidayCheck(DateCalc($baseDate,"+0 days")) eq "Memorial Day" ||
holidayCheck(DateCalc($baseDate,"+1 days")) eq "Memorial Day" ||
holidayCheck(DateCalc($baseDate,"+2 days")) eq "Memorial Day" ||
holidayCheck(DateCalc($baseDate,"+3 days")) eq "Memorial Day" ||
holidayCheck(DateCalc($baseDate,"+4 days")) eq "Memorial Day" ||
holidayCheck(DateCalc($baseDate,"+5 days")) eq "Memorial Day" ||
h

[R] Opps Correct Version of Holiday Regressor Perl Script

2009-12-08 Thread Idgarad
Here is the correct version. The old version is the redirect only version of
the script.

### BEGIN SCRIPT 

#!/usr/bin/perl
##
# --start, -s = The date you would like to start generating regressors
#--end, -e = When to stop generating holiday regressros
# --scope, -c = D, W for Daily or Weekly respectively (e.g. Does this week
have a particular holiday)
# --file, -f = Ummm where to write the output silly!
#
# **NOTE** The EOM holiday is "End of Month" for computer systems this may
be important for
# extra processing and what not.
#
# You may need to set yout TZ environment variable if the script cannot
# determine your time zone from the system (e.g. SET TZ=CST )
##


use Getopt::Long;
use Date::Manip;
use Spreadsheet::WriteExcel;
use Calendar::Functions;
use Date::Holidays::USFederal;
#use Date::Holidays;
use Set::Array;
use POSIX qw/strftime/;
use Time::Local;

my @regressors = ();
#my $holidays = Date::Holidays->new(countrycode => 'us');

$result = GetOptions ("start|s=s" => \$start,
   "end|e=s" => \$end,
   "scope|c=s" => \$scope,
   "file|f=s" => \$filename);

open (OUTFILE, ">>$filename");


print "Generating Holiday Dummy Variables starting $start to $end generated
by $scope. Output to

$filename \n";

#print all the dates based on scope as a test




$startDate=ParseDate(\$start);
if (! $startDate) {
print "Error in the date";exit;
}
$endDate= ParseDate($end);
print OUTFILE "Start Date: ",UnixDate($startDate,"%m/%d/%Y"),"\n";
print OUTFILE "End Date: ",$end,"\n";

# print OUTFILE "Last Day in Month: ",UnixDate(ParseDate("last day in JAN
2004"),"%m/%d/%Y"),"\n";




print OUTFILE

"Date,HLY-NewYear,HLY-MLK,HLY-PRES,HLY-MEMORIAL,HLY-J4,HLY-LABOR,HLY-COLUMBUS,HLY-VETS,HLY-THANKS,HLY-XMA

S,HLY-ELECT,HLY-PATRIOT,EOM\n";
$baseDate=$startDate;

if ($scope eq "d"){

while(Date_Cmp($baseDate,$endDate)<0)
{
print OUTFILE UnixDate($baseDate,"%m/%d/%Y"), ",";
if(holidayCheck($baseDate) eq "New Year's Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Martin Luther King, Jr. Day"){print OUTFILE
"1,"} else {print OUTFILE

"0,"};
if(holidayCheck($baseDate) eq "Presidents' Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Memorial Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Independence Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Labor Day"){print OUTFILE "1,"} else {print
OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Columbus Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Veterans' Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Thanksgiving Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Christmas Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "Election Day"){print OUTFILE "1,"} else
{print OUTFILE "0,"};
if(holidayCheck($baseDate) eq "U.S. Patriot Day Unofficial
Observation"){print OUTFILE "1,"} else {print

OUTFILE "0,"};
if(holidayCheck($baseDate) eq "EOM"){print OUTFILE "1"} else {print OUTFILE
"0"};
print OUTFILE "\n";

$baseDate=DateCalc($baseDate,"+1 day");
}

} # END IF D

if($scope eq "w") {

while(Date_Cmp($baseDate,DateCalc($endDate,"+7 days"))<0)
{
print OUTFILE UnixDate($baseDate,"%m/%d/%Y"), ",";

if(

(holidayCheck(DateCalc($baseDate,"+0 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+1 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+2 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+3 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+4 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+5 days")) eq "New Year's Day") ||
(holidayCheck(DateCalc($baseDate,"+6 days")) eq "New Year's Day")

){print OUTFILE "1,"} else {print OUTFILE "0,"};

if(
holidayCheck(DateCalc($baseDate,"+0 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+1 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+2 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+3 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+4 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+5 days")) eq "Martin Luther King, Jr. Day"
||
holidayCheck(DateCalc($baseDate,"+6 days")) eq "Martin Luther King, Jr. Day"

){print OUTFILE "1,"} else {print OUTFILE "0,"};

if(
holidayCheck(DateCalc($baseDate,"+0 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+1 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+2 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+3 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+4 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+5 days")) eq "Presidents' Day" ||
holidayCheck(DateCalc($baseDate,"+6 days")) eq "Presidents' Day"
){

[R] lm() and factors appending

2009-12-30 Thread Idgarad
How for the love of god can I prevent the lm() function from padding on to
my factor variables?

I start out with 2 tables:

Table1
123123
124351
...
626773

Table2
Count,IS_DEAD,IS_BURNING
1231,T,F
4521,F,T
...
3321,T,T

Everything looks fine when I import the data.

then we get a

oh_crap <- lm(table1 ~ Count + IS_DEAD + IS_BURNING, table2)

Magically when I look at my oh_crap coefficents they get turned into

Count, IS_DEADTRUE, IS_BURNINGTRUE

I get it that it finds them to be factors by how in the name of all that is
holy do I prevent them from doing that crap since later after a stepwise
removal I go into one of the models grab what coefficents were kept
(IS_BURNINGTTRUE now) and read future regressors from the original table
which read IS_BURNING rather then IS_BURNINGTRUE.

Since there is a mix of numeric and dummy regressors I cannot selectively
append TRUE to variables names as I don't have control of what regressors
get imported (one month there might be 50 and another month 2).

How can I stop lm() from padding onto a coefficent's name?

I have no objection to post processing by finding and assasinating any name
with the word TRUE appended to the end of a name but in that case then how
do I change the coeff name?

Maddening... who thought it would be a good idea to append things without
asking? A simple lm(appendFactor=FALSE) would have been nice... took me 3
hours to find out what was going wrong on this Grabbing coffee, a carton
of cigs, and heading outside to smash my head with a brick to make the
hurting stop

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Drop a part of an array\list\vector?

2010-01-07 Thread Idgarad
I did have a verbose description of why but rather then make everyone's eyes
bleed with useless details I ask the following :)
 To make a long story short: How can I make newmcReg[[i]]["PreIO308"] go
away in the following list... er vector... no wait array dataframe
awww crap...


 summary(newmcReg[[i]])
 UNITBUILD UNITDB ITBUILD   ITDB
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:249   FALSE:249   FALSE:249   FALSE:249
 TRUE :21TRUE :21TRUE :21TRUE :21



  UATBUILD UATDB HOGANCODE  ACF
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:250   FALSE:250   FALSE:208   FALSE:225
 TRUE :20TRUE :20TRUE :62TRUE :45



RCF  ReleaseST1  ReleaseST2  ReleaseBLA
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:186   FALSE:167   FALSE:157   FALSE:228
 TRUE :84TRUE :103   TRUE :113   TRUE :42



 MonthlyST1  MonthlyST2  MonthlyBLA  Small.Bank.Acquisitions
 Mode :logical   Mode :logical   Mode :logical   Min.   :0.
 FALSE:107   FALSE:105   FALSE:147   1st Qu.:0.
 TRUE :163   TRUE :165   TRUE :123   Median :0.
 Mean   :0.1556
 3rd Qu.:0.
 Max.   :1.
  Conversions  Build.New.Environment HLY.NewYear  HLY.MLK
 Min.   :0.0   Mode :logical Mode :logical   Mode :logical
 1st Qu.:0.0   FALSE:262 FALSE:266   FALSE:264
 Median :0.0   TRUE :8   TRUE :4 TRUE :6
 Mean   :0.08889
 3rd Qu.:0.0
 Max.   :1.0
  HLY.PRES   HLY.MEMORIAL  HLY.J4HLY.LABOR
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:264   FALSE:265   FALSE:265   FALSE:265
 TRUE :6 TRUE :5 TRUE :5 TRUE :5



 HLY.COLUMBUS HLY.VETS   HLY.THANKS   HLY.XMAS
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:265   FALSE:265   FALSE:265   FALSE:265
 TRUE :5 TRUE :5 TRUE :5 TRUE :5



 HLY.ELECT   HLY.PATRIOTEOMNEWMF
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:265   FALSE:265   FALSE:210   FALSE:263
 TRUE :5 TRUE :5 TRUE :60TRUE :7



  PreIO47 PreIO151PreIO164PreIO169
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:269   FALSE:269   FALSE:269   FALSE:269
 TRUE :1 TRUE :1 TRUE :1 TRUE :1



  PreIO197PreIO203PreIO209PreIO241
 Mode :logical   Mode :logical   Mode :logical   Mode :logical
 FALSE:269   FALSE:269   FALSE:269   FALSE:269
 TRUE :1 TRUE :1 TRUE :1 TRUE :1



  PreIO261PreIO308
 Mode :logical   Mode :logical
 FALSE:269   FALSE:270
 TRUE :1


(the PreIO are outliers identified from the output of stl() e.g. outliers in
the source data)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting a date out of an indice in a time series

2010-01-11 Thread Idgarad
I have a weekly data set imported via:

tsSource=ts(sh1$I000,start=c(2004,1),freq=52)
I am now getting to some 'spit and polish' but I realize something I can't
wrap my head around.

Given an outlier I find at say tsSource[54] ... how can get translate index
54 into the date\week. I mean I can figure out obviously that entry 52 is
last week of 2004 but since the data goes for many years week 251 is a tad
bit tricky since there are some years with 53 weeks in a year. Is there a
function that I am missing to give me a date (or even julian week #) from a
time series object? Since I am generating graphs that should read "Week
starting xxx/xxx/xxx" I was hoping there was a systematic way to retrive the
date from the time series object. Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [Solved][Code Snippets] Dropping Empty Regressors

2010-01-12 Thread Idgarad
To make a long story short I was doing some in-sample testing in which some
dynamically created regressors would end up either all true or all false
based on the validation portion. In my case a new mainframe configuration
(this is a crappy way to handle a level shift but I do what I can.) So here
is the code snippet that finally let me pre-check my regressors and drop any
of them that were all true or all false.

First the automagic STL outlier grabber that caused part of the problem:


# tsSource being my time Series source.
# sh2 is a table of all my regessors that have been previously pulled in
# this has historic and future values in it also, it gets sliced later.
# the EOM is the regessor holding weeks that contain an 'End of Month'
#
# This appends the found IOs to the regressor table. Stepwise tends to
# remove them later on. I needed a programtic way of removing useless
# regressors for model verification since I would not know their names
# if any are found

tsSourceDiag <- stl(tsSource,s.window="per", robust=TRUE)
#
tsSourceIO <- which(tsSourceDiag $ weights  < 1e-8)
#
# This is how to append run-time regessors
for(z in tsSourceIO) {
tmpname <-paste("PreIO",z,sep="")
#COPY EOM AS A TEMPLATE
sh2[[tmpname]] <- sh2[["EOM"]]
#SET IT ALL TO 0
sh2[[tmpname]][]<-FALSE
#SET The Proper Indice to TRUE
sh2[[tmpname]][z]<- TRUE
}


So to get rid of them (those empty useless regressors) I cooked up this:

###
#Prune Empty Regressors (All false or all true)
# the newmcReg you see is a copy of the sh2 from earlier
# newmcReg = New Model Current Regressors
# sh2 later became cReg.
#
# Yes it makes my eyes bleed. in short we count all the trues
# and all the false and if they happen to be the same number
# as the length we know they are all true or false.
#
# the trick I finally found was that you could in fact -c()
# a list (e.g. ask for everything but the following) but you
# can't apparently do that inline so we just make a list of
# regressors that get shown the door then after hunting
# them down we give em the boot. This mess is soley
# so my in-sample Arima doesn't choke on xreg=newmcReg
# in which one of the newmcReg happen to be all true or false.
#
# God I wish I had taken more then a Trig course. Where was I?
#
# Yes that phantom 'i' you see is that this is all in a big loop
# for 6 possible models
# lm1 = all regressors w/ intercept
# lm2 = lm1 stepwise removal
# lm3 = all regressors wo/ intercept
# lm4 = lm3 stepwise removal
# lm5 = Hand Tuned
# lm6 = lm5 stepwise removal
###
toPurge=c()
for(k in names(newmcReg[[i]])) {
 print (paste("check to see if",k,"is a useless regressors for model",i))
 if(sum(newmcReg[[i]][k][,1])==length(newmcReg[[i]][k][,1])) {
  print(paste("All of",k,"are TRUE"))
  getLost=which(names(newmcReg[[i]])==k)
  toPurge=c(toPurge,getLost)
  print(paste(k, "has been added to the purge list for model", i,"!"))
 }
 if(sum(newmcReg[[i]][k][,1]==FALSE)==length(newmcReg[[i]][k][,1])) {
  print(paste("All of",k,"are FALSE"))
  getLost=which(names(newmcReg[[i]])==k)
  toPurge=c(toPurge,getLost)
  print(paste(k, "has been added to the purge list for model", i,"!"))
 }
}
toPurge
# Do this only if there are any or R will beat you senseless and
# steal all your M&Ms!
if(length(toPurge)!=0) {
names(newmcReg[[i]])
names(newmcReg[[i]][-c(toPurge)])
newmcReg[[i]] <- newmcReg[[i]][-c(toPurge)]
newmfReg[[i]] <- newmfReg[[i]][-c(toPurge)]
names(newmcReg[[i]])
}
##
# End Regressor Pruning
##


Big thanks to the help so far. Now about those darn transfer functions...
hmm and pulse detection...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble Highlighting outliers on Time Series Plot

2010-01-26 Thread Idgarad
I am having trouble plotting outliers on time series.

Give then following code:

# find STL Outliers by weight and append sh2, use Robust
# this should allow the initial outliers to be filtered
# this section may be commented out.

tsSourceDiag <- stl(tsSource,s.window="per", robust=TRUE)
#
tsSourceIO <- which(tsSourceDiag $ weights  < 1e-8)
#
# This is how to append run-time regessors
for(z in tsSourceIO) {
tmpname <-paste("PreIO",z,sep="")
#COPY EOM REGRESSOR AS A TEMPLATE
sh2[[tmpname]] <- sh2[["EOM"]]
#SET IT ALL TO 0
sh2[[tmpname]][]<-FALSE
#SET The Proper Indice to TRUE
sh2[[tmpname]][z]<- TRUE
}
Ok so I have a time series tsSource. I yank out the index of each
tsSourceDiag and appending it to an existing list of regressors with all
false save one that is true for the index of the suspected outlier.

I decided that a plot of the time series as points was in order and thought,
"Hey I should really fill the circle that is considered an outlier red so I
can eye ball check the graph to see if that is indeed an outlier needing
agent Fox and Scully to investigate (yes my later list of outliers is in
fact called "XFILES").

So I am like, BOOM! plot(tsSource) and points(tsSource[tsSourceIO]). Nada.

A plot of tsSource[tsSourceIO] reveals a hint of what is wrong. tsSource is
as time series with date info while tsSource[tsSourceIO] is just a series
with no proper alignment with the cosmic universe... errr... I mean time
series.

Anyone have some sweet voodoo on how to get a proper time series plot while
properly overplotting various indicies? (e.g. tsSeries[c=(1,22,11,61)])

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automagic Outlier Dummy Variables

2008-05-27 Thread Idgarad
10/06/080   0   0
10/13/080   0   0
10/20/080   0   0
10/27/080   0   0
11/03/081   0   0
11/10/080   1   0
11/17/080   0   1
11/24/080   0   0
12/01/080   0   0
12/08/080   0   0
12/15/080   0   0
12/22/080   0   0
12/29/080   0   0
01/05/090   0   0
01/12/090   0   0
01/19/090   0   0
01/26/090   0   0
02/02/091   0   0
02/09/090   1   0
02/16/090   0   1
02/23/090   0   0
03/02/090   0   0
03/09/090   0   0
03/16/090   0   0
03/23/090   0   0
03/30/090   0   0
04/06/090   0   0
04/13/090   0   0
04/20/090   0   0
04/27/090   0   0
05/04/091   0   0
05/11/090   1   0
05/18/090   0   1
05/25/090   0   0
06/01/090   0   0
06/08/090   0   0
06/15/090   0   0
06/22/090   0   0
06/29/090   0   0
07/06/090   0   0
07/13/090   0   0
07/20/090   0   0
07/27/090   0   0
08/03/091   0   0
08/10/090   1   0
08/17/090   0   1
08/24/090   0   0
08/31/090   0   0
09/07/090   0   0
09/14/090   0   0
09/21/090   0   0
09/28/090   0   0
10/05/090   0   0
10/12/090   0   0
10/19/090   0   0
10/26/090   0   0
11/02/090   0   0
11/09/090   0   0
11/16/090   0   0
11/23/090   0   0
11/30/090   0   0
12/07/090   0   0

Now I want to try and find outliers in either the source data that are
X standard deviations out and generate a dummy variable for each.

So rather then having cReg[UnitBuild,UnitDB,ITBuild] I would also want
to generate (say if there was an outlier on 11/27/06) I want to
generate cReg[UnitBuild,UnitDB,ITBuild,112706] with the 11/27/06 value
of the 11/27/06 set to 1.

I hope this makes sense but I am having a heck of a time wrapping my
head around generating an additional column in R, naming after the
date of the outlier and subsequently generating the appropriate
sequence for the dummy variable.

Idgarad

P.S. Anyone know how to generate a holiday dummy series but aligned
for weekly samples? (e.g. Week 52 has Christmas so get a column called
Christmas setting ever week 52 entry to 1?)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help fCalendar holidayNYSE for Regressors

2008-06-17 Thread Idgarad
I am working with weekly time series data as in:

tsData=ts(data,start=c(2004,1),freq=52)

I have a table of regression variables that matchs called cReg (loaded
from an xls sheet).

I would like to append to the cReg table dummy variables for all the
holidays as calculated from the fCalendar package for instance,
Easter, Christmas, Memorial Day, etc.)

The problem I am running into is how to get the data that I can get
from the fCalendar package into something useful for time series
analysis such that my sampled data tsData is weekly but the holiday
functions are giving me raw dates.

In short I need to be able to answer "Does this week have a christmas
in it? Does this week contain Memorial day, etc."

Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Programatic Method for Holiday Dummy Variables

2008-05-20 Thread Idgarad
I am working on a weekly analysis and would like to look into the
effects of a holiday. As I only get weekly data how can I populate,
automagically, a series of dummy variables that are aligned with my
data.

 Snip -

channel1 <- odbcConnectExcel("D:/RSTATS/metrics.xls")
sqlTables(channel1)
sh1 <- sqlFetch(channel1, "Actuals$")

channel2 <- odbcConnectExcel("D:/RSTATS/events.xls")
sqlTables(channel2)
sh2 <- sqlFetch(channel2, "data$")


tsU000=ts(sh1$U000,start=c(2004,1),freq=52)
summary(tsU000)

#Variable
forecastDistance <- 52
#Grab Existing Regressors
cReg <- sh2[1:length(tsU000),-1]
#Grab X Future Regressors equal to the forecastDistance
fReg <- sh2[length(tsU000):(length(tsU000)+forecastDistance),-1]
---snip

Somewhere in there I need to append to cReg a series of holiday dummy
variables but here is the catch, it's weekly data.

First How can I get an array of holidays in the first place aligned on
weeks?

e.g.

week,xmas,newyears,holloween,cinco,...,qwanza,
1,0,0,0,0,0,...,0
2,0,0,0,0,0,...,0
3,0,0,0,0,0,...,0
4,0,0,0,0,0,...,0
5,0,0,0,0,0,...,0
..
52,1,0,0,0,0,...,0

then how do I append the holiday array to the cReg array?

Help I am confused and bewildered

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Proper Usage of the XREG in ARIMA

2008-01-17 Thread Idgarad
I am using the auto.arima package to do some basic forecasting based
on CPU usage. I now have found a calendar that has various activities
that partially control the computer's usage and want to factor that in
(They are effectively dummy variables indicating a particular type of
activity that week). Per the ARIMA instructions I am to feed those in
a a vector or matrix. I am getting lost in the sand so to speak at
this point. How would I prepare that data? I am pulling from a CSV
that is roughly:

date,usage,allocation,number of engines, theoretical max,r1,r2,...r21

So far so good just working with a copy of the CSV that is just

date,usage

But what should I do to disect the configuration data and the r1 to
r21 dummy variables? (Some of these explain certain spikes and level
shifts, forinstance r21 indicates if there was conversion activity
during the week). I never really could figure out in R (only been
using it a week or so) how to pull out part of an array.

Also should I do my disection prior to or after concerting it into a ts object?

the short of the script is (removing plots etc..):
--
baseU000 <- read.csv("testfile.csv",header=T)
#--- hmm what happens in years with a 53rd week...
tsbaseU000 <- ts(baseU000,start=2004,frequency=52)
#--- add regressors
arimafit <- auto.arima(tsU000[,2],approximation=T,stepwise=N)
forecastU000 <- forecast(arimafit,52)

plot(forecastU000)
lines(fitted(arimafit),col=3,lty="dashed")
--


What I am just trying to do is build the best educated guess on what
the cpu usage is going to be for some planning. As I control part of
the calendar I need to start working towards the ability to do some
"What-If" so I can provide future values for those dummy variables
also. Soo close yet so far away Any suggestions?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.