Re: [R] off topic but need your pointers about statistics

2009-06-19 Thread Gerard M. Keogh
Agresti's first book: Categorical Data Analysis, Appendix B. If you're interested in probability and random processes look at he notes at the end of each chapter in "Real Ananlysis and Probability" by R. M Dudley.

Re: [R] books on Time series

2009-06-15 Thread Gerard M. Keogh
Antonio, You basically need the cross-correlation function in R which is very easy to use - just look up the examples in ?ccf. So, if any of the books you mention deal with it - you'll be ok. If you know about ARIMA models (or know someone who does in a maths/stats dept) and stuff like that then

Re: [R] how can I ordinal regression??

2009-06-03 Thread Gerard M. Keogh
Here's some code to implement the cheese data table using the proportinal odds model given in Generalised Linear Models by McCullagh & Nelder (Ch5). You will have to adapt this to handle your case data - but i'll give you something to go on. cheese library(MASS) options(contrasts =

Re: [R] Validity of Pearson's Chi-Square for Large Tables

2009-06-03 Thread Gerard M. Keogh
Hi, didn't get your name. For large tables (5 X 5) or bigger the dist of the log of the cross product ratios tends to normality. there are (nC2)**2/2 of these (200 in a 5X5 table. The chi-sq test for independence fits a main effects loglinear model to the table and this can be expressed in terms o

Re: [R] Harmonic Analysis

2009-05-27 Thread Gerard M. Keogh
My thoughts exactly. ?FFT should do the job. And define the dominant term - a_n**2 + b_n**2 - the Parseval Relation. stephen sefick

Re: [R] Linear least squares fit with errors in both x and y values.

2009-05-07 Thread Gerard M. Keogh
James, look up "errors in variables" models or "instrumental variable" models in econometrics. The statistics alternative is a "random effects" or "mixed effects" model which plugs the variation in the x's into a randomly varying parameter - these are available in R (?lmer or glmm - I think). Som

Re: [R] Multiple imputations : wicked dataset. Need advice for follow-up to a possible solution.

2009-04-28 Thread Gerard M. Keogh
Emmanuel, Friedman's (Annals of Stats 1991) MARS program implements recursive partitioning in a regression context - a version of it written by Trevor Hastie was available in R but I don't know what package it's now in - I only have base stuff available (long story). MARS, like recursive partitio

[R] arima on defined lags

2009-04-09 Thread Gerard M. Keogh
Dear all, The standard call to ARIMA in the base package such as arima(y,c(5,0,0),include.mean=FALSE) gives a full 5th order lag polynomial model with for example coeffs Coefficients: ar1ar2 ar3 ar4 ar5 0.4715 0.067 -0.1772 0.0

Re: [R] Does R support double-exponential smoothing?

2009-04-01 Thread Gerard M. Keogh
Fit an ARIMA(0,2,2) model - it's the same thing and you'll get the MLE of the smoothing parameter for free. Use logs if you want a multiplicative model. Gerard Stephan Kolassa

Re: [R] Iterative Proportional Fitting, use

2009-03-23 Thread Gerard M. Keogh
Keon, why not fit a loglinear independence model which as far as I know is the same. Gerard Here's an example from Agresti - Intro to Cat Data analysis Example: Alcohol, cigarette, marijuana use |--+--+| | Alcohol |

[R] R under Citrix and access to Lotus notes

2009-03-04 Thread Gerard M. Keogh
Dear All, 1. Does anyone have experience of running R on a server inside a Citrix shell - I'd like to get R onto the server and would be greatful for any tips or direction on the matter. 2. This may seem like a silly question so forgive my ignornace. Most of the data I currently work with is h

[R] R and Citrix - Lotus notes

2009-03-04 Thread Gerard M. Keogh
Dear All, 1. Does anyone have experience of running R on a server inside a Citrix shell - I'd like to get R onto the server and would be greatful for any tips or direction on the matter. 2. This may seem like a silly question so forgive my ignornace. Most of the data I currently work with is hel

Re: [R] Inefficiency of SAS Programming

2009-03-03 Thread Gerard M. Keogh
Ajay ohri To Greg Snow 03/03/2009 04:58

Re: [R] Inefficiency of SAS Programming

2009-03-02 Thread Gerard M. Keogh
Greg Snow To Sent by: "Gerard M. Keogh" r-help-boun...@r- , Frank E project.org

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Gerard M. Keogh
Frank E Harrell Jr "Gerard M. Keogh" 27/02/

Re: [R] Inefficiency of SAS Programming

2009-02-27 Thread Gerard M. Keogh
Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that th

Re: [R] Problem about SARMA model forcasting

2009-02-03 Thread Gerard M. Keogh
Saji, This may help. Your model is (1,0,1)X(0,1,1)S giving difference polynomials nonseasonal (1,0,1) = (1-ar1*B) = (1-ma1*B) seasonal (0,1,1)S = (1-B**S)= (1-sma1*B**S) giving: (1-ar1*B)X(1-B**S) x_t = (1-ma1*B)X(1-sma1*B**S) a_t multiplying out: x_t - x_(t-S) - ar1*x_(

Re: [R] optim() and ARIMA

2009-01-27 Thread Gerard M. Keogh
Surely, this sounds like a bug in the optim function. The rule of thumb with ts data is to scale so that data have mean 0 and unit variance and then fit a) for non-seasonal data the IMA model (0,1,1); and b) for seasonal data so-called Airline Model (0,1,1)X(0,1,1)S see for example A course

Re: [R] Proportional response and boosting

2009-01-20 Thread Gerard M. Keogh
Quick response on the binomial: If possible I would suggest you should model pi = (number/freq of type A) / (total_freq of type A) veg.glm = glm ( pi ~ x, weights = total_freq, family=binomial) The glm method is supposed to work only on the natural numbers (inc 0!) but also works for decimal da

Re: [R] noise in time series

2009-01-15 Thread Gerard M. Keogh
Hi, here's a possibility! Your problem can be restated as "given 2 observers giving 2 measures what is their level of agreement" - the classical measure here is the Kappa (see sec 10.5 of Categorical Data Analysis by Alan Agresti (in Ed 1 - Ed 2 should also have it!) and you can also model the si

[R] loglm fitting

2009-01-14 Thread Gerard M. Keogh
Dear all, sorry to bother you all with this but I've been trying to use the loglm in MASS package (v2.8.0) and cannot get any sensible output. I'm wondering am I doing something very foolish or missing something obvious. For example, I tried the documentation help(loglm) example - here's the cod

Re: [R] inter-timeseries correlation or corrections

2009-01-14 Thread Gerard M. Keogh
Tim, Given you have so little data - I would try a) prefilter and forecast Fit a fairly simple ARIMA(0,1,1) model to A and treat 8987 as an outlier - then predict the fitted series AF with the 3 missing points as forecasts. Fit another ARIMA(0,1,1) with 7688 as an outlier - then back cast to fill

[R] deviance in polr method

2009-01-13 Thread Gerard M. Keogh
Dear all, I've replicated the cheese tasting example on p175 of GLM's by McCullagh and Nelder. This is a 4 treatment (rows) by 9 ordinal response (cols) table. Here's my simple code: cheese library(MASS) options(contrasts = c("contr.treatment", "contr.poly")) y = c

Re: [R] AR(2) coefficient interpretation

2008-12-24 Thread Gerard M. Keogh
ma use lm on lagged values. Gerard "Stephen Oman" To

[R] queue simulation

2008-12-22 Thread Gerard M. Keogh
Hi all, I have a multiple queing situation I'd like to simulate to get some idea of the distributions - waiting times and allocations etc. Does R has a package available for this - many years ago there used to be a language called "simscript" for discrete event simulation and I was wondering if

Re: [R] Testing predictive power of ARIMA model

2008-12-15 Thread Gerard M. Keogh
Sorry, but this gives me the shivers! Are all your time series linear? For each model you should check the residuals and their squares to see if they are uncorrelated (Box-ljung Chi-sq). Another useful check is to test for a trend in the coefficient of variation of the residuals. If the series is

Re: [R] Validity of GLM using Gaussian family with sqrt link

2008-12-11 Thread Gerard M. Keogh
Hi all, Just on this question : can I assume any R internal defined function can be used to describe the link (e.g. = "arctan") so long as its increasing and monotone? How might abs work for example - (except at 0)? And/or finally, can I define any old function in R called "myfun" and use link="

Re: [R] Simplex function in R

2008-12-11 Thread Gerard M. Keogh
re pseudo inverse On the point of generalised inverses - GINV is usually taken to mean the moore-penrose pseudo inverse - this is the least squares projection. There are others - e.g. the Drazin inverse which amounts to diagonalisation - of course this inverse may not be available in R. Gerard

[R] for loop query

2008-12-09 Thread Gerard M. Keogh
Hi all, apologies if this is obvious - but I can't see it and would appreciate some quick help! the matrix mhouse is 26x3 and I'm computing odds ratios. The simple code below "should" compute the odds vector for every pair (325) i.e. 26C2 in cols 1 and 2. On the first i=1 outer loop the inner j

Re: [R] Simulating underdispersed counts

2008-12-04 Thread Gerard M. Keogh
V interesting point Greg. But are you not just suggesting left and right truncation? It strikes me that if the data are Poission then a mixture is likely to be better - something akin to zero-deflated. Neg binomail works for greater variance == mix a gamma and poisson, but I'm unsure what to mix t

[R] confidence interval for glm

2008-11-28 Thread Gerard M. Keogh
Hi all, simple Q: how do I extract the upper and lower CI for predicted probabilities directly for a glm - I'm sure there's a one line to do it but I can't find it. the predicted values I get with the predict (.. "response") Thanks Gerard *

Re: [R] How to measure the significant difference between two time series

2008-11-26 Thread Gerard M. Keogh
Q1: Quick answer a) you need to remove the seasonality - there s/b a tool in the time series package to do this - though I'm not familar enough with R to know this. b) check the resulting series to see if it is stationary - acf decays quickly i.e. within a couple of lags. c) if two series are st

Re: [R] plots of ACF

2008-11-25 Thread Gerard M. Keogh
Sara, look carefully at the acf again and increase the lag. Lags outside the envelope indicate differencing may be necessary. If the data are "seasonal" you could be seeing a cycle with period 8 - you'll see alias peaks at 16 and 24; ideally plot the periodiogram which will show spikes at these fr

[R] binomial glm???

2008-11-20 Thread Gerard M. Keogh
Hi everyone, newbee query! I've installed R 2.8.0 and tried to run this simple glm - x is no of cars in a given year, y is the number voted in an election that year while n is the population 18+: votes <- data.frame(x = c(0.62,0.77,0.71,0.74,0.77,0.86,1.13,1.44), +