[R] Constraint Linear regression

2012-03-20 Thread priya fernandes
Hi there,

I am trying to use linear regression to solve the following equation -

y <- c(0.2525, 0.3448, 0.2358, 0.3696, 0.2708, 0.1667, 0.2941, 0.2333,
0.1500, 0.3077, 0.3462, 0.1667, 0.2500, 0.3214, 0.1364)
x2 <- c(0.368, 0.537, 0.379, 0.472, 0.401, 0.361, 0.644, 0.444, 0.440,
0.676, 0.679, 0.622, 0.450, 0.379, 0.620)
x1 <- 1-x2

# equation
lmFit <- lm(y ~ x1 + x2)

lmFit
Call:
lm(formula = y ~ x1 + x2)

Coefficients:
(Intercept)   x1   x2
0.30521 -0.09726   NA

I would like to *constraint the coefficients of x1 and x2 to be between 0,1*.
Is there a way of adding constraints to lm?

I looked through the old help files and found a solution by Emmanuel using
least squares. The method (with modification) is as follows -

 Data1<- data.frame(y=y,x1=x1, x2=x2)

# The objective function : least squares.

e<-expression((y-(c1+c2*x1+c3*x2))^2)

foo<-deriv(e, name=c("c1","c2","c3"))

# Objective

objfun<-function(coefs, data) {

return(sum(eval(foo,env=c(as.list(coefs), as.list(data)

}

# Objective's gradient

objgrad<-function(coefs, data) {

return(apply(attr(eval(foo,env=c(as.list(coefs), as.list(data))),

 "gradient"),2,sum))

 }

D1.unbound<-optim(par=c(c1=0.5, c2=0.5, c3=0.5),

  fn=objfun,

gr=objgrad,

data=Data1,

 method="L-BFGS-B",

 lower=rep(0, 3),

upper=rep(1, 3))


D1.unbound


$par
 c1  c2  c3
0.004387706 0.203562156 0.300825550

$value
[1] 0.07811152

$counts
function gradient
   88

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

Any suggestion on how to fix the error  "CONVERGENCE: REL_REDUCTION_OF_F <=
FACTR*EPSMCH"?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Wrong output due to what I think might be a data type issue (zoo read in problem)

2012-03-20 Thread knavero
Here's the small scale version of the R script:

http://pastebin.com/sEYKv2Vv

Here's the file that I'm reading in:

http://r.789695.n4.nabble.com/file/n4487682/weatherData.txt weatherData.txt 

I apologize for the length of the data. I tried to cut it down to 12 lines,
however, it wasn't reproducing the bad output that I wanted to show. 

The problem is that my whole data set shifts down. For example, I have this
when the raw data is read in or scanned in as a zoo object:

"> rawData
(12/01/10 00:53:00) (12/01/10 01:53:00) (12/01/10 02:53:00) (12/01/10
03:53:00) 
 41  40  39 
38 
(12/01/10 04:53:00) (12/01/10 05:53:00) (12/01/10 06:53:00) (12/01/10
07:53:00) 
 38  37  36 
39 
(12/01/10 08:53:00) (12/01/10 09:53:00) (12/01/10 10:53:00) (12/01/10
11:53:00) 
 43  47  50 "

Then when I run it through my code, which should feed out the exact same
thing (the values at least), the output is this:

"> intData
(12/01/10 00:53:00) (12/01/10 01:53:00) (12/01/10 02:53:00) (12/01/10
03:53:00) 
   11.010.0 9.0
8.0 
(12/01/10 04:53:00) (12/01/10 05:53:00) (12/01/10 06:53:00) (12/01/10
07:53:00) 
8.0 7.0 6.0
9.0 
(12/01/10 08:53:00) (12/01/10 09:53:00) (12/01/10 10:53:00) (12/01/10
11:53:00) 
   13.017.020.0   
24.0 "

Finally, my dput(rawData) and dput(intData):

"> dput(rawData)
structure(c(11L, 10L, 9L, 8L, 8L, 7L, 6L, 9L, 13L, 17L, 20L, 
24L, 27L, 27L, 27L, 26L, 23L, 21L, 20L, 21L, 18L, 16L, 14L, 14L, 
12L, 10L, 12L, 11L, 10L, 10L, 11L, 14L, 16L, 20L, 23L, 27L, 25L, 
26L, 29L, 28L, 27L, 26L, 24L, 24L, 25L, 24L, 23L, 23L, 21L, 20L, 
18L, 19L, 18L, 18L, 16L, 18L, 21L, 24L, 25L, 27L, 27L, 29L, 29L,..." 

"> dput(intData)
structure(c(11, 10, 9, 8, 8, 7, 6, 9, 13, 17, 20, 24, 27, 27, 
27, 26, 23, 21, 20, 21, 18, 16, 14, 14, 12, 10, 12, 11, 10, 10, 
11, 14, 16, 20, 23, 27, 25, 26, 29, 28, 27, 26, 24, 24, 25, 24, 
23, 23, 21, 20, 18, 19, 18, 18, 16, 18, 21, 24, 25, 27, 27, 29, 
29, 28, 26, 25, 24, 22, 22, 22, 21, 21, 21, 20, 21, 21, 20, 21,..." 

I am not sure how to interpret this, however I have tried researching on
what the "L" following the number is, and it seems they are "list" values? 
Also, I have read ?colClasses in the R manual, and have tried colClasses.
>From experience using C, there seems to be a related error message saying:

"scan() expected 'a real', got 'M'"

What is "M"? Is that matrix? Any clarification of the issue and solution is
appreciated. I apologize in advance for any noob mistake related to asking
questions correctly according to forum specifications. Thanks for any help!
I will keep messing around with colClassesI feel like I am close to a
solution..however, am very far from understanding the problem.



--
View this message in context: 
http://r.789695.n4.nabble.com/Wrong-output-due-to-what-I-think-might-be-a-data-type-issue-zoo-read-in-problem-tp4487682p4487682.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape from long to wide

2012-03-20 Thread Tyler Rinker

Another approach, as your needs are very specific (take every other item in the 
second column and the unique values of columns 1) would be to index, use unique 
and put it together with data.frame (or cbind).

data.frame(family = unique(x[, 1]), kid1 = x[c(T, F), 2], kid2 = x[c(F, T), 2])

Cheers,Tyler

> From: jorgeivanve...@gmail.com
> Date: Tue, 20 Mar 2012 00:29:08 -0400
> To: alyaba...@gmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] Reshape from long to wide
> 
> Hi aly,
> 
> Try
> 
> # your data
> x <- structure(list(family = c(14L, 14L, 15L, 15L, 17L, 17L, 18L,
> 18L, 20L, 20L, 24L, 24L, 25L, 25L, 27L, 27L, 28L, 28L, 29L, 29L
> ), length = c(18L, 7L, 7L, 21L, 50L, 21L, 36L, 21L, 36L, 42L,
> 56L, 42L, 43L, 56L, 15L, 42L, 7L, 42L, 56L, 49L)), .Names = c("family",
> "length"), class = "data.frame", row.names = c(NA, -20L))
> 
> # processing
> require(plyr)
> ddply(x, .(family), function(df) c(kid1 = df$length[1], kid2 =
> df$length[2]))
>family kid1 kid2
> 1  14   187
> 2  157   21
> 3  17   50   21
> 4  18   36   21
> 5  20   36   42
> 6  24   56   42
> 7  25   43   56
> 8  27   15   42
> 9  287   42
> 10 29   56   49
> 
> HTH,
> Jorge.-
> 
> 
> On Mon, Mar 19, 2012 at 7:01 PM, aly <> wrote:
> 
> > Hi,
> >
> > I'm a total beginner in R and this question is probably very simple but
> > I've
> > spent hours reading about it and can't find the answer. I'm trying to
> > reshape a data table from long to wide format. I've tried reshape() and
> > cast() but I get error messages every time and I can't figure why. In my
> > data, I have the length of two fish from each family. My data table (called
> > fish) looks like this:
> >
> > family  length
> > 14  18
> > 14  7
> > 15  7
> > 15  21
> > 17  50
> > 17  21
> > 18  36
> > 18  21
> > 20  36
> > 20  42
> > 24  56
> > 24  42
> > 25  43
> > 25  56
> > 27  15
> > 27  42
> > 28  7
> > 28  42
> > 29  56
> > 29  49
> >
> > I want it to look like this:
> >
> > family kid1 kid2
> > 14  18  7
> > 15  7   21
> > 17  50  21
> > 18  36  21
> > 28  36  42
> > 24  56  42
> > 25  43  56
> > 27  15  42
> > 28  7   42
> > 29  56  49
> >
> > I've tried:
> >
> > >cast( fish, fam~length)
> >
> > and got the error message:
> >
> > Using length as value column.  Use the value argument to cast to override
> > this choice
> > Error in `[.data.frame`(data, , variables, drop = FALSE) :
> >  undefined columns selected
> >
> > Then I rename the columns:
> >
> > >myvars<-c("fam","length")
> > >fish<-fish[myvars]
> >
> > and try the cast() again with no luck (same error)
> >
> > By using reshape() I don't get the results I want:
> >
> > >reshape(rdm1, timevar="fam", idvar=c("length"), direction="wide")
> > > head(first)
> >   length
> > 14.2014
> > 14.19 7
> > 15.2521
> > 17.3050
> > 18.3236
> > 20.3642
> >
> > Can someone help with this? Thanks a lot!
> >
> >
> >
> >
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/Reshape-from-long-to-wide-tp4486875p4486875.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to find order of the autoregressive process in r

2012-03-20 Thread sagarnikam123
if i have time series,the give me code example for finding auto-regressive
process 

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-find-order-of-the-autoregressive-process-in-r-tp4487721p4487721.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wrong output due to what I think might be a data type issue (zoo read in problem)

2012-03-20 Thread knavero
found a temporary fix (I'm sure it's redundant and not as elegant, but here
it is):

require(zoo)
require(chron)
setwd("/home/knavero/Desktop/")

fmt = "%m/%d/%Y %H:%M"
tail1 = function(x) tail(x, 1)
rawData = read.zoo("weatherData.txt", header = T, FUN = as.chron,
   format = fmt, sep = "\t", aggregate = tail1)
   #colClasses = c(NA, "matrix"))

rawData = zoo(cbind(temp = as.vector(rawData)), time(rawData))

oneMin = seq(start(rawData), end(rawData), by = times("01:00:00"))
intData = na.approx(rawData, xout = oneMin)

par(mfrow = c(3, 1), oma = c(0, 0, 2, 0), mar = c(2, 4, 1, 1))

plot(rawData, type = "p", ylim = c(0, 100))
grid(col = "darkgrey")

plot(intData, type = "p", ylim = c(0, 100))
grid(col = "darkgrey")

Silly coding huh? It works thoughthe plots were just to double check
btw...nothing significant obviously


--
View this message in context: 
http://r.789695.n4.nabble.com/Wrong-output-due-to-what-I-think-might-be-a-data-type-issue-zoo-read-in-problem-tp4487682p4487739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automaticall adjust axis scales

2012-03-20 Thread Jim Lemon

Alaios wrote:
> Dear all,
>
> I have made a function that given a number of list elements plot them 
to the same window.

>
> The first element is plotted by using plot and all the rest are 
plotted under the

>
> same window by using lines.
>
> I have below a small and simple reproducible example.
>
>
> x1<-c(1:10)
> plot(x1)
>
> x2<-c(11:20)
> lines(x2)
>
> x3<-c(31:40)
> lines(x3)
>
>
>
>
> as you might notice
> the two consecutive lines fail to be plotted as the axis were formed 
by the first plot.
> Would it be possible after the last lines to change the axis to the 
minimum and the maximum of all data sets to be visible?

>
> Any idea how I can do that?
>
>
Hi Alaois,
Try this:

ylim=range(c(x1,x2,x3))
plot(x1,ylim=ylim,type="l")
lines(x2)
lines(x3)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R (Bold font) and Latex

2012-03-20 Thread Rainer Schuermann
For a small number of elements you could use \Sexpr{},
i.e.

<>=
x<-c(1,0,2,4)
@
x\\
\textbf{\Sexpr{x[1]}}\\
\textbf{\Sexpr{x[2]}}\\
\textbf{\Sexpr{x[3]}}\\
\textbf{\Sexpr{x[4]}}\\

Rgds,
Rainer


On Monday 19 March 2012 20:03:47 Manish Gupta wrote:
> Hi,
> 
> I am using R and latex for generating report. I need R result to be in bold
> face.
> 
> For instance.
> x<-c(1,0,2,4)
> 
> I need to print its output in bold face.
> x
> *1
> 2
> 3
> 4*
> 
> I attempted to use textbf{} but can not write R output inside it. How can i
> implement it. Thanks in advance.
> 
> Regards
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-Bold-font-and-Latex-tp4487535p4487535.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SE from nleqslv

2012-03-20 Thread Berend Hasselman

On 20-03-2012, at 01:01, FU-WEN LIANG wrote:

> Dear R-users,
> 
> I use the "nleqslv" function to get parameter estimates by solving a system
> of non-linear equations. But I also need standard error for each of
> estimates. I checked the nleqslv manual but it didn't mention about SE.
> Is there any way to get the SE for each estimate?

nleqslv is for solving a nonlinear system of equations. Only that.
If you provide a system of equations for determining standard errors then 
nleqslv might be able to solve that system. 
You can use nleqslv to investigate the sensitivity of a solution wrt changes in 
parameters.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R (Bold font) and Latex

2012-03-20 Thread Rainer Schuermann
Or, with a little less typing:

<>=
x<-c(1,0,2,4)
@
x\\
\begin{textbf}
\Sexpr{x[1]}\\
\Sexpr{x[2]}\\
\Sexpr{x[3]}\\
\Sexpr{x[4]}\\
\end{textbf}



On Tuesday 20 March 2012 10:14:38 Rainer Schuermann wrote:
> For a small number of elements you could use \Sexpr{},
> i.e.
> 
> <>=
> x<-c(1,0,2,4)
> @
> x\\
> \textbf{\Sexpr{x[1]}}\\
> \textbf{\Sexpr{x[2]}}\\
> \textbf{\Sexpr{x[3]}}\\
> \textbf{\Sexpr{x[4]}}\\
> 
> Rgds,
> Rainer
> 
> On Monday 19 March 2012 20:03:47 Manish Gupta wrote:
> > Hi,
> > 
> > I am using R and latex for generating report. I need R result to be in
> > bold
> > face.
> > 
> > For instance.
> > x<-c(1,0,2,4)
> > 
> > I need to print its output in bold face.
> > x
> > *1
> > 2
> > 3
> > 4*
> > 
> > I attempted to use textbf{} but can not write R output inside it. How can
> > i
> > implement it. Thanks in advance.
> > 
> > Regards
> > 
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/R-Bold-font-and-Latex-tp4487535p4487535.html
> > Sent from the R help mailing list archive at Nabble.com.
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting loglinear model with glm() and loglm()

2012-03-20 Thread Christofer Bogaso
Dear all, I have small difficulty in comprehending the loglinear model
with R. Assume, we have following data

dat <- array(c(911, 44, 538, 456, 3, 2, 43, 279), c(2, 2, 2))

Now I fit a loglinear model with this and get the fitted values:

library(MASS)
Model_1 <- loglm(~1 + 2 + 3, dat)
fitted(Model_1)

I could do this same task using glm() function as well because
loglinear model is just 1 kind of glm

### Create dummy variables manually
Dummy_Variable_Matrix <- rbind(c(1, 1, 1),
   c(0, 1, 1),
   c(1, 0, 1),
   c(0, 0, 1),

   c(1, 1, 0),
   c(0, 1, 0),
   c(1, 0, 0),
   c(0, 0, 0))

### Fit glm

model_2 <- glm(as.vector(dat) ~
   Dummy_Variable_Matrix[,1] +
   Dummy_Variable_Matrix[,2] +
   Dummy_Variable_Matrix[,3],
   poisson(link = log));
fitted(model_2)

### However

fitted(model_2) == as.vector(fitted(Model_1)) ### do not match


However it is true that the difference is very small, still I am
wondering whether should I just ingore that small difference? Or I
have done something fundamentally wrong?

Thanks for your help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting loglinear model with glm() and loglm()

2012-03-20 Thread Achim Zeileis

On Tue, 20 Mar 2012, Christofer Bogaso wrote:


Dear all, I have small difficulty in comprehending the loglinear model
with R. Assume, we have following data

dat <- array(c(911, 44, 538, 456, 3, 2, 43, 279), c(2, 2, 2))

Now I fit a loglinear model with this and get the fitted values:

library(MASS)
Model_1 <- loglm(~1 + 2 + 3, dat)
fitted(Model_1)

I could do this same task using glm() function as well because
loglinear model is just 1 kind of glm

### Create dummy variables manually
Dummy_Variable_Matrix <- rbind(c(1, 1, 1),
   c(0, 1, 1),
   c(1, 0, 1),
   c(0, 0, 1),

   c(1, 1, 0),
   c(0, 1, 0),
   c(1, 0, 0),
   c(0, 0, 0))

### Fit glm

model_2 <- glm(as.vector(dat) ~
   Dummy_Variable_Matrix[,1] +
   Dummy_Variable_Matrix[,2] +
   Dummy_Variable_Matrix[,3],
   poisson(link = log));
fitted(model_2)

### However

fitted(model_2) == as.vector(fitted(Model_1)) ### do not match


However it is true that the difference is very small, still I am
wondering whether should I just ingore that small difference? Or I
have done something fundamentally wrong?


The fitted values are not the same (==) but equal up to some tolerance 
appropriate for floating point numbers (see all.equal).


The reason is that different numeric algorithms are employed for 
maximizing the log-likelihood. loglm() internally uses loglin() which uses 
iterative proportional fitting. glm() internally uses glm.fit() which 
performs iterative weighted least squares.


BTW: Setting up frequencies and factors for glm() modeling based on a 
table can be done more easily by coercing the "array" to a "table" and 
then to a "data.frame":


tab <- as.table(dat)
m1 <- loglm(~ 1 + 2 + 3, data = tab)

dframe <- as.data.frame(tab)
m2 <- glm(Freq ~ Var1 + Var2 + Var3, data = dframe, family = poisson)

all.equal(as.vector(fitted(m1)), as.vector(fitted(m2))) ## TRUE

Also, the LR and Pearson statistics from print(m1) can be reproduced via

sum(residuals(m2, type = "deviance")^2)
sum(residuals(m2, type = "pearson")^2)

Hope that helps,
Z


Thanks for your help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding columns to csvs in a loop

2012-03-20 Thread Jim Holtman
for (i in fileList){
  x <- read.csv(i)
  x$QID <- ""
  x$COMMENTS <- ""
  x$"DATE CREATED" <- ""
  write.csv(x, file = i)
}

Sent from my iPad

On Mar 19, 2012, at 17:42, Edgar Alminar  wrote:

> Hello,
> I am trying to add columns to a folder of csvs (the folder is called 
> "20120314"). 
> I have csvs of different numbers of columns, but at the end of this loop, I'd 
> like to add three columns to each csv: "QID", "COMMENTS", "DATE CREATED".
> I've tried some things with cbind, I looked at using awk, but I couldn't get 
> either to work.
> 
> Does anyone have an example of a working loop that adds columns to a folder 
> of csvs?
> 
> Thanks!
> Edgar
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wrong output due to what I think might be a data type issue (zoo read in problem)

2012-03-20 Thread Joshua Ulrich
On Mon, Mar 19, 2012 at 11:34 PM, knavero  wrote:
> Here's the small scale version of the R script:
>
> http://pastebin.com/sEYKv2Vv
>
> Here's the file that I'm reading in:
>
> http://r.789695.n4.nabble.com/file/n4487682/weatherData.txt weatherData.txt
>
> I apologize for the length of the data. I tried to cut it down to 12 lines,
> however, it wasn't reproducing the bad output that I wanted to show.
>



>
> Finally, my dput(rawData) and dput(intData):
>
> "> dput(rawData)
> structure(c(11L, 10L, 9L, 8L, 8L, 7L, 6L, 9L, 13L, 17L, 20L,
> 24L, 27L, 27L, 27L, 26L, 23L, 21L, 20L, 21L, 18L, 16L, 14L, 14L,
> 12L, 10L, 12L, 11L, 10L, 10L, 11L, 14L, 16L, 20L, 23L, 27L, 25L,
> 26L, 29L, 28L, 27L, 26L, 24L, 24L, 25L, 24L, 23L, 23L, 21L, 20L,
> 18L, 19L, 18L, 18L, 16L, 18L, 21L, 24L, 25L, 27L, 27L, 29L, 29L,..."
>
> "> dput(intData)
> structure(c(11, 10, 9, 8, 8, 7, 6, 9, 13, 17, 20, 24, 27, 27,
> 27, 26, 23, 21, 20, 21, 18, 16, 14, 14, 12, 10, 12, 11, 10, 10,
> 11, 14, 16, 20, 23, 27, 25, 26, 29, 28, 27, 26, 24, 24, 25, 24,
> 23, 23, 21, 20, 18, 19, 18, 18, 16, 18, 21, 24, 25, 27, 27, 29,
> 29, 28, 26, 25, 24, 22, 22, 22, 21, 21, 21, 20, 21, 21, 20, 21,..."
>
> I am not sure how to interpret this, however I have tried researching on
> what the "L" following the number is, and it seems they are "list" values?

1 is a double.  1L is an integer.
> class(1)
[1] "numeric"
> class(1L)
[1] "integer"

> Also, I have read ?colClasses in the R manual, and have tried colClasses.
> >From experience using C, there seems to be a related error message saying:
>
> "scan() expected 'a real', got 'M'"
>
> What is "M"? Is that matrix? Any clarification of the issue and solution is

There is an "M" in your data at line 9954:
1/3/2012 9:53   48
1/3/2012 10:53  M
1/3/2012 11:53  51

> appreciated. I apologize in advance for any noob mistake related to asking
> questions correctly according to forum specifications. Thanks for any help!
> I will keep messing around with colClassesI feel like I am close to a
> solution..however, am very far from understanding the problem.
>

Best,
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com

R/Finance 2012: Applied Finance with R
www.RinFinance.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashes due to stats.dll

2012-03-20 Thread Kamil Bartoń

Dnia 2012-03-18 01:15, Ted Stankowich pisze:

Hello!
I've been running a looped AIC analysis using several modules including ape, 
nlme, and MuMIn, and
during one particularly long analysis, R (ver 2.14.12) crashes several minutes 
into the routine with
the simple message "R for windows GUI front-end has stopped working". I'm using 
a brand new laptop
with Windows 7, i7 processor, 8GB RAM. I've tried it on both the 64 bit and 32 
bit versions of R.
Using the 64 bit version, the analysis makes it through a few iterations before 
it crashes (maybe
about 20-25 min into the test).

<...>

Does anyone have any idea what might be going wrong here?



I assume you're using MuMIn::dredge. If so, most likely the model fitting function with some 
combination of parameters causes the crash. You can use 'trace = TRUE' argument for 'dredge' to find 
out which model is it. To see the output after the crash, use either R in console (not RGUI) or 
divert the output to a file with 'sink'.


kamil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question abou pROC package

2012-03-20 Thread hexiangxiang
How can I implement multiple testing  with Bonferroni correction of ROC curve
in  R??

--
View this message in context: 
http://r.789695.n4.nabble.com/Question-abou-pROC-package-tp4488271p4488271.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wrong output due to what I think might be a data type issue (zoo read in problem)

2012-03-20 Thread knavero
update temporary fix:

http://pastebin.com/dzj0W89H

--
View this message in context: 
http://r.789695.n4.nabble.com/Wrong-output-due-to-what-I-think-might-be-a-data-type-issue-zoo-read-in-problem-tp4487682p4488179.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Predicting confidence intervals for fitted values for non-linear regression

2012-03-20 Thread Mills, Kathryn (NIH/NIMH) [F]
Hello,

I am interested in calculating the confidence intervals for fitted values for 
non-linear regressions. For instance, I have used the nlme package to generate 
my non-linear model.

summary(lme(myvariable~age+age.sq+age.cu, data=my.matrix, random=~1|Name))

I would like to use the model generated from the data in my.matrix to predict 
the confidence intervals for new ages. Someone posted a question similar to 
mine a few years ago, but the website someone sent in response to the question 
is no longer in use. I will paste their question below because I think it is 
clearer than mine.

Thank you for your help,
Kate

"I am interested to calculate confidence interval for fitted values in general 
for non-linear regressions. Lets say we have y=f(x1,x2,..xN) where f() is a 
non-linear regression. I would like to calculate a confidence interval for new 
prediction f(a1,..,aN). I am aware of techniques for calculating confidence 
intervals for coeffiecients in specific non-linear regressions and with them 
then to calculate confidence interval for the predicted value. ...

Any references to the literature or R packages would be very welcome."
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting loglinear model with glm() and loglm()

2012-03-20 Thread Søren Højsgaard
Dear Christofer,

loglm uses an iterative proportional scaling (IPS) algorithm for fitting a 
log-linear model to a contingency table. glm uses an iteratively reweighted 
least squares algorithm. The result from IPS is exact.

Regards
Søren




-Oprindelig meddelelse-
Fra: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] På 
vegne af Christofer Bogaso
Sendt: 20. marts 2012 11:04
Til: r-help@r-project.org
Emne: [R] Fitting loglinear model with glm() and loglm()

Dear all, I have small difficulty in comprehending the loglinear model with R. 
Assume, we have following data

dat <- array(c(911, 44, 538, 456, 3, 2, 43, 279), c(2, 2, 2))

Now I fit a loglinear model with this and get the fitted values:

library(MASS)
Model_1 <- loglm(~1 + 2 + 3, dat)
fitted(Model_1)

I could do this same task using glm() function as well because loglinear model 
is just 1 kind of glm

### Create dummy variables manually
Dummy_Variable_Matrix <- rbind(c(1, 1, 1),
   c(0, 1, 1),
   c(1, 0, 1),
   c(0, 0, 1),

   c(1, 1, 0),
   c(0, 1, 0),
   c(1, 0, 0),
   c(0, 0, 0))

### Fit glm

model_2 <- glm(as.vector(dat) ~
   Dummy_Variable_Matrix[,1] +
   Dummy_Variable_Matrix[,2] +
   Dummy_Variable_Matrix[,3],
   poisson(link = log));
fitted(model_2)

### However

fitted(model_2) == as.vector(fitted(Model_1)) ### do not match


However it is true that the difference is very small, still I am wondering 
whether should I just ingore that small difference? Or I have done something 
fundamentally wrong?

Thanks for your help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm: getting the confidence interval for an Odds Ratio, when using predict()

2012-03-20 Thread peter dalgaard
[Oops, forgot cc. to list]

On Mar 20, 2012, at 04:40 , Dominic Comtois wrote:

> I apologize for the errors in the previous code. Here is a reworked example. 
> It works, but I suspect problems in the se calculation. I changed, from the 
> 1st prediction to the 2nd only one covariate, so that the OR's CI should be 
> equal to the exponentiated variable's coefficient and ci. And we get 
> something different:

Yep. Classical rookie mistake: Forgot to take sqrt() in the se. I then get

> se <- sqrt(contr %*% V %*% t(contr))
> 
> # display the CI
> exp(contr %*% coef(model) + qnorm(c(.025,.50,.975))*se)
[1] 0.655531 1.686115 4.336918
> 
> # the point estimate is ok, as verified with
> exp(model$coefficients[3])
 x2cat2 
1.686115 
> 
> # however I we'd expect to find upper and lower bound equal 
> # to the exponentiated x2cat coefficient CI
> exp(confint(model))[3,]
Waiting for profiling to be done...
   2.5 %97.5 % 
0.6589485 4.4331058 

which is as close as you can expect since the confint method is a bit more 
advanced than +/-2SE.

-pd


> x1 <- factor(rbinom(100,1,.5),levels=c(0,1))
> x2 <- factor(round(runif(100,1,2)),levels=c(1,2),labels=c("cat1","cat2"))
> outcome <- rbinom(100,1,.2)
> 
> model <- glm(outcome~x1+x2,family=binomial(logit))
> newd <- data.frame(x1=factor(c(0,0),levels=c(0,1)),
>   x2=factor(c("cat1","cat2"),levels=c("cat1","cat2")),
>   outcome=c(1,1))
> 
> M <- model.matrix(formula(model), data=newd)
> V <- vcov(model)
> contr <- c(-1,1) %*% M
> se <- sqrt(contr %*% V %*% t(contr))
> 
> # display the CI
> exp(contr %*% coef(model) + qnorm(c(.025,.50,.975))*se)
> 
> # the point estimate is ok, as verified with
> exp(model$coefficients[3])
> 
> # however I we'd expect to find upper and lower bound equal 
> # to the exponentiated x2cat coefficient CI
> exp(confint(model))[3,]
> 
> Many thanks,
> 
> Dominic C.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coalesce function in BBmisc, emoa, and microbenchmark packages

2012-03-20 Thread Paul Miller
Hi Brian,

This works very well. Still trying to develop some skill with R. So can't say I 
understand your function completely as yet, but will work on it. I had thought 
that your function might only work for two columns (because of the 
"function(x,y)" part), but the example below suggests it will work for any 
number of columns. 

Appreciate your showing this to me. 

Thanks,

Paul  


Demog <- data.frame(PFS = as.Date(c("2006-07-22", NA, "2007-12-16", 
"2008-01-19", "2009-05-05", "2006-04-29", "2006-06-18", NA)),
  DOD = as.Date(c("2006-07-23", "2008-07-09", 
"2007-12-16", "2008-01-19", "2009-05-05", "2006-04-29", "2006-06-18", NA)),
LKDA = as.Date(c(NA, NA, NA, NA, NA, NA, NA, "2008-03-25")))

coalesce <- function(...) {
 dots <- list(...)
 ret <- Reduce(function (x,y) ifelse(!is.na(x),x,y), dots)
 class(ret) <- class(dots[[1]])
 ret
}

Demog$Test <- with(Demog, coalesce(PFS, DOD, LKDA))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wrong output due to what I think might be a data type issue (zoo read in problem)

2012-03-20 Thread Gabor Grothendieck
On Tue, Mar 20, 2012 at 1:24 AM, knavero  wrote:
> found a temporary fix (I'm sure it's redundant and not as elegant, but here
> it is):
>
> require(zoo)
> require(chron)
> setwd("/home/knavero/Desktop/")
>
> fmt = "%m/%d/%Y %H:%M"
> tail1 = function(x) tail(x, 1)
> rawData = read.zoo("weatherData.txt", header = T, FUN = as.chron,
>   format = fmt, sep = "\t", aggregate = tail1)
>   #colClasses = c(NA, "matrix"))
>
> rawData = zoo(cbind(temp = as.vector(rawData)), time(rawData))
>
> oneMin = seq(start(rawData), end(rawData), by = times("01:00:00"))
> intData = na.approx(rawData, xout = oneMin)
>
> par(mfrow = c(3, 1), oma = c(0, 0, 2, 0), mar = c(2, 4, 1, 1))
>
> plot(rawData, type = "p", ylim = c(0, 100))
> grid(col = "darkgrey")
>
> plot(intData, type = "p", ylim = c(0, 100))
> grid(col = "darkgrey")
>
> Silly coding huh? It works thoughthe plots were just to double check
> btw...nothing significant obviously
>

If you specify the column classes a better error message can be produced:

> weatherData.txt <- 
> "http://r.789695.n4.nabble.com/file/n4487682/weatherData.txt";
> rawData = read.zoo(weatherData.txt, header = T, FUN = as.chron,
+format = fmt, sep = "\t", aggregate = tail1, colClasses = c(NA, "numeric"))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  scan() expected 'a real', got 'M'

from which we see that there is an M in the second column.  Using a
text editor we can fix it up or we could specify that M is a comment
character (better make sure there are no M's in the header though) in
which case we will get an NA in that position:

> rawData <- read.zoo(weatherData.txt, header = T, FUN = as.chron,
+ format = fmt, sep = "\t", aggregate = tail1, comment = "M")
> rawData[9553]
(01/03/12 10:53:00)
 NA

We could use na.omit(rawData) to eliminate it.

Another approach to finding it is:

> L <- read.table(weatherData.txt, colClasses = "character", header = TRUE, sep 
> = "\t")
> ix <- is.na(as.numeric(L[[2]])); which(ix); L[ix, 2]
Warning message:
NAs introduced by coercion
[1] 9553
[1] "M"


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coalesce function in BBmisc, emoa, and microbenchmark packages

2012-03-20 Thread R. Michael Weylandt
The key is that "Reduce" function -- it takes in a list of elements
and combines them iteratively, e.g., `+` is only defined for two
elements (at a time) but we can do something like

Reduce(`+`, list(1,2,3)) = Reduce(`+`, list(1+2,3)) = Reduce(`+`,
list(3,3)) = 3 + 3 = 6

Michael



On Tue, Mar 20, 2012 at 8:51 AM, Paul Miller  wrote:
> Hi Brian,
>
> This works very well. Still trying to develop some skill with R. So can't say 
> I understand your function completely as yet, but will work on it. I had 
> thought that your function might only work for two columns (because of the 
> "function(x,y)" part), but the example below suggests it will work for any 
> number of columns.
>
> Appreciate your showing this to me.
>
> Thanks,
>
> Paul
>
>
> Demog <- data.frame(PFS = as.Date(c("2006-07-22", NA, "2007-12-16", 
> "2008-01-19", "2009-05-05", "2006-04-29", "2006-06-18", NA)),
>                          DOD = as.Date(c("2006-07-23", "2008-07-09", 
> "2007-12-16", "2008-01-19", "2009-05-05", "2006-04-29", "2006-06-18", NA)),
>                    LKDA = as.Date(c(NA, NA, NA, NA, NA, NA, NA, 
> "2008-03-25")))
>
> coalesce <- function(...) {
>     dots <- list(...)
>     ret <- Reduce(function (x,y) ifelse(!is.na(x),x,y), dots)
>     class(ret) <- class(dots[[1]])
>     ret
> }
>
> Demog$Test <- with(Demog, coalesce(PFS, DOD, LKDA))
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm: getting the confidence interval for an Odds Ratio, when using predict()

2012-03-20 Thread Dominic Comtois
Case solved. Thanks a lot Peter!

Dominic C.


-Message d'origine-
De : peter dalgaard [mailto:pda...@gmail.com] 
Envoyé : 20 mars 2012 07:57
À : Dominic Comtois
Cc : r-help@r-project.org help
Objet : Re: [R] glm: getting the confidence interval for an Odds Ratio, when
using predict()

[Oops, forgot cc. to list]

On Mar 20, 2012, at 04:40 , Dominic Comtois wrote:

> I apologize for the errors in the previous code. Here is a reworked
example. It works, but I suspect problems in the se calculation. I changed,
from the 1st prediction to the 2nd only one covariate, so that the OR's CI
should be equal to the exponentiated variable's coefficient and ci. And we
get something different:

Yep. Classical rookie mistake: Forgot to take sqrt() in the se. I then get

> se <- sqrt(contr %*% V %*% t(contr))
> 
> # display the CI
> exp(contr %*% coef(model) + qnorm(c(.025,.50,.975))*se)
[1] 0.655531 1.686115 4.336918
> 
> # the point estimate is ok, as verified with
> exp(model$coefficients[3])
 x2cat2
1.686115 
> 
> # however I we'd expect to find upper and lower bound equal # to the 
> exponentiated x2cat coefficient CI exp(confint(model))[3,]
Waiting for profiling to be done...
   2.5 %97.5 % 
0.6589485 4.4331058 

which is as close as you can expect since the confint method is a bit more
advanced than +/-2SE.

-pd


> x1 <- factor(rbinom(100,1,.5),levels=c(0,1))
> x2 <- 
> factor(round(runif(100,1,2)),levels=c(1,2),labels=c("cat1","cat2"))
> outcome <- rbinom(100,1,.2)
> 
> model <- glm(outcome~x1+x2,family=binomial(logit))
> newd <- data.frame(x1=factor(c(0,0),levels=c(0,1)),
>   x2=factor(c("cat1","cat2"),levels=c("cat1","cat2")),
>   outcome=c(1,1))
> 
> M <- model.matrix(formula(model), data=newd) V <- vcov(model) contr <- 
> c(-1,1) %*% M se <- sqrt(contr %*% V %*% t(contr))
> 
> # display the CI
> exp(contr %*% coef(model) + qnorm(c(.025,.50,.975))*se)
> 
> # the point estimate is ok, as verified with
> exp(model$coefficients[3])
> 
> # however I we'd expect to find upper and lower bound equal # to the 
> exponentiated x2cat coefficient CI exp(confint(model))[3,]
> 
> Many thanks,
> 
> Dominic C.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000
Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cv.glmnet

2012-03-20 Thread Yuanyuan Tang
Hi, all:

Does anybody know how to avoid the intercept term in cv.glmnet coefficient?
 When I say "avoid", it does not mean using coef()[-1] to omit the printout
of intercept, it means no intercept at all when doing the analysis. Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Great new video on BaSTA - Bayesian Survival Trajectory Analysis

2012-03-20 Thread Graziella Iossa
Dear all,

Fernando Colchero, Owen Jones and Maren Rebke, Max Plank Institute for 
Demographic Research, present BaSTA -Bayesian Survival Trajectory Analysis. 
Fernando, Owen and Maren have put together this beautiful video exploring 
research on ageing and and how to deal with incomplete data. 

To install BaSTA http://basta.r-forge.r-project.org
BaSTA users mailing list, 
http://lists.r-forge.r-project.org/mailman/listinfo/basta-users

Thanks,
Graziella

---
Dr Graziella Iossa

Journal Coordinator, Methods in Ecology and Evolution
coordina...@methodsinecologyandevolution.org
Working hours: Mon-Wedn 8-18, Thurs 8-16 GMT and Mon-Fri 8-16 GMT on alternate 
weeks

The British Ecological Society is a limited company, registered in
England No. 1522897 and a Registered Charity No. 281213. VAT
registration No 12863. Information and advice given to members or
others by or on behalf of the Society is given on the basis that no
liability attaches to the Society, its Council Members, Officers or
representatives in respect thereof.

Think before you print...



__
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reshaping data from long to wide without a "timevar"

2012-03-20 Thread Paul Miller
Hello All,

I was wondering if it's possible to reshape data from long to wide in R without 
using a "timevar". I've pasted some sample data below along with some code. The 
data are sorted by Subject and Drug. I want to transpose the Drug variable into 
multiple columns in alphabetical order. 

My data have a variable called "RowNo" that functions almost like a "timevar" 
but not quite. In Subject 6, Erlotinib has a RowNo value of 3 whereas 
Paclitaxel has a RowNo value of 2. So if I use reshape as in the first bit of 
code below, the columns for drug don't transpose in alphabetical order. That 
is, Paclitaxel appears in Drug.2 and Erlotinib appears in Drug.3 when it should 
be the other way around.

The next two bits of code represent a couple of other things I've tried. The 
cast function almost works but unfortunately makes a separate column for each 
drug (at least the way I'm using it). The unstack function works almost 
perfectly but to my surprise creates a list instead of a dataframe (which I 
understand is a different kind of list). Thought it might take a single line of 
code to convert the former structure to the latter but this appears not to be 
the case.

So can I get what I want without adding a timevar to my data? And if do need a 
timevar, what's the best way to add it?

Thanks,

Paul
 
connection <- textConnection("
005 1 Gemcitabine
005 2 Erlotinib
006 1 Gemcitabine
006 3 Erlotinib
006 2 Paclitaxel
009 1 Gemcitabine
009 2 Erlotinib
010 1 Gemcitabine
010 2 Erlotinib
010 3 Herceptin
")

TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug = 
"")))
TestData$Subject <- as.integer(TestData$Subject)
TestData$RowNo <- as.integer(TestData$RowNo)
TestData$Drug <- as.character(TestData$Drug)

require(reshape)

Transpose <- reshape(TestData, direction="wide", idvar="Subject", 
timevar="RowNo", v.names="Drug")
Transpose

Transpose <- melt(TestData, id.var="Subject", measure.var="Drug")
Transpose <- cast(Transpose, Subject ~ value)
Transpose

Transpose <- unstack(TestData, Drug ~ Subject)
Transpose

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshaping data from long to wide without a "timevar"

2012-03-20 Thread R. Michael Weylandt
If I understand you right,

library(reshape2)
dcast(melt(TestData, id.var = "Subject", measure.var = "Drug"), Subject ~ value)

Michael

On Tue, Mar 20, 2012 at 9:50 AM, Paul Miller  wrote:
> Hello All,
>
> I was wondering if it's possible to reshape data from long to wide in R 
> without using a "timevar". I've pasted some sample data below along with some 
> code. The data are sorted by Subject and Drug. I want to transpose the Drug 
> variable into multiple columns in alphabetical order.
>
> My data have a variable called "RowNo" that functions almost like a "timevar" 
> but not quite. In Subject 6, Erlotinib has a RowNo value of 3 whereas 
> Paclitaxel has a RowNo value of 2. So if I use reshape as in the first bit of 
> code below, the columns for drug don't transpose in alphabetical order. That 
> is, Paclitaxel appears in Drug.2 and Erlotinib appears in Drug.3 when it 
> should be the other way around.
>
> The next two bits of code represent a couple of other things I've tried. The 
> cast function almost works but unfortunately makes a separate column for each 
> drug (at least the way I'm using it). The unstack function works almost 
> perfectly but to my surprise creates a list instead of a dataframe (which I 
> understand is a different kind of list). Thought it might take a single line 
> of code to convert the former structure to the latter but this appears not to 
> be the case.
>
> So can I get what I want without adding a timevar to my data? And if do need 
> a timevar, what's the best way to add it?
>
> Thanks,
>
> Paul
>
> connection <- textConnection("
> 005 1 Gemcitabine
> 005 2 Erlotinib
> 006 1 Gemcitabine
> 006 3 Erlotinib
> 006 2 Paclitaxel
> 009 1 Gemcitabine
> 009 2 Erlotinib
> 010 1 Gemcitabine
> 010 2 Erlotinib
> 010 3 Herceptin
> ")
>
> TestData <- data.frame(scan(connection, list(Subject = 0, RowNo = 0, Drug = 
> "")))
> TestData$Subject <- as.integer(TestData$Subject)
> TestData$RowNo <- as.integer(TestData$RowNo)
> TestData$Drug <- as.character(TestData$Drug)
>
> require(reshape)
>
> Transpose <- reshape(TestData, direction="wide", idvar="Subject", 
> timevar="RowNo", v.names="Drug")
> Transpose
>
> Transpose <- melt(TestData, id.var="Subject", measure.var="Drug")
> Transpose <- cast(Transpose, Subject ~ value)
> Transpose
>
> Transpose <- unstack(TestData, Drug ~ Subject)
> Transpose
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint Linear regression

2012-03-20 Thread R. Michael Weylandt
Due to perfect collinearity, your regression isn't unique so you're
not going to be able to even solve the unconstrained version of this
problem.

Michael

On Tue, Mar 20, 2012 at 12:54 AM, priya fernandes
 wrote:
> Hi there,
>
> I am trying to use linear regression to solve the following equation -
>
> y <- c(0.2525, 0.3448, 0.2358, 0.3696, 0.2708, 0.1667, 0.2941, 0.2333,
> 0.1500, 0.3077, 0.3462, 0.1667, 0.2500, 0.3214, 0.1364)
> x2 <- c(0.368, 0.537, 0.379, 0.472, 0.401, 0.361, 0.644, 0.444, 0.440,
> 0.676, 0.679, 0.622, 0.450, 0.379, 0.620)
> x1 <- 1-x2
>
> # equation
> lmFit <- lm(y ~ x1 + x2)
>
> lmFit
> Call:
> lm(formula = y ~ x1 + x2)
>
> Coefficients:
> (Intercept)           x1           x2
>    0.30521     -0.09726           NA
>
> I would like to *constraint the coefficients of x1 and x2 to be between 0,1*.
> Is there a way of adding constraints to lm?
>
> I looked through the old help files and found a solution by Emmanuel using
> least squares. The method (with modification) is as follows -
>
>  Data1<- data.frame(y=y,x1=x1, x2=x2)
>
> # The objective function : least squares.
>
> e<-expression((y-(c1+c2*x1+c3*x2))^2)
>
> foo<-deriv(e, name=c("c1","c2","c3"))
>
> # Objective
>
> objfun<-function(coefs, data) {
>
> return(sum(eval(foo,env=c(as.list(coefs), as.list(data)
>
> }
>
> # Objective's gradient
>
> objgrad<-function(coefs, data) {
>
> return(apply(attr(eval(foo,env=c(as.list(coefs), as.list(data))),
>
>  "gradient"),2,sum))
>
>  }
>
> D1.unbound<-optim(par=c(c1=0.5, c2=0.5, c3=0.5),
>
>  fn=objfun,
>
> gr=objgrad,
>
> data=Data1,
>
>  method="L-BFGS-B",
>
>  lower=rep(0, 3),
>
> upper=rep(1, 3))
>
>
> D1.unbound
>
>
> $par
>         c1          c2          c3
> 0.004387706 0.203562156 0.300825550
>
> $value
> [1] 0.07811152
>
> $counts
> function gradient
>       8        8
>
> $convergence
> [1] 0
>
> $message
> [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
>
> Any suggestion on how to fix the error  "CONVERGENCE: REL_REDUCTION_OF_F <=
> FACTR*EPSMCH"?
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading Dataset into R continual issue

2012-03-20 Thread R. Michael Weylandt
Seems to work for me:

x <- read.table("read.table("~/Downloads/gooddemocracy.txt", sep =
"\t", header = TRUE)
str(x)

summary(x)[,1:10]

Michael



On Mon, Mar 19, 2012 at 5:52 PM, bobo  wrote:
> Hi, this is related to academic research I am trying to conduct. Please
> pardon my lack of socialization for this forum.
>
> For my project, I had to combine two different datasets, Democracy dataset
> from Pippa Norris and World Bank Patents dataset.
>
> My issue arrises from just loading the file into R. My colleagues proficient
> in R have been stumped as well.  Often times the file would seem to load
> fine using read.table command, however when I tried to run summary
> statistics of variables, it would say "object not found".
>
> I have tried different formats and commands. Formats .xlsx, .csv, . txt.
> Commands read.table, read.csv, read.delim. I have tried to run summary
> statistics of Pat2006, Pat2005, Pat 2004 all the way to Pat 2001.
>
> Could anyone PLEASE help me solve this issue? I cannot even begin to say how
> thankful I will be.
>
> I have uploaded the .txt file onto mediafire website for easy access. I
> posted .txt so people aren't worried about viruses or anything of the sort.
> I can also post other versions of the file, or direct to the 2 original
> datasets.
>
> Good Democracy Dataset http://www.mediafire.com/?ytg7a76s7ox05se  (141 kb)
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Loading-Dataset-into-R-continual-issue-tp4486619p4486619.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Graphic legend with mathematical symbol, numeric variable and character variable

2012-03-20 Thread ECOTIÈRE David (Responsable d'activité) - CETE Est/LRPC de Strasbourg/6 Acoustique

Hi,

I'd like to make a legend with a mix of mathematical symbol (tau), 
numeric variable and character variables.I have tried :



types<-c("Type 1","Type 2","Type 2")
tau<-c(1,3,2)
legend(x="topright",legend=paste(types,"tau=",expression(tau)))



but it doesn't work: the 'tau' symbol is not written in its 'symbol 
style' but as 'tau'


Any (good) idea ?
Thank you in advance !

David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected input in function

2012-03-20 Thread Schryver, Jack C.
Thank you for your responses. You helped me to see something right in front of 
me that I should have noticed quickly. I copy/pasted my script from MS Word 
instead of using a safer text editor. I did catch the funky double-quotes the 
MS Word uses, but I think it also sometimes substitutes a dash for a minus sign.

Jack

-Original Message-
From: ted@deb [mailto:ted@deb] On Behalf Of Ted Harding
Sent: Monday, March 19, 2012 4:56 PM
To: r-help@r-project.org
Cc: Schryver, Jack C.; Schryver, Jack C.; Sarah Goslee
Subject: Re: [R] Unexpected input in function

I think the most likely explanation is that something in
the input string has had the effect of inserting an invisible
"character" between the "-" and the "a" in "b-a", and a
possible suspect is pollution by UTF8: see the discussion at

http://r.789695.n4.nabble.com/unexpected-input-in-rpart-td3168363.html

Or a character "copy&paste"d from an editor that uses a
non-ASCII encoding for its characters. See e.g.:

http://support.rstudio.org/help/discussions/problems/
386-error-unexpected-input-in

and:

http://www.mail-archive.com/r-help@r-project.org/msg71798.html


On 19-Mar-2012 Sarah Goslee wrote:
> I think you'll need to provide a reproducible example, because your
> code works for me:
> 
>> fsubt <- function(a) {
> + b <- 1:length(a)
> + b-a
> + }
>>
>>
>> fsubt(1:5)
> [1] 0 0 0 0 0
>>
>> fsubt(sample(1:10))
>  [1] -8 -6  1  1 -1  5  3  1  4  0
>>
>> fsubt(2)
> [1] -1
> 
> 
> On Mon, Mar 19, 2012 at 4:01 PM, Schryver, Jack C. 
> wrote:
>> Hi,
>>
>> Although the following statements work individually in R, they produce an
>> error if placed inside a function as below:
>>
>> fsubt <- function(a) {
>> b <- 1:length(a)
>> b-a
>> }
>>
>> The error message is:
>>
>> Error: unexpected input in:
>> "b <- 1:length(a)
>> b-"
>>
>> Any insight would be greatly appreciated.
>>
>> Thanks,
>> Jack
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org

-
E-Mail: (Ted Harding) 
Date: 19-Mar-2012  Time: 20:56:04
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unique in DataFrame

2012-03-20 Thread MSousa
Hello,

I have little doubt, and I do not think that the way I solve the problem
is the best way to do it. 
The following is a small dataset


x<-data.frame(city="Barcelona",sales=253639)
x<-rbind(x,data.frame(city="Madrid",sales=223455))
x<-rbind(x,data.frame(city="Lisbon",sales=273633))
x<-rbind(x,data.frame(city="Madrid",sales=266535))
x<-rbind(x,data.frame(city="Barcelona",sales=258369))
x<-rbind(x,data.frame(city="Lisbon",sales=273633))
x<-rbind(x,data.frame(city="Barcelona",sales=22579))
x<-rbind(x,data.frame(city="Lisbon",sales=26333))
x<-rbind(x,data.frame(city="Barcelona",sales=253639))

x$num<-as.numeric(as.numeric(factor(x$city)))
View(x)

my problem and my doubts start here, I'm trying to create a list of cities
and the code that was assigned in.
x$num<-as.numeric(as.numeric(factor(x$city)))

here seems to work fine, but the largest dataset repeats some values ​​and
hiding others, this is the correct way to present unique values ​​in a
column, and view the contents with other columns

rescity<-x[unique(x$city),c(3,1)]
rescity

   Thanks


--
View this message in context: 
http://r.789695.n4.nabble.com/Unique-in-DataFrame-tp4488943p4488943.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rJava / RCMD javareconf fails

2012-03-20 Thread st0ut717
Hi All,

Running R CMD javareconf -e (or R CMD javareconf as root I am rot on my
machines) Fails.  
/
[root@penguins1lanalt etc]# R CMD javareconf
Java interpreter : /usr/bin/java
Java version : 1.6.0_30
Java home path   : /usr/java/jre1.6.0_30
Java compiler: not present
Java headers gen.: 
Java archive tool: 
Java library path:
$(JAVA_HOME)/lib/i386/server:$(JAVA_HOME)/lib/i386:$(JAVA_HOME)/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
JNI linker flags : -L$(JAVA_HOME)/lib/i386/server -L$(JAVA_HOME)/lib/i386
-L$(JAVA_HOME)/../lib/i386 -L/usr/java/packages/lib/i386 -L/lib -L/usr/lib
-ljvm
JNI cpp flags: 

Updating Java configuration in /usr/lib/R
Done.
[user@penguins1lanalt java]$ echo $JAVA_HOME
/usr/java
[user@penguins1lanalt java]$ R CMD javareconf -e
Java interpreter : /usr/bin/java
Java version : 1.6.0_30
Java home path   : /usr/java
Java compiler: not present
Java headers gen.: 
Java archive tool: 
Java library path:
$(JAVA_HOME)/lib/i386/server:$(JAVA_HOME)/lib/i386:$(JAVA_HOME)/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
JNI linker flags : -L$(JAVA_HOME)/lib/i386/server -L$(JAVA_HOME)/lib/i386
-L$(JAVA_HOME)/../lib/i386 -L/usr/java/packages/lib/i386 -L/lib -L/usr/lib
-ljvm
JNI cpp flags: 

The following Java variables have been exported:
JAVA_HOME JAVA JAVAC JAVAH JAR JAVA_LIBS JAVA_CPPFLAGS JAVA_LD_LIBRARY_PATH
Runnig: /bin/bash
[user@penguins1lanalt java]$ ls /usr/java
default  jdk1.6.0_25  jre1.6.0_30  latest
/

Help please?

--
View this message in context: 
http://r.789695.n4.nabble.com/rJava-RCMD-javareconf-fails-tp4488961p4488961.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rtmvtnorm and equal upper and lower truncation boundaries in 'tmvtnorm' package

2012-03-20 Thread Jurgis SAPIJANSKAS
Dear list,

I am facing a problem with the rtmvtnorm function to sample from a
truncated multivariate normal when some of the truncation boundaries are
equal, e.g.:

>rtmvnorm(1, mean = rep(0,3), sigma = diag(3), lower=c(-1,0,-1),
upper=c(1,0,1),
 algorithm="gibbs")

Error in checkTmvArgs(mean, sigma, lower, upper) :
  lower must be smaller than or equal to upper (lower<=upper)

Of course, since it is all about numerics I could do

>rtmvnorm(1, mean = rep(0,3), sigma = diag(3), lower=c(-1,0,-1),
upper=c(1,1e-16,1),
 algorithm="gibbs")


   [,1] [,2]   [,3]
[1,] -0.62211860 -0.6435531


but it is not entirely satisfying. Is it a normal behaviour of rtmvtnorm?

Thanks in advance,

jurgis

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SE from nleqslv

2012-03-20 Thread FU-WEN LIANG
On Tue, Mar 20, 2012 at 4:24 AM, Berend Hasselman  wrote:
>
>
> On 20-03-2012, at 01:01, FU-WEN LIANG wrote:
>
> > Dear R-users,
> >
> > I use the "nleqslv" function to get parameter estimates by solving a
> > system
> > of non-linear equations. But I also need standard error for each of
> > estimates. I checked the nleqslv manual but it didn't mention about SE.
> > Is there any way to get the SE for each estimate?
>
> nleqslv is for solving a nonlinear system of equations. Only that.
> If you provide a system of equations for determining standard errors then
> nleqslv might be able to solve that system.
> You can use nleqslv to investigate the sensitivity of a solution wrt
> changes in parameters.
>
> Berend
>

Thank you very much for your advice, Berend.
Would you please give me a hint about "the sensitivity of a solution
wrt changes in parameters"? What statistics can we use?
Thank you.
Fu-Wen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshaping data from long to wide without a "timevar"

2012-03-20 Thread Paul Miller
Hi Michael,

Sorry, my description seems to have been less than adequate. I want my 
transposed data to look something like:

  Subject  Drug.1 Drug.2Drug.3
1   5 Gemcitabine  Erlotinib  
3   6 Gemcitabine  Erlotinib Paclitaxel
6   9 Gemcitabine  Erlotinib  
8  10 Gemcitabine  Erlotinib Herceptin

This is almost the same as what one gets with:

Transpose <- reshape(TestData, direction="wide", idvar="Subject", 
timevar="RowNo", v.names="Drug")
Transpose

The difference is that Subject 6 has "Gemcitabine, Erlotinib, Paclitaxel" 
instead of "Gemcitabine, Paclitaxel, Erlotinib". That's what I mean when I say 
I want the columns in alphabetical order.

Thanks,

Paul


--- On Tue, 3/20/12, R. Michael Weylandt  wrote:

> From: R. Michael Weylandt 
> Subject: Re: [R] Reshaping data from long to wide without a "timevar"
> To: "Paul Miller" 
> Cc: r-help@r-project.org
> Received: Tuesday, March 20, 2012, 9:01 AM
> If I understand you right,
> 
> library(reshape2)
> dcast(melt(TestData, id.var = "Subject", measure.var =
> "Drug"), Subject ~ value)
> 
> Michael
> 
> On Tue, Mar 20, 2012 at 9:50 AM, Paul Miller 
> wrote:
> > Hello All,
> >
> > I was wondering if it's possible to reshape data from
> long to wide in R without using a "timevar". I've pasted
> some sample data below along with some code. The data are
> sorted by Subject and Drug. I want to transpose the Drug
> variable into multiple columns in alphabetical order.
> >
> > My data have a variable called "RowNo" that functions
> almost like a "timevar" but not quite. In Subject 6,
> Erlotinib has a RowNo value of 3 whereas Paclitaxel has a
> RowNo value of 2. So if I use reshape as in the first bit of
> code below, the columns for drug don't transpose in
> alphabetical order. That is, Paclitaxel appears in Drug.2
> and Erlotinib appears in Drug.3 when it should be the other
> way around.
> >
> > The next two bits of code represent a couple of other
> things I've tried. The cast function almost works but
> unfortunately makes a separate column for each drug (at
> least the way I'm using it). The unstack function works
> almost perfectly but to my surprise creates a list instead
> of a dataframe (which I understand is a different kind of
> list). Thought it might take a single line of code to
> convert the former structure to the latter but this appears
> not to be the case.
> >
> > So can I get what I want without adding a timevar to my
> data? And if do need a timevar, what's the best way to add
> it?
> >
> > Thanks,
> >
> > Paul
> >
> > connection <- textConnection("
> > 005 1 Gemcitabine
> > 005 2 Erlotinib
> > 006 1 Gemcitabine
> > 006 3 Erlotinib
> > 006 2 Paclitaxel
> > 009 1 Gemcitabine
> > 009 2 Erlotinib
> > 010 1 Gemcitabine
> > 010 2 Erlotinib
> > 010 3 Herceptin
> > ")
> >
> > TestData <- data.frame(scan(connection, list(Subject
> = 0, RowNo = 0, Drug = "")))
> > TestData$Subject <- as.integer(TestData$Subject)
> > TestData$RowNo <- as.integer(TestData$RowNo)
> > TestData$Drug <- as.character(TestData$Drug)
> >
> > require(reshape)
> >
> > Transpose <- reshape(TestData, direction="wide",
> idvar="Subject", timevar="RowNo", v.names="Drug")
> > Transpose
> >
> > Transpose <- melt(TestData, id.var="Subject",
> measure.var="Drug")
> > Transpose <- cast(Transpose, Subject ~ value)
> > Transpose
> >
> > Transpose <- unstack(TestData, Drug ~ Subject)
> > Transpose
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SE from nleqslv

2012-03-20 Thread Berend Hasselman

On 20-03-2012, at 15:36, FU-WEN LIANG wrote:

> On Tue, Mar 20, 2012 at 4:24 AM, Berend Hasselman  wrote:
>> 
>> 
>> On 20-03-2012, at 01:01, FU-WEN LIANG wrote:
>> 
>>> Dear R-users,
>>> 
>>> I use the "nleqslv" function to get parameter estimates by solving a
>>> system
>>> of non-linear equations. But I also need standard error for each of
>>> estimates. I checked the nleqslv manual but it didn't mention about SE.
>>> Is there any way to get the SE for each estimate?
>> 
>> nleqslv is for solving a nonlinear system of equations. Only that.
>> If you provide a system of equations for determining standard errors then
>> nleqslv might be able to solve that system.
>> You can use nleqslv to investigate the sensitivity of a solution wrt
>> changes in parameters.
>> 
>> Berend
>> 
> 
> Thank you very much for your advice, Berend.
> Would you please give me a hint about "the sensitivity of a solution
> wrt changes in parameters"? What statistics can we use?

Suppose you have a system of two equations and this system depends on two 
parameters A and B.
You have solved the system for specific values of A and B.
Then you can vary A and B see how the solution changes.

How that could or might be translated to SE's I really wouldn't know.
A measure of the sensitivity could be the (relative change of a norm of the 
solution) / (relative change of a parameter).

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint Linear regression

2012-03-20 Thread Gabor Grothendieck
On Tue, Mar 20, 2012 at 12:54 AM, priya fernandes
 wrote:
> Hi there,
>
> I am trying to use linear regression to solve the following equation -
>
> y <- c(0.2525, 0.3448, 0.2358, 0.3696, 0.2708, 0.1667, 0.2941, 0.2333,
> 0.1500, 0.3077, 0.3462, 0.1667, 0.2500, 0.3214, 0.1364)
> x2 <- c(0.368, 0.537, 0.379, 0.472, 0.401, 0.361, 0.644, 0.444, 0.440,
> 0.676, 0.679, 0.622, 0.450, 0.379, 0.620)
> x1 <- 1-x2
>
> # equation
> lmFit <- lm(y ~ x1 + x2)
>
> lmFit
> Call:
> lm(formula = y ~ x1 + x2)
>
> Coefficients:
> (Intercept)           x1           x2
>    0.30521     -0.09726           NA
>
> I would like to *constraint the coefficients of x1 and x2 to be between 0,1*.
> Is there a way of adding constraints to lm?
>

Assuming we set the intercept to zero the unconstrained solution does
satisfy those constraints:

lm(y ~ x1 + x2 + 0)

An approach which explicitly set the constraints (also removing the
intercept) would be nls:

nls(y ~ a * x1 + b * x2,
lower = c(a = 0, b = 0), upper = c(a = 1, b = 1),
start = c(a = 0.5, b = 0.5),
alg = "port")

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot method for ca.jo

2012-03-20 Thread Keith Weintraub
Folks,
  How would I find the code for a plot function that is in a package?

I want to understand exactly what is being plotted.

Thanks,
KW

--


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-20 Thread Ashish Agarwal
The file is 20MB having 2 Million rows.
I understand that I two different formats  - 6 columns and 7 columns.
How do I read chunks to different files by using scan with modifying
skip and nlines parameters?

On Mon, Mar 19, 2012 at 3:59 PM, Petr PIKAL  wrote:
>
> I would follow Jims suggestion,
> nFields <- count.fields(fileName, sep = ',')
> count fields and read chunks to different files by using scan with
> modifying skip and nlines parameters. However if there is only few lines
> which differ it would be better to correct those few lines manually in
> some suitable editor.
>
> Elaborating omnipotent function for reading any kind of
> corrupted/nonstandard files seems to me suited only if you expect to read
> such files many times.
>
> Regards
> Petr
>
>
>>
>>
>>
>> On Sat, Mar 17, 2012 at 4:54 AM, jim holtman  wrote:
>> > Here is a solution that looks for the line with 7 elements and inserts
>> > the quotes:
>> >
>> >
>> >> fileName <- '/temp/text.txt'
>> >> input <- readLines(fileName)
>> >> # count the fields to find 7
>> >> nFields <- count.fields(fileName, sep = ',')
>> >> # now fix the data
>> >> for (i in which(nFields == 7)){
>> > +     # split on comma
>> > +     z <- strsplit(input[i], ',')[[1]]
>> > +     input[i] <- paste(z[1], z[2]
>> > +         , paste('"', z[3], ',', z[4], '"', sep = '') # put on quotes
>> > +         , z[5], z[6], z[7], sep = ','
>> > +         )
>> > + }
>> >>
>> >> # now read in the data
>> >> result <- read.table(textConnection(input), sep = ',')
>> >>
>> >>         result
>> >                         V1       V2                   V3   V4 V5 V6
>> > 1                                                         1968 21  0
>> > 2                                                  Boston 1968 13  0
>> > 3                                                  Boston 1968 18  0
>> > 4                                                 Chicago 1967 44  0
>> > 5                                              Providence 1968 17  0
>> > 6                                              Providence 1969 48  0
>> > 7                                                   Binky 1968 24  0
>> > 8                                                 Chicago 1968 23  0
>> > 9                                                   Dally 1968  7  0
>> > 10                                   Raleigh, North Carol 1968 25  0
>> > 11 Addy ABC-Dogs Stars-W8.1                    Providence 1968 38  0
>> > 12              DEF_REQPRF/                     Dartmouth 1967 31  1
>> > 13                       PL                               1967 38  1
>> > 14                       XY PopatLal                      1967  5  1
>> > 15                       XY PopatLal                      1967  6  8
>> > 16                       XY PopatLal                      1967  7  7
>> > 17                       XY PopatLal                      1967  9  1
>> > 18                       XY PopatLal                      1967 10  1
>> > 19                       XY PopatLal                      1967 13  1
>> > 20                       XY PopatLal               Boston 1967  6  1
>> > 21                       XY PopatLal               Boston 1967  7 11
>> > 22                       XY PopatLal               Boston 1967  9  2
>> > 23                       XY PopatLal               Boston 1967 10  3
>> > 24                       XY PopatLal               Boston 1967  7  2
>> >>
>> >
>> >
>> > On Fri, Mar 16, 2012 at 2:17 PM, Ashish Agarwal
>> >  wrote:
>> >> I have a file that is 5000 records and to edit that file is not easy.
>> >> Is there any way to line 10 differently to account for changes in the
>> >> third field?
>> >>
>> >> On Fri, Mar 16, 2012 at 11:35 PM, Peter Ehlers 
> wrote:
>> >>> On 2012-03-16 10:48, Ashish Agarwal wrote:
>> 
>>  Line 10 has City and State that too separated by comma. For line 10
>>  how can I read differently as compared to the other lines?
>> >>>
>> >>>
>> >>> Edit the file and put quotes around the city-state combination:
>> >>>  "Raleigh, North Carol"
>> >>>
>> >>
>> >> __
>> >> R-help@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> > --
>> > Jim Holtman
>> > Data Munger Guru
>> >
>> > What is the problem that you are trying to solve?
>> > Tell me what you want to do, not how you want to do it.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.or

[R] error message

2012-03-20 Thread adamu eloji
Dear all,

 Who will bail me out. Iam using R with S-Splancs. Anytime i typed in the 
syntex, an error will appear eg setwd("C:\\TEMP)
>dat <- read.table(cheshire_fmd.cvs",header=TRUE, sep=",")
> dat.<-read.table(''chesire_fmd.cvs'',header=TRUE,sep='',)
Error: unexpected symbol in "dat.<-read.table(''chesire_fmd.cvs"
> dat$x.km <-dat$xcoord/1000
Error: object 'dat' not found
> dat$y.km <-dat$ycoord/1000
Error: object 'dat' not found
> dat[1:10,]
Error: object 'dat' not found
> Library(splancs)
I was advised to remove the '>' at begining of each line, but its like that 
symbol is a default. How do i do that.
Thanks
El-Oji Adamu
Nigeria

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot method for ca.jo

2012-03-20 Thread Pfaff, Bernhard Dr.
?getMethod
getMethod("plot", c("ca.jo", "missing"))

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im 
Auftrag von Keith Weintraub
Gesendet: Dienstag, 20. März 2012 16:36
An: r-help@r-project.org
Betreff: [R] Plot method for ca.jo

Folks,
  How would I find the code for a plot function that is in a package?

I want to understand exactly what is being plotted.

Thanks,
KW

--


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
*
Confidentiality Note: The information contained in this ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error message

2012-03-20 Thread Sarah Goslee
You left a quotation mark out of your very first statement, and then
some others. Quotation marks must be balanced.

Try this instead:


setwd("C:\\TEMP")
dat <- read.table("cheshire_fmd.cvs",header=TRUE, sep=",")
dat[1:10,]

Sarah

On Tue, Mar 20, 2012 at 10:36 AM, adamu eloji  wrote:
> Dear all,
>
>  Who will bail me out. Iam using R with S-Splancs. Anytime i typed in the 
> syntex, an error will appear eg setwd("C:\\TEMP)
>>dat <- read.table(cheshire_fmd.cvs",header=TRUE, sep=",")
>> dat.<-read.table(''chesire_fmd.cvs'',header=TRUE,sep='',)
> Error: unexpected symbol in "dat.<-read.table(''chesire_fmd.cvs"
>> dat$x.km <-dat$xcoord/1000
> Error: object 'dat' not found
>> dat$y.km <-dat$ycoord/1000
> Error: object 'dat' not found
>> dat[1:10,]
> Error: object 'dat' not found
>> Library(splancs)
> I was advised to remove the '>' at begining of each line, but its like that 
> symbol is a default. How do i do that.
> Thanks
> El-Oji Adamu
> Nigeria
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique in DataFrame

2012-03-20 Thread jim holtman
This may be what you want:

> x<-data.frame(city="Barcelona",sales=253639)
> x<-rbind(x,data.frame(city="Madrid",sales=223455))
> x<-rbind(x,data.frame(city="Lisbon",sales=273633))
> x<-rbind(x,data.frame(city="Madrid",sales=266535))
> x<-rbind(x,data.frame(city="Barcelona",sales=258369))
> x<-rbind(x,data.frame(city="Lisbon",sales=273633))
> x<-rbind(x,data.frame(city="Barcelona",sales=22579))
> x<-rbind(x,data.frame(city="Lisbon",sales=26333))
> x<-rbind(x,data.frame(city="Barcelona",sales=253639))
>
> x$num<-as.numeric(as.numeric(factor(x$city)))
> View(x)
> x[!duplicated(x$city),]
   city  sales num
1 Barcelona 253639   1
2Madrid 223455   2
3Lisbon 273633   3
>


On Tue, Mar 20, 2012 at 10:16 AM, MSousa  wrote:
> Hello,
>
>    I have little doubt, and I do not think that the way I solve the problem
> is the best way to do it.
> The following is a small dataset
>
>
> x<-data.frame(city="Barcelona",sales=253639)
> x<-rbind(x,data.frame(city="Madrid",sales=223455))
> x<-rbind(x,data.frame(city="Lisbon",sales=273633))
> x<-rbind(x,data.frame(city="Madrid",sales=266535))
> x<-rbind(x,data.frame(city="Barcelona",sales=258369))
> x<-rbind(x,data.frame(city="Lisbon",sales=273633))
> x<-rbind(x,data.frame(city="Barcelona",sales=22579))
> x<-rbind(x,data.frame(city="Lisbon",sales=26333))
> x<-rbind(x,data.frame(city="Barcelona",sales=253639))
>
> x$num<-as.numeric(as.numeric(factor(x$city)))
> View(x)
>
> my problem and my doubts start here, I'm trying to create a list of cities
> and the code that was assigned in.
> x$num<-as.numeric(as.numeric(factor(x$city)))
>
> here seems to work fine, but the largest dataset repeats some values and
> hiding others, this is the correct way to present unique values in a
> column, and view the contents with other columns
>
> rescity<-x[unique(x$city),c(3,1)]
> rescity
>
>   Thanks
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Unique-in-DataFrame-tp4488943p4488943.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error message

2012-03-20 Thread jim holtman
You have a syntax error:

> dat.<-read.table(''chesire_fmd.cvs'',header=TRUE,sep='',)
Error: unexpected symbol in "dat.<-read.table(''chesire_fmd.cvs"

notice the

sep='',

probably should be

sep = ','

On Tue, Mar 20, 2012 at 10:36 AM, adamu eloji  wrote:
> Dear all,
>
>  Who will bail me out. Iam using R with S-Splancs. Anytime i typed in the 
> syntex, an error will appear eg setwd("C:\\TEMP)
>>dat <- read.table(cheshire_fmd.cvs",header=TRUE, sep=",")
>> dat.<-read.table(''chesire_fmd.cvs'',header=TRUE,sep='',)
> Error: unexpected symbol in "dat.<-read.table(''chesire_fmd.cvs"
>> dat$x.km <-dat$xcoord/1000
> Error: object 'dat' not found
>> dat$y.km <-dat$ycoord/1000
> Error: object 'dat' not found
>> dat[1:10,]
> Error: object 'dat' not found
>> Library(splancs)
> I was advised to remove the '>' at begining of each line, but its like that 
> symbol is a default. How do i do that.
> Thanks
> El-Oji Adamu
> Nigeria
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-20 Thread jim holtman
use 'count.fields' to determine which line have 6 and 7 fields in them.

then use 'readLines' to read in the entire file and the use the data
from count.fields to write out to separate files"

x <- count.fields(...)
input <- readLines(..)
writeLines(input[x == 6], file = '6fields.csv')
writeLines(input[x==7], file = '7fields.csv')

On Tue, Mar 20, 2012 at 11:43 AM, Ashish Agarwal
 wrote:
> The file is 20MB having 2 Million rows.
> I understand that I two different formats  - 6 columns and 7 columns.
> How do I read chunks to different files by using scan with modifying
> skip and nlines parameters?
>
> On Mon, Mar 19, 2012 at 3:59 PM, Petr PIKAL  wrote:
>>
>> I would follow Jims suggestion,
>> nFields <- count.fields(fileName, sep = ',')
>> count fields and read chunks to different files by using scan with
>> modifying skip and nlines parameters. However if there is only few lines
>> which differ it would be better to correct those few lines manually in
>> some suitable editor.
>>
>> Elaborating omnipotent function for reading any kind of
>> corrupted/nonstandard files seems to me suited only if you expect to read
>> such files many times.
>>
>> Regards
>> Petr
>>
>>
>>>
>>>
>>>
>>> On Sat, Mar 17, 2012 at 4:54 AM, jim holtman  wrote:
>>> > Here is a solution that looks for the line with 7 elements and inserts
>>> > the quotes:
>>> >
>>> >
>>> >> fileName <- '/temp/text.txt'
>>> >> input <- readLines(fileName)
>>> >> # count the fields to find 7
>>> >> nFields <- count.fields(fileName, sep = ',')
>>> >> # now fix the data
>>> >> for (i in which(nFields == 7)){
>>> > +     # split on comma
>>> > +     z <- strsplit(input[i], ',')[[1]]
>>> > +     input[i] <- paste(z[1], z[2]
>>> > +         , paste('"', z[3], ',', z[4], '"', sep = '') # put on quotes
>>> > +         , z[5], z[6], z[7], sep = ','
>>> > +         )
>>> > + }
>>> >>
>>> >> # now read in the data
>>> >> result <- read.table(textConnection(input), sep = ',')
>>> >>
>>> >>         result
>>> >                         V1       V2                   V3   V4 V5 V6
>>> > 1                                                         1968 21  0
>>> > 2                                                  Boston 1968 13  0
>>> > 3                                                  Boston 1968 18  0
>>> > 4                                                 Chicago 1967 44  0
>>> > 5                                              Providence 1968 17  0
>>> > 6                                              Providence 1969 48  0
>>> > 7                                                   Binky 1968 24  0
>>> > 8                                                 Chicago 1968 23  0
>>> > 9                                                   Dally 1968  7  0
>>> > 10                                   Raleigh, North Carol 1968 25  0
>>> > 11 Addy ABC-Dogs Stars-W8.1                    Providence 1968 38  0
>>> > 12              DEF_REQPRF/                     Dartmouth 1967 31  1
>>> > 13                       PL                               1967 38  1
>>> > 14                       XY PopatLal                      1967  5  1
>>> > 15                       XY PopatLal                      1967  6  8
>>> > 16                       XY PopatLal                      1967  7  7
>>> > 17                       XY PopatLal                      1967  9  1
>>> > 18                       XY PopatLal                      1967 10  1
>>> > 19                       XY PopatLal                      1967 13  1
>>> > 20                       XY PopatLal               Boston 1967  6  1
>>> > 21                       XY PopatLal               Boston 1967  7 11
>>> > 22                       XY PopatLal               Boston 1967  9  2
>>> > 23                       XY PopatLal               Boston 1967 10  3
>>> > 24                       XY PopatLal               Boston 1967  7  2
>>> >>
>>> >
>>> >
>>> > On Fri, Mar 16, 2012 at 2:17 PM, Ashish Agarwal
>>> >  wrote:
>>> >> I have a file that is 5000 records and to edit that file is not easy.
>>> >> Is there any way to line 10 differently to account for changes in the
>>> >> third field?
>>> >>
>>> >> On Fri, Mar 16, 2012 at 11:35 PM, Peter Ehlers 
>> wrote:
>>> >>> On 2012-03-16 10:48, Ashish Agarwal wrote:
>>> 
>>>  Line 10 has City and State that too separated by comma. For line 10
>>>  how can I read differently as compared to the other lines?
>>> >>>
>>> >>>
>>> >>> Edit the file and put quotes around the city-state combination:
>>> >>>  "Raleigh, North Carol"
>>> >>>
>>> >>
>>> >> __
>>> >> R-help@r-project.org mailing list
>>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> >> and provide commented, minimal, self-contained, reproducible code.
>>> >
>>> >
>>> >
>>> > --
>>> > Jim Holtman
>>> > Data Munger Guru
>>> >
>>> > What is the problem that you are trying to solve?
>>> > Tell me what you want to d

[R] Remove individual rows from a matrix based upon a list

2012-03-20 Thread Grant Gillis
Dear All,

Thanks in advance for any help.  I have a square matrix of measures of
interactions among individuals and would like to calculate a values from a
function (colSums for example) with a single individual (row) excluded in
each instance.  That individual would be returned to the matrix before the
next is removed and the function recalculated.  I can do this by hand
removing rows based upon ids however I would like specify individuals to be
removed from a list (lots of data).

An example matrix:

MyMatrix
   E985047 E985071 E985088 F952477 F952478 J644805 J644807 J644813  E985047
1 0.09 0 0 0 0 0 0.4  E985071 0.09 1 0 0 0 0 0 0.07  E985088 0 0 1 0 0 0
0.14 0  F952477 0 0 0 1 0.38 0 0 0  F952478 0 0 0 0.38 1 0 0 0  J644805 0 0
0 0 0 1 0.07 0  J644807 0 0 0.14 0 0 0.07 1 0  J644813 0.4 0.07 0 0 0 0 0 1
Example list of individuals to be removed

MyList

E985088






F952477







F952478


If I were to do this by hand it would look like

MyMat1 <- MyMatrix[!rownames(MyMatrix)%in% "E985088",]
colSums(MyMat1)

MyMat2 <-  MyMatrix[!rownames(MyMatrix)%in% " F952477 ",]
colSums(MyMat2)

MyMat3 <-  MyMatrix[!rownames(MyMatrix)%in% " F952478 ",]
colSums(MyMat3)

How might I replace the individual ids (in quotes) with a list and remove
rows corresponding to that list from the matrix for the calculation and
returning the row to the list after each calculation before the next.

I hope I've been clear!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MA process in panels

2012-03-20 Thread Philipp Grueber
Dear R users,

I have an unbalanced panel with an average of I=100 individuals and a total
of T=1370 time intervals, i.e. T>>I. So far, I have been using the plm
package.

I wish to estimate a FE model like:

res<-plm(x~c+v, data=pdata_frame, effect="twoways", model="within",
na.action=na.omit)

…where c varies over i and t, and v represents an exogenous impact on x
varying over time but not over i. I discover significant time effects
comparing the above model with a plm(…,effect="individual", …)-model (using
pftest). 

MY PROBLEM:
I discover high levels of serial correlation in the errors. Including lags
of x, coefficients are significant at least put to 30 lags. If I set my
dataset to weekly observations (approx. 5 days = 1 week), the coefficients
of the lags are significant at least up to the 15th lag (I didn't test a
larger number of lags). The more lags I include, the less sections can be
included in my sample (the panel is unbalanced, i.e. data is not available
for the whole period for all individuals -- in fact, full data is available
for only few individuals).  

Checking the acf() and pacf() of x, I find that for the large majority of
individuals, x is an MA() process. That's plausible because it would explain
the high levels of autocorrelation. However, I do not know a lot about MA
models for panel data.  



Most books I have found so far only touch on MA processes in panels but do
not discuss the estimation problems and implementation in further details.
Therefore, I have the following questions:

1) Are there any issues specific to panel models with an MA component? 

2) Is there an implementation for panel MA models in R?

3) If not, I have thought about the following solution. Does this approach
provide correct/ reliable results?

_

#Unfortunately, I was unable to create an appropriate panel dataset with an
MA process in the residuals. Maybe someone has an idea where to find such
data? Nevertheless you should be able to follow my subsequent thoughts:

# I should be able to get my (time- and sectionally) demeaned series as
follows:

res1<-plm(x~c+v,data=pdata_frame, effect="twoways", model="within",
na.action=na.omit))
dem_yt<-pmodel.response(res) 
demXt<-model.matrix(res)

# Given the demeaned series, I need to set the first observation(s) in each
cross-section to NA in order to avoid inter-sectional links in the lagged
residuals (i.e. in the MA component).
#Note: Delete the first n observations per section for a MA(n) regression.
For me, an MA(1) process should be fine (I hope):

n<-1
for ( i in unique(pdata_frame$i)){
dem_yt[na.omit(pdata_frame$i)==i][1:n]<-rep(NA,n)
demXt$c[na.omit(pdata_frame$i)==i][1:n]<-rep(NA,n)
demXt$v[na.omit(pdata_frame$i)==i][1:n]<-rep(NA,n)
} 

#I think I should now be able to use standard ARIMA methods such as 

res2<-arima(x=dem_yt,xreg=demXt,order=c(0,0,1))

#Alternatively, I tried to obtain res2 using maxLik() from the maxLik
package, but I am not sure about how to specify the log-likelihood function: 

tslag <- function(x, d=l)
{
  x <- as.vector(x)
  n <- length(x)
  c(rep(NA,d),x)[1:n]
}

log_Lik<-function(param) {
b1<-param[1]
b2<-param[2]
b3<-param[3]
sigma<-param[4]
ll<- -0.5*N*log(2*pi) - N*log(sigma) -
sum(0.5*(dem_yt-(b1*demXt[,1]+b2*demXt[,2]) +
b3*tslag(dem_yt-(b1*demXt[,1]+b2*demXt[,2]),d=1))^2/sigma^2)
ll
}

res2<-maxLik(logLik=log_Lik,start=c(coef(res1),1,1),method="nr")
___


Am I on the right track? Is there an easier way to do this? Did I miss
something important?

Any help is appreciated, thanks a lot in advance!

Best, 
Philipp


__
Philipp Grueber
EBS Universitaet fuer Wirtschaft und Recht
Wiesbaden, Germany





-

EBS Universitaet fuer Wirtschaft und Recht
FARE Department
Wiesbaden/ Germany
http://www.ebs.edu/index.php?id=finacc&L=0
--
View this message in context: 
http://r.789695.n4.nabble.com/MA-process-in-panels-tp4489528p4489528.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SE from nleqslv

2012-03-20 Thread peter dalgaard

On Mar 20, 2012, at 15:55 , Berend Hasselman wrote:

> 
> On 20-03-2012, at 15:36, FU-WEN LIANG wrote:
> 
>> On Tue, Mar 20, 2012 at 4:24 AM, Berend Hasselman  wrote:
>>> 
>>> 
>>> On 20-03-2012, at 01:01, FU-WEN LIANG wrote:
>>> 
 Dear R-users,
 
 I use the "nleqslv" function to get parameter estimates by solving a
 system
 of non-linear equations. But I also need standard error for each of
 estimates. I checked the nleqslv manual but it didn't mention about SE.
 Is there any way to get the SE for each estimate?
>>> 
>>> nleqslv is for solving a nonlinear system of equations. Only that.
>>> If you provide a system of equations for determining standard errors then
>>> nleqslv might be able to solve that system.
>>> You can use nleqslv to investigate the sensitivity of a solution wrt
>>> changes in parameters.
>>> 
>>> Berend
>>> 
>> 
>> Thank you very much for your advice, Berend.
>> Would you please give me a hint about "the sensitivity of a solution
>> wrt changes in parameters"? What statistics can we use?
> 
> Suppose you have a system of two equations and this system depends on two 
> parameters A and B.
> You have solved the system for specific values of A and B.
> Then you can vary A and B see how the solution changes.
> 
> How that could or might be translated to SE's I really wouldn't know.
> A measure of the sensitivity could be the (relative change of a norm of the 
> solution) / (relative change of a parameter).

Well, the delta method springs to mind, but it really depends on how and where 
noise is being injected into the system. All we have been told is that the 
estimates are obtained as a solution to a nonlinear equation, and that can mean 
many things. Presumably there are some observations somewhere, with a 
distribution, etc.


> 
> Berend
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SE from nleqslv

2012-03-20 Thread Bert Gunter
> 
> Well, the delta method springs to mind, but it really depends on how and 
> where noise is being injected into the system. All we have been told is that 
> the estimates are obtained as a solution to a nonlinear equation, and that 
> can mean many things. Presumably there are some observations somewhere, with 
> a distribution, etc.

"A long time ago, in a galaxy far away..."
Cue Star Wars Theme... **

Cheers,
Bert

** Sorry ... something suddenly came over me.

>
>
>>
>> Berend
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SE from nleqslv

2012-03-20 Thread Peter Meilstrup
You haven't said how the function you're optimizing relates to your data.
In the special case that you happen to be using nleqslv to maximize a
log-likelihood function (a special case of which is least squares fitting),
you can get an approximation to the standard error using the Jacobian
matrix that nleqslv computes.

Peter

On Mon, Mar 19, 2012 at 5:01 PM, FU-WEN LIANG  wrote:

> Dear R-users,
>
> I use the "nleqslv" function to get parameter estimates by solving a system
> of non-linear equations. But I also need standard error for each of
> estimates. I checked the nleqslv manual but it didn't mention about SE.
> Is there any way to get the SE for each estimate?
>
> Thank you very much.
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique in DataFrame

2012-03-20 Thread MSousa
Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/Unique-in-DataFrame-tp4488943p4489554.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to write and analyze data with 3 dimensions

2012-03-20 Thread jorge Rogrigues
Suppose I have data organized in the following way:
(P_i, M_j, S_k)

where i, j and k and indexes for sets.
I would like to analyze the data to get for example the following
information:
what is the average over k for
(P_i, M_j)
or what is the average over j and k for P_i.

My question is what would be the way of doing this in R.
Specifically how should I write the data in a csv file
and how do I read the data from the csv file into R and perform these basic
operations.

Thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove quotes from a string to use in a variable call

2012-03-20 Thread dadrivr
Hi,

I have a string that I want to use in a variable call.  How can I remove the
quotes and/or the string properties of the string to use it in a variable
call?

Here's an example:

library(lme)
fm2 <- lme(distance ~ age, data = Orthodont, random = ~ 1)
summary(fm2)

I want to update the above regression to include new predictors according to
what is in a string:

predictors <- "age + Sex"
update(fm2,fixed=distance ~ age + Sex) #this works
update(fm2,fixed=distance ~ noquote(predictors)) #this doesn't work

Any help would be greatly appreciated.  Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-quotes-from-a-string-to-use-in-a-variable-call-tp4489370p4489370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error message

2012-03-20 Thread David Winsemius


On Mar 20, 2012, at 10:36 AM, adamu eloji wrote:


Dear all,

 Who will bail me out.


You seem to be doing homework. This is not a mailing list for homework.


Iam using R with S-Splancs.


What is that? You cannot use S packages with R unless someone has  
ported it.


Anytime i typed in the syntex, an error will appear eg setwd("C:\ 
\TEMP)

dat <- read.table(cheshire_fmd.cvs",header=TRUE, sep=",")


Sarah has already shown you the error of the first statement.


dat.<-read.table(''chesire_fmd.cvs'',header=TRUE,sep='',)


That will produce is a different error  but it seems to possibly  
related to you attempting to create double quotes by using two  
instances of singe-quotes together. That will not succeed. You can  
match either single quotes or double quotes but you cannot use   
''name''.




Error: unexpected symbol in "dat.<-read.table(''chesire_fmd.cvs"

dat$x.km <-dat$xcoord/1000



Error: object 'dat' not found

dat$y.km <-dat$ycoord/1000

Error: object 'dat' not found

dat[1:10,]

Error: object 'dat' not found

Library(splancs)


Capitalization is crucial. Unless there is a package with a function  
named "Library" as a function, that will not do anything useful.


I was advised to remove the '>' at begining of each line, but its  
like that symbol is a default. How do i do that.


We have no context for that question. But again this is not a  
"homework help line". You should be using you local resources for  
instruction




Thanks
El-Oji Adamu
Nigeria

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove leading and trailing white spaces

2012-03-20 Thread Daniel Malter
This just saved me a lot of time.

Thank you!
Daniel

--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-leading-and-trailing-white-spaces-tp907851p4489725.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] igraph: decompose.graph: Error: protect(): protection stack overflow

2012-03-20 Thread Sam Steingold
I just got this error:
> library(igraph)
> comp <- decompose.graph(gr)
Error: protect(): protection stack overflow
Error: protect(): protection stack overflow
> 

what can I do?
the digraph is, indeed, large (300,000 vertexes), but there are very
many very small components (which I would rather not discard).

PS. the doc for decompose.graph does not say which mode is the default.

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://mideasttruth.com http://camera.org
http://iris.org.il http://openvotingconsortium.org http://truepeace.org
Beauty is only a light switch away.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] igraph: decompose.graph: Error: protect(): protection stack overflow

2012-03-20 Thread Sam Steingold
> * Sam Steingold  [2012-03-20 14:20:06 -0400]:
>
> I just got this error:
>> library(igraph)
>> comp <- decompose.graph(gr)
> Error: protect(): protection stack overflow
> Error: protect(): protection stack overflow
>> 

after restarting

> system.time(comp <- decompose.graph(gr, mode="weak"))
Error: protect(): protection stack overflow
Error: protect(): protection stack overflow
Error: protect(): protection stack overflow
> system.time(comp <- decompose.graph(gr, mode="strong"))

 *** caught segfault ***
address 0xd8, cause 'memory not mapped'

Traceback:
 1: decompose.graph(sc.gr, mode = "strong")
 2: system.time(sc.comp <- decompose.graph(sc.gr, mode = "strong"))

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 3

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://openvotingconsortium.org http://www.memritv.org
http://pmw.org.il http://www.PetitionOnline.com/tap12009/
What's the difference between Apathy & Ignorance? -I don't know and don't care!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Graphic legend with mathematical symbol, numeric variable and character variable

2012-03-20 Thread Patrick Breheny
There are a few different ways to do this; see the examples in ?plotmath 
under the heading "How to combine 'math' and numeric variables".


--
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky

On 03/20/2012 09:09 AM, "ECOTIÈRE David (Responsable d'activité) - CETE 
Est/LRPC de Strasbourg/6 Acoustique" wrote:

Hi,

I'd like to make a legend with a mix of mathematical symbol (tau),
numeric variable and character variables.I have tried :


types<-c("Type 1","Type 2","Type 2")
tau<-c(1,3,2)
legend(x="topright",legend=paste(types,"tau=",expression(tau)))



but it doesn't work: the 'tau' symbol is not written in its 'symbol
style' but as 'tau'

Any (good) idea ?
Thank you in advance !


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with RMA using limma, oligo and pdInfoBuilder packages

2012-03-20 Thread Marcin R
Dear Martin
I just start to learn how analise microarray data by bioconductor. I use 
ragene.1.1.st.v1.v1 from affymetrix. I try to find some info in internet, but 
your example was the most useful for me . Now I done exacly this same witch tou 
descirbe and I got results file. I would like to thank you for your help.
Greetings
Marcin Rucinski 
Dept of Histology and Embryology
Poznan University of Medical Science
Poland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cv.glmnet

2012-03-20 Thread Patrick Breheny

On 03/20/2012 09:41 AM, Yuanyuan Tang wrote:

Does anybody know how to avoid the intercept term in cv.glmnet coefficient?
  When I say "avoid", it does not mean using coef()[-1] to omit the printout
of intercept, it means no intercept at all when doing the analysis. Thanks.


I do not believe that is possible with the current implementation of 
glmnet.  The glmnet() function includes an intercept by default and 
there are no options which allow the user to change this.


--
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] job opening at Merck Research Labs, NJ USA

2012-03-20 Thread Liaw, Andy
The Biometrics Research department at the Merck Research Laboratories has an 
open position to be located in Rahway, New Jersey, USA:

This position will be responsible for imaging and bio-signal biomarkers 
projects including analysis of preclinical, early clinical, and experimental 
medicine imaging and EEG data. Responsibilities include all phases of data 
analysis from processing of raw imaging and EEG data to derivation of 
endpoints. Part of the responsibilities is development and implementation of 
novel statistical methods and software for analysis of imaging and bio-signal 
data.  This position will closely collaborate with Imaging and Clinical 
Pharmacology departments; Experimental Medicine; Early and Late Stage 
Development Statistics; and Modeling and Simulation.  Publication and 
presentation of the results is highly encouraged as is collaboration with 
external experts. 

Education Minimum Requirement:  PhD in Statistics, Applied Mathematics, 
Physics, Computer Science, Engineering, or related fields
Required Experience and Skills: Education should include Statistics related 
courses or equivalently working experience should involve data analysis and 
statistical modeling for at least 1 year. Excellent computing skills, R and/or 
SAS , MATLAB  in Linux and Windows environment; working knowledge of parallel 
computing; C, C++,  or Fortran programming.  Dissertation or experience in at 
least one of these areas: statistical image and signal analysis; data mining 
and machine learning; mathematical modeling in medicine and biology;  general 
statistical research
Desired Experience and Skills -  education in and/or experience with EEG and 
Imaging data analysis; stochastic modeling; functional data analysis; 
familiarity with wavelet analysis and other spectral analysis methods


Please apply electronically at:
http://www.merck.com/careers/search-and-apply/search-jobs/home.html 
Click on "Experienced Opportunities", and search by Requisition ID: BIO003546 
and email CV to:
vladimir_svet...@merck.com

Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lag based on Date objects with non-consecutive values

2012-03-20 Thread Sam Albers
On Mon, Mar 19, 2012 at 9:11 PM, Gabor Grothendieck
 wrote:
>
>
> On Mon, Mar 19, 2012 at 8:03 PM, Sam Albers 
> wrote:
>>
>> Hello R-ers,
>>
>> I just wanted to update this post. I've made some progress on this but
>> am still not quite where I need to be. I feel like I am close so I
>> just wanted to share my work so far.
>>
>
> Try this:
>
> Lines <- "Date      Dis1
> 1967-06-05  1.146405
> 1967-06-06  9.732887
> 1967-06-07 -9.279462
> 1967-06-08  7.856646
> 1967-06-09  5.494370
> 1967-06-15  5.070176
> 1967-06-16  3.847314
> 1967-06-17 -5.243094
> 1967-06-18  9.396560
> 1967-06-19  4.112792"
>
> # read in data
> library(zoo)
> z <- read.zoo(text = Lines, header = TRUE)
>
> # process it
> g <- seq(start(z), end(z), "day") # all days
> zg <- merge(z, zoo(, g)) # fill in missing days
> lag(zg, 0:-2)[time(z)]
>

Thanks Gabor. I was, however, hoping for base R solution. I think I've
got it and I will post the result here just to be complete. A big
thanks to Brain Cade for an off-list suggestion.

set.seed(32)
df1<-data.frame(
   Date=seq(as.Date("1967-06-05","%Y-%m-%d"),by="day", length=5),
   Dis1=rnorm(5, 1,10)
   )
df2<-data.frame(
  Date=seq(as.Date("1967-06-15","%Y-%m-%d"),by="day", length=5),
  Dis1=rnorm(5, 1,10)
  )

df <- rbind(df1,df2)
df$Dis2 <- df$Dis1*2


lag.base <- function (lag.date, lag.by, lag.var) {
  time_dif <- as.numeric(lag.date)-c(rep(NA,lag.by), head(lag.date, -lag.by))
  lag.tmp <-c(rep(NA,lag.by), head(lag.var, -lag.by))
  lv <- ifelse(time_dif<=lag.by,lag.tmp,NA)
  return(lv)
}

df$lag <- lag.base(lag.date=df$Date, lag.var=df$Dis1, lag.by=3);df
df$lag2 <- lag.base(lag.date=df$Date, lag.var=df$Dis2, lag.by=3);df


> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-20 Thread Ashish Agarwal
Given x<- count.fields(..) could you pls help in following:
1. how to create a data vector with data being line numbers of original
file where x==6?
2. what is the way to read only the nth line (only) of an input file into a
data vector with first three attributes to be read as string, 4th
being categorical, 5th and 6th being numeric with width 10?


On Tue, Mar 20, 2012 at 9:37 PM, jim holtman  wrote:
> use 'count.fields' to determine which line have 6 and 7 fields in them.
>
> then use 'readLines' to read in the entire file and the use the data
> from count.fields to write out to separate files"
>
> x <- count.fields(...)
> input <- readLines(..)
> writeLines(input[x == 6], file = '6fields.csv')
> writeLines(input[x==7], file = '7fields.csv')
>
> On Tue, Mar 20, 2012 at 11:43 AM, Ashish Agarwal
>  wrote:
>> The file is 20MB having 2 Million rows.
>> I understand that I two different formats  - 6 columns and 7 columns.
>> How do I read chunks to different files by using scan with modifying
>> skip and nlines parameters?
>>
>> On Mon, Mar 19, 2012 at 3:59 PM, Petr PIKAL 
wrote:
>>>
>>> I would follow Jims suggestion,
>>> nFields <- count.fields(fileName, sep = ',')
>>> count fields and read chunks to different files by using scan with
>>> modifying skip and nlines parameters. However if there is only few lines
>>> which differ it would be better to correct those few lines manually in
>>> some suitable editor.
>>>
>>> Elaborating omnipotent function for reading any kind of
>>> corrupted/nonstandard files seems to me suited only if you expect to
read
>>> such files many times.
>>>
>>> Regards
>>> Petr
>>>
>>>



 On Sat, Mar 17, 2012 at 4:54 AM, jim holtman 
wrote:
 > Here is a solution that looks for the line with 7 elements and
inserts
 > the quotes:
 >
 >
 >> fileName <- '/temp/text.txt'
 >> input <- readLines(fileName)
 >> # count the fields to find 7
 >> nFields <- count.fields(fileName, sep = ',')
 >> # now fix the data
 >> for (i in which(nFields == 7)){
 > + # split on comma
 > + z <- strsplit(input[i], ',')[[1]]
 > + input[i] <- paste(z[1], z[2]
 > + , paste('"', z[3], ',', z[4], '"', sep = '') # put on
quotes
 > + , z[5], z[6], z[7], sep = ','
 > + )
 > + }
 >>
 >> # now read in the data
 >> result <- read.table(textConnection(input), sep = ',')
 >>
 >> result
 > V1   V2   V3   V4 V5 V6
 > 1 1968 21  0
 > 2  Boston 1968 13  0
 > 3  Boston 1968 18  0
 > 4 Chicago 1967 44  0
 > 5  Providence 1968 17  0
 > 6  Providence 1969 48  0
 > 7   Binky 1968 24  0
 > 8 Chicago 1968 23  0
 > 9   Dally 1968  7  0
 > 10   Raleigh, North Carol 1968 25  0
 > 11 Addy ABC-Dogs Stars-W8.1Providence 1968 38  0
 > 12  DEF_REQPRF/ Dartmouth 1967 31  1
 > 13   PL   1967 38  1
 > 14   XY PopatLal  1967  5  1
 > 15   XY PopatLal  1967  6  8
 > 16   XY PopatLal  1967  7  7
 > 17   XY PopatLal  1967  9  1
 > 18   XY PopatLal  1967 10  1
 > 19   XY PopatLal  1967 13  1
 > 20   XY PopatLal   Boston 1967  6  1
 > 21   XY PopatLal   Boston 1967  7 11
 > 22   XY PopatLal   Boston 1967  9  2
 > 23   XY PopatLal   Boston 1967 10  3
 > 24   XY PopatLal   Boston 1967  7  2
 >>
 >
 >
 > On Fri, Mar 16, 2012 at 2:17 PM, Ashish Agarwal
 >  wrote:
 >> I have a file that is 5000 records and to edit that file is not
easy.
 >> Is there any way to line 10 differently to account for changes in
the
 >> third field?
 >>
 >> On Fri, Mar 16, 2012 at 11:35 PM, Peter Ehlers 
>>> wrote:
 >>> On 2012-03-16 10:48, Ashish Agarwal wrote:
 
  Line 10 has City and State that too separated by comma. For line
10
  how can I read differently as compared to the other lines?
 >>>
 >>>
 >>> Edit the file and put quotes around the city-state combination:

Re: [R] cv.glmnet

2012-03-20 Thread David Winsemius


On Mar 20, 2012, at 3:17 PM, Patrick Breheny wrote:


On 03/20/2012 09:41 AM, Yuanyuan Tang wrote:
Does anybody know how to avoid the intercept term in cv.glmnet  
coefficient?
 When I say "avoid", it does not mean using coef()[-1] to omit the  
printout
of intercept, it means no intercept at all when doing the analysis.  
Thanks.


I do not believe that is possible with the current implementation of  
glmnet.  The glmnet() function includes an intercept by default and  
there are no options which allow the user to change this.





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] igraph: decompose.graph: Error: protect(): protection stack overflow

2012-03-20 Thread Duncan Murdoch

On 12-03-20 2:20 PM, Sam Steingold wrote:

I just got this error:

library(igraph)
comp<- decompose.graph(gr)

Error: protect(): protection stack overflow
Error: protect(): protection stack overflow




what can I do?
the digraph is, indeed, large (300,000 vertexes), but there are very
many very small components (which I would rather not discard).

PS. the doc for decompose.graph does not say which mode is the default.



If you don't get useful help here, you should contact the maintainer of 
the package, who may not be reading your question.  The maintainer is 
listed if you run


library(help=igraph)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova.lm F test confusion

2012-03-20 Thread msteane
Sorry...typo 

***<-- I don't get why the MSE of model 3 is being included if we're
comparing Model 2 to Model 1

--
View this message in context: 
http://r.789695.n4.nabble.com/anova-lm-F-test-confusion-tp4490211p4490220.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Not colour but symbols

2012-03-20 Thread Komine
Hi,
Instead to put colour in my histogram,
I want to put symbols like lines, dots etc.
Do you know the function that does it?
Thank you in advance 


--
View this message in context: 
http://r.789695.n4.nabble.com/Not-colour-but-symbols-tp4490030p4490030.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] passing xlim to coord_map in ggplot2

2012-03-20 Thread z2.0
I'm sure this is smack-head moment, but I haven't been able to find an
example of this on Nabble or SO, so thought I'd ask.

This works:
michigan <- map_data('county', 'michigan')
mich_points <- data.frame(x = rnorm(n = 200, median(michigan[,1]), 0.75), y
= rnorm(n = 200, median(michigan[,2]), 0.75))
ggplot() + geom_path(aes(long, lat, group = group), data = michigan) +
geom_point(aes(x, y), data = mich_points) + coord_map('gilbert', xlim =
c(-86, -84))

This generates the following error:
*Error in unit(x, default.units) : 'x' and 'units' must have length > 0*
#Where tank_trunc is a data.frame with two columns, 'lon' and 'lat'
containing point coordinates in storage mode 'double'.
michigan_map.df <- map_data('county', 'michigan')
ggplot() + geom_point(aes(lon, lat), data = tank_trunc, na.rm = T) +
geom_path(aes(long, lat, group = group), data = michigan_map.df) +
coord_map('gilbert', xlim = c(-88, -82))

I thought at first maybe the overlay of one layer on another caused the
limiting to freak out. But the sketch code above disproves that theory --
thoughts? Some kink in ggplot2's latest implementation? Do I need another
package? (e.g., the Scales disunion in the latest release...)

Thanks, as always,

Zack

--
View this message in context: 
http://r.789695.n4.nabble.com/passing-xlim-to-coord-map-in-ggplot2-tp4490005p4490005.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove quotes from a string to use in a variable call

2012-03-20 Thread Rui Barradas
Hello,



dadrivr wrote
> 
> Hi,
> 
> I have a string that I want to use in a variable call.  How can I remove
> the quotes and/or the string properties of the string to use it in a
> variable call?
> 
> Here's an example:
> 
> library(nlme)
> fm2 <- lme(distance ~ age, data = Orthodont, random = ~ 1)
> summary(fm2)
> 
> I want to update the above regression to include new predictors according
> to what is in a string:
> 
> predictors <- "age + Sex"
> update(fm2,fixed=distance ~ age + Sex) #this works
> update(fm2,fixed=distance ~ noquote(predictors)) #this doesn't work
> 
> Any help would be greatly appreciated.  Thanks!
> 

Try

response <- "distance"
predictors <- "age + Sex"
fmla.text <- paste(response, predictors, sep="~")

update(fm2,fixed=as.formula(fmla.text))

Hope this helps,

Rui Barradas



--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-quotes-from-a-string-to-use-in-a-variable-call-tp4489370p4490120.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem in loop

2012-03-20 Thread Fidel Maximiano Peña Ramos
Dear R users
I want change the entries in a matrix. I did a matrix



A=0  1  5

  .3 0  0

  0  .5 0



A1<-A


A2<-A*0.90
A2

 12   3

1 0.00 0.90 4.5

2 0.27 0.00 0.0

3 0.00 0.45 0.0



I need replace elements one by one in a loop

I tried the following using package POPBIO,

total <-matrix(0, nrow=5, ncol=60)

for(i in 1:10){

A1<-A

A1[1,2] <- A2[1,2]

A1[1,3]<-A2[1,3]

A1[2,1]<-A2[2,1]

A1[3,2]<-A2[3,2]

n <-runif(3)

n <- n/sum(n)

p1<-pop.projection(A1,n,60)

total[i,] <- p1$pop.sizes}

matplot2(total, legend=NA,xlab=c(years))



but i not view change in population





Thanks in advance

Fidel M.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wrong output due to what I think might be a data type issue (zoo read in problem)

2012-03-20 Thread knavero
Ah I see. Thank you very much Gabor and Joshua. Yes that makes sense since in
C, alpha characters are represented in single quotes as to represent the
ASCII value hence 'M'. I would've never imagined the raw data would be so
lame like that though. Thanks again! 

--
View this message in context: 
http://r.789695.n4.nabble.com/Wrong-output-due-to-what-I-think-might-be-a-data-type-issue-zoo-read-in-problem-tp4487682p4490172.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova.lm F test confusion

2012-03-20 Thread msteane
I am using anova.lm to compare 3 linear models.  Model 1 has 1 variable,
model 2 has 2 variables and model 3 has 3 variables.  All models are fitted
to the same data set.

anova.lm(model1,model2) gives me:

  Res.DfRSS Df Sum of Sq  FPr(>F)
1135 245.38  
2134 184.36  161.022 44.354 6.467e-10 ***

anova.lm(model1,model2,model3) gives me:

  Res.DfRSS Df Sum of Sq  FPr(>F)
1135 245.38  
2134 184.36  161.022 50.182 7.355e-11 ***
3133 161.73  122.628 18.609 3.105e-05 ***


Why aren't the 2nd row F values from each of the anova tables the same??? I
thought in each case the 2nd row is comparing model 2 to model 1?  

I figured out that for anova.lm(model1,model2) 
F(row2)=Sum of Sq(row2)/MSE of Model 2 

and for anova.lm(model1,model2,model3)
 F(row2)=Sum of Sq(row 2)/MSE of Model 3  <-- I don't get why the MSE of
model 3 is being included if we're comparing Model 2 to Model 2



Any help/explanations would be appreciated! 

--
View this message in context: 
http://r.789695.n4.nabble.com/anova-lm-F-test-confusion-tp4490211p4490211.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Not colour but symbols

2012-03-20 Thread Bert Gunter
Don't do this!

Google on "chartjunk" to learn why not.

-- Bert

On Tue, Mar 20, 2012 at 12:11 PM, Komine  wrote:
> Hi,
> Instead to put colour in my histogram,
> I want to put symbols like lines, dots etc.
> Do you know the function that does it?
> Thank you in advance
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Not-colour-but-symbols-tp4490030p4490030.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove individual rows from a matrix based upon a list

2012-03-20 Thread Rui Barradas
Hello,

>
> Thanks in advance for any help.  I have a square matrix of measures of
> interactions among individuals and would like to calculate a values from a
> function (colSums for example) with a single individual (row) excluded in
> each instance.  That individual would be returned to the matrix before the
> next is removed and the function recalculated. 
> 

Try


MyMatrix <- structure(list(V2 = c(1, 0.09, 0, 0, 0, 0, 0, 0.4), V3 = c(0.09, 
1, 0, 0, 0, 0, 0, 0.07), V4 = c(0, 0, 1, 0, 0, 0, 0.14, 0), V5 = c(0, 
0, 0, 1, 0.38, 0, 0, 0), V6 = c(0, 0, 0, 0.38, 1, 0, 0, 0), V7 = c(0, 
0, 0, 0, 0, 1, 0.07, 0), V8 = c(0, 0, 0.14, 0, 0, 0.07, 1, 0), 
V9 = c(0.4, 0.07, 0, 0, 0, 0, 0, 1)), .Names = c("V2", "V3", 
"V4", "V5", "V6", "V7", "V8", "V9"), class = "data.frame", row.names =
c("E985047", 
"E985071", "E985088", "F952477", "F952478", "J644805", "J644807", 
"J644813"))

MyList <- c("E985088", "F952477", "F952478")


inx <- which(rownames(MyMatrix) %in% MyList)

result <- lapply(inx, function(i) colSums(MyMatrix[-i, ]))

# Not needed, but makes what is what more clear 
names(result) <- paste("Without", MyList, sep=".")
result

>
> I hope I've been clear!! 
>

I believe you were, but your data is a mess.
The structures above were produced with function 'dput', it makes it much,
much easier to create the objects.
See

?dput

and use it!

Hope this helps,

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-individual-rows-from-a-matrix-based-upon-a-list-tp4489462p4490257.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What is the correct syntax of "for" or "if" in Rexcel

2012-03-20 Thread Dong-Joon Lim
Hello thankful R friends,

Can I use iteration (for) or conditional (if) syntax in rexcel using rrun?
I've finished coding my program and want to run through Excel.
I just want to run such as

Call rinterface.RRun("for(i in 1:10){")
Call rinterface.RRun("a[i,1]<-i")
Call rinterface.RRun("}")

But it doesn't work.
Any solution or trick to use "for" or "if" syntax?


Thanks in advance as always,
Dong-Joon

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] labeling rows in heatmap.2

2012-03-20 Thread 1Rnwb
how can i put the row labels on the left hand of the heatmap in heatmap.2?

abnr<-structure(c(1, 0.678622097406395, 0.670294749652918,
-0.0016314464654279, 
-0.000519068106572792, 0.199581999119988, -0.0106623494189115, 
0.0840111691399559, -0.0461494399639137, 0.249279171677728, NA, 
1, 0.757114062773504, 0.0352642759270137, -0.0255518450373996, 
0.0943268190664674, -0.0536269679247722, 0.126773293034976,
0.201980408094959, 
0.350765436868705, NA, NA, 1, -0.0036285048239171, -0.0130341823193391, 
0.0687025839829192, 0.0178114338783461, 0.152626558218618,
0.275694188182626, 
0.516142788252573, NA, NA, NA, 1, 0.164352390738372, 0.0458032354120583, 
-0.105461242066774, 0.128550333248478, -0.0388185507340826,
-0.0114545823453345, 
NA, NA, NA, NA, 1, 0.0771316851136, -0.00659533531241862,
0.0901665606000509, 
-0.0220524408127054, 0.0488218042091934, NA, NA, NA, NA, NA, 
1, 0.208114979820194, 0.438398355562088, -0.0635609915410962, 
0.0769889130808, NA, NA, NA, NA, NA, NA, 1, 0.350782329458641, 
0.102284906838582, 0.00467073053941224, NA, NA, NA, NA, NA, NA, 
NA, 1, 0.170117904443778, 0.166988169283325, NA, NA, NA, NA, 
NA, NA, NA, NA, 1, 0.324711157100758, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 1), .Dim = c(10L, 10L), .Dimnames = list(c("TNFB", 
"MCP3", "IL6", "IGFBP6", "sCD40L", "PTH", "IGFBP2", "OPG", "IL1Ra", 
"TNFA"), c("TNFB", "MCP3", "IL6", "IGFBP6", "sCD40L", "PTH", 
"IGFBP2", "OPG", "IL1Ra", "TNFA")))

heatmap.2(abnr, breaks=c(0,0.05,0.1,0.25,0.35), col=brewer.pal(4,"Blues"),
Rowv=FALSE, Colv=FALSE,symm=TRUE,
key=TRUE,symkey=FALSE, density.info="none", trace="none", 
cexRow=0.75,
keysize=0.8,
scale = "none", dendrogram="none",main='AbN')

I would appreciate any response on this.
Thanks
sharad


--
View this message in context: 
http://r.789695.n4.nabble.com/labeling-rows-in-heatmap-2-tp4490314p4490314.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Running BayesFst in R

2012-03-20 Thread jazjanes
Hi All,

I am trying to plot outlier loci in R from data generated by BayesFst.  The
developers provide the code which I use but I can't seem to get it to work
consistently. Sometimes I can get R to generate a plot (without confidence
intervals or loci IDs) and other times I cannot even generate P-values using
the code supplied.

I am wondering if any one has used R to plot the BayesFst results and can
assist me in working out where I go wrong? Typically, R will become
unresponsive or it will provide errors at the getpvals/getfsts stages
referring to "object m1 not found" - but other times it will work. I am
uncertain exactly what m refers to, but 1440 is the # of loci and 2 is the #
of populations. M may refer to # samples??

The code is as follows:
data.matrix(read.table("fst.out")
getpvals(m1,1440,2)
getfsts(m1,1440,2)
plot(pvals[,3],pvals[,1],xlab="transformed p-values",ylab="FST")
... the code continues

Thanks


--
View this message in context: 
http://r.789695.n4.nabble.com/Running-BayesFst-in-R-tp4490380p4490380.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Not colour but symbols

2012-03-20 Thread Thomas Lumley
However, you *can* do it in R.  For example, the function to draw
histograms filled with lines is hist().

   -thomas

On Wed, Mar 21, 2012 at 9:31 AM, Bert Gunter  wrote:
> Don't do this!
>
> Google on "chartjunk" to learn why not.
>
> -- Bert
>
> On Tue, Mar 20, 2012 at 12:11 PM, Komine  wrote:
>> Hi,
>> Instead to put colour in my histogram,
>> I want to put symbols like lines, dots etc.
>> Do you know the function that does it?
>> Thank you in advance
>>
>>
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/Not-colour-but-symbols-tp4490030p4490030.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the correct syntax of "for" or "if" in Rexcel

2012-03-20 Thread Jeff Newmiller
A) This is the wrong list for StatConn questions.

B) My guess would be that you need to send the entire loop in one call.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Dong-Joon Lim  wrote:

>Hello thankful R friends,
>
>Can I use iteration (for) or conditional (if) syntax in rexcel using
>rrun?
>I've finished coding my program and want to run through Excel.
>I just want to run such as
>
>Call rinterface.RRun("for(i in 1:10){")
>Call rinterface.RRun("a[i,1]<-i")
>Call rinterface.RRun("}")
>
>But it doesn't work.
>Any solution or trick to use "for" or "if" syntax?
>
>
>Thanks in advance as always,
>Dong-Joon
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scientific notation in a data frame

2012-03-20 Thread Laura Rodriguez Murillo
Dear list,
I have a data frame where one of the columns are p values with scientific
notation mixed with regular numbers with decimals.
>a=data frame
>a
  P OR   N
0.50 0.7500 237
0.047 1.1030 237
0.124 0.7742 237
0.124 0.7742 237
0.0080 1.1590 237
0.50 0.7500 237
4.5e-07 1.2 237
5.6e-04 0.9 237

when I try to do
>pval=a$P/2

R gives me an error saying "In Ops.factor(pval, 2) : / not meaningful for
factors"

any help on this?

Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scientific notation in a data frame

2012-03-20 Thread David Winsemius


On Mar 20, 2012, at 7:52 PM, Laura Rodriguez Murillo wrote:


Dear list,
I have a data frame where one of the columns are p values with  
scientific

notation mixed with regular numbers with decimals.

a=data frame
a

 P OR   N
0.50 0.7500 237
0.047 1.1030 237
0.124 0.7742 237
0.124 0.7742 237
0.0080 1.1590 237
0.50 0.7500 237
4.5e-07 1.2 237
5.6e-04 0.9 237

when I try to do

pval=a$P/2


R gives me an error saying "In Ops.factor(pval, 2) : / not  
meaningful for

factors"


It is a complete mystery to me that people don't believe the error  
messages.  a$P is a factor. Something you did created a factor  
(probably at data entry time), and you didn't realize it. You can  
change it to numeric with:


a$P <-  as.numeric(as.character(a$P))

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scientific notation in a data frame

2012-03-20 Thread Laura Rodriguez Murillo
Thank you!

On Tue, Mar 20, 2012 at 8:39 PM, David Winsemius wrote:

>
> On Mar 20, 2012, at 7:52 PM, Laura Rodriguez Murillo wrote:
>
>  Dear list,
>> I have a data frame where one of the columns are p values with scientific
>> notation mixed with regular numbers with decimals.
>>
>>> a=data frame
>>> a
>>>
>> P OR   N
>> 0.50 0.7500 237
>> 0.047 1.1030 237
>> 0.124 0.7742 237
>> 0.124 0.7742 237
>> 0.0080 1.1590 237
>> 0.50 0.7500 237
>> 4.5e-07 1.2 237
>> 5.6e-04 0.9 237
>>
>> when I try to do
>>
>>> pval=a$P/2
>>>
>>
>> R gives me an error saying "In Ops.factor(pval, 2) : / not meaningful for
>> factors"
>>
>
> It is a complete mystery to me that people don't believe the error
> messages.  a$P is a factor. Something you did created a factor (probably at
> data entry time), and you didn't realize it. You can change it to numeric
> with:
>
> a$P <-  as.numeric(as.character(a$P))
>
> --
>
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-20 Thread jim holtman
On Tue, Mar 20, 2012 at 3:17 PM, Ashish Agarwal
 wrote:
> Given x<- count.fields(..) could you pls help in following:
> 1. how to create a data vector with data being line numbers of original file
> where x==6?

That is what the expression:

 writeLines(input[x == 6], file = '6fields.csv')

is doing.  'x == 6' is a logical vector with TRUE in the position of
the line that has 6 fields in it, so it is only extracting the lines
with 6 fields and writing them to the output file.  You probably need
to read the section on "indexing" in the "Intro to R" manual.


> 2. what is the way to read only the nth line (only) of an input file into a
> data vector with first three attributes to be read as string, 4th
> being categorical, 5th and 6th being numeric with width 10?

You might want to give an example of the the line looks like.  I would
use 'readLines' to read in the file and then I could index to the
'nth' line and parse it using 'strsplit' or 'regexpr' depending on its
complexity.  This would depend on the format of the line which has not
been provided.


>
>
> On Tue, Mar 20, 2012 at 9:37 PM, jim holtman  wrote:
>> use 'count.fields' to determine which line have 6 and 7 fields in them.
>>
>> then use 'readLines' to read in the entire file and the use the data
>> from count.fields to write out to separate files"
>>
>> x <- count.fields(...)
>> input <- readLines(..)
>> writeLines(input[x == 6], file = '6fields.csv')
>> writeLines(input[x==7], file = '7fields.csv')
>>
>> On Tue, Mar 20, 2012 at 11:43 AM, Ashish Agarwal
>>  wrote:
>>> The file is 20MB having 2 Million rows.
>>> I understand that I two different formats  - 6 columns and 7 columns.
>>> How do I read chunks to different files by using scan with modifying
>>> skip and nlines parameters?
>>>
>>> On Mon, Mar 19, 2012 at 3:59 PM, Petr PIKAL 
>>> wrote:

 I would follow Jims suggestion,
 nFields <- count.fields(fileName, sep = ',')
 count fields and read chunks to different files by using scan with
 modifying skip and nlines parameters. However if there is only few lines
 which differ it would be better to correct those few lines manually in
 some suitable editor.

 Elaborating omnipotent function for reading any kind of
 corrupted/nonstandard files seems to me suited only if you expect to
 read
 such files many times.

 Regards
 Petr


>
>
>
> On Sat, Mar 17, 2012 at 4:54 AM, jim holtman 
> wrote:
> > Here is a solution that looks for the line with 7 elements and
> > inserts
> > the quotes:
> >
> >
> >> fileName <- '/temp/text.txt'
> >> input <- readLines(fileName)
> >> # count the fields to find 7
> >> nFields <- count.fields(fileName, sep = ',')
> >> # now fix the data
> >> for (i in which(nFields == 7)){
> > +     # split on comma
> > +     z <- strsplit(input[i], ',')[[1]]
> > +     input[i] <- paste(z[1], z[2]
> > +         , paste('"', z[3], ',', z[4], '"', sep = '') # put on
> > quotes
> > +         , z[5], z[6], z[7], sep = ','
> > +         )
> > + }
> >>
> >> # now read in the data
> >> result <- read.table(textConnection(input), sep = ',')
> >>
> >>         result
> >                         V1       V2                   V3   V4 V5 V6
> > 1                                                         1968 21  0
> > 2                                                  Boston 1968 13  0
> > 3                                                  Boston 1968 18  0
> > 4                                                 Chicago 1967 44  0
> > 5                                              Providence 1968 17  0
> > 6                                              Providence 1969 48  0
> > 7                                                   Binky 1968 24  0
> > 8                                                 Chicago 1968 23  0
> > 9                                                   Dally 1968  7  0
> > 10                                   Raleigh, North Carol 1968 25  0
> > 11 Addy ABC-Dogs Stars-W8.1                    Providence 1968 38  0
> > 12              DEF_REQPRF/                     Dartmouth 1967 31  1
> > 13                       PL                               1967 38  1
> > 14                       XY PopatLal                      1967  5  1
> > 15                       XY PopatLal                      1967  6  8
> > 16                       XY PopatLal                      1967  7  7
> > 17                       XY PopatLal                      1967  9  1
> > 18                       XY PopatLal                      1967 10  1
> > 19                       XY PopatLal                      1967 13  1
> > 20                       XY PopatLal               Boston 1967  6  1
> > 21                       XY PopatLal               Boston 1967  7 11
> > 22                       XY 

Re: [R] anova.lm F test confusion

2012-03-20 Thread Ben Bolker
msteane  hotmail.com> writes:

> 
> I am using anova.lm to compare 3 linear models.  Model 1 has 1 variable,
> model 2 has 2 variables and model 3 has 3 variables.  All models are fitted
> to the same data set.

  (I assume these are nested models, otherwise the analysis doesn't
make sense ...)

> 
> anova.lm(model1,model2) gives me:
> 
>   Res.DfRSS Df Sum of Sq  FPr(>F)
> 1135 245.38  
> 2134 184.36  161.022 44.354 6.467e-10 ***
> 
> anova.lm(model1,model2,model3) gives me:
> 
>   Res.DfRSS Df Sum of Sq  FPr(>F)
> 1135 245.38  
> 2134 184.36  161.022 50.182 7.355e-11 ***
> 3133 161.73  122.628 18.609 3.105e-05 ***
> 
> Why aren't the 2nd row F values from each of the anova tables the same??? I
> thought in each case the 2nd row is comparing model 2 to model 1?  

 From ?anova.lm:

 Normally the F statistic is most appropriate, which compares the mean
 square for a row to the residual sum of squares for the largest model
 considered.

> 
> I figured out that for anova.lm(model1,model2) 
> F(row2)=Sum of Sq(row2)/MSE of Model 2 
> 
> and for anova.lm(model1,model2,model3)
>  F(row2)=Sum of Sq(row 2)/MSE of Model 3  <-- I don't get why the MSE of
> model 3 is being included if we're comparing Model 2 to Model 2

   See above ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] passing xlim to coord_map in ggplot2

2012-03-20 Thread Ista Zahn
Hi Zack,

It would help a lot if you can give a reproducible example that
generates the error.

Best,
Ista

On Tue, Mar 20, 2012 at 3:05 PM, z2.0  wrote:
> This works:
> michigan <- map_data('county', 'michigan')
> mich_points <- data.frame(x = rnorm(n = 200, median(michigan[,1]), 0.75), y
> = rnorm(n = 200, median(michigan[,2]), 0.75))
> ggplot() + geom_path(aes(long, lat, group = group), data = michigan) +
> geom_point(aes(x, y), data = mich_points) + coord_map('gilbert', xlim =
> c(-86, -84))
>
> This generates the following error:
> *Error in unit(x, default.units) : 'x' and 'units' must have length > 0*
> #Where tank_trunc is a data.frame with two columns, 'lon' and 'lat'
> containing point coordinates in storage mode 'double'.
> michigan_map.df <- map_data('county', 'michigan')
> ggplot() + geom_point(aes(lon, lat), data = tank_trunc, na.rm = T) +
> geom_path(aes(long, lat, group = group), data = michigan_map.df) +
> coord_map('gilbert', xlim = c(-88, -82))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help matching observations for social network data

2012-03-20 Thread Holly Shakya
Greetings R folks, 

I am stuck on a problem that I suspect can be solved somewhat easily. 

I have social network data stored in dyads as below, where the numbers 
representing ego and alter are identifiers, so that number 1 as an ego is the 
same person as number 1 as an alter etc. 


  ego alter
1  1 2
2  1 3
3  2 1
4  2 4
5  3 1
6  3 2
7  3 4
8  3 6
9  4 1
10   5 3
115 6
126 4

What I would like to do is to create new dyads which match up the ego with the 
alter's alters as below (preferably removing dyads in which ego and alter2 are 
the same):

 ego alter2
1  1 4
2  1 2
3  1 4
4  1 6
5  2 3
6  2 1
7  3 2
8  3 1
9  3 4
103 1
113 4
124 2
134 3
145 1
155 2
165 4
175 6
185 4
196 1


Any suggestions as to how to do this would be greatly appreciated. 

Holly
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in fitdist- mle failed to estimate parameters

2012-03-20 Thread vinod1
Hi,

I am trying fit certain data into Beta distribution. I get the error saying
"Error in fitdist(discrete_random_variable_c, "beta", start = NULL, fix.arg
= NULL) :   the function mle failed to estimate the parameters,   with the
error code 100"

Below is the sorted data that I am trying to fit. Where am I going wrong.

Thanks a lot for any help.

Vinod

2.05495e-05,3.68772e-05,4.21994e-05,4.38481e-05,5.55001e-05,5.74267e-05,6.27489e-05,6.43976e-05,6.64938e-05,7.40247e-05,7.60495e-05,7.90767e-05,8.07253e-05,8.60475e-05,8.70433e-05,9.23773e-05,9.45742e-05,9.76995e-05,9.93482e-5,9.96262e-05,0.000101275,0.000103371,0.000106597,0.000108693,0.000110342,0.000110902,0.000112927,0.000116224,0.000118249,0.000119346,0.000119898,0.000120773,0.000121994,0.000122925,0.000123921,0.000131451,0.000134577,0.000136225,0.000136774,0.000138422,0.000139895,0.000140519,0.000141322,0.000142543,0.000150074,0.00015475,0.000155126,0.000156223,0.000156775,0.00015765,0.000161545,0.000163194,0.000164621,0.000165842,0.00016612,0.000166402,0.000167769,0.000173373,0.000173651,0.000174846,0.000176273,0.000177396,0.0001782,0.000179421,0.000183522,0.000183744,0.000186952,0.000187267,0.000192274,0.000193371,0.000197945,0.000202719,0.000203267,0.000207816,0.00021025,0.00021315,0.00021392,0.000218694,0.00022162,0.000230248,0.0002308,0.000231675,0.000240145,0.000!
 244693,0.000252224,0.000266343,0.000308765,0.000422837,0.000429537,0.000443386

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-fitdist-mle-failed-to-estimate-parameters-tp4490962p4490962.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glm.fit: fitted probabilities numerically 0 or 1 occurred?

2012-03-20 Thread ufuk beyaztas
Hi all,

I am doing bootstrap with logistic regression by using glm function, and I
get the errors;

glm.fit: fitted probabilities numerically 0 or 1 occurred and
glm.fit: algorithm did not converge

I have read some things about this issue in the mailing list. I can guess
what was the problem. My data contains one or may be two outliers. Does the
error occur due to these extreme values or something else such as MLE? Is
there any way to to fix this problem? 

Regards,

Ufuk

-
Best regards

Ufuk
--
View this message in context: 
http://r.789695.n4.nabble.com/glm-fit-fitted-probabilities-numerically-0-or-1-occurred-tp4490722p4490722.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] i dont know what function do i have to make for ("The number of lines in the body of the email")

2012-03-20 Thread david
i tried to make function about "The number of lines in the body of the email"
and i have three values which are emailtestset1, emailtestset2, and emails.
what i have to do?


--
View this message in context: 
http://r.789695.n4.nabble.com/i-dont-know-what-function-do-i-have-to-make-for-The-number-of-lines-in-the-body-of-the-email-tp4491165p4491165.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help matching observations for social network data

2012-03-20 Thread William Dunlap
Your data.frame is
  d <- data.frame(
 ego = c(1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 5, 6),
 alter = c(2, 3, 1, 4, 1, 2, 4, 6, 1, 3, 6, 4))
and you want to get
  e <- data.frame(
 ego = c(1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5, 6),
 alter2 = c(4, 2, 4, 6, 3, 1, 2, 1, 4, 1, 4, 2, 3, 1, 2, 4, 6, 4, 1))
Try using merge() and removing the entries where
ego and alter of alter are the same:
  f <- function(d) {
tmp <- merge(d, d, by.x = "alter", by.y="ego")  # note this gives 'tmp' 
illegal duplicate column names
tmp2 <- tmp[ tmp[,2] != tmp[,3], c(2,3)]
colnames(tmp2)[2] <- "alter2"
tmp2
  }

I think that f(d) and e contain the same set of rows, although
they are ordered differently.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Holly Shakya
> Sent: Tuesday, March 20, 2012 9:15 PM
> To: r-help@r-project.org
> Subject: [R] help matching observations for social network data
> 
> Greetings R folks,
> 
> I am stuck on a problem that I suspect can be solved somewhat easily.
> 
> I have social network data stored in dyads as below, where the numbers 
> representing
> ego and alter are identifiers, so that number 1 as an ego is the same person 
> as number 1
> as an alter etc.
> 
> 
>   ego alter
> 1  1 2
> 2  1 3
> 3  2 1
> 4  2 4
> 5  3 1
> 6  3 2
> 7  3 4
> 8  3 6
> 9  4 1
> 10   5 3
> 115 6
> 126 4
> 
> What I would like to do is to create new dyads which match up the ego with 
> the alter's
> alters as below (preferably removing dyads in which ego and alter2 are the 
> same):
> 
>  ego alter2
> 1  1 4
> 2  1 2
> 3  1 4
> 4  1 6
> 5  2 3
> 6  2 1
> 7  3 2
> 8  3 1
> 9  3 4
> 103 1
> 113 4
> 124 2
> 134 3
> 145 1
> 155 2
> 165 4
> 175 6
> 185 4
> 196 1
> 
> 
> Any suggestions as to how to do this would be greatly appreciated.
> 
> Holly
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading Dataset into R continual issue

2012-03-20 Thread bobo
Thank you. I was able to get it loaded however when I tried to run

mod1<-lm(Pat2006~FHouse)
I got
Error in eval(expr, envir, enclos) : object 'Pat2006' not found

What exactly is occurring here?

--
View this message in context: 
http://r.789695.n4.nabble.com/Loading-Dataset-into-R-continual-issue-tp4486619p4491424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.