[R] error in fitdistr

2012-02-23 Thread Soheila Khodakarim
Hi dear,

I want to estimate d.f for Chi-squared distribution:

est.chi[i,]<-c(
fitdistr(as.numeric(data2[,i]),"chi-squared",start=list(df=1))$estimate)Warning
message:In optim(x = c(7.86755, 7.50852, 7.86342, 7.70589, 7.70153,
7.58272,  :
  one-diml optimization by Nelder-Mead is unreliable:
use "Brent" or optimize() directly


Who can help me to solve this problem?

Best whishes,
Soheila

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package 'fCalendar'

2012-02-23 Thread Pfaff, Bernhard Dr.
Hello Brit and Michael,

indeed, fCalendar was replaced by timeDate (so was fSeries by timeSeries). Old 
versions of both packages are in the CRAN archive. Now, with respect to QRMlib, 
the package author/maintainer (cc'ed to this email) is pretty close to a 
re-submittance of his package to CRAN. Having said this, it might be worth 
waiting before you are trying to get QRMlib running based on the above 
mentioned, but remnoved from CRAN packages.  Scott, do you have a tentative 
schedule for the re-release of your package on CRAN in mind?

Best,
Bernhard


-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im 
Auftrag von R. Michael Weylandt
Gesendet: Mittwoch, 22. Februar 2012 15:06
An: Britt Grt
Cc: r-help@r-project.org
Betreff: Re: [R] Package 'fCalendar'

I believe fCalendar was replaced by timeDate which does have a namespace and 
can be acquired from CRAN.

Michael

On Wed, Feb 22, 2012 at 5:41 AM, Britt Grt  wrote:
>
> Dear,
>
> I'm a master student mathematics at university Gent, who's writing a thesis 
> about vines and copula's.
> I'm in trouble with the package 'fCalendar' which I need for running 'QRMlib'.
> The problem is that 'fCalendar' doesn't have a namespace. I need to 
> use R.2.14.1 because I also need the package 'vines' which only works for 
> R.2.14.1.
> I'm afraid making a namespace myself is much too complicated, I read 
> much about it, but I really do'nt know how to do it exactly.
> Is it possible to get a version of fCalendar' with namespace, so adjusted for 
> R.2.14.1?
> A tar.gz file would be fine.
> I really hope you can help me,
>
> kind regards,
> Britt Grootaerd
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
*
Confidentiality Note: The information contained in this ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xtable prcomp

2012-02-23 Thread Charles Roosen
Hi,

The "xtable" method for "summary.prcomp" is just creating a table of the 
"importance" values.  So you can get the same table for just a few of the PCs 
by subscripting the importance values:

xtable(mySummary$importance[,1:2],digits=4)

Best,
Charlie

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Riccardo Romoli
Sent: 22 February 2012 23:22
To: r-help@r-project.org
Subject: [R] xtable prcomp

Hi, I need to export to LaTex the summary of a PCA. So:

myPCA <- prcomp(myDF)
mySummary <- summary(myPCA)
#
print(xtable(mySummary))

How can I export to LaTeX not all the summary but only the first nPCs??

Best
Riccardo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
LEGAL NOTICE
This message is intended for the use o...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inserting a dataframe into the table

2012-02-23 Thread Uwe Ligges



On 23.02.2012 04:17, arunkumar wrote:

Yes the table already exist and it contains data
ant to insert two data frame into that separately



I still wonder why people write message completely out of context.

Uwe Ligges



-
Thanks in Advance
 Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/inserting-a-dataframe-into-the-table-tp4409756p4412667.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to merge commands

2012-02-23 Thread Nino Pierantonio
Thanks Don for your suggestion. I have received the original data in 
Excel xlsx format so I must work with them unless I want to change file 
format to thousands of files... I am also saving my output R files in 
Excel format to make them compatible with the original ones. Everything 
will be then stored in a proper database for further analysis after some 
basic data management.


Nino

I 23/02/2012 00:49, MacQueen, Don ha scritto:

Are you absolutely certain that the data must be stored in Excel?

In the long run I believe you will find it easier if the data is stored in
an external database, or some other data repository that does not require
you to read so many separate files.

Probably the best you can hope for as it is now is to put these commands
inside a loop, or nested loops, with the input and output file names
constructed from the loop indexes [see help('paste') for constructing file
names].

-Don



 * Italiano - rilevata
 * Inglese
 * Italiano
 * Francese
 * Spagnolo
 * Tedesco

 * Inglese
 * Italiano
 * Francese
 * Spagnolo
 * Tedesco

 
Impossibile tradurre il testo selezionato

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing the package Rcplex

2012-02-23 Thread Uwe Ligges



On 22.02.2012 21:04, zheng wei wrote:

Based on my understanding of the manual, I moved upziped the file and put the 
folder of Rcplex under the directory of c:/temp
Then I use cmd under windows to go to the director of C:\Program
Files\R\R-2.13.0\bin, where my R is installed
and typed R CMD INSTALL
"c:/temp/Rcplex"


I'd

1. install a recent version of R

2. add R's bin directory to the PATH and go to c:\temp and say

R CMD build Rcplex

followed by

R CMD INSTALL Rcplex_version.tar.gz


3. When it fails, I'd try to find out if the ERROR message is helpful.

4. If 3 fails, I'd ask the maintainer for help - including relevant 
information, like the error message, install paths, version information etc.



Uwe Ligges




I got the error of configuration failed for package "Rcplex"

Any idea?





  From: Uwe Ligges
To: zheng wei
Cc: David Winsemius; 
"r-help@r-project.org"
Sent: Wednesday, February 22, 2012 4:04 AM
Subject: Re: [R] installing the package Rcplex



On 22.02.2012 03:33, zheng wei wrote:

Thanks.

I was just reminded by the tech support in my university that cplex is an independent 
software owned by ILOG, which in turn is now owned by IBM. I suceeded in installing the 
software cplex under the directory of "C:/Program 
Files/IBM/ILOG/CPLEX_Studio_Academic124/cplex"
I guess Rcplex is an R package to utilize the software cplex. I have changed the path 
"/c/ilog/cplex111" to the above path. My question is how to finally and 
effectively install the package of Rcplex?


You have been asked alreeady to read the R Installation and
Administration manual.

Uwe Ligges





Thanks,
Wei




From: Uwe Ligges
To: zheng wei
Cc: David Winsemius; 
"r-help@r-project.org"
Sent: Tuesday, February 21, 2012 2:14 PM
Subject: Re: [R] installing the package Rcplex



On 21.02.2012 19:57, zheng wei wrote:

Thank you both for helping. Still could not figure out.

I was contacting different supporting IT departments in my university but did 
not get any help.

For the moment, I just want to what does the instruction of the package means. 
You could find this instruction on page 
http://cran.r-project.org/web/packages/Rcplex/INSTALL
 
--
***WINDOWS***
Installation on Windows systems is done by using the provided
Makevars.win file in the src directory. It contains the following
lines:
PKG_CPPFLAGS=-I/include
PKG_LIBS=-L/lib/x86_windows_vs2008/stat_mda -lcplex111 -lm
whereis the cplex installation directory
e.g. /c/ilog/cplex111. Please edit your Makevars.win file accordingly.
We have successfully tested this procedure with CPLEX 11.1 on 32-bit
Windows XP.
--


I can find the file and see the codes. But what new path should I put, and what 
to do next?


The path to your CPLEX installation?

Uwe Ligges






Thanks,
Wei



From: Uwe Ligges
To: David Winsemius
Cc: zheng wei; 
"r-help@r-project.org"
Sent: Monday, February 20, 2012 6:01 AM
Subject: Re: [R] installing the package Rcplex



On 20.02.2012 01:54, David Winsemius wrote:


On Feb 19, 2012, at 7:45 PM, zheng wei wrote:


I did not know this before. I installed it as you suggested. what to
do next?


Read the Installation Manual?



And don't forget this is a source package for which no CRAN Windows
binary exists, hence it may be not that straightforward to get it done
and you wil have to read the INSTALL file from the source package carefully.

Uwe Ligges


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with Cosine Similarity using library(lsa)

2012-02-23 Thread A J

Hi everybody!

I have intended to use library(lsa) on R 64-bits for Windows but it was not 
possible. Every time I try to launch library(lsa) function R give me back next 
message:

Loading required package: SnowballError : .onLoad failed in loadNamespace() for 
'Snowball', details:  call: NULL  error: .onLoad failed in loadNamespace() for 
'rJava', details:  call: stop("No CurrentVersion entry in '", key, "'! Try 
re-installing Java and make sure R and Java have matching architectures.")  
error: objeto 'key' no encontradoError: package ‘Snowball’ could not be loaded

Of course, I have loaded all necessary packages, but the only way to 
library(lsa) works it is on R 32-bits release. The problem here is that R don't 
leave me to load all data from my matrix and tell me that it is not able to 
load big vectors (may be due to limitations on memory of 32-bit release).

The issue is that I need to calculate cosine similarities on my matrix data. 
Has somebody any suggestion or idea about how to do it (a different library or 
a formula to get it)?

Thanks in advance.

Best,

AJ
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Several densityplots in single figure

2012-02-23 Thread josh rosen
Thank you very much Ilai, I'll try and implement this right away.

On 22 February 2012 18:21, ilai  wrote:
> On Wed, Feb 22, 2012 at 8:49 AM, David Winsemius  
> wrote:
>
>>
>> After going back and constructing a proper dataset, you should be passing
>> 'groups' into the panel function and picking it up inside panel.abline.
>
> Close, but unfortunately things get more complicated when using groups
> in densityplot. A straight up panel function wouldn't work for him out
> of the box (and you/list be getting more follow ups). He'll need
> something like:
> densityplot(~value | g1, groups=g2, data= combinedlongformat,
>  panel=panel.superpose, panel.groups=function(x,...,group.number){
>  panel.abline(v=mean(x),lwd=(1:0)[group.number])  # seems like he only wants 
> x1
>  panel.densityplot(x,...)
>  })
>
> At this stage in the game
> trellis.focus('panel',1,1) ; panel.abline(v=calculatedmean1)
> trellis.focus('panel',2,1) ; panel.abline(v=calculatedmean2)
> ...
> trellis.unfocus()
>
> Would probably go down easier...
>
> Just a thought
>
> Cheers,
> Elai
>
>
> You
>> can also recover the 'panel.number()' and then use them both to recover the
>> means from the global environment when inside panel.abline. I probably would
>> have used tapply to create a table with an index (in the enclosing
>> enviroment) rather than creating 4 separate values to choose That way you
>> could use one object name but construct the 4 indices. But for that to work
>> you would have needed to follow my example about binding the 4 datasets into
>> one.
>>
>>
>> One way to find previous worked examples in the rhelp Archives is to search
>> with your favorite engine on strategies like "panel.number groups Sarkar" or
>> "panel.number groups Andrews"
>>
>> Deepayan Sarkar and Felix Andrews are the two persons from whom I have
>> learned the most regarding the fine points of lattice plots.
>>
>> And will you please learn to post in plain text?
>>
>> --
>> david
>>
>>
>>>                }
>>> )
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 22 February 2012 13:48, David Winsemius  wrote:
>>>
>>> On Feb 22, 2012, at 5:28 AM, josh rosen wrote:
>>>
>>> Hi,
>>>
>>> I have created two separate overlapping density plots- see example code
>>> below.
>>> What I wish now to do is combine them into one figure where they sit side
>>> by side.
>>> Any help would be great!
>>>
>>> many thanks in advance, josh.
>>>
>>> #
>>> thedataA <- data.frame(x1=rnorm(100,1,1),x2=rnorm(100,3,1)) #create data
>>> thedataA.m<-melt(thedataA)
>>>
>>> densityplot(~value, thedataA.m, groups=variable,auto.key=list(columns=2),
>>>
>>>     panel = function(x, y, ...) {
>>>             panel.densityplot(x, ...)
>>>             panel.abline(v=0)
>>>     }
>>> )
>>>
>>>
>>> The syntax for grouping (which gives theoverlaid but different colors as
>>> default output) and "separation" is fairly simple. Use the "|" operator for
>>> separated plots and the " .., groups= , .."  parameter for overlaying
>>> results. It's only going to work easily if they are all in the same dataset.
>>>
>>> Try:
>>>
>>>  bigset <- cbind( rbind(thedataA.m, thedataB.m), ABgrp=rep(c("datA",
>>> "datB"), each=200) )
>>>  densityplot(~value|ABgrp, data=bigset, groups=variable,
>>> auto.key=list(columns=2),
>>>
>>>      panel = function(x, y, ...) {
>>>              panel.densityplot(x, ...)
>>>              panel.abline(v=0)   } )
>>>
>>>
>>> And please work on whatever practice is producing duplicate postings.
>>>
>>> --
>>> David.
>>>
>>>
>>>
>>>
>>> thedataB <- data.frame(x1=rnorm(100,2,1),x2=rnorm(100,4,1)) #create data
>>>
>>> thedataB.m<-melt(thedataA)
>>>
>>> I assume that is a copy-paste-fail-to-correct error.
>>>
>>>
>>> densityplot(~value, thedataB.m, groups=variable,auto.key=list(columns=2),
>>>
>>>     panel = function(x, y, ...) {
>>>             panel.densityplot(x, ...)
>>>             panel.abline(v=0)
>>>     }
>>> )
>>> ##
>>>
>>>       [[alternative HTML version deleted]]
>>
>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Cosine Similarity using library(lsa)

2012-02-23 Thread Uwe Ligges
The error message suggests that you do not have Java installed. And 
since you said it works in 32-bit: You only have a 32-bit Java but no 
64-bit Java installed in your machine.


Uwe Ligges


On 23.02.2012 11:08, A J wrote:


Hi everybody!

I have intended to use library(lsa) on R 64-bits for Windows but it was not 
possible. Every time I try to launch library(lsa) function R give me back next 
message:

Loading required package: SnowballError : .onLoad failed in loadNamespace() for 'Snowball', 
details:  call: NULL  error: .onLoad failed in loadNamespace() for 'rJava', details:  call: 
stop("No CurrentVersion entry in '", key, "'! Try re-installing Java and make sure R 
and Java have matching architectures.")  error: objeto 'key' no encontradoError: package 
‘Snowball’ could not be loaded

Of course, I have loaded all necessary packages, but the only way to 
library(lsa) works it is on R 32-bits release. The problem here is that R don't 
leave me to load all data from my matrix and tell me that it is not able to 
load big vectors (may be due to limitations on memory of 32-bit release).

The issue is that I need to calculate cosine similarities on my matrix data. 
Has somebody any suggestion or idea about how to do it (a different library or 
a formula to get it)?

Thanks in advance.

Best,

AJ  
[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Schoenfeld residuals for a null model coxph

2012-02-23 Thread Federico Calboli
Hi,

I have a coxph model like

coxph(Surv(start, stop, censor) ~ x + y, mydata)

I would like to calculate the Schoenfeld residuals for the null, i.e the same 
model where the beta hat vector (in practical terms, the coeff vector spat out 
by summary()) is constrained to be all 0s --all lese stays the same. 

I could calculate it by hand, but I was wondering if there is a way of doing it 
with resid() -- resid(my.coxph.mod, type = 'schoenfeld') works wery well for 
the true Schoenfeld residuals, but I haven't figured out how to use it 
constraining the beta hat vector to be all 0s.

BW

F

--
Federico C. F. Calboli
Neuroepidemiology and Ageing Research
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is generating the same graph???

2012-02-23 Thread S Ellison
 

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Vanúcia Schumacher
> Sent: 23 February 2012 00:08
> To: r-help@r-project.org
> Subject: [R] why is generating the same graph???
> 
> 
> Hi,
> why my script iss always generating the same graph?when I 
> change the parameters and the name of text file?

The usual - in fact probably the only - explanation for the same graph is that 
the same data set is plotted. 

This can happen by mistake for lots of reasons, all associated with the 
operator. Check that:
- you have no error messages on file reading. if read.table fails, the initial 
data set will not be replaced, and R will plot the data set by that name.
- your script is actually plotting the data set you are reading; it is 
surprisingly easy to get names wrong by a character and not notice
- that you are reading the right file
- that the files you are reading contain different data
- that the differences are in column 1 of the data set and not in another column
- if you're plotting inside a function, check that the data plotted has the 
name of the argument and not the name of an object somewhere else in your 
workspace (This happens often when testing scripts; if we say f<-function(x) 
plot(y)   and then run f using, say, f(z), the function will plot y if it 
exists in the parent environment for the function. It's not unusual to have 
created a temporary y (or whatever) to test the code... 

S


> library(MASS)
> dados<-read.table("inverno.txt",header=FALSE)
> vento50<-fitdistr(dados[[1]],densfun="weibull")
> png(filename="invernoRG.png",width=800,height=600)
> hist(dados[[1]], seq(0, 18, 0.5), prob=TRUE, xlab="Velocidade 
> (m/s)",ylab="Densidade", main="50 m")  curve(dweibull(x, 
> shape=0.614, scale=2.435), 0,18,add=T, col='red')
> dev.off()
> 
> Best Regards
> ***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing particular row from matrix

2012-02-23 Thread S Ellison
 

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of uday

> Error in rowSums(a[, 2] == -999.99) : 
>   'x' must be an array of at least two dimensions
Indeed it must - but you have asked for rowSums on a one-dimensinal object 
(a[,2]). You didn;t need to sum the rows of that.

Try
a[ a[,2] != -999 , ]

assuming that the column is integer or that you have read the Note on finite 
representation of ractions in ?Comparison and are willing to take your chances, 
or 

a[ a[,2] > -998 , ]
if it's not and if it's safe to assume that all large negative numbers are 
'missing'.

Better still, if you read the data using read.table, use na.strings=c('"-999", 
"NA") . That will mark "-999" as missing data. An na.omit will then remove the 
offending rows (but also those that contain NAs of thre reasons.

S***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Cosine Similarity using library(lsa)

2012-02-23 Thread A J

Thanks Uwe, you vere right. Now the library is working but I have next problem 
as well.

I have loaded a matrix (named matrix_v3) from a TXT file. This matrix was 
previously reached using xtabs function. Now I have used this formula to obtain 
cosine similarities from my data matrix (following instructions of lsa package 
instructions):

> cosine(matrix_v3, y = NULL)

Finally R returns this message:

Error en cosine(matrix_v3, y = NULL) :   argument mismatch. Either one matrix 
or two vectors needed as input.

I understand that R don't contemplate matrix_v3 like a real matrix. If this is 
like this, how can I do? I have tried "as.matrix" but it does not work. Sorry 
if my questions are not very fine, but I am newbie in using R.

Thanks again.


> Date: Thu, 23 Feb 2012 11:35:04 +0100
> From: lig...@statistik.tu-dortmund.de
> To: anxu...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] Problems with Cosine Similarity using library(lsa)
> 
> The error message suggests that you do not have Java installed. And 
> since you said it works in 32-bit: You only have a 32-bit Java but no 
> 64-bit Java installed in your machine.
> 
> Uwe Ligges
> 
> 
> On 23.02.2012 11:08, A J wrote:
> >
> > Hi everybody!
> >
> > I have intended to use library(lsa) on R 64-bits for Windows but it was not 
> > possible. Every time I try to launch library(lsa) function R give me back 
> > next message:
> >
> > Loading required package: SnowballError : .onLoad failed in loadNamespace() 
> > for 'Snowball', details:  call: NULL  error: .onLoad failed in 
> > loadNamespace() for 'rJava', details:  call: stop("No CurrentVersion entry 
> > in '", key, "'! Try re-installing Java and make sure R and Java have 
> > matching architectures.")  error: objeto 'key' no encontradoError: package 
> > ‘Snowball’ could not be loaded
> >
> > Of course, I have loaded all necessary packages, but the only way to 
> > library(lsa) works it is on R 32-bits release. The problem here is that R 
> > don't leave me to load all data from my matrix and tell me that it is not 
> > able to load big vectors (may be due to limitations on memory of 32-bit 
> > release).
> >
> > The issue is that I need to calculate cosine similarities on my matrix 
> > data. Has somebody any suggestion or idea about how to do it (a different 
> > library or a formula to get it)?
> >
> > Thanks in advance.
> >
> > Best,
> >
> > AJ  
> > [[alternative HTML version deleted]]
> >
> >
> >
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Cosine Similarity using library(lsa)

2012-02-23 Thread Uwe Ligges



On 23.02.2012 13:07, A J wrote:


Thanks Uwe, you vere right. Now the library is working but I have next problem 
as well.

I have loaded a matrix (named matrix_v3) from a TXT file. This matrix was 
previously reached using xtabs function. Now I have used this formula to obtain 
cosine similarities from my data matrix (following instructions of lsa package 
instructions):


cosine(matrix_v3, y = NULL)


Finally R returns this message:

Error en cosine(matrix_v3, y = NULL) :   argument mismatch. Either one matrix 
or two vectors needed as input.

I understand that R don't contemplate matrix_v3 like a real matrix. If this is like this, 
how can I do? I have tried "as.matrix" but it does not work. Sorry if my 
questions are not very fine, but I am newbie in using R.



We do not know what you object matrix_v3 is, so how could we help? 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



Thanks again.



Date: Thu, 23 Feb 2012 11:35:04 +0100
From: lig...@statistik.tu-dortmund.de
To: anxu...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Problems with Cosine Similarity using library(lsa)

The error message suggests that you do not have Java installed. And
since you said it works in 32-bit: You only have a 32-bit Java but no
64-bit Java installed in your machine.

Uwe Ligges


On 23.02.2012 11:08, A J wrote:


Hi everybody!

I have intended to use library(lsa) on R 64-bits for Windows but it was not 
possible. Every time I try to launch library(lsa) function R give me back next 
message:

Loading required package: SnowballError : .onLoad failed in loadNamespace() for 'Snowball', 
details:  call: NULL  error: .onLoad failed in loadNamespace() for 'rJava', details:  call: 
stop("No CurrentVersion entry in '", key, "'! Try re-installing Java and make sure R 
and Java have matching architectures.")  error: objeto 'key' no encontradoError: package 
‘Snowball’ could not be loaded

Of course, I have loaded all necessary packages, but the only way to 
library(lsa) works it is on R 32-bits release. The problem here is that R don't 
leave me to load all data from my matrix and tell me that it is not able to 
load big vectors (may be due to limitations on memory of 32-bit release).

The issue is that I need to calculate cosine similarities on my matrix data. 
Has somebody any suggestion or idea about how to do it (a different library or 
a formula to get it)?

Thanks in advance.

Best,

AJ  
[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gamlss results for EXP and LNO seem to have reversed AIC scores

2012-02-23 Thread Mikis Stasinopoulos
Dear Richard 

I think the results below are consistent

set.seed(1020)
 # created from EXP
 X1 <- rEXP(1000)
 Gexp <- gamlss(X1~1,family=EXP)
#GAMLSS-RS iteration 1: Global Deviance = 1999.762 
#GAMLSS-RS iteration 2: Global Deviance = 1999.762 
 Glno <- gamlss(X1~1,family=LOGNO)
#GAMLSS-RS iteration 1: Global Deviance = 2213.252 
#GAMLSS-RS iteration 2: Global Deviance = 2213.252 
 GAIC(Gexp, Glno)
# df  AIC
#Gexp  1 2001.762
#Glno  2 2217.252
# EXP best
 
# create from LOGNO
 X2 <- rLOGNO(1000)
 Aexp <- gamlss(X2~1,family=EXP)
#GAMLSS-RS iteration 1: Global Deviance = 2954.628 
#GAMLSS-RS iteration 2: Global Deviance = 2954.628 
 Alno <- gamlss(X2~1,family=LOGNO)
#GAMLSS-RS iteration 1: Global Deviance = 2796.725 
3GAMLSS-RS iteration 2: Global Deviance = 2796.725 
 GAIC(Aexp, Alno)
# df  AIC
#Alno  2 2800.725
#Aexp  1 2956.628
# LOGNO best

Mikis
 



Prof Mikis Stasinopoulos
d.stasinopou...@londonmet.ac.uk



Companies Act 2006 : http://www.londonmet.ac.uk/companyinfo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Cosine Similarity using library(lsa)

2012-02-23 Thread A J

OK. I will try to explain in the best way:

Starting point: I had an arc list (like list provided by social network 
softwares: Pajek, Ucinet...) With this form:

ID1 IDP2 SUMVAL
1 2 0.065
1 3 0.044
3 1 0.071
3 4 0.016
3 5 0.011
4 3 0.004
4 7 0.004
4 9 0.004
...

I transformed this list (recorded in a CSV file) into "matrix" using this code:

> nbrs <- 
> read.table("C:/Users/AJ/Desktop/paper3/prueba3/pajek/mix_dc_cc_cp_jidpajek.csv",
>  header=T, quote="\"")
> mymatrix <- xtabs(SUM_NORM_DC_CC_CP ~ IDP1 + IDP2, data=nbrs)
> write.table(mymatrix, file = "C:/backup/matriz_v3.csv"), row.names= FALSE, 
> col.names= FALSE)

Now my intention is to calculate cosine similarities on data of this "matrix" 
(14179 rows x 10317 columns). In order to do this I have tried:

> library(lsa)
> matrix_v3 <- 
> read.table("C:/Users/AJ/Desktop/paper3/matriz_lista_10/matrix_v3.csv", sep= " 
> ", header= F)
> as.matrix <- matrix_v3
> cosine(matrix_v3, y = NULL)

And R returns me this message:

Error en cosine(matrix_v3) :   argument mismatch. Either one matrix or two 
vectors needed as input.

After that, I tried to convert "matrix_v3" using > cosine(matrix_v3, y = NULL) 
but it was not possible.

That's all. I hope this is enough. Thanks and sorry for inconvenience.

AJ


> Date: Thu, 23 Feb 2012 13:15:16 +0100
> From: lig...@statistik.tu-dortmund.de
> To: anxu...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] Problems with Cosine Similarity using library(lsa)
> 
> 
> 
> On 23.02.2012 13:07, A J wrote:
> >
> > Thanks Uwe, you vere right. Now the library is working but I have next 
> > problem as well.
> >
> > I have loaded a matrix (named matrix_v3) from a TXT file. This matrix was 
> > previously reached using xtabs function. Now I have used this formula to 
> > obtain cosine similarities from my data matrix (following instructions of 
> > lsa package instructions):
> >
> >> cosine(matrix_v3, y = NULL)
> >
> > Finally R returns this message:
> >
> > Error en cosine(matrix_v3, y = NULL) :   argument mismatch. Either one 
> > matrix or two vectors needed as input.
> >
> > I understand that R don't contemplate matrix_v3 like a real matrix. If this 
> > is like this, how can I do? I have tried "as.matrix" but it does not work. 
> > Sorry if my questions are not very fine, but I am newbie in using R.
> 
> 
> We do not know what you object matrix_v3 is, so how could we help? 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> > Thanks again.
> >
> >
> >> Date: Thu, 23 Feb 2012 11:35:04 +0100
> >> From: lig...@statistik.tu-dortmund.de
> >> To: anxu...@hotmail.com
> >> CC: r-help@r-project.org
> >> Subject: Re: [R] Problems with Cosine Similarity using library(lsa)
> >>
> >> The error message suggests that you do not have Java installed. And
> >> since you said it works in 32-bit: You only have a 32-bit Java but no
> >> 64-bit Java installed in your machine.
> >>
> >> Uwe Ligges
> >>
> >>
> >> On 23.02.2012 11:08, A J wrote:
> >>>
> >>> Hi everybody!
> >>>
> >>> I have intended to use library(lsa) on R 64-bits for Windows but it was 
> >>> not possible. Every time I try to launch library(lsa) function R give me 
> >>> back next message:
> >>>
> >>> Loading required package: SnowballError : .onLoad failed in 
> >>> loadNamespace() for 'Snowball', details:  call: NULL  error: .onLoad 
> >>> failed in loadNamespace() for 'rJava', details:  call: stop("No 
> >>> CurrentVersion entry in '", key, "'! Try re-installing Java and make sure 
> >>> R and Java have matching architectures.")  error: objeto 'key' no 
> >>> encontradoError: package ‘Snowball’ could not be loaded
> >>>
> >>> Of course, I have loaded all necessary packages, but the only way to 
> >>> library(lsa) works it is on R 32-bits release. The problem here is that R 
> >>> don't leave me to load all data from my matrix and tell me that it is not 
> >>> able to load big vectors (may be due to limitations on memory of 32-bit 
> >>> release).
> >>>
> >>> The issue is that I need to calculate cosine similarities on my matrix 
> >>> data. Has somebody any suggestion or idea about how to do it (a different 
> >>> library or a formula to get it)?
> >>>
> >>> Thanks in advance.
> >>>
> >>> Best,
> >>>
> >>> AJ
> >>>   [[alternative HTML version deleted]]
> >>>
> >>>
> >>>
> >>>
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide 
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> > 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

[R] saving all data in r object

2012-02-23 Thread uday

I have 100 data files, which contains very huge data sets of location
details ( e.g latitude, longitude, time, temp) 
Now I would like to save the all data of these 100 files in r object, so I
can reload data any time. 

* Every file has different length of data 

latitude  <- NULL 
longitude  <- NULL
time<- NULL 
temp  <- NULL 

for ( i in 1:100) { 

data<- read.table(file_s[i],header=TRUE,skip=55 )
latitude [i]  <- data[,6] 
longitude[i]   <- data[,7]
time[i]<- data[,8]
temp[i ] <- data [,9]

} 
save(latitude=latitude,longitude=longitude, time=time,temp=temp,
file="data.RData")
but it does not work. 

I am new in R and I got stuck here.

Cheers 
Uday 





I am beginner in R and I got stuck here 

--
View this message in context: 
http://r.789695.n4.nabble.com/saving-all-data-in-r-object-tp4413092p4413092.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple gsub

2012-02-23 Thread TwistedSkies
Hi Guys,

I am relatively new to R and was wondering if I could next my gsub command
in identifying one object

I have data which looks like this:  Taiwan_250km
I want it to look like this: Taiwan_250km

So essentially I just want to gsub '' and   with nothing!

So far I have got this:  PolyNam <-
unlist(strsplit(gsub("","",PolyRaw[PolyLin],fixed = TRUE)," "))

Which removes the end tag, just wondering how I can nest 2 gsubs to remove
both?!

Thanks in advance! 


  

--
View this message in context: 
http://r.789695.n4.nabble.com/multiple-gsub-tp4413481p4413481.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error in fitdistr

2012-02-23 Thread Mark Difford
On Feb 23, 2012 at 11:19am Soheila wrote:

> Who can help me to solve this problem?
> est.chi[i,]<-c(
> fitdistr(as.numeric(data2[,i]),"chi-squared",start=list(df=1))$estimate)
> Warning message:In optim(x = c(7.86755, 7.50852, 7.86342, 7.70589,
> 7.70153, 7.58272,  :  one-diml optimization by 
> Nelder-Mead is unreliable: use "Brent" or optimize() directly

The warning message tells you to use "Brent" rather than the default
Nelder-Mead. So do that.

##
?optim
est.chi[i,]<-c( fitdistr(as.numeric(data2[,i]), densfun="chi-squared",
start=list(df=1), method="Brent")$estimate)

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/error-in-fitdistr-tp4413293p4413459.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] resistanceDistance representation

2012-02-23 Thread Lisa Santambrogio
Dear all,
i'm using gdistance to model animal movement across landscape.
I have imported 11 rasters with roads, freeways, slope, use-of-land, lakes
(...) after recoding them with GRASS with a HSI value ranging from 1 to 4.
I've assigned zero to the NAs and then transformed all the rasters in
TransitionLayers (function=mean,directions=4) and later summed all of them
into a new transition:


> MY_transitionclass   : TransitionLayer
dimensions  : 2181, 1648, 3594288  (nrow, ncol, ncell)
resolution  : 100.0049, 99.98945  (x, y)
extent  : 1460708, 1625516, 4947383, 5165460  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=tmerc +lat_0=0 +lon_0=9 +k=0.9996 +x_0=150
+y_0=0 +ellps=intl +towgs84=-225,-65,9,0,0,0,0 +units=m +no_defs
values  : conductance
matrix class: dsCMatrix

I applied two different geocorrections to this same transition:

> MY_correction<-geoCorrection(MY_transition,"c",F,scl=T)

> MY_Rcorrection<-geoCorrection(MY_transition,"r",F,scl=T)

My coords are 44 points where samples were taken (some further genetic
analyses are programmed), but i've also tried with only two points:

> coords<-read.csv("coords_md.csv",header=F)
> mycoords<-as.matrix(coords)
> my_spatialpoints<-SpatialPoints(mycoords)

Finally I calculated three geographical distance matrices:

> geodist <- pointDistance(my_spatialpoints,longlat=FALSE)
> summary(geodist)   V1   V2   V3   V4  
>  V5
 Min.   : 0   Min.   : 0   Min.   : 0   Min.   : 0
Min.   : 0
 1st Qu.:133788   1st Qu.:133788   1st Qu.:133788   1st Qu.:133788
1st Qu.:133788
 Median :133788   Median :133788   Median :133788   Median :133788
Median :135168
 Mean   :114704   Mean   :117371   Mean   :120166   Mean   :123097
Mean   :126174
 3rd Qu.:137971   3rd Qu.:137971   3rd Qu.:137971   3rd Qu.:137971
3rd Qu.:137971
 Max.   :137971   Max.   :137971   Max.   :137971   Max.   :137971
Max.   :137971
  NA's   : 1   NA's   : 2   NA's   : 3
NA's   : 4
()

> costdist <- costDistance(MY_correction, my_spatialpoints)

> summary(costdist)   Min. 1st Qu.  MedianMean 3rd Qu.Max.
   0.000.00   21.92   68.53  210.90  244.70


> resdist <- resistanceDistance(MY_Rcorrection, my_spatialpoints)
> summary(resdist) Min. 1st Qu. Median Mean 3rd Qu. Max. 0 0 1022
1707 4286 4830

Now, I'd like to represent this measures but I couldn't find any examples
of how to do it; plotting or imaging them doesn't return a meaningful
graph, I guess I should transform them somehow but I don't know how.
ANY help would be really appreciated, along with comments about the rest of
the work done.

Thanks in advance,
Lisa

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mgcv: Smoothing matrix

2012-02-23 Thread Man Zhang
Dear All,

I would like to extract the smoothing matrix of the fitted GAM, \hat{y} = Sy. I 
can't seem to find the function or am I missing something?

Thanks, any help is greatly appreciated
Man Zhang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using shapefiles in adehabitat/ converting shapefile to spatial pixel data frame

2012-02-23 Thread Soanes, Louise
Hello
I wonder if anybody can help,
I am using the package adehabitatHR to estimate the potential distribution of a 
species using the command "domain"

In the example given in the AdehabitatHS manual a map containing elevation 
information is loaded (class= spatial pixels data frame) as well as the GPS 
points of the animals being tracked, these are then plotted on each other and 
estimation of habitat suitability performed.

I have a map containing sea depth information in the form of a shapefile which 
I want to relate to some GPS tracks I have, however I believe that the 
shapefile with my sea depth information has to be of the class spatial pixels 
data frame before I can perform many of the analysis in AdehabitatHS,

I have tried converting the shapefile to a spatial pixel data frame using the 
sp package by exporting the data from the attribute table in my depth shapefile 
then importing it into R as a csv file (see my attempts below)

>depth<-read.csv("depthmap.csv")
>pts = depth[c("x", "y")]
>y = SpatialPixels(SpatialPoints(pts))

suggested tolerance minimum: 1
Error in points2grid(points, tolerance, round) :
 dimension 1 : coordinate intervals are not constant

> depth<-read.csv("a.csv")
> coordinates(depth) <- c("x", "y")
> points2grid(depth)
suggested tolerance minimum: 1
Error in points2grid(depth) :
  dimension 1 : coordinate intervals are not constant

Bur these do not seem to work, does anyone have any ideas?

Many thanks
Louise Soanes


School of Environmental Sciences
University of Liverpool
Brownlow Hill
L69 3GP

http://sites.google.com/site/puffinislandseabirdresearch/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saving all data in r object

2012-02-23 Thread R. Michael Weylandt
It looks like it works. (I ran your code leaving out the inner
non-reproducible loop and just saving the NULL objects with your
syntax)

What is the error you are getting?

Michael

On Thu, Feb 23, 2012 at 3:01 AM, uday  wrote:
>
> I have 100 data files, which contains very huge data sets of location
> details ( e.g latitude, longitude, time, temp)
> Now I would like to save the all data of these 100 files in r object, so I
> can reload data any time.
>
> * Every file has different length of data
>
> latitude      <- NULL
> longitude  <- NULL
> time            <- NULL
> temp          <- NULL
>
> for ( i in 1:100) {
>
> data<- read.table(file_s[i],header=TRUE,skip=55 )
> latitude [i]      <- data[,6]
> longitude[i]   <- data[,7]
> time[i]            <- data[,8]
> temp[i ]         <- data [,9]
>
> }
> save(latitude=latitude,longitude=longitude, time=time,temp=temp,
> file="data.RData")
> but it does not work.
>
> I am new in R and I got stuck here.
>
> Cheers
> Uday
>
>
>
>
>
> I am beginner in R and I got stuck here
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/saving-all-data-in-r-object-tp4413092p4413092.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] changing time span

2012-02-23 Thread phillen
hi there!
i am desperately in need for help.

i have read in data: 
qthm=read.csv("qthm.csv",sep=";",header=TRUE)
 then created time series ie
m2=ts(log(qthm$m2), start=c(1959, 1), frequency=4)
transformed these time series in zoo variables
qthmz=merge.zoo(diff(mbase),diff(m2),diff(cpi),diff(rgdp))

Now I want to analyse these date, but for different periods. How to I create
new time series or zoo variables with a shorter time span ie from 1959.00 to
1979.75 only?

I have already read through several possibilities but nothing seems to work
well for my case, what do you recommend?

kind regards, phillen

--
View this message in context: 
http://r.789695.n4.nabble.com/changing-time-span-tp4413672p4413672.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-finite finite-difference value

2012-02-23 Thread Martin Spindler
Dear all,

when applying the optim function the following error occured
"non-finite finite-difference value"

Therefore I would like to ask how one can try to handle such a problem and 
which strategies have proven useful. (There is only litte guidance on the help 
list for this kind of problem.)

Thank you in advance for your efforts!

Best,

Martin
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Improving performance of split-apply problem

2012-02-23 Thread R. Michael Weylandt
It looks like what you are doing is reasonably efficient: I do think
there's a residuals element to the object returned by lm() so you
could just call that directly (which will be just a little more
efficient).

The bulk of the time is probably being taken up in the lm() call,
which has alot of overhead: you could use fastLm from the
RcppArmadillo package or lm.fit() directly to cut alot of this out.

Michael

On Wed, Feb 22, 2012 at 9:10 PM, Martin  wrote:
> Hello,
> I'm very new to R so my apologies if I'm making an obvious mistake.
>
> I have a data frame with ~170k rows and 14 numeric variables. The first 2
> of those variables (let's call them group1 and group2) are used to define
> groups: each unique pair of (group1,group2) is a group. There are roughly
> 50k such unique groups, with sizes varying from 1 through 40 rows each.
>
> My objective is to fit a linear regression within each group and get its
> mean square error (MSE). So the final output needs to be a collection of
> 50k MSE's.  Now, regardless of the size of the group, the regression needs
> to be run on exactly 40 observations. If the group has less than 40
> observations, then I need to add rows to get to 40, populating all
> variables with 0's for those extra rows. Here's the function I wrote to do
> this:
>
> get_MSE = function(x) {
>  rownames(x) = x$ID  #'ID' can take on any value from 1 to 40.
>  x = x[as.character(1:40), ]
>  x[is.na(x)] = 0
>  regressionResult = lm(A ~ B + C + D + E, data=x)  #A-E are some variables
> in the data frame.
>  MSE = mean((regressionResult$fitted.values - A)^2)
>  return(MSE)
> }
>
> library(plyr)
> output = ddply(dataset, list(dataset$group1, dataset$group2), get_MSE)
>
> The above code takes about 10 minutes to run, but I'd really need it to be
> much faster, if at all possible. Is there anything I can do to speed up the
> code?
>
> Thank you very much in advance.
>
> Jose
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating Pseudo R-squared from nlme

2012-02-23 Thread dadrivr
I am fitting individual growth models using nlme (multilevel models with
repeated measurements nested within the individual), and I am trying to
calculate the Pseudo R-squared for the models (an overall summary of the
total outcome variability explained).  Singer and Willett (2003) recommend
calculating Pseudo R-squared in multilevel modeling by squaring the sample
correlation between observed and predicted values (across the sample for
each person on each occasion of measurement).

My question is which set of predicted values should I use from nlme in that
calculation?  From my models in nlme, I receive two sets of fitted values. 
Reading the description of the fitted lme values
(http://stat.ethz.ch/R-manual/R-patched/library/nlme/html/fitted.lme.html),
there appear to be two sets of fitted values that correspond to levels of
grouping, where the first set of fitted values (Level 0) correspond to the
population fitted values and it moves to more innermore groupings as the
levels increase (e.g., I suppose Level 1 corresponds to the individual-level
fitted values in my data).

I'm not sure I understand the distinction between population fitted values
and individual-level fitted values because each individual and each
measurement occasion has an estimate for both (population and individual
fitted estimates).  Could you please explain the distinction and which one I
should be using to calculate the Pseudo R-squared as suggested by Singer and
Willett (2003)?

Thanks so much for your help!

--
View this message in context: 
http://r.789695.n4.nabble.com/Calculating-Pseudo-R-squared-from-nlme-tp4413825p4413825.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing time span

2012-02-23 Thread R. Michael Weylandt
? window

Michael

On Thu, Feb 23, 2012 at 7:09 AM, phillen  wrote:
> hi there!
> i am desperately in need for help.
>
> i have read in data:
> qthm=read.csv("qthm.csv",sep=";",header=TRUE)
>  then created time series ie
> m2=ts(log(qthm$m2), start=c(1959, 1), frequency=4)
> transformed these time series in zoo variables
> qthmz=merge.zoo(diff(mbase),diff(m2),diff(cpi),diff(rgdp))
>
> Now I want to analyse these date, but for different periods. How to I create
> new time series or zoo variables with a shorter time span ie from 1959.00 to
> 1979.75 only?
>
> I have already read through several possibilities but nothing seems to work
> well for my case, what do you recommend?
>
> kind regards, phillen
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/changing-time-span-tp4413672p4413672.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I save the current session?

2012-02-23 Thread David Winsemius


On Feb 23, 2012, at 2:04 AM, Petr PIKAL wrote:


Hi


[R] How do I save the current session?

savehistory() gives me the option of saving the executable lines  
only.

I'd

like to save everything.


Using save is not enough?


I'm guessing the OP wanted to do what I accomplish in my MacGUI  
sessions with cmd-A/Save As/ , i.e. get a transcript of input and  
screen output. I don't know how that might be done on a cross-platform  
basis and the OP did not say what his environment was for his sessions  
so I just kept quite.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple gsub

2012-02-23 Thread Gabor Grothendieck
On Thu, Feb 23, 2012 at 5:28 AM, TwistedSkies  wrote:
> Hi Guys,
>
> I am relatively new to R and was wondering if I could next my gsub command
> in identifying one object
>
> I have data which looks like this:  Taiwan_250km
> I want it to look like this:                 Taiwan_250km
>
> So essentially I just want to gsub '' and   with nothing!
>
> So far I have got this:  PolyNam <-
> unlist(strsplit(gsub("","",PolyRaw[PolyLin],fixed = TRUE)," "))
>
> Which removes the end tag, just wondering how I can nest 2 gsubs to remove
> both?!

Just remove < followed by zero or more of anything except > followed by >

gsub("<[^>]*>", "", "Taiwan_250km")

or using the XML package:

library(XML)
xmlValue(xmlRoot(xmlTreeParse("Taiwan_250km", asText = TRUE)))

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I ran an R command whcih is present as the content of a character object

2012-02-23 Thread Aniruddha Mukherjee
I have an object called rcom which was created by the command rcom 
<-"mean(mat_rix$COL_1)". Also the data-frame mat_rix is well defined with 
numeric values in its column 1 and its name is "COL_1".

My question is how to extract (or do something with) the content of rcom 
so that it provides the mean value which I want.
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv: Smoothing matrix

2012-02-23 Thread Simon Wood
There's no function for extracting this directly, as almost anything 
that you want to do with the smoother matrix can be done in a much more 
efficient way without computing it explicitly, but here's an example of 
how to compute it explicitly in the unweighted additive case...


library(mgcv)
set.seed(0) ## simulate some data...
dat <- gamSim(1,n=400,dist="normal",scale=2)
b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)

X <- model.matrix(b)
## compute X'(X'X+S)^{-1}X'...
A <- X%*%vcov(b,dispersion=1)%*%t(X)

dA <- influence(b) ## much more efficient if you only need diag(A)
range(dA-diag(A))

On 23/02/12 11:02, Man Zhang wrote:

Dear All,

I would like to extract the smoothing matrix of the fitted GAM, \hat{y} = Sy. I 
can't seem to find the function or am I missing something?

Thanks, any help is greatly appreciated
Man Zhang

[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603   http://people.bath.ac.uk/sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I ran an R command whcih is present as the content of a character object

2012-02-23 Thread R. Michael Weylandt
eval(parse(text = rcom))

Michael

On Thu, Feb 23, 2012 at 8:30 AM, Aniruddha Mukherjee
 wrote:
> I have an object called rcom which was created by the command rcom
> <-"mean(mat_rix$COL_1)". Also the data-frame mat_rix is well defined with
> numeric values in its column 1 and its name is "COL_1".
>
> My question is how to extract (or do something with) the content of rcom
> so that it provides the mean value which I want.
> =-=-=
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing time span

2012-02-23 Thread Gabor Grothendieck
On Thu, Feb 23, 2012 at 7:09 AM, phillen  wrote:
> hi there!
> i am desperately in need for help.
>
> i have read in data:
> qthm=read.csv("qthm.csv",sep=";",header=TRUE)
>  then created time series ie
> m2=ts(log(qthm$m2), start=c(1959, 1), frequency=4)
> transformed these time series in zoo variables
> qthmz=merge.zoo(diff(mbase),diff(m2),diff(cpi),diff(rgdp))
>
> Now I want to analyse these date, but for different periods. How to I create
> new time series or zoo variables with a shorter time span ie from 1959.00 to
> 1979.75 only?
>
> I have already read through several possibilities but nothing seems to work
> well for my case, what do you recommend?

?window.zoo

Also read the part about posting reproducible code in the last two
lines of every message to r-help.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2012-02-23 Thread Jonathan Williams
Dear Helpers,

I wrote a simple function to standardise variables if they contain more than 
one value. If the elements of the variable are all identical, then I want the 
function to return zero.

When I submit variables whose elements are all identical to the function, it 
returns not zero, but NaNs.

zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else if 
(length(table(x)==1)) y=0; return(y)}

zt(c(1:10))
#[1] -1.4863011 -1.1560120 -0.8257228 -0.4954337 -0.1651446  0.1651446  
0.4954337  0.8257228  1.1560120  1.4863011

zt(rep(1,10))
#[1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

Would you be so kind as to point out what I am doing wrong, here? How can I 
obtain zeros from my function, instead of NaNs? (I obtain NaNs also if I set 
the function to zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else 
if (length(table(x)==1)) y=rep(0, length(x)); return(y)} ).

Thanks, in advance, for your help,

Jonathan Williams

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2012-02-23 Thread Sarah Goslee
Hi,

The parentheses are in the wrong places in the two if() statements.

Look here:

(length(table(x)>1))
^^
(length(table(x)==1))
^ ^

In both cases you're checking whether
the length of the comparison (table(x) > 1) or (table(x) == 1)
is 1, which it always is regardless of whether the
comparison itself is true or false. If you move those, it
should be fine. Although I think I'd use length(unique(x)) instead.

Sarah

On Thu, Feb 23, 2012 at 9:19 AM, Jonathan Williams
 wrote:
> Dear Helpers,
>
> I wrote a simple function to standardise variables if they contain more than 
> one value. If the elements of the variable are all identical, then I want the 
> function to return zero.
>
> When I submit variables whose elements are all identical to the function, it 
> returns not zero, but NaNs.
>
> zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else if 
> (length(table(x)==1)) y=0; return(y)}
>
> zt(c(1:10))
> #[1] -1.4863011 -1.1560120 -0.8257228 -0.4954337 -0.1651446  0.1651446  
> 0.4954337  0.8257228  1.1560120  1.4863011
>
> zt(rep(1,10))
> #[1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>
> Would you be so kind as to point out what I am doing wrong, here? How can I 
> obtain zeros from my function, instead of NaNs? (I obtain NaNs also if I set 
> the function to zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) 
> else if (length(table(x)==1)) y=rep(0, length(x)); return(y)} ).
>
> Thanks, in advance, for your help,
>
> Jonathan Williams
>
> __

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing the package Rcplex

2012-02-23 Thread Uwe Ligges

I give up.

Uwe Ligges



On 23.02.2012 15:01, zheng wei wrote:

Thanks.

I do not understand item 2, what do you mean by add R's bin directory to path?

While directly use to c:\temp and type the commands, it says 'R' is not 
recognized as an internal or external command.

While I use previous approach by going into the bin directory and type R CMD INSTALL 
"c:/temp/Rcplex", it told me that configuration failed for package 'Rcplex'



  From: Uwe Ligges
To: zheng wei
Cc: David Winsemius; 
"r-help@r-project.org"
Sent: Thursday, February 23, 2012 5:07 AM
Subject: Re: [R] installing the package Rcplex



On 22.02.2012 21:04, zheng wei wrote:

Based on my understanding of the manual, I moved upziped the file and put the 
folder of Rcplex under the directory of c:/temp
Then I use cmd under windows to go to the director of C:\Program
Files\R\R-2.13.0\bin, where my R is installed
and typed R CMD INSTALL
"c:/temp/Rcplex"


I'd

1. install a recent version of R

2. add R's bin directory to the PATH and go to c:\temp and say

R CMD build Rcplex

followed by

R CMD INSTALL Rcplex_version.tar.gz


3. When it fails, I'd try to find out if the ERROR message is helpful.

4. If 3 fails, I'd ask the maintainer for help - including relevant
information, like the error message, install paths, version information etc.


Uwe Ligges




I got the error of configuration failed for package "Rcplex"

Any idea?





From: Uwe Ligges
To: zheng wei
Cc: David Winsemius; 
"r-help@r-project.org"
Sent: Wednesday, February 22, 2012 4:04 AM
Subject: Re: [R] installing the package Rcplex



On 22.02.2012 03:33, zheng wei wrote:

Thanks.

I was just reminded by the tech support in my university that cplex is an independent 
software owned by ILOG, which in turn is now owned by IBM. I suceeded in installing the 
software cplex under the directory of "C:/Program 
Files/IBM/ILOG/CPLEX_Studio_Academic124/cplex"
I guess Rcplex is an R package to utilize the software cplex. I have changed the path 
"/c/ilog/cplex111" to the above path. My question is how to finally and 
effectively install the package of Rcplex?


You have been asked alreeady to read the R Installation and
Administration manual.

Uwe Ligges





Thanks,
Wei




From: Uwe Ligges
To: zheng wei
Cc: David Winsemius; 
"r-help@r-project.org"
Sent: Tuesday, February 21, 2012 2:14 PM
Subject: Re: [R] installing the package Rcplex



On 21.02.2012 19:57, zheng wei wrote:

Thank you both for helping. Still could not figure out.

I was contacting different supporting IT departments in my university but did 
not get any help.

For the moment, I just want to what does the instruction of the package means. 
You could find this instruction on page 
http://cran.r-project.org/web/packages/Rcplex/INSTALL
   
--
***WINDOWS***
Installation on Windows systems is done by using the provided
Makevars.win file in the src directory. It contains the following
lines:
PKG_CPPFLAGS=-I/include
PKG_LIBS=-L/lib/x86_windows_vs2008/stat_mda -lcplex111 -lm
where is the cplex installation directory
e.g. /c/ilog/cplex111. Please edit your Makevars.win file accordingly.
We have successfully tested this procedure with CPLEX 11.1 on 32-bit
Windows XP.
--


I can find the file and see the codes. But what new path should I put, and what 
to do next?


The path to your CPLEX installation?

Uwe Ligges






Thanks,
Wei



From: Uwe Ligges
To: David Winsemius
Cc: zheng wei; 
"r-help@r-project.org"
Sent: Monday, February 20, 2012 6:01 AM
Subject: Re: [R] installing the package Rcplex



On 20.02.2012 01:54, David Winsemius wrote:


On Feb 19, 2012, at 7:45 PM, zheng wei wrote:


I did not know this before. I installed it as you suggested. what to
do next?


Read the Installation Manual?



And don't forget this is a source package for which no CRAN Windows
binary exists, hence it may be not that straightforward to get it done
and you wil have to read the INSTALL file from the source package carefully.

Uwe Ligges


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-23 Thread Liaw, Andy
In short, pick your poison...

Is there any particular reason why the tools that shipped with R itself (e.g., 
kernSmooth) are inadequate for you?

I like using the locfit package because it has many tools, including the ones 
that the author didn't think were optimal.  You may need the book to get most 
mileage out of it though.

Andy


From: Michael [mailto:comtech@gmail.com]
Sent: Thursday, February 23, 2012 12:25 AM
To: Liaw, Andy
Cc: Bert Gunter; r-help
Subject: Re: [R] Good and modern Kernel Regression package in R with 
auto-bandwidth?

$B#I(Bmeant its very slow when I use "cv.aic"...

On Wed, Feb 22, 2012 at 11:24 PM, Michael 
mailto:comtech@gmail.com>> wrote:
Is "np" an okay package to use?

I am worried about the "multi-start" thing... and also it's very slow...


On Wed, Feb 22, 2012 at 8:35 PM, Liaw, Andy 
mailto:andy_l...@merck.com>> wrote:
Bert's question aside (I was going to ask about laundry, but that's much harder 
than taxes...), my understanding of the situation is that "optimal" is in the 
eye of the beholder.  There were at least two schools of thought on which is 
the better way of automatically selecting bandwidth, using plug-in methods or 
CV-type.  The last I check, the jury is still out.

Andy

> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Bert Gunter
> Sent: Wednesday, February 22, 2012 6:03 PM
> To: Michael
> Cc: r-help
> Subject: Re: [R] Good and modern Kernel Regression package in
> R with auto-bandwidth?
>
> Would you like it to do your your taxes for you too? :-)
>
> Bert
>
> Sent from my iPhone -- please excuse typos.
>
> On Feb 22, 2012, at 11:46 AM, Michael 
> mailto:comtech@gmail.com>> wrote:
>
> > Hi all,
> >
> > I am looking for a good and modern Kernel Regression
> package in R, which
> > has the following features:
> >
> > 1) It has cross-validation
> > 2) It can automatically choose the "optimal" bandwidth
> > 3) It doesn't have random effect - i.e. if I run the
> function at different
> > times on the same data-set, the results should be exactly
> the same... I am
> > trying "np", but I am seeing:
> >
> > Multistart 1 of 1 |
> > Multistart 1 of 1 |
> > ...
> >
> > It looks like in order to do the optimization, it's doing
> > multiple-random-start optimization... am I right?
> >
> >
> > Could you please give me some pointers?
> >
> > I did some google search but there are so many packages
> that do this... I
> > just wanted to find the best/modern one to use...
> >
> > Thank you!
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice:  This e-mail message, together with any attachme...{{dropped:27}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2012-02-23 Thread Petr PIKAL
Hi

> 
> Hi,
> 
> The parentheses are in the wrong places in the two if() statements.
> 
> Look here:
> 
> (length(table(x)>1))
> ^^
> (length(table(x)==1))
> ^ ^
> 
> In both cases you're checking whether
> the length of the comparison (table(x) > 1) or (table(x) == 1)
> is 1, which it always is regardless of whether the
> comparison itself is true or false. If you move those, it
> should be fine. Although I think I'd use length(unique(x)) instead.

R's scale function is intended to do such things.

zt2 <- function(x) as.numeric(ifelse(is.nan(scale(x)), 0, scale(x)))

Regards
Petr

> 
> Sarah
> 
> On Thu, Feb 23, 2012 at 9:19 AM, Jonathan Williams
>  wrote:
> > Dear Helpers,
> >
> > I wrote a simple function to standardise variables if they contain 
more 
> than one value. If the elements of the variable are all identical, then 
I 
> want the function to return zero.
> >
> > When I submit variables whose elements are all identical to the 
> function, it returns not zero, but NaNs.
> >
> > zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else if 
> (length(table(x)==1)) y=0; return(y)}
> >
> > zt(c(1:10))
> > #[1] -1.4863011 -1.1560120 -0.8257228 -0.4954337 -0.1651446  0.1651446 

>  0.4954337  0.8257228  1.1560120  1.4863011
> >
> > zt(rep(1,10))
> > #[1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
> >
> > Would you be so kind as to point out what I am doing wrong, here? How 
> can I obtain zeros from my function, instead of NaNs? (I obtain NaNs 
also 
> if I set the function to zt=function(x){if (length(table(x)>1)) 
y=(x-mean
> (x))/sd(x) else if (length(table(x)==1)) y=rep(0, length(x)); return(y)} 
).
> >
> > Thanks, in advance, for your help,
> >
> > Jonathan Williams
> >
> > __
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-23 Thread Michael
Thank you Andy!

I went thru KernSmooth package but I don't see a way to use the fitted
function to do the "predict" part...


data=data.frame(z=z, x=x)

datanew=data.frame(z=z, x=x)

lmfit=lm(z~x, data=data)

lmforecast=predict(lmfit, newdata=datanew)

Am I missing anything here?

Thanks!
2012/2/23 Liaw, Andy 

> **
> In short, pick your poison...
>
> Is there any particular reason why the tools that shipped with R itself
> (e.g., kernSmooth) are inadequate for you?
>
> I like using the locfit package because it has many tools, including the
> ones that the author didn't think were optimal.  You may need the book to
> get most mileage out of it though.
>
> Andy
>
>  --
> *From:* Michael [mailto:comtech@gmail.com]
> *Sent:* Thursday, February 23, 2012 12:25 AM
> *To:* Liaw, Andy
> *Cc:* Bert Gunter; r-help
>
> *Subject:* Re: [R] Good and modern Kernel Regression package in R with
> auto-bandwidth?
>
>   Imeant its very slow when I use "cv.aic"...
>
> On Wed, Feb 22, 2012 at 11:24 PM, Michael  wrote:
>
>> Is "np" an okay package to use?
>>
>> I am worried about the "multi-start" thing... and also it's very slow...
>>
>>
>> On Wed, Feb 22, 2012 at 8:35 PM, Liaw, Andy  wrote:
>>
>>> Bert's question aside (I was going to ask about laundry, but that's much
>>> harder than taxes...), my understanding of the situation is that "optimal"
>>> is in the eye of the beholder.  There were at least two schools of thought
>>> on which is the better way of automatically selecting bandwidth, using
>>> plug-in methods or CV-type.  The last I check, the jury is still out.
>>>
>>> Andy
>>>
>>> > -Original Message-
>>> > From: r-help-boun...@r-project.org
>>> > [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter
>>> > Sent: Wednesday, February 22, 2012 6:03 PM
>>> > To: Michael
>>> > Cc: r-help
>>> > Subject: Re: [R] Good and modern Kernel Regression package in
>>> > R with auto-bandwidth?
>>> >
>>> > Would you like it to do your your taxes for you too? :-)
>>> >
>>> > Bert
>>> >
>>> > Sent from my iPhone -- please excuse typos.
>>> >
>>> > On Feb 22, 2012, at 11:46 AM, Michael  wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > I am looking for a good and modern Kernel Regression
>>> > package in R, which
>>> > > has the following features:
>>> > >
>>> > > 1) It has cross-validation
>>> > > 2) It can automatically choose the "optimal" bandwidth
>>> > > 3) It doesn't have random effect - i.e. if I run the
>>> > function at different
>>> > > times on the same data-set, the results should be exactly
>>> > the same... I am
>>> > > trying "np", but I am seeing:
>>> > >
>>> > > Multistart 1 of 1 |
>>> > > Multistart 1 of 1 |
>>> > > ...
>>> > >
>>> > > It looks like in order to do the optimization, it's doing
>>> > > multiple-random-start optimization... am I right?
>>> > >
>>> > >
>>> > > Could you please give me some pointers?
>>> > >
>>> > > I did some google search but there are so many packages
>>> > that do this... I
>>> > > just wanted to find the best/modern one to use...
>>> > >
>>> > > Thank you!
>>> > >
>>> > >[[alternative HTML version deleted]]
>>> > >
>>> > > __
>>> > > R-help@r-project.org mailing list
>>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > > PLEASE do read the posting guide
>>> > http://www.R-project.org/posting-guide.html
>>> > > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>> > __
>>> > R-help@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> > http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>> Notice:  This e-mail message, together with any attachments, contains
>>> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
>>> New Jersey, USA 08889), and/or its affiliates Direct contact information
>>> for affiliates is available at
>>> http://www.merck.com/contact/contacts.html) that may be confidential,
>>> proprietary copyrighted and/or legally privileged. It is intended solely
>>> for the use of the individual or entity named on this message. If you are
>>> not the intended recipient, and have received this message in error,
>>> please notify us immediately by reply e-mail and then delete it from
>>> your system.
>>>
>>>
>>
>  Notice:  This e-mail message, together with any attachments, contains
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
> New Jersey, USA 08889), and/or its affiliates Direct contact information
> for affiliates is available at
> http://www.merck.com/contact/contacts.html) that may be confidential,
> proprietary copyrighted and/or legally privileged. It is intended solely
> for the use of the individual or entity named on this message.

[R] TRAMO/SEATS and x12 in R

2012-02-23 Thread Victor
I have a Mac OS X system. To deal with a long monthly electricity demand 
time-series I use the  procedures TRAMO/SEATS with the MS-windows only Demetra 
programme and X12 under R resorting to the awkward - as far as the output is 
concerned - x12 R package running the relating Fortran code. 
I wonder if someone out there has attempted to translate TRAMO/SEATS and X12  
into R native language?

Ciao from Rome
Vittorio
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extract subset of data.frame

2012-02-23 Thread syrvn
Hello,


consider the following data.frame df and vector v

df <- data.frame(group = c("A","B","C","D"), value = c(1,2,3,4))
v <- c(2,3)

How can I return a sub data.frame which has only the rows left where value
matches v

df:

group value
B 2
C 3


Cheers

--
View this message in context: 
http://r.789695.n4.nabble.com/extract-subset-of-data-frame-tp4414251p4414251.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract subset of data.frame

2012-02-23 Thread Sarah Goslee
> df[df$value %in% v,]
  group value
2 B 2
3 C 3


df is a function; you're better off not using that name for your
dataframe.

Sarah

On Thu, Feb 23, 2012 at 10:48 AM, syrvn  wrote:
> Hello,
>
>
> consider the following data.frame df and vector v
>
> df <- data.frame(group = c("A","B","C","D"), value = c(1,2,3,4))
> v <- c(2,3)
>
> How can I return a sub data.frame which has only the rows left where value
> matches v
>
> df:
>
> group value
> B 2
> C 3
>
>
> Cheers
>
> --


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract subset of data.frame

2012-02-23 Thread R. Michael Weylandt
df[value %in% v, ]

Michael

On Thu, Feb 23, 2012 at 10:48 AM, syrvn  wrote:
> Hello,
>
>
> consider the following data.frame df and vector v
>
> df <- data.frame(group = c("A","B","C","D"), value = c(1,2,3,4))
> v <- c(2,3)
>
> How can I return a sub data.frame which has only the rows left where value
> matches v
>
> df:
>
> group value
> B 2
> C 3
>
>
> Cheers
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/extract-subset-of-data-frame-tp4414251p4414251.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] solnp inversed-Hessian problem

2012-02-23 Thread Diogo Alagador

Dear all,

I am tryng to solve a nonlinear optimization probel using the solnp function.
I have different datasets. For the smaller I get full solutions, for  
the bigger I got an error message stating:



Iter: 1 fn: 101.8017 Pars:  0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000 0.21000  
0.21000 0.21000 0.21000 0.21000


solnp--> Solution not reliableProblem Inverting Hessian.
Warning messages:
1: In p0 * vscale[(neq + 2):(nc + np + 1)] :
  longer object length is not a multiple of shorter object length
2: In cbind(temp, funv) :
  number of rows of result is not a multiple of vector length (arg 1)



Anyone knows what may be the reason? Just remembering that the same  
problem runs OK for smaller datasets.


Thanks in advance,

Diogo André
Portugal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FW: NaN from function

2012-02-23 Thread Jonathan Williams

Dear Helpers,

I wrote a simple function to standardise variables if they contain more than 
one value. If the elements of the variable are all identical, then I want the 
function to return zero.

When I submit variables whose elements are all identical to the function, it 
returns not zero, but NaNs.

zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else if 
(length(table(x)==1)) y=0; return(y)}

zt(c(1:10))
#[1] -1.4863011 -1.1560120 -0.8257228 -0.4954337 -0.1651446  0.1651446  
0.4954337  0.8257228  1.1560120  1.4863011

zt(rep(1,10))
#[1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

Would you be so kind as to point out what I am doing wrong, here? How can I 
obtain zeros from my function, instead of NaNs? (I obtain NaNs also if I set 
the function to zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else 
if (length(table(x)==1)) y=rep(0, length(x)); return(y)} ).

Thanks, in advance, for your help,

Jonathan Williams

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DCC-GARCH model

2012-02-23 Thread vnatanel
Dear Marcin, 


This document should clarify your questions:
http://www.google.be/url?sa=t&rct=j&q=ccgarch%3A%20an%20r%20package%20for%20building%20multivariate%20garch&source=web&cd=1&ved=0CCMQFjAA&url=http%3A%2F%2Fhhs.diva-portal.org%2Fsmash%2Fget%2Fdiva2%3A320449%2FFULLTEXT02&ei=8V1GT_uDDcLq8QOWyqSwDg&usg=AFQjCNE36DZu4qWOK-5AlZXhlDaT_sZ1sg&sig2=Z-dnG2bprPpL1FxtAuUCeA


--
View this message in context: 
http://r.789695.n4.nabble.com/DCC-GARCH-model-tp3524387p4414223.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing time span

2012-02-23 Thread phillen
muchas gracias! Exactly what I was looking for, very helpful.

Best regards,
Philipp

--
View this message in context: 
http://r.789695.n4.nabble.com/changing-time-span-tp4413672p4414007.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Case weighting

2012-02-23 Thread Hed Bar-Nissan
The need comes from the PISA data. (http://www.pisa.oecd.org)

In the data there are many cases and each of them carries a numeric
variable that signifies it's weight.
In SPSS the command would be "WEIGHT BY"

In simpler words here is an R sample ( What is get  VS  what i want to get )


> data.recieved <- data.frame(
+ kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes", "No")),
+ weight=c(10, 1, 1, 1)
+ );
> data.recieved;
  kindergarten_attendance weight
1  No 10
2 Yes  1
3 Yes  1
4 Yes  1
>
>
>
> data.weighted <- data.frame(
+ kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,1,1,1), labels =
c("Yes", "No")) );
>
>
> par(mfrow=c(1,2));
> plot(data.recieved$kindergarten_attendance,main="What i get");
> plot(data.weighted$kindergarten_attendance,main="What i want to get");
>

tnx in advance
Hed

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Advice on exploration of sub-clusters in hierarchical dendrogram

2012-02-23 Thread kosmo7
Dear R user,

I am a biochemist/bioinformatician, at the moment working on protein
clusterings by conformation similarity.

I only started seriously working with R about a couple of months ago.
I have been able so far to read my way through tutorials and set-up my
hierarchical clusterings. My problem is that I cannot find a way to obtain
information on the rooting of specific nodes, i.e. of specific clusters of
interest.
In other words, I am trying to obtain/read the sub-clusters of a specific
cluster in the dendrogram, by isolating a specific node and exploring
locally its lower hierarchy.

Please allow me to display some of the code I have been using for your
reference:

df=read.table('mydata.txt', head=T, row.names=1) #read file with distance
matrix
d=as.dist(df) #format table as distance matrix
z<-hclust(d,method="complete", members=NULL)
x<-as.dendrogram(z)
plot(x, xlab="mydata complete-LINKAGE", ylim=c(0,4)) #visualization of the
dendrogram
clusters<-cutree(z, h=1.6) #obtain clusters at cutoff height=1.6
ord<-cmdscale(d, k=2) #Multidimensional scaling of the data down to 2
dimensions
clusplot(ord,clusters, color=TRUE, shade=TRUE,labels=4, lines=0)
#visualization of the clusters in 2D map
var1<-var(clusters==1) #variance of cluster 1

#extract cluster memberships:
clids = as.data.frame(clusters)
names(clids) = c("id")
clids$cdr = row.names(clids)
row.names(clids) = c(1:dim(clids)[1])
clstructure = lapply(unique(clids$id), function(x){clids[clids$id ==
x,'cdr']})

clstructure[[1]] #get memberships of cluster 1



>From this point, eventually, I could recreate a distance matrix with only
the members of a specific cluster and then re-apply hierarchical clustering
and start all over again.
But this would take me ages to perform individually for hundred of clusters.
So, I was hoping if anyone could point me to a direction as to how to take
advantage of the initial dendrogram and focus on specific clusters from
which to derive the sub-clusters at a new given cutoff height.

I recently found in this page 
http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual
http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual 

the following code:
clid <- c(1,2)
ysub <- y[names(mycl[mycl%in%clid]),]
hrsub <- hclust(as.dist(1-cor(t(ysub), method="pearson")),
method="complete") # Select sub-cluster number (here: clid=c(1,2)) and
generate corresponding dendrogram.

Even with this given example I am afraid I can't work my way around.
So I guess in my case I could grab all the members of a specific cluster
using my existing code and try to reformat the distance matrix in one that
only contains the distances of those members:
cluster1members<-clstructure[[1]]

Then I need to reformat the distance matrix into a new one, say d1, which I
can feed to a new -local- hierarchical clustering:
hrsub<-hclust(d1, method="complete")

Any ideas on how I can obtain a new distance matrix with just the distances
of the members in that clusters, with names contained in vector
"cluster1members" ?

Apologies if this seems trivial, but I really can't find the correct
functions to use for this task.
Thank you very much in advance - as I am really a novice with R, small
chunks of code as example would be of great help.

Take care all - 

--
View this message in context: 
http://r.789695.n4.nabble.com/Advice-on-exploration-of-sub-clusters-in-hierarchical-dendrogram-tp4414277p4414277.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing the package Rcplex

2012-02-23 Thread zheng wei
Thanks.
 
I do not understand item 2, what do you mean by add R's bin directory to path?
 
While directly use to c:\temp and type the commands, it says 'R' is not 
recognized as an internal or external command.
 
While I use previous approach by going into the bin directory and type R CMD 
INSTALL "c:/temp/Rcplex", it told me that configuration failed for package 
'Rcplex'
 


 From: Uwe Ligges 

Cc: David Winsemius ; "r-help@r-project.org" 
 
Sent: Thursday, February 23, 2012 5:07 AM
Subject: Re: [R] installing the package Rcplex
  


On 22.02.2012 21:04, zheng wei wrote:
> Based on my understanding of the manual, I moved upziped the file and put the 
> folder of Rcplex under the directory of c:/temp
> Then I use cmd under windows to go to the director of C:\Program
> Files\R\R-2.13.0\bin, where my R is installed
> and typed R CMD INSTALL
> "c:/temp/Rcplex"

I'd

1. install a recent version of R

2. add R's bin directory to the PATH and go to c:\temp and say

R CMD build Rcplex

followed by

R CMD INSTALL Rcplex_version.tar.gz


3. When it fails, I'd try to find out if the ERROR message is helpful.

4. If 3 fails, I'd ask the maintainer for help - including relevant 
information, like the error message, install paths, version information etc.


Uwe Ligges



> I got the error of configuration failed for package "Rcplex"
>
> Any idea?
>
>
>
>
> 
>   From: Uwe Ligges

> Cc: David Winsemius; 
> "r-help@r-project.org"
> Sent: Wednesday, February 22, 2012 4:04 AM
> Subject: Re: [R] installing the package Rcplex
>
>
>
> On 22.02.2012 03:33, zheng wei wrote:
>> Thanks.
>>
>> I was just reminded by the tech support in my university that cplex is an 
>> independent software owned by ILOG, which in turn is now owned by IBM. I 
>> suceeded in installing the software cplex under the directory of "C:/Program 
>> Files/IBM/ILOG/CPLEX_Studio_Academic124/cplex"
>> I guess Rcplex is an R package to utilize the software cplex. I have changed 
>> the path "/c/ilog/cplex111" to the above path. My question is how to finally 
>> and effectively install the package of Rcplex?
>
> You have been asked alreeady to read the R Installation and
> Administration manual.
>
> Uwe Ligges
>
>
>
>>
>> Thanks,
>> Wei
>>
>>
>>
>> 
>> From: Uwe Ligges

>> Cc: David Winsemius; 
>> "r-help@r-project.org"
>> Sent: Tuesday, February 21, 2012 2:14 PM
>> Subject: Re: [R] installing the package Rcplex
>>
>>
>>
>> On 21.02.2012 19:57, zheng wei wrote:
>>> Thank you both for helping. Still could not figure out.
>>>
>>> I was contacting different supporting IT departments in my university but 
>>> did not get any help.
>>>
>>> For the moment, I just want to what does the instruction of the package 
>>> means. You could find this instruction on page 
>>> http://cran.r-project.org/web/packages/Rcplex/INSTALL
>>>      
>>>--
>>> ***WINDOWS***
>>> Installation on Windows systems is done by using the provided
>>> Makevars.win file in the src directory. It contains the following
>>> lines:
>>> PKG_CPPFLAGS=-I/include
>>> PKG_LIBS=-L/lib/x86_windows_vs2008/stat_mda -lcplex111 -lm
>>> where    is the cplex installation directory
>>> e.g. /c/ilog/cplex111. Please edit your Makevars.win file accordingly.
>>> We have successfully tested this procedure with CPLEX 11.1 on 32-bit
>>> Windows XP.
>>> --
>>>
>>>
>>> I can find the file and see the codes. But what new path should I put, and 
>>> what to do next?
>>
>> The path to your CPLEX installation?
>>
>> Uwe Ligges
>>
>>
>>
>>
>>>
>>> Thanks,
>>> Wei
>>>
>>>
>>> 
>>> From: Uwe Ligges
>>> To: David Winsemius

project.org>
>>> Sent: Monday, February 20, 2012 6:01 AM
>>> Subject: Re: [R] installing the package Rcplex
>>>
>>>
>>>
>>> On 20.02.2012 01:54, David Winsemius wrote:

 On Feb 19, 2012, at 7:45 PM, zheng wei wrote:

> I did not know this before. I installed it as you suggested. what to
> do next?

 Read the Installation Manual?

>>>
>>> And don't forget this is a source package for which no CRAN Windows
>>> binary exists, hence it may be not that straightforward to get it done
>>> and you wil have to read the INSTALL file from the source package carefully.
>>>
>>> Uwe Ligges
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saving all data in r object

2012-02-23 Thread uday
Thanks for reply Michael 

the error which I got is as follows : 
Error in gzfile(file, "wb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "wb") :
  cannot open compressed file 'data.RData', probable reason 'Permission
denied'

--
View this message in context: 
http://r.789695.n4.nabble.com/saving-all-data-in-r-object-tp4413092p4414051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saving all data in r object

2012-02-23 Thread uday
Michael , the first error which I got is 
"number of items to replace is not a multiple of replacement length"
sorry last time it did not copied whole thing 



--
View this message in context: 
http://r.789695.n4.nabble.com/saving-all-data-in-r-object-tp4413092p4414058.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DCC-GARCH model

2012-02-23 Thread John Kerpel
Also see the most excellent rmgarch package from A*lexios Ghalanos *available
here:

https://r-forge.r-project.org/R/?group_id=339




On Thu, Feb 23, 2012 at 9:41 AM, vnatanel  wrote:

> Dear Marcin,
>
>
> This document should clarify your questions:
>
> http://www.google.be/url?sa=t&rct=j&q=ccgarch%3A%20an%20r%20package%20for%20building%20multivariate%20garch&source=web&cd=1&ved=0CCMQFjAA&url=http%3A%2F%2Fhhs.diva-portal.org%2Fsmash%2Fget%2Fdiva2%3A320449%2FFULLTEXT02&ei=8V1GT_uDDcLq8QOWyqSwDg&usg=AFQjCNE36DZu4qWOK-5AlZXhlDaT_sZ1sg&sig2=Z-dnG2bprPpL1FxtAuUCeA
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/DCC-GARCH-model-tp3524387p4414223.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-23 Thread Liaw, Andy
If that's the kind of framework you'd like to work in, use locfit, which has 
the predict() method for evaluating new data.  There are several different 
handwidth selectors in that package for your choosing.

Kernel smoothers don't really fit the framework of "creating a model object, 
followed by predicting new data using that fitted model object" very well 
because of it's local nature.  Think of k-nn classification, which has similar 
problem:  The "model" needs to be computed for every data point you want to 
predict.

Andy


From: Michael [mailto:comtech@gmail.com]
Sent: Thursday, February 23, 2012 10:06 AM
To: Liaw, Andy
Cc: Bert Gunter; r-help
Subject: Re: [R] Good and modern Kernel Regression package in R with 
auto-bandwidth?

Thank you Andy!

I went thru KernSmooth package but I don't see a way to use the fitted function 
to do the "predict" part...


data=data.frame(z=z, x=x)

datanew=data.frame(z=z, x=x)

lmfit=lm(z

~x, data=data)

lmforecast=predict(lmfit, newdata=datanew)

Am I missing anything here?

Thanks!
2012/2/23 Liaw, Andy mailto:andy_l...@merck.com>>
In short, pick your poison...

Is there any particular reason why the tools that shipped with R itself (e.g., 
kernSmooth) are inadequate for you?

I like using the locfit package because it has many tools, including the ones 
that the author didn't think were optimal.  You may need the book to get most 
mileage out of it though.

Andy


From: Michael [mailto:comtech@gmail.com]
Sent: Thursday, February 23, 2012 12:25 AM
To: Liaw, Andy
Cc: Bert Gunter; r-help

Subject: Re: [R] Good and modern Kernel Regression package in R with 
auto-bandwidth?

$B#I(Bmeant its very slow when I use "cv.aic"...

On Wed, Feb 22, 2012 at 11:24 PM, Michael 
mailto:comtech@gmail.com>> wrote:
Is "np" an okay package to use?

I am worried about the "multi-start" thing... and also it's very slow...


On Wed, Feb 22, 2012 at 8:35 PM, Liaw, Andy 
mailto:andy_l...@merck.com>> wrote:
Bert's question aside (I was going to ask about laundry, but that's much harder 
than taxes...), my understanding of the situation is that "optimal" is in the 
eye of the beholder.  There were at least two schools of thought on which is 
the better way of automatically selecting bandwidth, using plug-in methods or 
CV-type.  The last I check, the jury is still out.

Andy

> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Bert Gunter
> Sent: Wednesday, February 22, 2012 6:03 PM
> To: Michael
> Cc: r-help
> Subject: Re: [R] Good and modern Kernel Regression package in
> R with auto-bandwidth?
>
> Would you like it to do your your taxes for you too? :-)
>
> Bert
>
> Sent from my iPhone -- please excuse typos.
>
> On Feb 22, 2012, at 11:46 AM, Michael 
> mailto:comtech@gmail.com>> wrote:
>
> > Hi all,
> >
> > I am looking for a good and modern Kernel Regression
> package in R, which
> > has the following features:
> >
> > 1) It has cross-validation
> > 2) It can automatically choose the "optimal" bandwidth
> > 3) It doesn't have random effect - i.e. if I run the
> function at different
> > times on the same data-set, the results should be exactly
> the same... I am
> > trying "np", but I am seeing:
> >
> > Multistart 1 of 1 |
> > Multistart 1 of 1 |
> > ...
> >
> > It looks like in order to do the optimization, it's doing
> > multiple-random-start optimization... am I right?
> >
> >
> > Could you please give me some pointers?
> >
> > I did some google search but there are so many packages
> that do this... I
> > just wanted to find the best/modern one to use...
> >
> > Thank you!
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of

Re: [R] tapply for enormous (>2^31 row) matrices

2012-02-23 Thread Matthew Keller
Thank you all very much for your help (on both the r-help and the
bioconductor listserves).

Benilton - I couldn't get sqldf to install on the server I'm using
(error is: Error : package 'gsubfn' does not have a name space). I
think this was a problem for R 2.13, and I'm trying to get the admin's
to install a more up-to-date version. I know that I need to probably
learn a modicum of SQL given the sizes of datasets I'm using now.

I ended up using a modified version of Hervé Pagès' excellent code
(thank you!). I got a huge (40-fold) speed bump by using the
data.table package for indexing/aggregate steps, making an hours long
job a minutes long job. SO - read.table is hugely useful if you're
dealing with indexing/apply-family functions on huge datasets. By the
way, I'm not sure why, but read.table was a bit faster than scan for
this problem... Here is the code for others:


require(data.table)

computeAllPairSums <- function(filename, nbindiv,nrows.to.read)
{
   con <- file(filename, open="r")
   on.exit(close(con))
   ans <- matrix(numeric(nbindiv * nbindiv), nrow=nbindiv)
   chunk <- 0L
   while (TRUE) {
   #read.table faster than scan
   df0 <- read.table(con,col.names=c("ID1", "ID2", "ignored", "sharing"),
colClasses=c("integer", "integer", "NULL",
"numeric"),nrows=nrows.to.read,comment.char="")

   DT <- data.table(df0)
   setkey(DT,ID1,ID2)
   ss <- DT[,sum(sharing),by="ID1,ID2"]

   if (nrow(df0) == 0L)
   break

   chunk <- chunk + 1L
   cat("Processing chunk", chunk, "... ")

  idd <- as.matrix(subset(ss,select=1:2))
  newvec <- as.vector(as.matrix(subset(ss,select=3)))
  ans[idd] <- ans[idd] + newvec

 cat("OK\n")
 }
   ans
 }



On Wed, Feb 22, 2012 at 3:20 PM, ilai  wrote:
> On Tue, Feb 21, 2012 at 4:04 PM, Matthew Keller  
> wrote:
>
>> X <- read.big.matrix("file.loc.X",sep=" ",type="double")
>> hap.indices <- bigsplit(X,1:2) #this runs for too long to be useful on
>> these matrices
>> #I was then going to use foreach loop to sum across the splits
>> identified by bigsplit
>
> How about just using foreach earlier in the process ? e.g. split
> file.loc.X to (80) sub files and then run
> read.big.matrix/bigsplit/sum inside %dopar%
>
> If splitting X beforehand is a problem, you could also use ?scan to
> read in different chunks of the file, something like (untested
> obviously):
> # for X a matrix 800x4
> lineind<- seq(1,800,100)  # create an index vec for the lines to read
> ReducedX<- foreach(i = 1:8) %dopar%{
>  x <- 
> scan('file.loc.X',list(double(0),double(0),double(0),double(0)),skip=lineind[i],nlines=100)
> ... do your thing on x (aggregate/tapply etc.)
>  }
>
> Hope this helped
> Elai.
>
>
>
>>
>> SO - does anyone have ideas on how to deal with this problem - i.e.,
>> how to use a tapply() like function on an enormous matrix? This isn't
>> necessarily a bigtabulate question (although if I screwed up using
>> bigsplit, let me know). If another package (e.g., an SQL package) can
>> do something like this efficiently, I'd like to hear about it and your
>> experiences using it.
>>
>> Thank you in advance,
>>
>> Matt
>>
>>
>>
>> --
>> Matthew C Keller
>> Asst. Professor of Psychology
>> University of Colorado at Boulder
>> www.matthewckeller.com
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: NaN from function

2012-02-23 Thread Ted Harding
On 23-Feb-2012 Jonathan Williams wrote:
> Dear Helpers,
> I wrote a simple function to standardise variables if they
> contain more than one value. If the elements of the variable
> are all identical, then I want the function to return zero.
> 
> When I submit variables whose elements are all identical to
> the function, it returns not zero, but NaNs.
> 
> zt=function(x){if (length(table(x)>1)) y=(x-mean(x))/sd(x) else if
> (length(table(x)==1)) y=0; return(y)}
> 
> zt(c(1:10))
>#[1] -1.4863011 -1.1560120 -0.8257228 -0.4954337 -0.1651446  0.1651446 
>#0.4954337  0.8257228  1.1560120  1.4863011
> 
> zt(rep(1,10))
>#[1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
> 
> Would you be so kind as to point out what I am doing wrong, here?
> How can I obtain zeros from my function, instead of NaNs?
> (I obtain NaNs also if I set the function to zt=function(x){
>  if (length(table(x)>1)) y=(x-mean(x))/sd(x) else if
> (length(table(x)==1)) y=rep(0, length(x)); return(y)} ).
> 
> Thanks, in advance, for your help,
> Jonathan Williams

The issue here, Jonathan, is that when you evaluate
(x-mean(x))/sd(x) for a vector x whose elements are all equal,
not only is (x-mean(x)) = 0, but also sd(x) = 0, so you are
asking the function to return the result of 0/0. Since this
is undefined, the result is NaN.

A basic solution for this special case would be

  zt=function(x){
if (sd(x) == 0) return(0*x) else return( (x-mean(x))/sd(x) )
  }

This should cover the case where length(table(x))==1 (see also below).

I'm not happy about your conditions

  if (length(table(x)>1))
  if (length(table(x)==1))

since they ask for "length(table(x)>1)", which doesn't seem
to represent any natural criterion. E.g.:

  length(table(1:10)>1)
  # [1] 10
  length(table(rep(1,10))>1)
  # [1] 1

  if(length(table(1:10)>1)) y <- "Yes" else y <- "No" ; y
  # [1] "Yes"
  if(length(table(rep(1,10))>1)) y <- "Yes" else y <- "No" ; y
  # [1] "Yes"

  length(table(1:10)==1)
  # [1] 10
  length(table(rep(1,10))==1)
  # [1] 1
  
  if(length(table(1:10)==1)) y <- "Yes" else y <- "No" ; y
  # [1] "Yes"
  if(length(table(rep(1,10))==1)) y <- "Yes" else y <- "No" ; y
  # [1] "Yes"

I suspect you meant to write

  if (length(table(x))>1)
and
  if (length(table(x)))==1)

since this distinguishes between two more more different values
(length(table(x)) > 1) and all equal values (length(table(x)) == 1).

Ted.



-
E-Mail: (Ted Harding) 
Date: 23-Feb-2012  Time: 16:40:03
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] segfault when using data.table package in conjunction with foreach

2012-02-23 Thread Matthew Keller
Hi all,

I'm trying to use the package read.table within a foreach loop. I'm
grabbing 500M rows of data at a time from two different files and then
doing an aggregate/tapply like function in read.table after that. I
had planned on doing a foreach loop 39 times at once for the 39 files
I have, but obviously that won't work until I figure out why the
segfault is occurring. The sessionInfo, code, and error are pasted
below. If you have any ideas, would love to hear them. (I have no
control over the version of R - 2.13.0 - being used). Best

Matt


SESSION INFO:

> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 LC_MONETARY=C
 [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8   LC_NAME=C
   LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] data.table_1.7.10 doMC_1.2.2multicore_0.1-5
foreach_1.3.2 codetools_0.2-8   iterators_1.0.3



MY CODE:

computeAllPairSums <- function(filename, nbindiv,nrows.to.read)
{
   con <- file(filename, open="r")
   on.exit(close(con))
   ans <- matrix(numeric(nbindiv * nbindiv), nrow=nbindiv)
   chunk <- 0L
   while (TRUE) {
   #read.table faster than scan
   df0 <- read.table(con,col.names=c("ID1", "ID2", "ignored", "sharing"),
colClasses=c("integer", "integer", "NULL",
"numeric"),nrows=nrows.to.read,comment.char="")

   DT <- data.table(df0)
   setkey(DT,ID1,ID2)
   ss <- DT[,sum(sharing),by="ID1,ID2"]

   if (nrow(df0) == 0L)
   break

   chunk <- chunk + 1L
   cat("Processing chunk", chunk, "... ")

  idd <- as.matrix(subset(ss,select=1:2))
  newvec <- as.vector(as.matrix(subset(ss,select=3)))
  ans[idd] <- ans[idd] + newvec

 cat("OK\n")
 }
   ans
 }



require(foreach)
require(doMC)
registerDoMC(cores=2)


num <- 8891
nr <-  5L   #500 million rows at a time


MMM  <-  foreach(IT = 1:2) %dopar% {
  require(data.table)
  if (IT==1){ x <- system.time({computeAllPairSums(
paste(GERMLINE,"bc.chr22.q.20.file",sep=''),num,nr)}) } #Run it on
regular file PID 6489, 24 gb
  if (IT==2){ z <- system.time({computeAllPairSums.gz(
paste(GERMLINE,"bc.chr22.q.20.gz",sep=''),num,nr)}) } #Run it on gz
file PID 6490, 24 gb
}


MY R OUTPUT/ERROR:

MMM  <-  foreach(IT = 1:2) %dopar% {
+   require(data.table)
+   if (IT==1){ x <- system.time({computeAllPairSums(
paste(GERMLINE,"bc.chr22.q.20.file",sep=''),num,nr)}) } #Run it on
regular file PID 6053, 5.9 gb
+   if (IT==2){ z <- system.time({computeAllPairSums.gz(
paste(GERMLINE,"bc.chr22.q.20.gz",sep=''),num,nr)}) } #Run it on gz
file PID 6054, 4 gb
+ }

Loading required package: data.table
Loading required package: data.table
data.table 1.7.10  For help type: help("data.table")
data.table 1.7.10  For help type: help("data.table")

 *** caught segfault ***
address 0x2ae93df9, cause 'memory not mapped'

Traceback:
 1: .Call("dogroups", x, xcols, o__, f__, len__, jsub, SDenv, testj,
  byretn, byval, i, as.integer(icols), i[1, ivars, with = FALSE],
if (length(ivars)) paste("i.", ivars, sep = ""), is.na(nomatch),
verbose, PACKAGE = "data.table")
 2: `[.data.table`(DT, , sum(sharing), by = "ID1,ID2")
 3: DT[, sum(sharing), by = "ID1,ID2"]
 4: computeAllPairSums(paste(GERMLINE, "bc.chr22.q.20.file", sep =
""), num, nr)
 5: system.time({computeAllPairSums(paste(GERMLINE,
"bc.chr22.q.20.file", sep = ""), num, nr)})
 6: eval(expr, envir, enclos)
 7: eval(c.expr, envir = args, enclos = envir)
 8: doTryCatch(return(expr), name, parentenv, handler)
 9: tryCatchOne(expr, names, parentenv, handlers[[1L]])
10: tryCatchList(expr, classes, parentenv, handlers)
11: tryCatch(eval(c.expr, envir = args, enclos = envir), error = function(e) e)
12: FUN(X[[1L]], ...)
13: lapply(S, FUN, ...)
14: doTryCatch(return(expr), name, parentenv, handler)
15: tryCatchOne(expr, names, parentenv, handlers[[1L]])
16: tryCatchList(expr, classes, parentenv, handlers)
17: tryCatch(expr, error = function(e) {call <- conditionCall(e)
 if (!is.null(call)) {if (identical(call[[1L]],
quote(doTryCatch))) call <- sys.call(-4L)dcall <-
deparse(call)[1L]prefix <- paste("Error in", dcall, ": ")
  LONG <- 75Lmsg <- conditionMessage(e)sm <-
strsplit(msg, "\n")[[1L]]w <- 14L + nchar(dcall, type = "w") +
nchar(sm[1L], type = "w")if (is.na(w)) w <- 14L +
nchar(dcall, type = "b") + nchar(sm[1L], type = "b")
 if (w > LONG) prefix <- paste(prefix, "\n  ", sep =
"")}else prefix <- "Error : "msg <- paste(prefix,
conditionMessage(e), "\n", sep = "")
.Internal(seterrmessage(msg[1L]))if (!silent &&
identical(getOption("show.error.messa

[R] help - history()

2012-02-23 Thread Gian Maria Niccolò Benucci
Hi Members,

Do exist the possibility to delete a command line into the history?
Fo example. If I' ve typed a line code that is wrong, Can I delete it from
the hostory in reason to do not save it in the .Rhistory file?
Thanks for helping,

Gian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best option for big 3D arrays?

2012-02-23 Thread steven mosher
Did you have to use a particular filename?  or extension.

I created a similar file but then could not read it back in

Steve

On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic  wrote:

> I've been investigating and I partially respond myself. I tried the
> packages 'bigmemory' and 'ff' and for me the latter did the work I need
> pretty straightforward. I create the array in filebacked form with the
> function ff, and it seems that the usual R indexing works well. I have yet
> to see the limitations, but I hope it helps.
>
> a foo example:
>
> myArr <- ff(NA, dim=rep(904,3), filename="arr.ffd", vmode="double")
> myMat <- matrix(1:904^2, ncol=904)
> for ( i in 1:904 ) {
>myArr[,,i] <- myMat
> }
>
> Thanks all,
>
> 2012/2/11 Duncan Murdoch 
>
> > On 12-02-10 9:12 AM, Djordje Bajic wrote:
> >
> >> Hi all,
> >>
> >> I am trying to fill a 904x904x904 array, but at some point of the loop R
> >> states that the 5.5Gb sized vector is too big to allocate. I have looked
> >> at
> >> packages such as "bigmemory", but I need help to decide which is the
> best
> >> way to store such an object. It would be perfect to store it in this
> >> "cube"
> >> form (for indexing and computation purpouses). If not possible, maybe
> the
> >> best is to store the 904 matrices separately and read them individually
> >> when needed?
> >>
> >> Never dealed with such a big dataset, so any help will be appreciated
> >>
> >> (R+ESS, Debian 64bit, 4Gb RAM, 4core)
> >>
> >
> > I'd really recommend getting more RAM, so you can have the whole thing
> > loaded in memory.  16 Gb would be nice, but even 8Gb should make a
> > substantial difference.  It's going to be too big to store as an array
> > since arrays have a limit of 2^31-1 entries, but you could store it as a
> > list of matrices, e.g.
> >
> > x <- vector("list", 904)
> > for (i in 1:904)
> >  x[[i]] <- matrix(0, 904,904)
> >
> > and then refer to entry i,j,k as x[[i]][j,k].
> >
> > Duncan Murdoch
> >
> >
> >
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame colnames through ddply

2012-02-23 Thread Arnaud Gaboury
Dear list,

I am trying to pass data frame colnames through ddply without sucess.

Here is my command:

>exportfile<-ddply(exportfile,c("Product","Price"),summarise,Nbr.Lots=sum(Filled.Qty))

exportfile is a df. I want to apply summarise to Product and Price columns and 
sum the Filled.Qty column and renamed it Nbr.Lots. This line works. 

What I would like is changing the col names in the same line, thus avoiding 
another line with 
>colnames(exportfile)<-c("Contract","Price","Nbr.Lots")

Is there a possibility to change my col names in the same line?

TY



Arnaud Gaboury
 
A2CT2 Ltd.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: NaN from function

2012-02-23 Thread Petr Savicky
On Thu, Feb 23, 2012 at 04:40:07PM -, Ted Harding wrote:
[...]
> A basic solution for this special case would be
> 
>   zt=function(x){
> if (sd(x) == 0) return(0*x) else return( (x-mean(x))/sd(x) )
>   }
> 
> This should cover the case where length(table(x))==1 (see also below).
> 
> I'm not happy about your conditions
> 
>   if (length(table(x)>1))
>   if (length(table(x)==1))
> 
> since they ask for "length(table(x)>1)", which doesn't seem
> to represent any natural criterion. E.g.:
> 
>   length(table(1:10)>1)
>   # [1] 10
>   length(table(rep(1,10))>1)
>   # [1] 1
> 
>   if(length(table(1:10)>1)) y <- "Yes" else y <- "No" ; y
>   # [1] "Yes"
>   if(length(table(rep(1,10))>1)) y <- "Yes" else y <- "No" ; y
>   # [1] "Yes"
> 
>   length(table(1:10)==1)
>   # [1] 10
>   length(table(rep(1,10))==1)
>   # [1] 1
>   
>   if(length(table(1:10)==1)) y <- "Yes" else y <- "No" ; y
>   # [1] "Yes"
>   if(length(table(rep(1,10))==1)) y <- "Yes" else y <- "No" ; y
>   # [1] "Yes"
> 
> I suspect you meant to write
> 
>   if (length(table(x))>1)
> and
>   if (length(table(x)))==1)
> 
> since this distinguishes between two more more different values
> (length(table(x)) > 1) and all equal values (length(table(x)) == 1).

Hi.

The condition length(table(x)) > 1 may also be written as
lentgh(unique(x)) > 1. These two conditions are usually
equivalent, but not always due to the rounding to 15 digits
performed in table(). For example

  x <- 1 + (0:10)*2^-52
  length(table(x))  # [1] 1
  length(unique(x)) # [1] 11
  sd(x) # [1] 7.364386e-16
  diff(x)   # [1] 2.220446e-16 2.220446e-16 2.220446e-16 ...

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating Pseudo R-squared from nlme

2012-02-23 Thread Bert Gunter
I saw no reply to this yet, so herewith a few comments.

1. Best recommendation: Post to r-sig-mixed-models instead.

Miscellaneous  comments.

R-squared as "an overall summary of the total outcome variability
explained" is practically useless and generally misleading. Why?
Short answer: Because nonlinear models are fundamentally different
(mathematically) from linear models. For example, the basic linear
models concept of "degrees of freedom" (df) is not  immediately
applicable and is certainly NOT simply related to the number of
parameters in the model.

Long Answer: Ask on mixed models list.

I am well aware that much software and lots of non-statistical
literature quote R-squares and "pseudo"-R-squares (at least there is a
qualification) as informative measures of fit for nonlinear models.

As I am now 65, I can be a bit impolite and expect forbearance when I
say: it's all crap. I say this in the same spirit that one would speak
of papers on perpetual motion machines or creationism: it's contrary
to the underlying reality.

And now for the disclaimer: As no one has proclaimed me expert in
chief of anything, others who actually are may point out my egregious
errors (or I certainly hope they will). So long as they have the
mathematics on their side (opinion counts for nothing in the world of
thermodynamics or evolution: entropy always increases and bacteria
eventually evolve resistance), pay attention to them and ignore me.
Age does not guarantee wisdom.

Best,
Bert



On Thu, Feb 23, 2012 at 5:18 AM, dadrivr  wrote:
> I am fitting individual growth models using nlme (multilevel models with
> repeated measurements nested within the individual), and I am trying to
> calculate the Pseudo R-squared for the models (an overall summary of the
> total outcome variability explained).  Singer and Willett (2003) recommend
> calculating Pseudo R-squared in multilevel modeling by squaring the sample
> correlation between observed and predicted values (across the sample for
> each person on each occasion of measurement).
>
> My question is which set of predicted values should I use from nlme in that
> calculation?  From my models in nlme, I receive two sets of fitted values.
> Reading the description of the fitted lme values
> (http://stat.ethz.ch/R-manual/R-patched/library/nlme/html/fitted.lme.html),
> there appear to be two sets of fitted values that correspond to levels of
> grouping, where the first set of fitted values (Level 0) correspond to the
> population fitted values and it moves to more innermore groupings as the
> levels increase (e.g., I suppose Level 1 corresponds to the individual-level
> fitted values in my data).
>
> I'm not sure I understand the distinction between population fitted values
> and individual-level fitted values because each individual and each
> measurement occasion has an estimate for both (population and individual
> fitted estimates).  Could you please explain the distinction and which one I
> should be using to calculate the Pseudo R-squared as suggested by Singer and
> Willett (2003)?
>
> Thanks so much for your help!
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Calculating-Pseudo-R-squared-from-nlme-tp4413825p4413825.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Case weighting

2012-02-23 Thread David Winsemius


On Feb 23, 2012, at 10:49 AM, Hed Bar-Nissan wrote:


The need comes from the PISA data. (http://www.pisa.oecd.org)

In the data there are many cases and each of them carries a numeric
variable that signifies it's weight.
In SPSS the command would be "WEIGHT BY"

In simpler words here is an R sample ( What is get  VS  what i want  
to get )




data.recieved <- data.frame(
+ kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes",  
"No")),

+ weight=c(10, 1, 1, 1)
+ );

data.recieved;

 kindergarten_attendance weight
1  No 10
2 Yes  1
3 Yes  1
4 Yes  1




data.weighted <- data.frame(
+ kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,1,1,1),  
labels =

c("Yes", "No")) );


You want "case repetition" not case weighting, which I would use as a  
term when working on estimation problems:


>  ( data.weighted <- unlist(sapply(1:NROW(data.recieved),  
function(x) rep(data.recieved[x,1], times=data.recieved[x,2] ))  ) )

 [1] No  No  No  No  No  No  No  No  No  No  Yes Yes Yes
Levels: Yes No




par(mfrow=c(1,2));
plot(data.recieved$kindergarten_attendance,main="What i get");
plot(data.weighted$kindergarten_attendance,main="What i want to  
get");


Seems to work with the factor vector, although I didn't replicate  
dataframe rows, but I guess you could.






tnx in advance
Hed

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R CMD INSTALL with configure args

2012-02-23 Thread Erin Hodgess
Dear R People:

I have a question, please:

I want to install a package from R CMD INSTALL and I have a boatload
of configure args.  I want to put them into a file.  How do I point
the R CMD INSTALL to that file, please?

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Advice on exploration of sub-clusters in hierarchical dendrogram

2012-02-23 Thread ilai
See inline

On Thu, Feb 23, 2012 at 8:54 AM, kosmo7  wrote:
> Dear R user,

> In other words, I am trying to obtain/read the sub-clusters of a specific
> cluster in the dendrogram, by isolating a specific node and exploring
> locally its lower hierarchy.

To explore or "zoom in" on elements of z you had the first step right:
create x<-as.dendrogram(z) but then you didn't use x anymore (except
for the plot which could have been done on z). Maybe you wanted:

> df=read.table('mydata.txt', head=T, row.names=1) #read file with distance
> matrix
> d=as.dist(df) #format table as distance matrix
> z<-hclust(d,method="complete", members=NULL)
> x<-as.dendrogram(z)
> plot(x, xlab="mydata complete-LINKAGE", ylim=c(0,4)) #visualization of the
> dendrogram

>From this point

clusters<-cut(x, h=1.6) #obtain clusters at cutoff height=1.6

# clusters is now (after cut x not cutree z) a list of two components:
upper and lower. Each is in itself a list of dendrograms: the
structure above 1.6, and the local clusters below:

plot(clusters$upper)  # the structure above 1.6
plot(clusters$lower[[1]])  # cluster 1

# To print the details of cluster 1 (this output maybe very long
depending on how many members):

str(clusters$lower[[1]])

To extract specific details from the list and automate for all or some
of the clusters ?dendrapply is your friend.

I'm assuming your attempts at reclustering locally later in your post
are no longer necessary, unless I'm missing something on what exactly
you are trying to do.

Hope this helps

Elai



> ord<-cmdscale(d, k=2) #Multidimensional scaling of the data down to 2
> dimensions
> clusplot(ord,clusters, color=TRUE, shade=TRUE,labels=4, lines=0)
> #visualization of the clusters in 2D map
> var1<-var(clusters==1) #variance of cluster 1
>
> #extract cluster memberships:
> clids = as.data.frame(clusters)
> names(clids) = c("id")
> clids$cdr = row.names(clids)
> row.names(clids) = c(1:dim(clids)[1])
> clstructure = lapply(unique(clids$id), function(x){clids[clids$id ==
> x,'cdr']})
>
> clstructure[[1]] #get memberships of cluster 1
>
>
>
> >From this point, eventually, I could recreate a distance matrix with only
> the members of a specific cluster and then re-apply hierarchical clustering
> and start all over again.
> But this would take me ages to perform individually for hundred of clusters.
> So, I was hoping if anyone could point me to a direction as to how to take
> advantage of the initial dendrogram and focus on specific clusters from
> which to derive the sub-clusters at a new given cutoff height.
>
> I recently found in this page
> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual
> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual
>
> the following code:
> clid <- c(1,2)
> ysub <- y[names(mycl[mycl%in%clid]),]
> hrsub <- hclust(as.dist(1-cor(t(ysub), method="pearson")),
> method="complete") # Select sub-cluster number (here: clid=c(1,2)) and
> generate corresponding dendrogram.
>
> Even with this given example I am afraid I can't work my way around.
> So I guess in my case I could grab all the members of a specific cluster
> using my existing code and try to reformat the distance matrix in one that
> only contains the distances of those members:
> cluster1members<-clstructure[[1]]
>
> Then I need to reformat the distance matrix into a new one, say d1, which I
> can feed to a new -local- hierarchical clustering:
> hrsub<-hclust(d1, method="complete")
>
> Any ideas on how I can obtain a new distance matrix with just the distances
> of the members in that clusters, with names contained in vector
> "cluster1members" ?
>
> Apologies if this seems trivial, but I really can't find the correct
> functions to use for this task.
> Thank you very much in advance - as I am really a novice with R, small
> chunks of code as example would be of great help.
>
> Take care all -
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Advice-on-exploration-of-sub-clusters-in-hierarchical-dendrogram-tp4414277p4414277.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nlme Fixed Variance Function

2012-02-23 Thread Agostino Moro
Dear R users,

I am trying to fit a gls model and weight my data points using a
VarFixed structure. I have found many examples, but I do not
understand the difference between the following models with varFixed
specified in a different way:

mod<-gls(y~x,weights=varFixed(~1/invsigma)

mod<-gls(y~x,weights=varFixed(~invsigma)

In my case I would simply like to weigh my data points by their
inverse variance.

Any help would be greatly appreciated!

Cheers,

Agostino

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival analysis and comparing survival curves

2012-02-23 Thread R-girl
Hei,

I have a one simple question which does not seem to be that simple as I
cannot find any solution/answer:

Is it possible to compare multiple survival curves in R with
survdiff-function when there is interaction term involved in predictor
variables (and this interaction is significant)?

Example:

survdiff(Surv(death,status)~treatment*gapsize)

R is making "problems" with it ie.e. it does not want to perform the test.
And all the examples I have found so far, may involve multiple predictor
variables but in additive format (e.g. treatment+gapsize).

If survdiff is not the way to go, are there any other solutions in order to
compare statistically the curves i.e. if they differ significantly from each
other?

I would really appreciate if someone can answer!

Cheers, Minna

--
View this message in context: 
http://r.789695.n4.nabble.com/Survival-analysis-and-comparing-survival-curves-tp4414316p4414316.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing the package Rcplex

2012-02-23 Thread zheng wei
You can tell me what do you mean by add R's bin directory to path.
 
Thanks



From: Uwe Ligges 

Cc: David Winsemius ; "r-help@r-project.org" 
 
Sent: Thursday, February 23, 2012 9:29 AM
Subject: Re: [R] installing the package Rcplex

I give up.

Uwe Ligges



On 23.02.2012 15:01, zheng wei wrote:
> Thanks.
>
> I do not understand item 2, what do you mean by add R's bin directory to path?
>
> While directly use to c:\temp and type the commands, it says 'R' is not 
> recognized as an internal or external command.
>
> While I use previous approach by going into the bin directory and type R CMD 
> INSTALL "c:/temp/Rcplex", it told me that configuration failed for package 
> 'Rcplex'
>
>
> 
>  From: Uwe Ligges

> Cc: David Winsemius; 
> "r-help@r-project.org"
> Sent: Thursday, February 23, 2012 5:07 AM
> Subject: Re: [R] installing the package Rcplex
>
>
>
> On 22.02.2012 21:04, zheng wei wrote:
>> Based on my understanding of the manual, I moved upziped the file and put 
>> the folder of Rcplex under the directory of c:/temp
>> Then I use cmd under windows to go to the director of C:\Program
>> Files\R\R-2.13.0\bin, where my R is installed
>> and typed R CMD INSTALL
>> "c:/temp/Rcplex"
>
> I'd
>
> 1. install a recent version of R
>
> 2. add R's bin directory to the PATH and go to c:\temp and say
>
> R CMD build Rcplex
>
> followed by
>
> R CMD INSTALL Rcplex_version.tar.gz
>
>
> 3. When it fails, I'd try to find out if the ERROR message is helpful.
>
> 4. If 3 fails, I'd ask the maintainer for help - including relevant
> information, like the error message, install paths, version information etc.
>
>
> Uwe Ligges
>
>
>
>> I got the error of configuration failed for package "Rcplex"
>>
>> Any idea?
>>
>>
>>
>>
>> 
>>    From: Uwe Ligges

>> Cc: David Winsemius; 
>> "r-help@r-project.org"
>> Sent: Wednesday, February 22, 2012 4:04 AM
>> Subject: Re: [R] installing the package Rcplex
>>
>>
>>
>> On 22.02.2012 03:33, zheng wei wrote:
>>> Thanks.
>>>
>>> I was just reminded by the tech support in my university that cplex is an 
>>> independent software owned by ILOG, which in turn is now owned by IBM. I 
>>> suceeded in installing the software cplex under the directory of 
>>> "C:/Program Files/IBM/ILOG/CPLEX_Studio_Academic124/cplex"
>>> I guess Rcplex is an R package to utilize the software cplex. I have 
>>> changed the path "/c/ilog/cplex111" to the above path. My question is how 
>>> to finally and effectively install the package of Rcplex?
>>
>> You have been asked alreeady to read the R Installation and
>> Administration manual.
>>
>> Uwe Ligges
>>
>>
>>
>>>
>>> Thanks,
>>> Wei
>>>
>>>
>>>
>>> 
>>> From: Uwe Ligges

>>> Cc: David Winsemius; 
>>> "r-help@r-project.org"
>>> Sent: Tuesday, February 21, 2012 2:14 PM
>>> Subject: Re: [R] installing the package Rcplex
>>>
>>>
>>>
>>> On 21.02.2012 19:57, zheng wei wrote:
 Thank you both for helping. Still could not figure out.

 I was contacting different supporting IT departments in my university but 
 did not get any help.

 For the moment, I just want to what does the instruction of the package 
 means. You could find this instruction on page 
 http://cran.r-project.org/web/packages/Rcplex/INSTALL
        
--
 ***WINDOWS***
 Installation on Windows systems is done by using the provided
 Makevars.win file in the src directory. It contains the following
 lines:
 PKG_CPPFLAGS=-I/include
 PKG_LIBS=-L/lib/x86_windows_vs2008/stat_mda -lcplex111 -lm
 where    is the cplex installation directory
 e.g. /c/ilog/cplex111. Please edit your Makevars.win file accordingly.
 We have successfully tested this procedure with CPLEX 11.1 on 32-bit
 Windows XP.
 --


 I can find the file and see the codes. But what new path should I put, and 
 what to do next?
>>>
>>> The path to your CPLEX installation?
>>>
>>> Uwe Ligges
>>>
>>>
>>>
>>>

 Thanks,
 Wei


 
 From: Uwe Ligges
 To: David Winsemius

-project.org>
 Sent: Monday, February 20, 2012 6:01 AM
 Subject: Re: [R] installing the package Rcplex



 On 20.02.2012 01:54, David Winsemius wrote:
>
> On Feb 19, 2012, at 7:45 PM, zheng wei wrote:
>
>> I did not know this before. I installed it as you suggested. what to
>> do next?
>
> Read the Installation Manual?
>

 And don't forget this is a source package for which no CRAN Windows
 binary exists, hence it may be not that straightforward to get it done
 and you wil have to read the INSTALL file from the source package 
 carefully.

[R] help with winbugs glm

2012-02-23 Thread Adan Jordan-Garza
Hi,
I am running a model with count data and one categorical predictor (simple
model for me to understand it fully), I did in R a glm like this:
glm(Recruitment~Depth, family=poisson). I get the coefficientes and
confidence intervals and all is ok. But then I want to do the same model
with Bayesian stats, here is my code:

model
{ for (i in 1:232)
{
Recruitment[i]~dpois(lambda[i])
log(lambda[i])<-a+b[Depth[i]]*Depth[i]
}
a~dnorm(0,0.01)
b[1]~dnorm(0,0.01)
b[2]~dnorm(0,0.01)
b[3]~dnorm(0,0.001)
}
list(a=0, b=c(0,0,0))

I have two problems: 1) the resulting credible intervals for the
coefficients (a, b1, b2 and b3) are HUGE don t make any reasonable sense;
2) Using OpenBugs and Winbugs I get different results,

if anyone can help me I appreciate a lot your time,

thanks

Guillermo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis and comparing survival curves

2012-02-23 Thread Bert Gunter
DEAR R-girl:

On Thu, Feb 23, 2012 at 8:12 AM, R-girl  wrote:
> Hei,
>
> I have a one simple question which does not seem to be that simple as I
> cannot find any solution/answer:
>
> Is it possible to compare multiple survival curves in R with
> survdiff-function when there is interaction term involved in predictor
> variables (and this interaction is significant)?

I have one simple answer. Get local statistical help. You do not
understand the meaning of an interaction.

-- Bert

>
> Example:
>
> survdiff(Surv(death,status)~treatment*gapsize)
>
> R is making "problems" with it ie.e. it does not want to perform the test.
> And all the examples I have found so far, may involve multiple predictor
> variables but in additive format (e.g. treatment+gapsize).
>
> If survdiff is not the way to go, are there any other solutions in order to
> compare statistically the curves i.e. if they differ significantly from each
> other?
>
> I would really appreciate if someone can answer!
>
> Cheers, Minna
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Survival-analysis-and-comparing-survival-curves-tp4414316p4414316.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Aggregate with Function List ?

2012-02-23 Thread Michael Karol
R Experts

 

  I wish to tabulate into one data frame statistics summarizing
concentration data.   The summary is to include mean, standard
deviation, median, min and max.  I wish to have summaries by Dose, Day
and Time.   I can do this by calling aggregate once for each of the
statistics (mean, standard deviation, median, min and max) and then
execute 4 merges to merging the 5 data frames into one.  (Example
aggregate code for mean only is shown below.)  

  Can someone show me the coding to do this as one command, rather than
5 calls to aggregate and 4 merges.  In other words, in essence, I'd like
to present to "FUN =" a list of functions, so all the summary stats come
back in one data frame.  Your assistance is appreciated.  Thank you.

 

MeansByDoseDayTime <- aggregate(as.double(DF$Concentration), by =
list(DF$Dose, DF$Day, DF$Time), FUN = mean, trim = 0, na.rm = T,
weights=NULL)

 

 

Regards, 

Michael


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R to read Nortek Aquadopp Profiler

2012-02-23 Thread Brian Diggs

On 2/22/2012 4:53 PM, Vinny Moriarty wrote:

Hello,

I have current data from a nortek ADP, which is basically current speed and
direction data in a 3 dimensional  X Y Z format

  http://www.nortekusa.com/usa/products/current-profilers/aquadopp-profiler-1


The instrument logs data in a complex way and I was wondering if anyone has
had any experience using R to at least read the data, and perhaps smooth it
as well.  If so, are there any resources for using R for this kind of work?


A quick search using Rseek.org turns up the oce package with functions 
read.adp and read.adp.nortek. I've not ever worked with such data, but 
looking at the oce package seems like a good start and also at the 
Environmetrics task view ( 
http://cran.r-project.org/web/views/Environmetrics.html ).



Thanks,

V



--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Matrix code optimization

2012-02-23 Thread Matt Shotwell
The chol and solve methods for dpoMatrix (Matrix package) are much
faster than the default methods. But, the time required to coerce a
regular matrix to dpoMatrix swamps the advantage.

Hence, I have the following problem, where use of dpoMatrix is worse
than a regular matrix.

library(Matrix)

x <- diag(10)

system.time(
  for(r in seq(0.1, 0.9, length.out=1000)) {
m <- r^abs(row(x)-col(x));
chol(m); solve(m);
  })

system.time(
  for(r in seq(0.1, 0.9, length.out=1000)) {
M <- as(r^abs(row(x)-col(x)), 'dpoMatrix')
chol(M); solve(M);
  })

Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-23 Thread Michael
Thanks Andy.

I am reading the "locfit" document...

but not sure how to do the CV and bandwidth selection...

Here is a quote about the function "regband": it doesn't seem to be usable?

Basically I am looking for a "locfit" that comes with an automatic
bandwidth selection so that I am essentially parameter free for the
local-regression step...

-

regband Bandwidth selectors for local regression.

Description

Function to compute local regression bandwidths for local linear
regression, implemented as a front

end to locfit().

This function is included for comparative purposes only. Plug-in selectors
are based on flawed logic,

make unreasonable and restrictive assumptions and do not use the full power
of the estimates available

in Locfit. Any relation between the results produced by this function and
desirable estimates

are entirely coincidental.

Usage

regband(formula, what = c("CP", "GCV", "GKK", "RSW"), deg=1, ...)

2012/2/23 Liaw, Andy 

> **
> If that's the kind of framework you'd like to work in, use locfit, which
> has the predict() method for evaluating new data.  There are several
> different handwidth selectors in that package for your choosing.
>
> Kernel smoothers don't really fit the framework of "creating a model
> object, followed by predicting new data using that fitted model object"
> very well because of it's local nature.  Think of k-nn classification,
> which has similar problem:  The "model" needs to be computed for every data
> point you want to predict.
>
> Andy
>
>  --
> *From:* Michael [mailto:comtech@gmail.com]
> *Sent:* Thursday, February 23, 2012 10:06 AM
>
> *To:* Liaw, Andy
> *Cc:* Bert Gunter; r-help
> *Subject:* Re: [R] Good and modern Kernel Regression package in R with
> auto-bandwidth?
>
>   Thank you Andy!
>
> I went thru KernSmooth package but I don't see a way to use the fitted
> function to do the "predict" part...
>
>
> data=data.frame(z=z, x=x)
>
> datanew=data.frame(z=z, x=x)
>
> lmfit=lm(z
> ~x, data=data)
>
> lmforecast=predict(lmfit, newdata=datanew)
>
> Am I missing anything here?
>
> Thanks!
> 2012/2/23 Liaw, Andy 
>
>> **
>> In short, pick your poison...
>>
>> Is there any particular reason why the tools that shipped with R itself
>> (e.g., kernSmooth) are inadequate for you?
>>
>> I like using the locfit package because it has many tools, including the
>> ones that the author didn't think were optimal.  You may need the book to
>> get most mileage out of it though.
>>
>> Andy
>>
>>  --
>> *From:* Michael [mailto:comtech@gmail.com]
>> *Sent:* Thursday, February 23, 2012 12:25 AM
>> *To:* Liaw, Andy
>> *Cc:* Bert Gunter; r-help
>>
>> *Subject:* Re: [R] Good and modern Kernel Regression package in R with
>> auto-bandwidth?
>>
>>   Imeant its very slow when I use "cv.aic"...
>>
>> On Wed, Feb 22, 2012 at 11:24 PM, Michael  wrote:
>>
>>> Is "np" an okay package to use?
>>>
>>> I am worried about the "multi-start" thing... and also it's very slow...
>>>
>>>
>>> On Wed, Feb 22, 2012 at 8:35 PM, Liaw, Andy  wrote:
>>>
 Bert's question aside (I was going to ask about laundry, but that's
 much harder than taxes...), my understanding of the situation is that
 "optimal" is in the eye of the beholder.  There were at least two schools
 of thought on which is the better way of automatically selecting bandwidth,
 using plug-in methods or CV-type.  The last I check, the jury is still out.

 Andy

 > -Original Message-
 > From: r-help-boun...@r-project.org
 > [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter
 > Sent: Wednesday, February 22, 2012 6:03 PM
 > To: Michael
 > Cc: r-help
 > Subject: Re: [R] Good and modern Kernel Regression package in
 > R with auto-bandwidth?
 >
 > Would you like it to do your your taxes for you too? :-)
 >
 > Bert
 >
 > Sent from my iPhone -- please excuse typos.
 >
 > On Feb 22, 2012, at 11:46 AM, Michael  wrote:
 >
 > > Hi all,
 > >
 > > I am looking for a good and modern Kernel Regression
 > package in R, which
 > > has the following features:
 > >
 > > 1) It has cross-validation
 > > 2) It can automatically choose the "optimal" bandwidth
 > > 3) It doesn't have random effect - i.e. if I run the
 > function at different
 > > times on the same data-set, the results should be exactly
 > the same... I am
 > > trying "np", but I am seeing:
 > >
 > > Multistart 1 of 1 |
 > > Multistart 1 of 1 |
 > > ...
 > >
 > > It looks like in order to do the optimization, it's doing
 > > multiple-random-start optimization... am I right?
 > >
 > >
 > > Could you please give me some pointers?
 > >
 > > I did some google search but there are so many packages
 > that do this... I
 > > just wanted to find the best/modern one to use...

Re: [R] Aggregate with Function List ?

2012-02-23 Thread David Winsemius


On Feb 23, 2012, at 1:41 PM, Michael Karol wrote:


R Experts



 I wish to tabulate into one data frame statistics summarizing
concentration data.   The summary is to include mean, standard
deviation, median, min and max.  I wish to have summaries by Dose, Day
and Time.   I can do this by calling aggregate once for each of the
statistics (mean, standard deviation, median, min and max) and then
execute 4 merges to merging the 5 data frames into one.  (Example
aggregate code for mean only is shown below.)

 Can someone show me the coding to do this as one command, rather than
5 calls to aggregate and 4 merges.  In other words, in essence, I'd  
like
to present to "FUN =" a list of functions, so all the summary stats  
come

back in one data frame.  Your assistance is appreciated.  Thank you.


Perhaps something like this?

MeansByDoseDayTime <- aggregate(as.double(DF$Concentration), by =
list(DF$Dose, DF$Day, DF$Time), FUN =
  function(x) c( mean(x, trim = 0, na.rm = T, weights=NULL),
 sd(x, na.rm=TRUE),
 median(x, na.rm=TRUE),
 min(na.rm=TRUE),
 max(x, na.rm=TRUE)
   )
)






David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis and comparing survival curves

2012-02-23 Thread David Winsemius


On Feb 23, 2012, at 11:12 AM, R-girl wrote:


Hei,

I have a one simple question which does not seem to be that simple  
as I

cannot find any solution/answer:

Is it possible to compare multiple survival curves in R with
survdiff-function when there is interaction term involved in predictor
variables (and this interaction is significant)?


I've never had a problem with survival failing to construct an  
interaction term.




Example:

survdiff(Surv(death,status)~treatment*gapsize)

R is making "problems" with it ie.e. it does not want to perform the  
test.


Is 'gapsize' a numeric vector rather than a factor?

And all the examples I have found so far, may involve multiple  
predictor

variables but in additive format (e.g. treatment+gapsize).

If survdiff is not the way to go, are there any other solutions in  
order to
compare statistically the curves i.e. if they differ significantly  
from each

other?


Without a better description of the data layout (say using hte str()  
function) and the code you have used so far, the words "the curves",  
the indefinite pronoun "they" and the phrase "each other" will remain  
far too nebulous for further comment.




I would really appreciate if someone can answer!


Only if "someone" can provide the information requested in the Posting  
Guide.


PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with winbugs glm

2012-02-23 Thread ilai
Adan,
How many levels does Depth have? my wild guess: 3 and your bugs model
is not identifiable.

Second, I think you may have a critical error in the way you formatted
the data for the bugs model. From your code it looks like you are just
using the factor Depth and not a design matrix of dummy variables.

I may be wrong with respect to WinBugs (I use JAGS), but if Depth is
denoted as, for e.g., "low","med","high" wouldn't your multiply
operation "...*Depth[i] " on line 5 fail ?

More likely Depth is denoted "1","2","3" and WinBugs thinks it's
numerical. Well, in that case clearly coefficients for this model
don't make any sense (you'll only need one b for the slope). You can
use model.matrix(~Depth) to get the proper format for your data.

Hope this solves it. Next time, knowing n.chains n.iter and if they
achieved convergence (with different starting values) can help sort
through these sort of issues.

Cheers

Elai

On Thu, Feb 23, 2012 at 9:57 AM, Adan Jordan-Garza
 wrote:
> Hi,
> I am running a model with count data and one categorical predictor (simple
> model for me to understand it fully), I did in R a glm like this:
> glm(Recruitment~Depth, family=poisson). I get the coefficientes and
> confidence intervals and all is ok. But then I want to do the same model
> with Bayesian stats, here is my code:
>
> model
> { for (i in 1:232)
> {
> Recruitment[i]~dpois(lambda[i])
> log(lambda[i])<-a+b[Depth[i]]*Depth[i]
> }
> a~dnorm(0,0.01)
> b[1]~dnorm(0,0.01)
> b[2]~dnorm(0,0.01)
> b[3]~dnorm(0,0.001)
> }
> list(a=0, b=c(0,0,0))
>
> I have two problems: 1) the resulting credible intervals for the
> coefficients (a, b1, b2 and b3) are HUGE don t make any reasonable sense;
> 2) Using OpenBugs and Winbugs I get different results,
>
> if anyone can help me I appreciate a lot your time,
>
> thanks
>
> Guillermo
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] I'm sure I'm missing something with formatC() or sprintf()

2012-02-23 Thread z2.0
I have a four-digit string I want to convert to five digits. Take the
following frame:

zip
2108
60321
60321
22030
91910

I need row 1 to read '02108'. This forum directed me to formatC previously
(thanks!) That usually works but, for some reason, it's not in this
instance. Neither of the syntaxes below change '2108' to '02108.' The values
in cand_receipts[,1] are of type 'character.'

cand_receipts[,1] <- formatC(cand_receipts[,1], width = 5, format = 's',
flag = '0')
cand_receipts[,1] <- sprintf("%05s", cand_receipts[,1])

 Any thoughts?

Thanks,

Zack





--
View this message in context: 
http://r.789695.n4.nabble.com/I-m-sure-I-m-missing-something-with-formatC-or-sprintf-tp4414905p4414905.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I'm sure I'm missing something with formatC() or sprintf()

2012-02-23 Thread William Dunlap
sprintf's "%s" format descriptor ignores initial 0's in ,
in C's sprintf and in R's.  Here are 2 ways to do it:
  > z <- c("5", "45", "345", "2345", "12345")
  > sprintf("%05d", as.integer(z))
  [1] "5" "00045" "00345" "02345" "12345"
  > gsub(" ", "0", sprintf("%5s", z))
  [1] "5" "00045" "00345" "02345" "12345"

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of z2.0
> Sent: Thursday, February 23, 2012 11:16 AM
> To: r-help@r-project.org
> Subject: [R] I'm sure I'm missing something with formatC() or sprintf()
> 
> I have a four-digit string I want to convert to five digits. Take the
> following frame:
> 
> zip
> 2108
> 60321
> 60321
> 22030
> 91910
> 
> I need row 1 to read '02108'. This forum directed me to formatC previously
> (thanks!) That usually works but, for some reason, it's not in this
> instance. Neither of the syntaxes below change '2108' to '02108.' The values
> in cand_receipts[,1] are of type 'character.'
> 
> cand_receipts[,1] <- formatC(cand_receipts[,1], width = 5, format = 's',
> flag = '0')
> cand_receipts[,1] <- sprintf("%05s", cand_receipts[,1])
> 
>  Any thoughts?
> 
> Thanks,
> 
> Zack
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/I-m-sure-I-m-missing-something-with-
> formatC-or-sprintf-tp4414905p4414905.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I'm sure I'm missing something with formatC() or sprintf()

2012-02-23 Thread Sarah Goslee
You said that the values are already character - that's the key.

Compare:
> sprintf("%05s", "2018")
[1] " 2018"
> sprintf("%05d", 2018)
[1] "02018"

Since they are already character, though, here's another option:
x <- c("2108", "60321", "22030") # part of your data
ifelse(nchar(x) == 4, paste("0", x, sep=""), x)
[1] "02108" "60321" "22030"

You could also use:
> sprintf("%05d", as.numeric("2018"))
[1] "02018"

The help for sprintf says this, but not clearly:
‘0’ For numbers, pad to the field width with leading zeros.



Sarah


On Thu, Feb 23, 2012 at 2:16 PM, z2.0  wrote:
> I have a four-digit string I want to convert to five digits. Take the
> following frame:
>
> zip
> 2108
> 60321
> 60321
> 22030
> 91910
>
> I need row 1 to read '02108'. This forum directed me to formatC previously
> (thanks!) That usually works but, for some reason, it's not in this
> instance. Neither of the syntaxes below change '2108' to '02108.' The values
> in cand_receipts[,1] are of type 'character.'
>
> cand_receipts[,1] <- formatC(cand_receipts[,1], width = 5, format = 's',
> flag = '0')
> cand_receipts[,1] <- sprintf("%05s", cand_receipts[,1])
>
>  Any thoughts?
>
> Thanks,
>
> Zack
>
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/I-m-sure-I-m-missing-something-with-formatC-or-sprintf-tp4414905p4414905.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I'm sure I'm missing something with formatC() or sprintf()

2012-02-23 Thread Ted Harding
On 23-Feb-2012 z2.0 wrote:
> I have a four-digit string I want to convert to five digits. Take the
> following frame:
> 
> zip
> 2108
> 60321
> 60321
> 22030
> 91910
> 
> I need row 1 to read '02108'. This forum directed me to formatC previously
> (thanks!) That usually works but, for some reason, it's not in this
> instance. Neither of the syntaxes below change '2108' to '02108.' The values
> in cand_receipts[,1] are of type 'character.'
> 
> cand_receipts[,1] <- formatC(cand_receipts[,1], width = 5, format = 's',
> flag = '0')
> cand_receipts[,1] <- sprintf("%05s", cand_receipts[,1])
> 
>  Any thoughts?
> 
> Thanks,
> Zack

For this (and similar cases):

  formatC(2108,width=5,flag="0")
  # [1] "02108"

For longer strings:

  formatC(2108,width=6,flag="0")
  # [1] "002108"

see ?formatC for more details (the way formatC() aggregates
information about the desired format is somewhat different
from the format syntax in C's printf() and related functions).

Ted.

-
E-Mail: (Ted Harding) 
Date: 23-Feb-2012  Time: 19:58:22
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert zoo object to "standard" R object so I can plot and output to csv file

2012-02-23 Thread Henry
Another simple question - trying to specify xlim in a zoo plot and getting
error
my plot line is
plot(z1,
ylim=c(-100,3000),xlim=c(chron("10/30/2011","00:00:00"),chron("10/30/2011","00:20:00")),type="b",xlab="",ylab="1
Minute Fit",cex.lab=1.3)
Error in substring(paste("0", v$day, sep = ""), first = nchar(paste(v$day)))
: 
  invalid substring argument(s)

Most of the complete code pasted below.
fmt<-"%m/%d/%Y %H:%M:%S"  # describe the date/time format in the file
tail1<-function(x) tail(x,1)
z<-read.zoo("Cooling-1.txt",FUN=as.chron,format=fmt,sep="\t",header=TRUE,aggregate=tail1)

par(oma=c(6,1,4,2)) # set the outside space boundaries

par(mfrow=c(2,1)) # set the number of graphs (rows,cols) going on one page

# plot the original data
par(mar=c(1.1, 5, .9, 0.5))
plot(z, ylim=c(-100,3000),type="b",xlab="",ylab="Raw
Data",col="red",cex.lab=1.3)
grid(nx=NA,ny=NULL,lwd=.5,lty=1,col="red")

# calculate and plot the 1 minute straight line interpolation 
m1 <- times("00:01:00")
g <- seq(trunc(start(z),m1),end(z),by = m1)
z1<-na.approx(z,xout = g)

#plot the 1 minute linear interpolation fit
par(mar=c(1.1, 5, .9, 0.5))
#the following plot line generates the error
plot(z1,
ylim=c(-100,3000),xlim=c(chron("10/30/2011","00:00:00"),chron("10/30/2011","00:20:00")),type="b",xlab="",ylab="1
Minute Fit",cex.lab=1.3)
grid(nx=NA,ny=NULL,lwd=.5,lty=1,col="darkgrey")



--
View this message in context: 
http://r.789695.n4.nabble.com/convert-zoo-object-to-standard-R-object-so-I-can-plot-and-output-to-csv-file-tp4398302p4415078.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Cosine Similarity using library(lsa)

2012-02-23 Thread Dallas
The as.matrix (and as.table or as. vector or as.numeric ...) command takes
the object that you wish to convert as an argument. So the code below will
actually perform the conversion from table to matrix. 

> newmatrix<- as.matrix(matrix_v3)

A way to see what form your data are taking is to use the command
typeof(object). In this case, you can write 

>typeof(matrix_v3)

 Easy fix (hopefully)

Tad

--
View this message in context: 
http://r.789695.n4.nabble.com/Problems-with-Cosine-Similarity-using-library-lsa-tp4413433p4415114.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert zoo object to "standard" R object so I can plot and output to csv file

2012-02-23 Thread Gabor Grothendieck
On Thu, Feb 23, 2012 at 3:06 PM, Henry  wrote:
> Another simple question - trying to specify xlim in a zoo plot and getting
> error
> my plot line is
> plot(z1,
> ylim=c(-100,3000),xlim=c(chron("10/30/2011","00:00:00"),chron("10/30/2011","00:20:00")),type="b",xlab="",ylab="1
> Minute Fit",cex.lab=1.3)
> Error in substring(paste("0", v$day, sep = ""), first = nchar(paste(v$day)))
> :
>  invalid substring argument(s)
>
> Most of the complete code pasted below.

Most???

Read the last two lines of every message to r-help.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert zoo object to "standard" R object so I can plot and output to csv file

2012-02-23 Thread R. Michael Weylandt
If you could just construct the zoo object you want to plot and then
use dput() on it to create a plain-text representation (safe for
emailing) that'd make it easiest for us to help you. We can't do much
right now since we don't have the text file in question.

Michael

On Thu, Feb 23, 2012 at 3:21 PM, Gabor Grothendieck
 wrote:
> On Thu, Feb 23, 2012 at 3:06 PM, Henry  wrote:
>> Another simple question - trying to specify xlim in a zoo plot and getting
>> error
>> my plot line is
>> plot(z1,
>> ylim=c(-100,3000),xlim=c(chron("10/30/2011","00:00:00"),chron("10/30/2011","00:20:00")),type="b",xlab="",ylab="1
>> Minute Fit",cex.lab=1.3)
>> Error in substring(paste("0", v$day, sep = ""), first = nchar(paste(v$day)))
>> :
>>  invalid substring argument(s)
>>
>> Most of the complete code pasted below.
>
> Most???
>
> Read the last two lines of every message to r-help.
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Case weighting

2012-02-23 Thread Hed Bar-Nissan
It's really weighting - it's just that my simplified example was too
simplified
Here is my real weight vector:
> sc$W_FSCHWT
  [1]  14.8579  61.9528   3.0420   2.9929   5.1239  14.7507   2.7535
2.2693   3.6658   8.6179   2.5926   2.5390   1.7354   2.9767   9.0477
2.6589   3.4040   3.0519



And still it should somehow set the case weight.
I could multiply all by 1 and use maybe your method but it would create
such a bloated dataframe

working with numeric only i could probably create weighted means

But something simple as WEIGHTED BY would be nice.

tnx
Hed





On Thu, Feb 23, 2012 at 7:43 PM, David Winsemius wrote:

>
> On Feb 23, 2012, at 10:49 AM, Hed Bar-Nissan wrote:
>
>  The need comes from the PISA data. (http://www.pisa.oecd.org)
>>
>> In the data there are many cases and each of them carries a numeric
>> variable that signifies it's weight.
>> In SPSS the command would be "WEIGHT BY"
>>
>> In simpler words here is an R sample ( What is get  VS  what i want to
>> get )
>>
>>
>>  data.recieved <- data.frame(
>>>
>> + kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes", "No")),
>> + weight=c(10, 1, 1, 1)
>> + );
>>
>>> data.recieved;
>>>
>>  kindergarten_attendance weight
>> 1  No 10
>> 2 Yes  1
>> 3 Yes  1
>> 4 Yes  1
>>
>>>
>>>
>>>
>>> data.weighted <- data.frame(
>>>
>> + kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,**1,1,1),
>> labels =
>> c("Yes", "No")) );
>>
>
> You want "case repetition" not case weighting, which I would use as a term
> when working on estimation problems:
>
> >  ( data.weighted <- unlist(sapply(1:NROW(data.**recieved), function(x)
> rep(data.recieved[x,1], times=data.recieved[x,2] ))  ) )
>  [1] No  No  No  No  No  No  No  No  No  No  Yes Yes Yes
> Levels: Yes No
>
>
>
>>>
>>> par(mfrow=c(1,2));
>>> plot(data.recieved$**kindergarten_attendance,main="**What i get");
>>> plot(data.weighted$**kindergarten_attendance,main="**What i want to
>>> get");
>>>
>>
> Seems to work with the factor vector, although I didn't replicate
> dataframe rows, but I guess you could.
>
>
>>>
>> tnx in advance
>> Hed
>>
>>[[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD INSTALL with configure args

2012-02-23 Thread R. Michael Weylandt
Please cc the list. I'm not a debian user so any suggestions I make
will be guesses based on similarities with Mac OS X. There's also a
R-SIG-Debian list that might be appropriate.

That said, if you plan on doing this repeatedly, perhaps you can make
an alias in your bash profile (or whatever shell you're using).

Michael

On Thu, Feb 23, 2012 at 1:47 PM, Erin Hodgess  wrote:
> Sorry...Debian
>
>
> On Thu, Feb 23, 2012 at 11:51 AM, R. Michael Weylandt
>   wrote:
>> What OS?
>>
>> Michael
>>
>> On Feb 23, 2012, at 12:44 PM, Erin Hodgess  wrote:
>>
>>> Dear R People:
>>>
>>> I have a question, please:
>>>
>>> I want to install a package from R CMD INSTALL and I have a boatload
>>> of configure args.  I want to put them into a file.  How do I point
>>> the R CMD INSTALL to that file, please?
>>>
>>> Thanks,
>>> Erin
>>>
>>>
>>> --
>>> Erin Hodgess
>>> Associate Professor
>>> Department of Computer and Mathematical Sciences
>>> University of Houston - Downtown
>>> mailto: erinm.hodg...@gmail.com
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Case weighting

2012-02-23 Thread David Winsemius


On Feb 23, 2012, at 3:27 PM, Hed Bar-Nissan wrote:

It's really weighting - it's just that my simplified example was too  
simplified

Here is my real weight vector:
> sc$W_FSCHWT
  [1]  14.8579  61.9528   3.0420   2.9929   5.1239  14.7507
2.7535   2.2693   3.6658   8.6179   2.5926   2.5390   1.7354
2.9767   9.0477   2.6589   3.4040   3.0519




You should always convey the necessary complexity of the problem.



And still it should somehow set the case weight.
I could multiply all by 1 and use maybe your method but it would  
create such a bloated dataframe


working with numeric only i could probably create weighted means

But something simple as WEIGHTED BY would be nice.


The survey package by Thomas Lumley provides for a wide variety of  
weighted analyses.


--
David.


tnx
Hed





On Thu, Feb 23, 2012 at 7:43 PM, David Winsemius > wrote:


On Feb 23, 2012, at 10:49 AM, Hed Bar-Nissan wrote:

The need comes from the PISA data. (http://www.pisa.oecd.org)

In the data there are many cases and each of them carries a numeric
variable that signifies it's weight.
In SPSS the command would be "WEIGHT BY"

In simpler words here is an R sample ( What is get  VS  what i want  
to get )



data.recieved <- data.frame(
+ kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes",  
"No")),

+ weight=c(10, 1, 1, 1)
+ );
data.recieved;
 kindergarten_attendance weight
1  No 10
2 Yes  1
3 Yes  1
4 Yes  1



data.weighted <- data.frame(
+ kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,1,1,1),  
labels =

c("Yes", "No")) );

You want "case repetition" not case weighting, which I would use as  
a term when working on estimation problems:


>  ( data.weighted <- unlist(sapply(1:NROW(data.recieved),  
function(x) rep(data.recieved[x,1], times=data.recieved[x,2] ))  ) )

 [1] No  No  No  No  No  No  No  No  No  No  Yes Yes Yes
Levels: Yes No




par(mfrow=c(1,2));
plot(data.recieved$kindergarten_attendance,main="What i get");
plot(data.weighted$kindergarten_attendance,main="What i want to get");

Seems to work with the factor vector, although I didn't replicate  
dataframe rows, but I guess you could.




tnx in advance
Hed

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help - history()

2012-02-23 Thread Jeff Newmiller
No. But there is nothing preventing you from editing the file once your session 
is complete.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

"Gian Maria Niccolò Benucci"  wrote:

>Hi Members,
>
>Do exist the possibility to delete a command line into the history?
>Fo example. If I' ve typed a line code that is wrong, Can I delete it
>from
>the hostory in reason to do not save it in the .Rhistory file?
>Thanks for helping,
>
>Gian
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Case weighting

2012-02-23 Thread Daniel Nordlund
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Hed Bar-Nissan
> Sent: Thursday, February 23, 2012 12:27 PM
> To: David Winsemius
> Cc: r-help@r-project.org
> Subject: Re: [R] Case weighting
> 
> It's really weighting - it's just that my simplified example was too
> simplified
> Here is my real weight vector:
> > sc$W_FSCHWT
>   [1]  14.8579  61.9528   3.0420   2.9929   5.1239  14.7507   2.7535
> 2.2693   3.6658   8.6179   2.5926   2.5390   1.7354   2.9767   9.0477
> 2.6589   3.4040   3.0519
> 
> 
> 
> And still it should somehow set the case weight.
> I could multiply all by 1 and use maybe your method but it would
> create
> such a bloated dataframe
> 
> working with numeric only i could probably create weighted means
> 
> But something simple as WEIGHTED BY would be nice.
> 
> tnx
> Hed
> 
> 
> 
> 
> 
> On Thu, Feb 23, 2012 at 7:43 PM, David Winsemius
> wrote:
> 
> >
> > On Feb 23, 2012, at 10:49 AM, Hed Bar-Nissan wrote:
> >
> >  The need comes from the PISA data. (http://www.pisa.oecd.org)
> >>
> >> In the data there are many cases and each of them carries a numeric
> >> variable that signifies it's weight.
> >> In SPSS the command would be "WEIGHT BY"
> >>
> >> In simpler words here is an R sample ( What is get  VS  what i want to
> >> get )
> >>
> >>
> >>  data.recieved <- data.frame(
> >>>
> >> + kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes",
> "No")),
> >> + weight=c(10, 1, 1, 1)
> >> + );
> >>
> >>> data.recieved;
> >>>
> >>  kindergarten_attendance weight
> >> 1  No 10
> >> 2 Yes  1
> >> 3 Yes  1
> >> 4 Yes  1
> >>
> >>>
> >>>
> >>>
> >>> data.weighted <- data.frame(
> >>>
> >> + kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,**1,1,1),
> >> labels =
> >> c("Yes", "No")) );
> >>
> >
> > You want "case repetition" not case weighting, which I would use as a
> term
> > when working on estimation problems:
> >
> > >  ( data.weighted <- unlist(sapply(1:NROW(data.**recieved), function(x)
> > rep(data.recieved[x,1], times=data.recieved[x,2] ))  ) )
> >  [1] No  No  No  No  No  No  No  No  No  No  Yes Yes Yes
> > Levels: Yes No
> >
> >
> >
> >>>
> >>> par(mfrow=c(1,2));
> >>> plot(data.recieved$**kindergarten_attendance,main="**What i get");
> >>> plot(data.weighted$**kindergarten_attendance,main="**What i want to
> >>> get");
> >>>
> >>
> > Seems to work with the factor vector, although I didn't replicate
> > dataframe rows, but I guess you could.
> >
> >

Are these survey sampling weights?  If so, then you need to be using procedures 
that take the sampling design into account.  Otherwise, your variance estimates 
are going to be all wrong.

Dan

Daniel Nordlund
Bothell, WA USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cor() on sets of vectors

2012-02-23 Thread Sam Steingold
suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
my sets of vectors are arranged as data frames x & y (vector=column):

 x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
 y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))

cor(x,y) returns a _matrix_ of all pairwise correlations:

 cor(x,y)
  d  ef
a 0.2763696 -0.3523757 -0.373518870
b 0.5892742 -0.1969161 -0.007159589
c 0.3094301  0.997 -0.094970748

which is _not_ what I want.

I want diag(cor(x,y)) but without the N^2 calculations.

thanks.

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://iris.org.il http://americancensorship.org
http://dhimmi.com http://www.PetitionOnline.com/tap12009/ http://jihadwatch.org
Never argue with an idiot: he has more experience with idiotic arguments.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cor() on sets of vectors

2012-02-23 Thread R. Michael Weylandt
sapply(1:NCOL(x), function(n) cor(x[n], y[n])) is a quick and dirty
way, though probably not optimal.

Michael

On Thu, Feb 23, 2012 at 5:10 PM, Sam Steingold  wrote:
> suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
> I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
> my sets of vectors are arranged as data frames x & y (vector=column):
>
>  x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
>  y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))
>
> cor(x,y) returns a _matrix_ of all pairwise correlations:
>
>  cor(x,y)
>          d          e            f
> a 0.2763696 -0.3523757 -0.373518870
> b 0.5892742 -0.1969161 -0.007159589
> c 0.3094301  0.997 -0.094970748
>
> which is _not_ what I want.
>
> I want diag(cor(x,y)) but without the N^2 calculations.
>
> thanks.
>
> --
> Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 
> 11.0.11004000
> http://www.childpsy.net/ http://iris.org.il http://americancensorship.org
> http://dhimmi.com http://www.PetitionOnline.com/tap12009/ 
> http://jihadwatch.org
> Never argue with an idiot: he has more experience with idiotic arguments.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cor() on sets of vectors

2012-02-23 Thread Bert Gunter
Use 1:n as an index.

e.g.
sapply(1:n, function(i) cor(x[,i],y[,i]))

-- Bert



On Thu, Feb 23, 2012 at 2:10 PM, Sam Steingold  wrote:
> suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
> I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
> my sets of vectors are arranged as data frames x & y (vector=column):
>
>  x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
>  y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))
>
> cor(x,y) returns a _matrix_ of all pairwise correlations:
>
>  cor(x,y)
>          d          e            f
> a 0.2763696 -0.3523757 -0.373518870
> b 0.5892742 -0.1969161 -0.007159589
> c 0.3094301  0.997 -0.094970748
>
> which is _not_ what I want.
>
> I want diag(cor(x,y)) but without the N^2 calculations.
>
> thanks.
>
> --
> Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 
> 11.0.11004000
> http://www.childpsy.net/ http://iris.org.il http://americancensorship.org
> http://dhimmi.com http://www.PetitionOnline.com/tap12009/ 
> http://jihadwatch.org
> Never argue with an idiot: he has more experience with idiotic arguments.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] TRAMO/SEATS and x12 in R

2012-02-23 Thread John C Frain
Have a look at the gretl econometric package. This has a simplified
interface to X12 and integrates well with R.

Best Regards

John

On Thursday, 23 February 2012, Victor wrote:

> I have a Mac OS X system. To deal with a long monthly electricity demand
> time-series I use the  procedures TRAMO/SEATS with the MS-windows only
> Demetra programme and X12 under R resorting to the awkward - as far as the
> output is concerned - x12 R package running the relating Fortran code.
> I wonder if someone out there has attempted to translate TRAMO/SEATS and
> X12  into R native language?
>
> Ciao from Rome
> Vittorio
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John C Frain
Economics Department
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:fra...@tcd.ie
mailto:fra...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] TRAMO/SEATS and x12 in R

2012-02-23 Thread John C Frain
Have a look at the gretl econometric package. This has a simplified
interface to X12 and integrates well with R.

Best Regards

John

On Thursday, 23 February 2012, Victor wrote:

> I have a Mac OS X system. To deal with a long monthly electricity demand
> time-series I use the  procedures TRAMO/SEATS with the MS-windows only
> Demetra programme and X12 under R resorting to the awkward - as far as the
> output is concerned - x12 R package running the relating Fortran code.
> I wonder if someone out there has attempted to translate TRAMO/SEATS and
> X12  into R native language?
>
> Ciao from Rome
> Vittorio
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John C Frain
Economics Department
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:fra...@tcd.ie
mailto:fra...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cor() on sets of vectors

2012-02-23 Thread ilai
On Thu, Feb 23, 2012 at 3:24 PM, Bert Gunter  wrote:
> Use 1:n as an index.
>
> e.g.
> sapply(1:n, function(i) cor(x[,i],y[,i]))

## sapply is a good solution (the only one I could think of too), but
not always worth it:

# for 100 x 1000
 x <- data.frame(matrix(rnorm(10),nc=1000))
 y <- data.frame(matrix(rnorm(10),nc=1000))
 system.time(diag(cor(x,y)))
#   user  system elapsed
#  0.592   0.008   0.623
system.time(sapply(1:1000,function(i) cor(x[,i],y[,i])))
#   user  system elapsed
#  0.384   0.000   0.412

# Great. but for 10 x 1000
x <- data.frame(matrix(rnorm(1),nc=1000))
y <- data.frame(matrix(rnorm(1),nc=1000))
system.time(diag(cor(x,y)))
#   user  system elapsed
#  0.256   0.008   0.279
system.time(sapply(1:1000,function(i) cor(x[,i],y[,i])))
#   user  system elapsed
#  0.376   0.000   0.388

# or 100 x 100
 system.time(diag(cor(x,y)))
#   user  system elapsed
#  0.016   0.000   0.014
 system.time(sapply(1:100,function(i) cor(x[,i],y[,i])))
#   user  system elapsed
#  0.036   0.000   0.036

# Not so great.

Bottom line, as always, it depends.

Cheers
Elai




>
> -- Bert
>
>
>
> On Thu, Feb 23, 2012 at 2:10 PM, Sam Steingold  wrote:
>> suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
>> I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
>> my sets of vectors are arranged as data frames x & y (vector=column):
>>
>>  x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
>>  y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))
>>
>> cor(x,y) returns a _matrix_ of all pairwise correlations:
>>
>>  cor(x,y)
>>          d          e            f
>> a 0.2763696 -0.3523757 -0.373518870
>> b 0.5892742 -0.1969161 -0.007159589
>> c 0.3094301  0.997 -0.094970748
>>
>> which is _not_ what I want.
>>
>> I want diag(cor(x,y)) but without the N^2 calculations.
>>
>> thanks.
>>
>> --
>> Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 
>> 11.0.11004000
>> http://www.childpsy.net/ http://iris.org.il http://americancensorship.org
>> http://dhimmi.com http://www.PetitionOnline.com/tap12009/ 
>> http://jihadwatch.org
>> Never argue with an idiot: he has more experience with idiotic arguments.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] perform t.test by rows and columns in data frame

2012-02-23 Thread Kara Przeczek
Dear R Help,

I have been struggling with this problem without making much headway. I am 
attempting to avoid using a loop, and would appreciate any suggestions you may 
have. I am not well versed in R and apologize in advance if I have missed 
something obvious.



I have a data set with multiple sites along a river where metal concentrations 
were measured. Three sites are located upstream of a mine and three sites are 
located downstream of the mine. I would like to compare the upstream and 
downstream metal levels using a t-test.



The data set looks something like this (but with more metals (25) and sites (6):

TotalMetalsMeanSiteLocation

Al60001us

Sb0.61us

Ba1501us

Al65002us

Sb0.72us

Ba1602us

Al56003ds

Sb0.83ds

Ba1803ds

Al1704ds

Sb0.84ds

Ba1754ds



I have tried several variations of by() and aggregate() and tapply() without 
much luck. I thought I had finally got what I wanted with:

by(mr2$Mean, mr2$TotalMetals, function (x) t.test(mr2$Mean[mr2$Location=="us"], 
mr2$Mean[mr2$Location=="ds"]))



However, the output, although grouped by metal, had identical results for each 
metal with means for "x and y" equivalent to the mean of all metals within each 
site.

mean(mr2$Mean[mr2$Location=="us"]) #gave the x mean from the output and,

mean(mr2$Mean[mr2$Location=="ds"]) #gave the same y mean from the output





I can get the answer I want by performing the t-test for each metal 
individually with:



y=mr2[mr2$TotalMetals=="Al",]

t.test(y$Mean[y$Location=="us"], y$Mean[y$Location=="ds"])



But it would be painstaking to do this for each metal. In addition the data set 
will be getting larger in the future.

It would also be nice to collect the output in a table or similar format for 
easy output, if possible.



I would greatly appreciate any help that you could provide!
Thank you,

Kara



Natural Resources and Environmental Studies, MSc

University of Northern B.C.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] perform t.test by rows and columns in data frame

2012-02-23 Thread Kara Przeczek
Sorry. I forgot to note that I am using R version 2.8.0.


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Kara Przeczek [przec...@unbc.ca]
Sent: February 23, 2012 3:13 PM
To: r-help@r-project.org
Subject: [R] perform t.test by rows and columns in data frame

Dear R Help,

I have been struggling with this problem without making much headway. I am 
attempting to avoid using a loop, and would appreciate any suggestions you may 
have. I am not well versed in R and apologize in advance if I have missed 
something obvious.



I have a data set with multiple sites along a river where metal concentrations 
were measured. Three sites are located upstream of a mine and three sites are 
located downstream of the mine. I would like to compare the upstream and 
downstream metal levels using a t-test.



The data set looks something like this (but with more metals (25) and sites (6):

TotalMetalsMeanSiteLocation

Al60001us

Sb0.61us

Ba1501us

Al65002us

Sb0.72us

Ba1602us

Al56003ds

Sb0.83ds

Ba1803ds

Al1704ds

Sb0.84ds

Ba1754ds



I have tried several variations of by() and aggregate() and tapply() without 
much luck. I thought I had finally got what I wanted with:

by(mr2$Mean, mr2$TotalMetals, function (x) t.test(mr2$Mean[mr2$Location=="us"], 
mr2$Mean[mr2$Location=="ds"]))



However, the output, although grouped by metal, had identical results for each 
metal with means for "x and y" equivalent to the mean of all metals within each 
site.

mean(mr2$Mean[mr2$Location=="us"]) #gave the x mean from the output and,

mean(mr2$Mean[mr2$Location=="ds"]) #gave the same y mean from the output





I can get the answer I want by performing the t-test for each metal 
individually with:



y=mr2[mr2$TotalMetals=="Al",]

t.test(y$Mean[y$Location=="us"], y$Mean[y$Location=="ds"])



But it would be painstaking to do this for each metal. In addition the data set 
will be getting larger in the future.

It would also be nice to collect the output in a table or similar format for 
easy output, if possible.



I would greatly appreciate any help that you could provide!
Thank you,

Kara



Natural Resources and Environmental Studies, MSc

University of Northern B.C.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cor() on sets of vectors

2012-02-23 Thread Bert Gunter
Elai:

Thank you.You make an excellent point. cor() is implemented at the C
level (via a .internal call) whereas sapply implements an interpreted
loop that has to issue the call each time through the loop (with some
shortcuts/tricks to reduce overhead). So the operations count of the
original poster is completely bogus.

As you say, "it depends..." . For this reason, it is generally a bad
idea to waste much time on code efficiency unless you really need to,
which these days is not often (and there are certainly arenas where
this statement is false). More important is to focus on code clarity,
flexibility, debuggabuility, etc.

Best,
Bert
On Thu, Feb 23, 2012 at 2:52 PM, ilai  wrote:
> On Thu, Feb 23, 2012 at 3:24 PM, Bert Gunter  wrote:
>> Use 1:n as an index.
>>
>> e.g.
>> sapply(1:n, function(i) cor(x[,i],y[,i]))
>
> ## sapply is a good solution (the only one I could think of too), but
> not always worth it:
>
> # for 100 x 1000
>  x <- data.frame(matrix(rnorm(10),nc=1000))
>  y <- data.frame(matrix(rnorm(10),nc=1000))
>  system.time(diag(cor(x,y)))
> #   user  system elapsed
> #  0.592   0.008   0.623
> system.time(sapply(1:1000,function(i) cor(x[,i],y[,i])))
> #   user  system elapsed
> #  0.384   0.000   0.412
>
> # Great. but for 10 x 1000
> x <- data.frame(matrix(rnorm(1),nc=1000))
> y <- data.frame(matrix(rnorm(1),nc=1000))
> system.time(diag(cor(x,y)))
> #   user  system elapsed
> #  0.256   0.008   0.279
> system.time(sapply(1:1000,function(i) cor(x[,i],y[,i])))
> #   user  system elapsed
> #  0.376   0.000   0.388
>
> # or 100 x 100
>  system.time(diag(cor(x,y)))
> #   user  system elapsed
> #  0.016   0.000   0.014
>  system.time(sapply(1:100,function(i) cor(x[,i],y[,i])))
> #   user  system elapsed
> #  0.036   0.000   0.036
>
> # Not so great.
>
> Bottom line, as always, it depends.
>
> Cheers
> Elai
>
>
>
>
>>
>> -- Bert
>>
>>
>>
>> On Thu, Feb 23, 2012 at 2:10 PM, Sam Steingold  wrote:
>>> suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
>>> I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
>>> my sets of vectors are arranged as data frames x & y (vector=column):
>>>
>>>  x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
>>>  y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))
>>>
>>> cor(x,y) returns a _matrix_ of all pairwise correlations:
>>>
>>>  cor(x,y)
>>>          d          e            f
>>> a 0.2763696 -0.3523757 -0.373518870
>>> b 0.5892742 -0.1969161 -0.007159589
>>> c 0.3094301  0.997 -0.094970748
>>>
>>> which is _not_ what I want.
>>>
>>> I want diag(cor(x,y)) but without the N^2 calculations.
>>>
>>> thanks.
>>>
>>> --
>>> Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 
>>> 11.0.11004000
>>> http://www.childpsy.net/ http://iris.org.il http://americancensorship.org
>>> http://dhimmi.com http://www.PetitionOnline.com/tap12009/ 
>>> http://jihadwatch.org
>>> Never argue with an idiot: he has more experience with idiotic arguments.
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >