Re: [R] TukeyHSD with Type III SS

2010-12-02 Thread Dieter Menne


Ali S wrote:
> 
> Is there a way to run TukeyHSD with Type III instead of Type I SS?
> 

I am sure Bill Venables will respond, but in the meantime read:

<>

And search google for "Venables exegesis". It's one of the unique terms.

Dieter

-- 
View this message in context: 
http://r.789695.n4.nabble.com/TukeyHSD-with-Type-III-SS-tp3068157p3068654.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why do I get (not estimable) in this aov data

2010-12-02 Thread Hedberg Peter
I realize the problem now, and you are right aov is wrong.
It seams that it was completely wrong to use the transect in a fixed formula, 
since my Trasects are labeled A to P, and A to D is in one mainplot, E-H in 
another, I-L an a third, and M-P in a fourth. In this case lme, with random 
variables for Transect is better. Thank you for the help.

Best regards, Petter



Peter Dalgaard  skrev :

> On Nov 30, 2010, at 13:58 , Hedberg Peter wrote:
> 
> >> aov_data
> > Call:
> >   aov(formula = Species1 ~ Site + Obstacle + Treatment + 
> > as.factor(Dist_Obstacle) + 
> >as.factor(Dist_start) + Transect + Mainplot + Obsplot)
> > 
> > Terms:
> > Site  Obstacle Treatment as.factor(Dist_Obstacle) 
> > as.factor(Dist_start)  Transect   Obsplot Residuals
> > Sum of Squares   2143.984   446.274   340.042  736.073  
> >  173.707   800.270  4014.378 17238.625
> > Deg. of Freedom 2 1 14  
> >   31060   271
> > 
> > Residual standard error: 7.97566 
> > 27 out of 109 effects not estimable
> > Estimated effects may be unbalanced
> > 
> > 
> > 
> > My question is why do I get "effects not estimable", and "effects may be 
> > unbalanced). I have checked the data and it is balanced.
> 
> Unfortunately, your attachment did not contain the data, but the sum of the 
> Deg. of Freedom  above is 352, suggesting that the observation count is 353, 
> which is prime, so I find it difficult to believe that you have balanced data 
> in the sense of a complete factorial design, even for a subset of your 
> factors. A complete factorial with those DF would take more than 16 
> observations!
> 
> I suspect that aov() is simply the wrong tool for these data. lm() will do 
> it, but watch out for the aliased effects.
> 
> -- 
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com

___
NOCC, http://nocc.sourceforge.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Aaron Polhamus
Greetings,

My goal is to create a Markov transition matrix (probability of moving from
one state to another) with the 'highest traffic' portion of the matrix
occupying the top-left section. Consider the following sample:

inputData <- c(
c(5, 3, 1, 6, 7),
c(9, 7, 3, 10, 11),
c(1, 2, 3, 4, 5),
c(2, 4, 6, 8, 10),
c(9, 5, 2, 1, 1)
)

MAT <- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE)
colnames(MAT) <- c("A", "B", "C", "D", "E")
rownames(MAT) <- c("A", "B", "C", "D", "E")

rowSums(MAT)

I wan to re-arrange the elements of this matrix such that the elements with
the largest row sums are placed to the top-left, in descending order. Does
this make sense? In this case the order I'm looking for would be B, D, A, E,
C Any thoughts?

As an aside, here is the function I've written to construct the transition
matrix. Is there a more elegant way to do this that doesn't involve a double
transpose?

TMAT <- apply(t(MAT), 2, function(X) X/sum(X))
TMAT <- t(TMAT)

I tried the following:

TMAT <- apply(MAT, 1, function(X) X/sum(X))

But my the custom function is still getting applied over the columns of the
array, rather than the rows. For a check try:

rowSums(TMAT)
colSums(TMAT)

Row sums here should equal 1...

Many thanks in advance,
Aaron

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] venneuler() - customize a few things

2010-12-02 Thread Francesco Lescai
Hi,
I was wondering if it would be possible to print the counts of common 
items in the venneuler() diagrams.
Anybody knows a straightforward idea?
I can always count things from the data matrix, but I have vectors with 
50 thousands elements.

thanks
Francesco
-- 
--
*Francesco Lescai, PhD*
Research Associate in Genome Analysis
Faculty of Biomedical Sciences, Division of Research Strategy
c/o UCL Cancer Institute, Paul O'Gorman Building
University College London
Gower Street WC1E 6BT London, UK

visiting address: 72, Huntley Street
WC1E 6DE London

email: f.les...@ucl.ac.uk
phone: +44.(0)20.7679.6812
[ext: 46812]
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Ivan Calandra

Hi,

Here is a not so easy way to do your first step, but it works:
MAT2 <- cbind(MAT, rowSums(MAT))
MAT[order(MAT2[,6], decreasing=TRUE),]

For the second, I don't know!

HTH,
Ivan


Le 12/2/2010 09:46, Aaron Polhamus a écrit :

Greetings,

My goal is to create a Markov transition matrix (probability of moving from
one state to another) with the 'highest traffic' portion of the matrix
occupying the top-left section. Consider the following sample:

inputData<- c(
 c(5, 3, 1, 6, 7),
 c(9, 7, 3, 10, 11),
 c(1, 2, 3, 4, 5),
 c(2, 4, 6, 8, 10),
 c(9, 5, 2, 1, 1)
 )

MAT<- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE)
colnames(MAT)<- c("A", "B", "C", "D", "E")
rownames(MAT)<- c("A", "B", "C", "D", "E")

rowSums(MAT)

I wan to re-arrange the elements of this matrix such that the elements with
the largest row sums are placed to the top-left, in descending order. Does
this make sense? In this case the order I'm looking for would be B, D, A, E,
C Any thoughts?

As an aside, here is the function I've written to construct the transition
matrix. Is there a more elegant way to do this that doesn't involve a double
transpose?

TMAT<- apply(t(MAT), 2, function(X) X/sum(X))
TMAT<- t(TMAT)

I tried the following:

TMAT<- apply(MAT, 1, function(X) X/sum(X))

But my the custom function is still getting applied over the columns of the
array, rather than the rows. For a check try:

rowSums(TMAT)
colSums(TMAT)

Row sums here should equal 1...

Many thanks in advance,
Aaron

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using foreach (parallel processing)

2010-12-02 Thread Santosh Srinivas
Hello group,

I am experimenting with parallel processing on my quad core Win 7 32
bit machine. Using these packages for the first time.

I can see all my processor running at full performance when I use a
smaller dataset

require(snow)
require(doSNOW)
require(foreach)
#change the 8 to however many cores\phys processors you have on your machine
cl.tmp = makeCluster(rep("localhost",4), type="SOCK")
registerDoSNOW(cl.tmp)


optData.df <- head(pristine,10)

system.time(
optData.df$impliedVol <-
foreach(i=1:NROW(optData.df),.packages="RQuantLib") %dopar%
with(optData.df[i,],
tryCatch({EuropeanOptionImpliedVolatility(type,value,underlying,
strike, dividendYield, riskFreeRate, maturity,
volatility)$impliedVol},
error = function (ex){0}))
)

This works fine!

PROBLEM: However, when I do the same but with optData.df <- pristine
... which has about 3.8 million options data ... the cores do not seem
to be fully utilized (they seem to run at 25%).

I noticed some slight delay before the processing starts running ...
when I did with the 100k dataset ... do i need to wait longer for any
allocations to be done?

Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Michael Bedward
Hi Aaron,

Following up on Ivan's suggestion, if you want the column order to
mirror the row order...

mo <- order(rowSums(MAT), decreasing=TRUE)
MAT2 <- MAT[mo, mo]

Also, you don't need all those extra c() calls when creating
inputData, just the outermost one.

Regarding your second question, your statements...

TMAT <- apply(t(MAT), 2, function(X) X/sum(X))
TMAT <- t(TMAT)

is actually just a complicated way of doing this...

TMAT <- MAT / rowSums(MAT)

You can confirm that by doing it your way and then this...

TMAT == MAT / rowSums(MAT)

...and you should see a matrix of TRUE values

Michael


On 2 December 2010 20:43, Ivan Calandra  wrote:
> Hi,
>
> Here is a not so easy way to do your first step, but it works:
> MAT2 <- cbind(MAT, rowSums(MAT))
> MAT[order(MAT2[,6], decreasing=TRUE),]
>
> For the second, I don't know!
>
> HTH,
> Ivan
>
>
> Le 12/2/2010 09:46, Aaron Polhamus a écrit :
>>
>> Greetings,
>>
>> My goal is to create a Markov transition matrix (probability of moving
>> from
>> one state to another) with the 'highest traffic' portion of the matrix
>> occupying the top-left section. Consider the following sample:
>>
>> inputData<- c(
>>     c(5, 3, 1, 6, 7),
>>     c(9, 7, 3, 10, 11),
>>     c(1, 2, 3, 4, 5),
>>     c(2, 4, 6, 8, 10),
>>     c(9, 5, 2, 1, 1)
>>     )
>>
>> MAT<- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE)
>> colnames(MAT)<- c("A", "B", "C", "D", "E")
>> rownames(MAT)<- c("A", "B", "C", "D", "E")
>>
>> rowSums(MAT)
>>
>> I wan to re-arrange the elements of this matrix such that the elements
>> with
>> the largest row sums are placed to the top-left, in descending order. Does
>> this make sense? In this case the order I'm looking for would be B, D, A,
>> E,
>> C Any thoughts?
>>
>> As an aside, here is the function I've written to construct the transition
>> matrix. Is there a more elegant way to do this that doesn't involve a
>> double
>> transpose?
>>
>> TMAT<- apply(t(MAT), 2, function(X) X/sum(X))
>> TMAT<- t(TMAT)
>>
>> I tried the following:
>>
>> TMAT<- apply(MAT, 1, function(X) X/sum(X))
>>
>> But my the custom function is still getting applied over the columns of
>> the
>> array, rather than the rows. For a check try:
>>
>> rowSums(TMAT)
>> colSums(TMAT)
>>
>> Row sums here should equal 1...
>>
>> Many thanks in advance,
>> Aaron
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Ivan CALANDRA
> PhD Student
> University of Hamburg
> Biozentrum Grindel und Zoologisches Museum
> Abt. Säugetiere
> Martin-Luther-King-Platz 3
> D-20146 Hamburg, GERMANY
> +49(0)40 42838 6231
> ivan.calan...@uni-hamburg.de
>
> **
> http://www.for771.uni-bonn.de
> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using foreach (parallel processing)

2010-12-02 Thread Rainer M Krug
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/02/2010 10:56 AM, Santosh Srinivas wrote:
> Hello group,

Hi

First of all: there is a special interest group for high performance
computing in R, which includes parallel computing, wghich would be more
appropriate to ask. But I will try anyway:

> 
> I am experimenting with parallel processing on my quad core Win 7 32
> bit machine. Using these packages for the first time.
> 
> I can see all my processor running at full performance when I use a
> smaller dataset
> 
> require(snow)
> require(doSNOW)
> require(foreach)
> #change the 8 to however many cores\phys processors you have on your machine
> cl.tmp = makeCluster(rep("localhost",4), type="SOCK")
> registerDoSNOW(cl.tmp)
> 
> 
> optData.df <- head(pristine,10)
> 
> system.time(
> optData.df$impliedVol <-
> foreach(i=1:NROW(optData.df),.packages="RQuantLib") %dopar%
> with(optData.df[i,],
>   tryCatch({EuropeanOptionImpliedVolatility(type,value,underlying,
> strike, dividendYield, riskFreeRate, maturity,
> volatility)$impliedVol},
>   error = function (ex){0}))
> )
> 
> This works fine!
> 
> PROBLEM: However, when I do the same but with optData.df <- pristine
> ... which has about 3.8 million options data ... the cores do not seem
> to be fully utilized (they seem to run at 25%).
> 
> I noticed some slight delay before the processing starts running ...
> when I did with the 100k dataset ... do i need to wait longer for any
> allocations to be done?

Communication to setup the threads ould definitly take some time.
So why don't you try to increase from 100.000 to 1.000.000 and see how
long it takes to initialize the threads?

You are not mentioning how long you wait?

Cheers,

Rainer

> 
> Thank you.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa

Tel:+33 - (0)9 53 10 27 44
Cell:   +27 - (0)8 39 47 90 42
Fax (SA):   +27 - (0)8 65 16 27 82
Fax (D) :   +49 - (0)3 21 21 25 22 44
Fax (FR):   +33 - (0)9 58 10 27 44
email:  rai...@krugs.de

Skype:  RMkrug
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkz3b5YACgkQoYgNqgF2egrqtQCfQKXKZln5tz5eZDTYBmbFvruH
H/4AoIOXZJ7F3lKL6QLRkaQPHZbqBDQn
=odY/
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread Karl Brand

Cheers Bill.

You got me halfway, since:

> temp <- merge(x=x, y=y[,17, drop=FALSE], by="rownames", sort=FALSE)
Error in fix.by(by.x, x) : 'by' must specify valid column(s)

but, using "row.names" instead of "rownames", like:
> temp <- merge(x=x, y=y[,17, drop=FALSE], by="row.names", sort=FALSE)

works (but adds a column "Row.names").

Which seems some what counter intuitive to me since i am feeding in two 
matrices to the merge function, which i understand have 'rownames', not 
'row.names' as data frames have, right? Although the output of merge() 
is a data frame...


thanks again,

Karl


On 12/1/2010 6:08 PM, William Dunlap wrote:

Try
by="rownames"
instead of
by=rownames

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Karl Brand
Sent: Wednesday, December 01, 2010 6:35 AM
To: Dimitris Rizopoulos
Cc: r-help@r-project.org
Subject: [R] attempted merge() returns: cannot coerce type
'closure' to vector of type 'any'

Hi Dimtris and esteemed useRs,

I don't understand why i get this error message when
attempting to use
merge() -

  >  temp<- merge(x, y[,17, drop=FALSE], by=rownames, sort=FALSE)
Error in as.vector(x, mode) :
cannot coerce type 'closure' to vector of type 'any'

It should work because:

  >  all(rownames(x[order(rownames(x)),]) ==
+ rownames(y[order(rownames(y[,17, drop=FALSE])),17,
drop=FALSE]) 
[TRUNCATED]
[1] TRUE

also:

  >  class(x); class(y[,17, drop=FALSE])
[1] "matrix"
[1] "matrix"


Any idea why i cant use merge() in the normal way here? I'm forced to
add the column using:

temp.b<- cbind(x, y[match(rownames(x), rownames(y)),17])

All insights appreciated for this leaRner,

cheers,

Karl


--
Karl Brand
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Karl Brand 
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rpart results - problem after oversampling

2010-12-02 Thread Mederos, Vicente (Santander)
Hi all,

I am trying to predict a target variable that takes values 0 or 1 using the 
rpart command. In my initial dataset I have few positive observations of the 
target variable; therefore I have oversampled the rare event by a multiple of 6 
(i.e. from 762 to 4572).

However, in my results, I end up with a number of positives in one of the 
terminal nodes that is not divisible by 6. As I have the same observation 
repeated 6 times, shouldn't all of them follow the same branch of the tree and 
go to the same terminal node?

Thanks for your help,

Vicente
Emails aren't always secure, and they may be intercepted or changed
after they've been sent. Santander doesn't accept liability if this
happens. If you think someone may have interfered with this email,
please get in touch with the sender another way. This message doesn't
create or change any contract. Santander doesn't accept responsibility
for damage caused by any viruses contained in this email or its
attachments. Emails may be monitored. If you've received this email by
mistake, please let the sender know at once that it's gone to the wrong
person and then destroy it without copying, using, or telling anyone
about its contents. Santander UK plc (SAN UK) Reg. No. 2294747 and Abbey
National Treasury Services plc (ANTS) Reg. No. 2338548 are registered in
England and have their Registered Offices at 2 Triton Square, Regent's
Place, London, NW1 3AN. www.santander.co.uk SAN UK and ANTS are
authorised and regulated by the Financial Services Authority (Reg. No.
106054 and 146003 respectively). SAN UK advises on mortgages, a limited
range of life assurance, pension and collective investment scheme
products and acts as an insurance intermediary for general insurance.
Santander and the flame logo are registered trademarks.
Ref:[PDB#1-4]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread David Winsemius


On Dec 2, 2010, at 5:22 AM, Karl Brand wrote:


Cheers Bill.

You got me halfway, since:

> temp <- merge(x=x, y=y[,17, drop=FALSE], by="rownames", sort=FALSE)
Error in fix.by(by.x, x) : 'by' must specify valid column(s)

but, using "row.names" instead of "rownames", like:
> temp <- merge(x=x, y=y[,17, drop=FALSE], by="row.names", sort=FALSE)

works (but adds a column "Row.names").

Which seems some what counter intuitive to me since i am feeding in  
two matrices to the merge function, which i understand have  
'rownames', not 'row.names' as data frames have, right? Although the  
output of merge() is a data frame...




This reminds me of physicists using high energy particles to  
investigate the structure of the nucleus. But you have alternatives to  
what might be called "destructive debugging through binoculars".  
Instead of throwing code at your data and asking the assembled  
audience of soothsayers to tell you what went wrong by examining the  
debris, why don't you show us what the data objects actually look like?


--
David.


thanks again,

Karl


On 12/1/2010 6:08 PM, William Dunlap wrote:

Try
   by="rownames"
instead of
   by=rownames

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Karl Brand
Sent: Wednesday, December 01, 2010 6:35 AM
To: Dimitris Rizopoulos
Cc: r-help@r-project.org
Subject: [R] attempted merge() returns: cannot coerce type
'closure' to vector of type 'any'

Hi Dimtris and esteemed useRs,

I don't understand why i get this error message when
attempting to use
merge() -

 >  temp<- merge(x, y[,17, drop=FALSE], by=rownames, sort=FALSE)
Error in as.vector(x, mode) :
   cannot coerce type 'closure' to vector of type 'any'

It should work because:

 >  all(rownames(x[order(rownames(x)),]) ==
+ rownames(y[order(rownames(y[,17, drop=FALSE])),17,
drop=FALSE]) 
[TRUNCATED]
[1] TRUE

also:

 >  class(x); class(y[,17, drop=FALSE])
[1] "matrix"
[1] "matrix"


Any idea why i cant use merge() in the normal way here? I'm forced  
to

add the column using:

temp.b<- cbind(x, y[match(rownames(x), rownames(y)),17])

All insights appreciated for this leaRner,

cheers,

Karl


--
Karl Brand
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Karl Brand 
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread Karl Brand



On 12/2/2010 11:36 AM, David Winsemius wrote:


On Dec 2, 2010, at 5:22 AM, Karl Brand wrote:


Cheers Bill.

You got me halfway, since:

> temp <- merge(x=x, y=y[,17, drop=FALSE], by="rownames", sort=FALSE)
Error in fix.by(by.x, x) : 'by' must specify valid column(s)

but, using "row.names" instead of "rownames", like:
> temp <- merge(x=x, y=y[,17, drop=FALSE], by="row.names", sort=FALSE)

works (but adds a column "Row.names").

Which seems some what counter intuitive to me since i am feeding in
two matrices to the merge function, which i understand have
'rownames', not 'row.names' as data frames have, right? Although the
output of merge() is a data frame...



This reminds me of physicists using high energy particles to investigate
the structure of the nucleus. But you have alternatives to what might be
called "destructive debugging through binoculars". Instead of throwing
code at your data and asking the assembled audience of soothsayers to
tell you what went wrong by examining the debris, why don't you show us
what the data objects actually look like?



Ofcourse David, sorry- my lazy.

Is str() enough to put away the binoculuars and maybe highlight the 
source of my ignorance?


> str(x)
 num [1:140, 1:7] 4.93e-02 2.83e-02 1.07e-02 2.68e-02 3.92e-05 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:140] "1415743_at" "1415887_at" "1416332_at" 
"1416340_a_at" ...

  ..$ : chr [1:7] "R.S.adj.pvalue" "R.S.Tau" "R.S.xpiek" "R.S.xdal" ...

> str(y)
 num [1:140, 1:18] 0.573 1 0.752 0.768 0.399 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:140] "1427982_s_at" "1430520_at" "1454086_a_at" 
"1419652_s_at" ...

  ..$ : chr [1:18] "R.S.1" "R.S.2" "R.S.3" "R.S.4" ...
> source(.trPaths[5], echo=TRUE, max.deparse.length=150)

> str(temp)
'data.frame':   140 obs. of  9 variables:
 $ Row.names :Class 'AsIs'  chr [1:140] "1415743_at" "1415887_at" 
"1416332_at" "1416340_a_at" ...

 $ R.S.adj.pvalue: num  4.93e-02 2.83e-02 1.07e-02 2.68e-02 3.92e-05 ...
 $ R.S.Tau   : num  21.6 23.6 26.6 29.7 18.8 20 24.6 27.9 23.9 22.7 ...
 $ R.S.xpiek : num  6.74 17.46 15.81 15.39 14.73 ...
 $ R.S.xdal  : num  16.94 1.76 22.8 1.12 5.41 ...
 $ R.S.p1: num  0.0004 0.0001 0.0467 0.0024 0 ...
 $ R.S.p2: num  0.0039 0 0.002 0.019 0.0001 0.0035 0.0351 
0.0126 0.0028 0.0192 ...
 $ R.S.p3: num  0.0178 0.0004 0.0101 0.0312 0.0012 0.022 0.0008 
0.0151 0.0048 0.0317 ...

 $ V1: num  3.96 -0.94 1.04 1.94 -2.53 ...
>




--
Karl Brand 
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hmm Topology restriction

2010-12-02 Thread Kishor Tappita
Dear List,

We are using RHmm to cluster data through HMM. We would like to
restrict the transition matrix of HMM, to get hierarchical connections
between clusters. But, RHmm doesn't seem to support these
restrictions. Can any one suggest a library to do that.

Thanks,
Kishor

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using foreach (parallel processing)

2010-12-02 Thread Mike Marchywka








> Date: Thu, 2 Dec 2010 11:06:14 +0100
> From: r.m.k...@gmail.com
> To: santosh.srini...@gmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] using foreach (parallel processing)
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 12/02/2010 10:56 AM, Santosh Srinivas wrote:
> > Hello group,
>
> Hi
> >
> > I am experimenting with parallel processing on my quad core Win 7 32
> > bit machine. Using these packages for the first time.
> >
> > I can see all my processor running at full performance when I use a
> > smaller dataset
[...]
> >
> > PROBLEM: However, when I do the same but with optData.df <- pristine
> > ... which has about 3.8 million options data ... the cores do not seem
> > to be fully utilized (they seem to run at 25%).
> >
> > I noticed some slight delay before the processing starts running ...
> > when I did with the 100k dataset ... do i need to wait longer for any
> > allocations to be done?
>
> Communication to setup the threads ould definitly take some time.
> So why don't you try to increase from 100.000 to 1.000.000 and see how
> long it takes to initialize the threads?
>
> You are not mentioning how long you wait?

Recent 'dohs releases including windohs 7 have more task manager capabilities
although afaik it is still hard to reduce the display to text for easy sharing.
One thing to look at is disk usage and page faults. Again, it is easy for IO
to take longer than processing. Usually cores end up fighting
with each other for memory eventually causing VM thrashing.

I posted a link to IEEE blurb here, showing non-monotonic
performance results as a function of cores used ( I can't remember
now if this was cores or processors but you get the idea ),

http://lists.boost.org/boost-users/2008/11/42263.php

You can generally expect peformance gains if each core
is off doing its own thing, not competing with the others
for memory or disk or other limited resources.




>
> Cheers,
>
> Rainer
>

  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using foreach (parallel processing)

2010-12-02 Thread Santosh Srinivas
Thanks. I guess I had to wait long enough to get the threads running
and now they seem be on top gear.
Hope this thing doesn't crash now!


On Thu, Dec 2, 2010 at 3:36 PM, Rainer M Krug  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 12/02/2010 10:56 AM, Santosh Srinivas wrote:
>> Hello group,
>
> Hi
>
> First of all: there is a special interest group for high performance
> computing in R, which includes parallel computing, wghich would be more
> appropriate to ask. But I will try anyway:
>
>>
>> I am experimenting with parallel processing on my quad core Win 7 32
>> bit machine. Using these packages for the first time.
>>
>> I can see all my processor running at full performance when I use a
>> smaller dataset
>>
>> require(snow)
>> require(doSNOW)
>> require(foreach)
>> #change the 8 to however many cores\phys processors you have on your machine
>> cl.tmp = makeCluster(rep("localhost",4), type="SOCK")
>> registerDoSNOW(cl.tmp)
>>
>>
>> optData.df <- head(pristine,10)
>>
>> system.time(
>> optData.df$impliedVol <-
>> foreach(i=1:NROW(optData.df),.packages="RQuantLib") %dopar%
>> with(optData.df[i,],
>>       tryCatch({EuropeanOptionImpliedVolatility(type,value,underlying,
>> strike, dividendYield, riskFreeRate, maturity,
>> volatility)$impliedVol},
>>               error = function (ex){0}))
>> )
>>
>> This works fine!
>>
>> PROBLEM: However, when I do the same but with optData.df <- pristine
>> ... which has about 3.8 million options data ... the cores do not seem
>> to be fully utilized (they seem to run at 25%).
>>
>> I noticed some slight delay before the processing starts running ...
>> when I did with the 100k dataset ... do i need to wait longer for any
>> allocations to be done?
>
> Communication to setup the threads ould definitly take some time.
> So why don't you try to increase from 100.000 to 1.000.000 and see how
> long it takes to initialize the threads?
>
> You are not mentioning how long you wait?
>
> Cheers,
>
> Rainer
>
>>
>> Thank you.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> - --
> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
> Biology, UCT), Dipl. Phys. (Germany)
>
> Centre of Excellence for Invasion Biology
> Natural Sciences Building
> Office Suite 2039
> Stellenbosch University
> Main Campus, Merriman Avenue
> Stellenbosch
> South Africa
>
> Tel:        +33 - (0)9 53 10 27 44
> Cell:       +27 - (0)8 39 47 90 42
> Fax (SA):   +27 - (0)8 65 16 27 82
> Fax (D) :   +49 - (0)3 21 21 25 22 44
> Fax (FR):   +33 - (0)9 58 10 27 44
> email:      rai...@krugs.de
>
> Skype:      RMkrug
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkz3b5YACgkQoYgNqgF2egrqtQCfQKXKZln5tz5eZDTYBmbFvruH
> H/4AoIOXZJ7F3lKL6QLRkaQPHZbqBDQn
> =odY/
> -END PGP SIGNATURE-
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to update a bugs-model from inside R using R2WinBUGS

2010-12-02 Thread Frederic Holzwarth

I found a very simple way:
you start winbugs from R and state: debug=TRUE, start maybe with few
iterations, like 10.
If everything goes fine, open a new & empty "odc" document in winbugs. Now
scan the log file and find something like this:

update(10)
coda(*,C:/DOKUME~1/Frederic/LOKALE~1/Temp/Rtmp0Ey62L/coda)
stats(*)
history(*,C:/DOKUME~1/Frederic/LOKALE~1/Temp/Rtmp0Ey62L/history.odc)
save(C:/DOKUME~1/Frederic/LOKALE~1/Temp/Rtmp0Ey62L/log.odc)
save(C:/DOKUME~1/Frederic/LOKALE~1/Temp/Rtmp0Ey62L/log.txt)

copy all the lines (found in this order but interrupted by output) into the
new document.
Then start the document as script (model-> script). The only important
thing, however, is only the line "coda...", the update() can be filled with
howmany iterations you like.
On you go and, when you close winbugs, all the output should be read in by
R2winbugs into the R-object you indicated.

Frederic
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-update-a-bugs-model-from-inside-R-using-R2WinBUGS-tp220p3068980.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] procrustes results affected by order of sites in input file

2010-12-02 Thread Gavin Simpson
On Wed, 2010-12-01 at 14:19 -0600, Christine Dolph wrote:
> Dear All,
> 
> I am using a Procrustes analysis to compare two NMDS ordinations for the
> same set of sites. One ordination is based on fish data, the other is based
> on invertebrate data. Ordinations were derived using metaMDS() from the
> {vegan} library as follows:
> 
> fish.mds<-metaMDS(fish.data, distance="bray", k=3, trymax=100,
> wascores=TRUE, trace=TRUE, zero="add")
> 
> invert.mds<-metaMDS(bugcal.a, distance="bray", k=3, trymax=100,
> wascores=TRUE, trace=TRUE, zero="add"),
> 
> 
> where fish.data and invert.data are site-by-abundance matrices for fish and
> invertebrates, respectively.
> 
> I have then used protest() to test the significance between the two
> ordinations:

Simple things first:

You did have a set.seed() call before the protest() call didn't you? In
fact, I'd probably put set.seed() calls before the two metaMDS() calls
as well.

The ordering should have no impact at all, but the random starts in
metaMDS() and the random permutations in protest() will vary from run to
run unless you set the random seed to some known value.

G

> protest.results<-protest(fish.mds, invert.mds, scale=TRUE, symmetric=TRUE,
> permutations=999)
> 
> The problem is, the results of the protest analysis seem to be affected by
> the original ordering of sites in the data input files. That is, if I
> re-sort one of the input files based on some criteria, I sometimes see a
> change in the strength and significance of the concordance results. In an
> attempt to figure out what is happening, I have re-ordered each dataset in a
> number of ways, and plotted the resulting ordinations. I have seen that
> while the ordering of sites in the input file does not change the spatial
> relationship between them (i.e., does not change the fundamental ordination
> solution), it can change how the sites are oriented with respect to the
> different NMDS axes. I was of the belief that a difference in orientation
> with respect to the NMDS axes should not affect the Procrustes comparison of
> two ordinations, as the protest function should be rotating and reflecting
> one matrix with respect to the other to find the closest match between them
> (i.e., it should be accounting for differences in how the two solutions are
> oriented in space). However, I appear to see different results depending on
> how sites are oriented with respect to each NMDS axis.
> 
> When comparing two ordination solutions with Protest, is it necessary to
> ensure that the original data input files are ordered in the same way?
> 
> Thanks in advance for any insight.
> 
> Sincerely,
> 
> 
> 

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survival - summary and score test for ridge coxph()

2010-12-02 Thread Damjan Krstajic

It seems to me that summary for ridge coxph() prints summary but returns NULL. 
It is not a big issue because one can calculate statistics directly from a 
coxph.object. However, for some reason the score test is not calculated for 
ridge coxph(), i.e score nor rscore components are not included in the coxph 
object when ridge is specified. Please find the code below. I use 2.9.2 R with 
2.35-4 version of the survival package. Thanks DK. 

> fit <- coxph(Surv(futime, fustat) ~ ridge(rx, age, ecog.ps, 
> theta=1),data=ovarian) 
> a<-summary(fit)
Call:
coxph(formula = Surv(futime, fustat) ~ ridge(rx, age, ecog.ps, 
theta = 1), data = ovarian)

  n= 26 
   coef   se(coef) se2Chisq DF p 
ridge(rx)  -0.780 0.5862   0.5589  1.77 1  0.1800
ridge(age)  0.123 0.0387   0.0356 10.15 1  0.0014
ridge(ecog.ps)  0.104 0.5729   0.5478  0.03 1  0.8600

   exp(coef) exp(-coef) lower .95 upper .95
ridge(rx)  0.459  2.181 0.145  1.45
ridge(age) 1.131  0.884 1.049  1.22
ridge(ecog.ps) 1.110  0.901 0.361  3.41

Iterations: 1 outer, 4 Newton-Raphson
Degrees of freedom for terms= 2.7 
Rsquare= 0.452   (max possible= 0.932 )
Likelihood ratio test= 15.6  on 2.67 df,   p=0.000941
Wald test= 12.9  on 2.67 df,   p=0.00354
> a
NULL
> names(fit)
 [1] "coefficients"  "var"   "var2"  "loglik"   
 "iter"  "linear.predictors" "residuals"
 [8] "means" "method""df""df2"  
 "penalty"   "pterms""assign2"  
[15] "history"   "coxlist2"  "printfun"  "n"
 "terms" "assign""wald.test"
[22] "y" "formula"   "call" 

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Пикап

2010-12-02 Thread Vovan
Пикап блог samec.org.ua один из лучший 
информативных сайтов о в Украине, который 
помогает мужчинам в разы улучшить их 
личную и сексуальную жизнь дает ответы на 
вопрос как познакомиться с девушкой? , и 
решить множество проблем, найти ответы 
которые помогут Вам стать настоящими 
мужчинами!!!
Надеюсь что после прочтения блога у Вас 
больше не будет возникать вопрос как 
познакомиться с девушкой)))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 1

2010-12-02 Thread Vovan
sd

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpart results - problem after oversampling

2010-12-02 Thread Jonathan P Daily
Short answer: Not really.

Slightly longer answer: from what I remember of partitioning methods, a 
given split is made at either a single observation or between two 
consecutive observations. In your case, I estimate rpart would have split 
on that particular point... except that there are 6 of them now. So the 
choice of which 6 to split is arbitrary. (Someone with more knowledge of 
rpart's guts feel free to correct me).
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
 - Jubal Early, Firefly

r-help-boun...@r-project.org wrote on 12/02/2010 05:34:23 AM:

> [image removed] 
> 
> [R] rpart results - problem after oversampling
> 
> Mederos, Vicente (Santander) 
> 
> to:
> 
> r-help@r-project.org
> 
> 12/02/2010 05:36 AM
> 
> Sent by:
> 
> r-help-boun...@r-project.org
> 
> Hi all,
> 
> I am trying to predict a target variable that takes values 0 or 1 
> using the rpart command. In my initial dataset I have few positive 
> observations of the target variable; therefore I have oversampled 
> the rare event by a multiple of 6 (i.e. from 762 to 4572).
> 
> However, in my results, I end up with a number of positives in one 
> of the terminal nodes that is not divisible by 6. As I have the same
> observation repeated 6 times, shouldn't all of them follow the same 
> branch of the tree and go to the same terminal node?
> 
> Thanks for your help,
> 
> Vicente
> Emails aren't always secure, and they may be intercepted or changed
> after they've been sent. Santander doesn't accept liability if this
> happens. If you think someone may have interfered with this email,
> please get in touch with the sender another way. This message doesn't
> create or change any contract. Santander doesn't accept responsibility
> for damage caused by any viruses contained in this email or its
> attachments. Emails may be monitored. If you've received this email by
> mistake, please let the sender know at once that it's gone to the wrong
> person and then destroy it without copying, using, or telling anyone
> about its contents. Santander UK plc (SAN UK) Reg. No. 2294747 and Abbey
> National Treasury Services plc (ANTS) Reg. No. 2338548 are registered in
> England and have their Registered Offices at 2 Triton Square, Regent's
> Place, London, NW1 3AN. www.santander.co.uk SAN UK and ANTS are
> authorised and regulated by the Financial Services Authority (Reg. No.
> 106054 and 146003 respectively). SAN UK advises on mortgages, a limited
> range of life assurance, pension and collective investment scheme
> products and acts as an insurance intermediary for general insurance.
> Santander and the flame logo are registered trademarks.
> Ref:[PDB#1-4]
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with package rsm: running fit.mult.impute with cph

2010-12-02 Thread David Foreman
Hi all (and especially Frank),

I'm trying to use x=T, y=T in order to run a validated stepwise cox
regression in rsm, having multiply imputed using mice.  I'm coding

model.max<-fit.mult.impute(baseform,cph,miced2,dated.sexrisk2,x=T,y=T)

baseform is
 baseform<-Surv(si.age,si=="Yes")~ peer.press + copy.press + excited +
worried + intimate.friend + am.pill.times + info.parents + info.teacher +
info.sch.nurse + info.friends + info.media + info.clinic + info.gp +
info.fpa + info.chemist + nearer.clinic + uti + thrush + herpes + love +
strat(gender) + Own.space + ordered(Chat.Mother) + ordered(Chat.Father) +
ordered(Prts.interested) + ordered(Someone.stands.up.4u) + Thought.runaway +
as.numeric(Prts.know.location)+ pubty + partner + arguments + mfqc.total +
esteem.total + events.total + bmi

This gives the error message


Error in  1:n.impute : NA/NaN argument
In addition : Warning message:
In 1:n.impute : numerical expression has 130 elements: only the first used
:>

It doesn't seem to be a singularity problem, as the simple equation

model.test<-fit.mult.impute(Surv(si.age,si=="Yes")~peer.press,cph,miced2,dated.sexrisk2,x=T,y=T)


gives an identical message, and the code runs perfectly in either without
x=T, y=T

PS the dataset is 827 rows x 130 variables.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Last post: problem with package rsm: running fit.mult.impute with cph -- sorry, package was rms

2010-12-02 Thread David Foreman
Sorry everybody, temporary dyslexia.  
Sent from my BlackBerry wireless smartphone

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] warning creating an as.array method in a package

2010-12-02 Thread Michael Friendly

On 11/30/2010 5:59 PM, David Winsemius wrote:


I'm guessing from this behavior on my system that it may be Matrix.

 > showMethods(as.array)
Function: as.array (package base)
x="ANY"
x="Matrix"


I'm still perplexed by this.  Running R CMD install gives this warning:
** testing if installed package can be loaded
Warning: found an S4 version of 'as.array' so it has not been imported 
correctly


Yet, in a freshly initialized R console, I get:
> methods("as.array")
[1] as.array.default
> showMethods("as.array")

Function "as.array":
 

Then, I load Matrix:

> library(Matrix)
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

det

> showMethods("as.array")
Function: as.array (package base)
x="ANY"
x="Matrix"

Under ?methods the Note says

Functions can have both S3 and S4 methods, and function showMethods will 
list the S4 methods (possibly none).


Does the warning mean that I cannot (or should not) have an S3 as.array 
method in a package?
The method works in my context, but the warning will create problems 
when I submit to CRAN, so I would rather avoid it.



--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] warning creating an as.array method in a package

2010-12-02 Thread Michael Friendly

On 12/2/2010 8:58 AM, Michael Friendly wrote:

Under ?methods the Note says

Functions can have both S3 and S4 methods, and function showMethods will
list the S4 methods (possibly none).

Does the warning mean that I cannot (or should not) have an S3 as.array
method in a package?
The method works in my context, but the warning will create problems
when I submit to CRAN, so I would rather avoid it.



A little more testing:  Same warning when I load my package, whether or 
not Matrix has been loaded:


> library(vcdExtra)
Loading required package: vcd
Loading required package: MASS
Loading required package: grid
Loading required package: colorspace
Loading required package: gnm
Warning message:
found an S4 version of ‘as.array’ so it has not been imported correctly
> showMethods("as.array")
Function: as.array (package base)
x="ANY"
x="Matrix"

But as.array does show up as a method for my class:

> methods(class="loddsratio")
[1] as.array.loddsratio*  as.data.frame.loddsratio*
[3] as.matrix.loddsratio* coef.loddsratio*
[5] confint.loddsratio*   dim.loddsratio*
[7] dimnames.loddsratio*  print.loddsratio*
[9] vcov.loddsratio*

   Non-visible functions are asterisked
>


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Пикап

2010-12-02 Thread Vovan
Пикап блог samec.org.ua один из лучший 
информативных сайтов о в Украине, который 
помогает мужчинам в разы улучшить их 
личную и сексуальную жизнь дает ответы на 
вопрос как познакомиться с девушкой? , и 
решить множество проблем, найти ответы 
которые помогут Вам стать настоящими 
мужчинами!!!
Надеюсь что после прочтения блога у Вас 
больше не будет возникать вопрос как 
познакомиться с девушкой)))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to call R-squared values from lm's?

2010-12-02 Thread Wegan, Michael (DNRE)
I would like to call both p-values and R-squared values from lm's in a 
function.  I can get the p-values from coef(summary(name.lm))[r,c], however, I 
cannot figure out how to call the R-squared values without manually calling the 
summary and inserting them in the script - which negates the value of 
automating the process through a function.

Thanks,
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Suitable test for ordinal variable vs continuous variable trend

2010-12-02 Thread Daniel Brewer
Dear all,

For a population of about 200 I have a continuous variable and an ordinal 
variable.  The question I would like to ask is whether the continuously 
increases (or decreases) as the rank of the ordinal variable increases.  I was 
thinking that a Spearmen's rank correlation or or a chi squared trend might be 
appropriate.  I don't have any experience dealing with ordinal variables so I 
am at a bit of a loss.  What is the most appropriate test? and is it 
implemented in R?

Many thanks


**

Daniel Brewer

Institute of Cancer Research
Molecular Carcinogenesis
MUCRC
15 Cotswold Road
Sutton, Surrey SM2 5NG
United Kingdom 

Tel: +44 (0) 20 8722 4109
Fax: +44 (0) 20 8722 4141

Email: daniel.bre...@icr.ac.uk 

**

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the a...{{dropped:5}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to call R-squared values from lm's?

2010-12-02 Thread Roland Rau

On 12/02/2010 03:15 PM, Wegan, Michael (DNRE) wrote:

I would like to call both p-values and R-squared values from lm's in a 
function.  I can get the p-values from coef(summary(name.lm))[r,c], however, I 
cannot figure out how to call the R-squared values without manually calling the 
summary and inserting them in the script - which negates the value of 
automating the process through a function.

Thanks,
Mike


I hope this does the trick:

set.seed(1234)
x <- 1:10
y <- (4 + 3*x)+rnorm(10)

my.lm <- lm(y~x)
summary(my.lm)$r.squared
summary(my.lm)$adj.r.squared

Enjoy,
Roland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to call R-squared values from lm's?

2010-12-02 Thread Marc Schwartz

On Dec 2, 2010, at 8:15 AM, Wegan, Michael (DNRE) wrote:

> I would like to call both p-values and R-squared values from lm's in a 
> function.  I can get the p-values from coef(summary(name.lm))[r,c], however, 
> I cannot figure out how to call the R-squared values without manually calling 
> the summary and inserting them in the script - which negates the value of 
> automating the process through a function.
> 
> Thanks,
> Mike

str() is your friend.

>From ?lm

> summary(lm.D9)

Call:
lm(formula = weight ~ group)

Residuals:
Min  1Q  Median  3Q Max 
-1.0710 -0.4938  0.0685  0.2462  1.3690 

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   5.0320 0.2202  22.850 9.55e-15 ***
groupTrt -0.3710 0.3114  -1.1910.249
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.6964 on 18 degrees of freedom
Multiple R-squared: 0.07308,Adjusted R-squared: 0.02158 
F-statistic: 1.419 on 1 and 18 DF,  p-value: 0.249 



# Get the structure of the model summary object
> str(summary(lm.D9))
List of 11
 $ call : language lm(formula = weight ~ group)
 $ terms:Classes 'terms', 'formula' length 3 weight ~ group
  .. ..- attr(*, "variables")= language list(weight, group)
  .. ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:2] "weight" "group"
  .. .. .. ..$ : chr "group"
  .. ..- attr(*, "term.labels")= chr "group"
  .. ..- attr(*, "order")= int 1
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")= 
  .. ..- attr(*, "predvars")= language list(weight, group)
  .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
  .. .. ..- attr(*, "names")= chr [1:2] "weight" "group"
 $ residuals: Named num [1:20] -0.862 0.548 0.148 1.078 -0.532 ...
  ..- attr(*, "names")= chr [1:20] "1" "2" "3" "4" ...
 $ coefficients : num [1:2, 1:4] 5.032 -0.371 0.22 0.311 22.85 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:2] "(Intercept)" "groupTrt"
  .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
 $ aliased  : Named logi [1:2] FALSE FALSE
  ..- attr(*, "names")= chr [1:2] "(Intercept)" "groupTrt"
 $ sigma: num 0.696
 $ df   : int [1:3] 2 18 2
 $ r.squared: num 0.0731
 $ adj.r.squared: num 0.0216
 $ fstatistic   : Named num [1:3] 1.42 1 18
  ..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
 $ cov.unscaled : num [1:2, 1:2] 0.1 -0.1 -0.1 0.2
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:2] "(Intercept)" "groupTrt"
  .. ..$ : chr [1:2] "(Intercept)" "groupTrt"
 - attr(*, "class")= chr "summary.lm"



> summary(lm.D9)$r.squared
[1] 0.0730776

> summary(lm.D9)$adj.r.squared
[1] 0.02158191


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Пикап

2010-12-02 Thread Vovan
Пикап блог samec.org.ua один из лучший 
информативных сайтов о в Украине, который 
помогает мужчинам в разы улучшить их 
личную и сексуальную жизнь дает ответы на 
вопрос как познакомиться с девушкой? , и 
решить множество проблем, найти ответы 
которые помогут Вам стать настоящими 
мужчинами!!!
Надеюсь что после прочтения блога у Вас 
больше не будет возникать вопрос как 
познакомиться с девушкой)))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Пикап

2010-12-02 Thread Vovan
Пикап блог  один из лучший информативных 
сайтов о в Украине, который помогает 
мужчинам в разы улучшить их личную и 
сексуальную жизнь дает ответы на вопрос 
как познакомиться с девушкой? , и решить 
множество проблем, найти ответы которые 
помогут Вам стать настоящими мужчинами!!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread David Winsemius


On Dec 2, 2010, at 5:58 AM, Karl Brand wrote:


On 12/2/2010 11:36 AM, David Winsemius wrote:


On Dec 2, 2010, at 5:22 AM, Karl Brand wrote:


Cheers Bill.







Inserting earlier debris:
I don't understand why i get this error message when attempting to  
use merge() -


> temp <- merge(x, y[,17, drop=FALSE], by=rownames, sort=FALSE)
Error in as.vector(x, mode) :
 cannot coerce type 'closure' to vector of type 'any'


Try
   by="rownames"
instead of
   by=rownames


At this point I think Bill should have suggested by="row.names" since  
that is the correct argument to merge() when asking for a rownames  
match, and you appear to have found that by experimentation since you  
didn't seem to have read the help page.




You got me halfway, since:

> temp <- merge(x=x, y=y[,17, drop=FALSE], by="rownames",  
sort=FALSE)

Error in fix.by(by.x, x) : 'by' must specify valid column(s)

but, using "row.names" instead of "rownames", like:
> temp <- merge(x=x, y=y[,17, drop=FALSE], by="row.names",  
sort=FALSE)


works (but adds a column "Row.names").


OK. It added a column what's the problem? The help(merge) pages says:
"If the matching involved row names, an extra character column called  
Row.names is added at the left, and in all cases the result has  
‘automatic’ row names."




Which seems some what counter intuitive to me since i am feeding in
two matrices to the merge function, which i understand have
'rownames', not 'row.names' as data frames have, right?




Although the
output of merge() is a data frame...


Right. The merge function takes either dataframes or things which can  
be coerced to dataframes as arguments and returns a dataframe. That is  
exactly what the help(merge) page states outright (in many places).  
Just because your arguments were matrices doesn't mean the returned  
object should be one. Had you wanted it to be an "augmented"  
data.matrix with rownames as before, you could gotten that after merge  
by:


?data.matrix
temp2 <- datamatrix(temp[-1])
rownames(temp2) <- temp$Row.names

(Or by the rather nice manipulations that you described using rownames  
as indices.)


I think I understand the "problem" now  that you were expecting  
merge() to behave differently than is documented.


--
David.


This reminds me of physicists using high energy particles to  
investigate
the structure of the nucleus. But you have alternatives to what  
might be
called "destructive debugging through binoculars". Instead of  
throwing

code at your data and asking the assembled audience of soothsayers to
tell you what went wrong by examining the debris, why don't you  
show us

what the data objects actually look like?



Ofcourse David, sorry- my lazy.

Is str() enough to put away the binoculuars and maybe highlight the  
source of my ignorance?


> str(x)
num [1:140, 1:7] 4.93e-02 2.83e-02 1.07e-02 2.68e-02 3.92e-05 ...
- attr(*, "dimnames")=List of 2
 ..$ : chr [1:140] "1415743_at" "1415887_at" "1416332_at"  
"1416340_a_at" ...

 ..$ : chr [1:7] "R.S.adj.pvalue" "R.S.Tau" "R.S.xpiek" "R.S.xdal" ...

> str(y)
num [1:140, 1:18] 0.573 1 0.752 0.768 0.399 ...
- attr(*, "dimnames")=List of 2
 ..$ : chr [1:140] "1427982_s_at" "1430520_at" "1454086_a_at"  
"1419652_s_at" ...

 ..$ : chr [1:18] "R.S.1" "R.S.2" "R.S.3" "R.S.4" ...
> source(.trPaths[5], echo=TRUE, max.deparse.length=150)

> str(temp)
'data.frame':   140 obs. of  9 variables:
$ Row.names :Class 'AsIs'  chr [1:140] "1415743_at" "1415887_at"  
"1416332_at" "1416340_a_at" ...
$ R.S.adj.pvalue: num  4.93e-02 2.83e-02 1.07e-02 2.68e-02  
3.92e-05 ...
$ R.S.Tau   : num  21.6 23.6 26.6 29.7 18.8 20 24.6 27.9 23.9  
22.7 ...

$ R.S.xpiek : num  6.74 17.46 15.81 15.39 14.73 ...
$ R.S.xdal  : num  16.94 1.76 22.8 1.12 5.41 ...
$ R.S.p1: num  0.0004 0.0001 0.0467 0.0024 0 ...
$ R.S.p2: num  0.0039 0 0.002 0.019 0.0001 0.0035 0.0351  
0.0126 0.0028 0.0192 ...
$ R.S.p3: num  0.0178 0.0004 0.0101 0.0312 0.0012 0.022  
0.0008 0.0151 0.0048 0.0317 ...

$ V1: num  3.96 -0.94 1.04 1.94 -2.53 ...
>

--
Karl Brand 
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Пикап

2010-12-02 Thread Vovan
Пикап блог  один из лучший информативных 
сайтов о в Украине, который помогает 
мужчинам в разы улучшить их личную и 
сексуальную жизнь дает ответы на вопрос 
как познакомиться с девушкой? , и решить 
множество проблем, найти ответы которые 
помогут Вам стать настоящими мужчинами!!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Пикап

2010-12-02 Thread Vovan
Пикап блог  один из лучший информативных 
сайтов о в Украине, который помогает 
мужчинам в разы улучшить их личную и 
сексуальную жизнь дает ответы на вопрос 
как познакомиться с девушкой? , и решить 
множество проблем, найти ответы которые 
помогут Вам стать настоящими мужчинами!!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please help......barplot2

2010-12-02 Thread David Lyon
Hi everyone 

I spent hours trying to figure this out but as a newbie I am stuck...
can someone show me the R code for the following:

If I had a tab delimited file called "file", containing 3 rows :
9.568.679.288.817.939.858.9210.19   8.63
6.367.296.687.118.046.057.045.807.34
3.223.223.284.375.213.103.372.565.43

How do I import the file, then plot the mean and standard error bars of the 
data from each row?


Thank you so much

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hmm Topology restriction

2010-12-02 Thread Kishor Tappita
Dear Ingmar,

Thanks a lot.

Regards,
Kishor

On Thu, Dec 2, 2010 at 6:39 PM, Ingmar Visser  wrote:
> depmixS4 enables linear constraints on parameters, best, Ingmar
>
> On Thu, Dec 2, 2010 at 12:01 PM, Kishor Tappita 
> wrote:
>>
>> Dear List,
>>
>> We are using RHmm to cluster data through HMM. We would like to
>> restrict the transition matrix of HMM, to get hierarchical connections
>> between clusters. But, RHmm doesn't seem to support these
>> restrictions. Can any one suggest a library to do that.
>>
>> Thanks,
>> Kishor
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] make check from R2.12.0.exe installation

2010-12-02 Thread elliott harrison
Hi,

I typically install new versions of R on windows using the downloadable 
executable file rather than the full tar. 
I need to now document the success of the installation in addition to my 
preferred procedure of running an old dataset against the new build.
I found quickly that this is all available to me but the tests directory I have 
does not contain the scripts that the tar file does.
How do I check the installation when R is installed in this manner?

Many thanks

Mr Elliott Harrison
Senior Research Associate
Epistem Ltd
48 Grafton Street
Manchester, M13 9XX. UK
Office Tel:  +44 (0) 161 606 7398
Fax:  +44  (0) 161 606 7348
Incorporated and registered in England and Wales under registration 
number: 03901952.  Registered office: The Incubator Building, Grafton Street, 
Manchester M13 9XX. VAT registration number: 903 0350 77.
This email and any attachments and files are confidential and are intended only 
for the addressee(s). Any review, onward transmission, disclosure or other use 
of this email or any attachments by any other person is prohibited. If you 
receive this email and are not an intended recipient please inform the sender 
immediately and delete this message, removing any copies from your system. 
Unless expressly stated otherwise, any opinions expressed in this email or any 
attachments transmitted with it are personal and do not constitute the opinions 
of Epistem Holdings Plc or any subsidiary company (together "Epistem"). 
Although emails are routinely screened for viruses, Epistem does not accept any 
liability or responsibility for: (i) changes made to this email or any 
attachments transmitted with it after they were sent; or (ii) any viruses 
transmitted through this email or attachments transmitted with it. It is the 
responsibility of the recipient to ensure that the onward transmission, opening 
or use of this email or any files or attachments transmitted with it will not 
adversely affect its systems or data and the recipient should carry out such 
virus checks and other checks that it considers appropriate.
For more information about Epistem please visit: www.epistem.co.uk 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] initial values for alpha and beta in Gamma inverse

2010-12-02 Thread Rosario Garcia Gil
Hello

I am trying to fit a model using glm function with family=Gamma(link="inverse").

Is it possible to give initial values, and how? I tried to search for info but 
not managed to get any answer.

Kind regards and thanks in advance.
Rosario

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot more plots from one matrix

2010-12-02 Thread alcesgabbo

Hi,
I have a dataframe like this:

 procedure propertysensor_data  sensor_date   
 | |   | |
 [1,] "PAT_Laser_2" "Distance"   "30.42" "2010-09-30T15:00:12+0200"
 [2,] "PAT_Laser_2" "Distance"   "31.22" "2010-10-31T15:05:07+0100"
 [3,] "PAT_Laser_2" "Distance"   "30.42" "2011-09-30T15:00:12+0200"
 [4,] "PAT_Laser_2" "Distance"   "31.22" "2011-10-31T15:05:07+0100"
 [5,] "PAT_Laser_1" "Distance"   "50.1252"   "2010-09-30T15:00:12+0200"
 [6,] "PAT_Laser_1" "Distance"   "48.532""2010-10-31T15:05:07+0100"
 [7,] "PAT_Laser_1" "Distance"   "50.1252"   "2011-09-30T15:00:12+0200"
 [8,] "PAT_Laser_1" "Distance"   "48.532""2011-10-31T15:05:07+0100"
 [9,] "PAT_Laser_2" "Distance_B" "3.42"  "2010-09-30T15:00:12+0200"
[10,] "PAT_Laser_2" "Distance_B" "131.22""2010-10-31T15:05:07+0100"
[11,] "PAT_Laser_2" "Distance_B" "303.42""2011-09-30T15:00:12+0200"
[12,] "PAT_Laser_2" "Distance_B" "131.22""2011-10-31T15:05:07+0100"
[13,] "PAT_Laser_1" "Distance_B" "530.1252"  "2010-09-30T15:00:12+0200"
[14,] "PAT_Laser_1" "Distance_B" "428.532"   "2010-10-31T15:05:07+0100"
[15,] "PAT_Laser_1" "Distance_B" "530.1252"  "2011-09-30T15:00:12+0200"
[16,] "PAT_Laser_1" "Distance_B" "148.532"   "2011-10-31T15:05:07+0100"


In order to identify each "sensor_data" with the columns "property" and
"procedure" I create a new column "key" in this way:

procedure   property sensor_data  sensor_date   
key
1  PAT_Laser_2   Distance   30.42 2010-09-30T15:00:12+0200  
PAT_Laser_2.Distance
2  PAT_Laser_2   Distance   31.22 2010-10-31T15:05:07+0100  
PAT_Laser_2.Distance
3  PAT_Laser_2   Distance   30.42 2011-09-30T15:00:12+0200  
PAT_Laser_2.Distance
4  PAT_Laser_2   Distance   31.22 2011-10-31T15:05:07+0100  
PAT_Laser_2.Distance
5  PAT_Laser_1   Distance 50.1252 2010-09-30T15:00:12+0200  
PAT_Laser_1.Distance
6  PAT_Laser_1   Distance  48.532 2010-10-31T15:05:07+0100  
PAT_Laser_1.Distance
7  PAT_Laser_1   Distance 50.1252 2011-09-30T15:00:12+0200  
PAT_Laser_1.Distance
8  PAT_Laser_1   Distance  48.532 2011-10-31T15:05:07+0100  
PAT_Laser_1.Distance
9  PAT_Laser_2 Distance_B3.42 2010-09-30T15:00:12+0200
PAT_Laser_2.Distance_B
10 PAT_Laser_2 Distance_B  131.22 2010-10-31T15:05:07+0100
PAT_Laser_2.Distance_B
11 PAT_Laser_2 Distance_B  303.42 2011-09-30T15:00:12+0200
PAT_Laser_2.Distance_B
12 PAT_Laser_2 Distance_B  131.22 2011-10-31T15:05:07+0100
PAT_Laser_2.Distance_B
13 PAT_Laser_1 Distance_B530.1252 2010-09-30T15:00:12+0200
PAT_Laser_1.Distance_B
14 PAT_Laser_1 Distance_B 428.532 2010-10-31T15:05:07+0100
PAT_Laser_1.Distance_B
15 PAT_Laser_1 Distance_B530.1252 2011-09-30T15:00:12+0200
PAT_Laser_1.Distance_B
16 PAT_Laser_1 Distance_B 148.532 2011-10-31T15:05:07+0100
PAT_Laser_1.Distance_B

Now I want to plot the "sensor_data" column for each "sensor_date" so I
create a matrix in this way:


m<-tapply(matrix2plot2[,"sensor_data"],matrix2plot2[,c("sensor_date","key")],c)

key
sensor_date  PAT_Laser_1.Distance PAT_Laser_1.Distance_B
PAT_Laser_2.Distance PAT_Laser_2.Distance_B   
  2010-09-30T15:00:12+02009 10  
 
4  3
  2010-10-31T15:05:07+01008  7  
 
6  1
  2011-09-30T15:00:12+02009 10  
 
4  5
  2011-10-31T15:05:07+01008  2  
 
6  1

(1° PROBLEM)
And here there is the first problem: instead of having "sensor_data" inside
the matrix there is a sequence of numbers 1,2...,10. (I think it is a way to
simplify the matrix but I would like the original values).

Apart this problem when I do:
matplot(m, type="o", xaxt="n", lty=1)

It is ok, It plots the matrix, each line on the plot represents a column.

(2° PROBLEM)
I would like to have 2 plots, one for the values with "property"= distance
and the other for "property"=distance_B. Is there a way to do this without
use directly the names "distance" and "distance_B" (so it should be possible
to use this method also for other matrix with other "property").


Is it clear?? I don't know if I explained the problem in a clear way.  

Thanks!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/plot-more-plots-from-one-matrix-tp3069545p3069545.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please help......barplot2

2010-12-02 Thread David Winsemius


On Dec 2, 2010, at 10:36 AM, David Lyon wrote:


Hi everyone

I spent hours trying to figure this out but as a newbie I am stuck...
can someone show me the R code for the following:

If I had a tab delimited file called "file",


 Are you sure that you are not on an OS that hides the file  
extensions by default?




containing 3 rows :
9.568.679.288.817.939.858.9210.19   8.63
6.367.296.687.118.046.057.045.807.34
3.223.223.284.375.213.103.372.565.43

How do I import the file, then plot the mean and standard error bars  
of the data from each row?


If the files name is really "file" then:

read.table(file="file", header=FALSE, sep="\t")

At this point I will note that this appears to be structured along the  
lines I might except for a homework problem. I will offer the advice  
that the plotCI function in package plotrix looks suitable and that we  
have an extensive collection of documentation suitable for beginners  
that probably has similar problems worked out:






Thank you so much

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Please help......barplot2

2010-12-02 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 02.12.2010 16:36:09:

> Hi everyone 
> 
> I spent hours trying to figure this out but as a newbie I am stuck...
> can someone show me the R code for the following:
> 
> If I had a tab delimited file called "file", containing 3 rows :
> 9.56   8.67   9.28   8.81   7.93   9.85   8.92   10.19   8.63 
> 6.36   7.29   6.68   7.11   8.04   6.05   7.04   5.80   7.34 
> 3.22   3.22   3.28   4.37   5.21   3.10   3.37   2.56   5.43 
> 
> How do I import the file, then plot the mean and standard error bars of 
the 
> data from each row?

Here is how I found how to do it

test=read.table("clipboard")
test
V1   V2   V3   V4   V5   V6   V7V8   V9
1 9.56 8.67 9.28 8.81 7.93 9.85 8.92 10.19 8.63
2 6.36 7.29 6.68 7.11 8.04 6.05 7.04  5.80 7.34
3 3.22 3.22 3.28 4.37 5.21 3.10 3.37  2.56 5.43

test1<-t(test)
??error
library(Hmisc)
?errbar
starting httpd help server ... done
test.m<-colMeans(test1)
test.sd<-apply(test1, 2, sd)
errbar(1:3, test.m, test.m+test.sd, test.m-test.sd)

Is it OK?

Regards
Petr




> 
> 
> Thank you so much
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please help......barplot2

2010-12-02 Thread David Lyon
Thanks David

Do you have the url link that details the worked solution to my problem:

"that we have an extensive collection of documentation suitable for beginners 
that probably has similar problems worked out:"

If not can someone show me how to 
" How do I import the file, then plot the mean and standard error bars of the 
data from each row?"

Thanks in advance.
Also this is not an homework problem just somebody using it as a newbie at work 
in research who normally codes in C, C++ and JAVA.

Thanks again.









--- On Thu, 12/2/10, David Winsemius  wrote:

> From: David Winsemius 
> Subject: Re: [R] Please help..barplot2
> To: "David Lyon" 
> Cc: r-help@r-project.org
> Date: Thursday, December 2, 2010, 10:57 AM
> 
> On Dec 2, 2010, at 10:36 AM, David Lyon wrote:
> 
> > Hi everyone
> > 
> > I spent hours trying to figure this out but as a
> newbie I am stuck...
> > can someone show me the R code for the following:
> > 
> > If I had a tab delimited file called "file",
> 
>  Are you sure that you are not on an OS that hides the file
> extensions by default?
> 
> 
> > containing 3 rows :
> > 9.56    8.67   
> 9.28    8.81   
> 7.93    9.85   
> 8.92    10.19   
> 8.63    
> > 6.36    7.29   
> 6.68    7.11   
> 8.04    6.05   
> 7.04    5.80   
> 7.34    
> > 3.22    3.22   
> 3.28    4.37   
> 5.21    3.10   
> 3.37    2.56   
> 5.43    
> > 
> > How do I import the file, then plot the mean and
> standard error bars of the data from each row?
> 
> If the files name is really "file" then:
> 
> read.table(file="file", header=FALSE, sep="\t")
> 
> At this point I will note that this appears to be
> structured along the lines I might except for a homework
> problem. I will offer the advice that the plotCI function in
> package plotrix looks suitable and that we have an extensive
> collection of documentation suitable for beginners that
> probably has similar problems worked out:
> 
> 
> > 
> > 
> > Thank you so much
> > 
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> David Winsemius, MD
> West Hartford, CT
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread Karl Brand



On 12/2/2010 3:52 PM, David Winsemius wrote:


On Dec 2, 2010, at 5:58 AM, Karl Brand wrote:


On 12/2/2010 11:36 AM, David Winsemius wrote:


On Dec 2, 2010, at 5:22 AM, Karl Brand wrote:


Cheers Bill.







Inserting earlier debris:

I don't understand why i get this error message when attempting to
use merge() -

> temp <- merge(x, y[,17, drop=FALSE], by=rownames, sort=FALSE)
Error in as.vector(x, mode) :
cannot coerce type 'closure' to vector of type 'any'


Try
by="rownames"
instead of
by=rownames


At this point I think Bill should have suggested by="row.names" since
that is the correct argument to merge() when asking for a rownames
match, and you appear to have found that by experimentation since you
didn't seem to have read the help page.



You got me halfway, since:

> temp <- merge(x=x, y=y[,17, drop=FALSE], by="rownames", sort=FALSE)
Error in fix.by(by.x, x) : 'by' must specify valid column(s)

but, using "row.names" instead of "rownames", like:
> temp <- merge(x=x, y=y[,17, drop=FALSE], by="row.names", sort=FALSE)

works (but adds a column "Row.names").


OK. It added a column what's the problem? The help(merge) pages says:
"If the matching involved row names, an extra character column called
Row.names is added at the left, and in all cases the result has
‘automatic’ row names."



Which seems some what counter intuitive to me since i am feeding in
two matrices to the merge function, which i understand have
'rownames', not 'row.names' as data frames have, right?




Although the
output of merge() is a data frame...


Right. The merge function takes either dataframes or things which can be
coerced to dataframes as arguments and returns a dataframe. That is
exactly what the help(merge) page states outright (in many places). Just
because your arguments were matrices doesn't mean the returned object
should be one. Had you wanted it to be an "augmented" data.matrix with
rownames as before, you could gotten that after merge by:

?data.matrix
temp2 <- datamatrix(temp[-1])
rownames(temp2) <- temp$Row.names

(Or by the rather nice manipulations that you described using rownames
as indices.)

I think I understand the "problem" now  that you were expecting
merge() to behave differently than is documented.


Cheers David,

You nailed it- closer attention to the help page would defiently have 
saved us time.


No less, thanks a lot for highlighting this, and also for the merge() 
clarifications.


And insight into what physicists have to deaal with :)

cheers,

Karl


--
Karl Brand 
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help summarizing R data frame

2010-12-02 Thread chris99

I am trying to aggregate data in column 2 to identifiers in col 1

eg..

take this>

identifier   quantity
1 10
1 20
2 30
1 15
2 10
3 20

and make this>

identifier quantity
145
240
320


Thanks in advance for your help!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Help-summarizing-R-data-frame-tp3069624p3069624.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please help......barplot2

2010-12-02 Thread David Winsemius


On Dec 2, 2010, at 11:10 AM, David Lyon wrote:


Thanks David

Do you have the url link that details the worked solution to my  
problem:


"that we have an extensive collection of documentation suitable for  
beginners that probably has similar problems worked out:"


http://cran.r-project.org/other-docs.html

My opinion: the Kuhnert and Venables piece is well worth the time to  
download its 3MB extent, which is not to diminish the value of the  
rest of the contributed documentation. I don't think you can go wring  
starting there. I have in the past used Alzola and Harrell's and am  
currently working through the piece that Harrell does on literate  
programming (not on the contributed page:


http://biostat.mc.vanderbilt.edu/wiki/Main/StatReport
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/StatReport/summary.pdf




If not can someone show me how to
" How do I import the file, then plot the mean and standard error  
bars of the data from each row?"


Thanks in advance.
Also this is not an homework problem just somebody using it as a  
newbie at work in research who normally codes in C, C++ and JAVA.


I noted that Petr's solution used colMeans, I thought you were asking  
for rowMeans.


--
David.


Thanks again.


--- On Thu, 12/2/10, David Winsemius  wrote:


From: David Winsemius 
Subject: Re: [R] Please help..barplot2
To: "David Lyon" 
Cc: r-help@r-project.org
Date: Thursday, December 2, 2010, 10:57 AM

On Dec 2, 2010, at 10:36 AM, David Lyon wrote:


Hi everyone

I spent hours trying to figure this out but as a

newbie I am stuck...

can someone show me the R code for the following:

If I had a tab delimited file called "file",


Are you sure that you are not on an OS that hides the file
extensions by default?



containing 3 rows :
9.568.67

9.288.81
7.939.85
8.9210.19
8.63

6.367.29

6.687.11
8.046.05
7.045.80
7.34

3.223.22

3.284.37
5.213.10
3.372.56
5.43


How do I import the file, then plot the mean and

standard error bars of the data from each row?

If the files name is really "file" then:

read.table(file="file", header=FALSE, sep="\t")

At this point I will note that this appears to be
structured along the lines I might except for a homework
problem. I will offer the advice that the plotCI function in
package plotrix looks suitable and that we have an extensive
collection of documentation suitable for beginners that
probably has similar problems worked out:





Thank you so much

__
R-help@r-project.org

mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

reproducible code.

David Winsemius, MD
West Hartford, CT








David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2D Random walk

2010-12-02 Thread featherbox

I've wrote some code to simulate a random walk in 2 dimensions on a lattice.
Basically I want to add something in to make it plot it point by point so
you can see what is going on.
Heres my code for the random walk in 2d


RW2D<-function(N)
  {
i<-0
xdir<-0
ydir<-0
xpos<-vector()
xpos[1]<-xdir
ypos<-vector()
ypos[1]<-ydir
for (i in 1:N-1)
  {
r<-runif(1)
if(r<=0.25) {xdir<-xdir+1}
if(r>0.25 && r<=0.5) {xdir<-xdir-1}
if(r>0.5 && r<=0.75) {ydir<-ydir +1}
if(r>0.75) {ydir<-ydir-1}
xpos[i+1]<-xdir
ypos[i+1]<-ydir
  } 
return(cbind(xpos,ypos))
  }
rw<-RW2D(1)

xmin<-min(rw[,1])
xmax<-max(rw[,1])
ymin<-min(rw[,2])
ymax<-max(rw[,2])

plot(rw[,1],rw[,2],type="l",xlab="x",ylab="y",main="Random Walk Simulation
In Two Dimensions",col="green4",xlim=range(xmin:xmax),ylim=range(ymin:ymax))

end<-cbind(rw[1,1],rw[1,2])
start<-cbind(0,0)

points(start,pch=4,col="red")
points(end,pch=4,col="red")

-- 
View this message in context: 
http://r.789695.n4.nabble.com/2D-Random-walk-tp3069557p3069557.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kalman filter

2010-12-02 Thread Santiago Ramos

I can say is used as filter Kalman

thanks

Cordialmente,
 
JAVIER SANTIAGO PARRA RAMOS

INGENIERO DE SISTEMAS
ESP. EN GERENCIA DE SISTEMAS INFORMATICOS
CEL: (57) 313 416 71 21

 






  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help summarizing R data frame

2010-12-02 Thread Ivan Calandra

see ?aggregate, and ?summaryBy (in package doBy)
I think ddply (in package plyr) could also do the job

Ivan

Le 12/2/2010 17:24, chris99 a écrit :

I am trying to aggregate data in column 2 to identifiers in col 1

eg..

take this>

identifier   quantity
1 10
1 20
2 30
1 15
2 10
3 20

and make this>

identifier quantity
145
240
320


Thanks in advance for your help!


--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread William Dunlap
I didn't change your "rownames" to "row.names"
because I figured you had the name you wanted
but only forgot to include the quotes.  Both "rownames"
and "row.names" are valid inputs, but they mean
different things.  Howver, without the quotes you were
passing in the contents of the object called
rownames, which happens to be a function (aka a closure),
not the string "rownames.  The quotes are needed to
distinguish between things and names of things.
Some functions (e.g., library() and help()) try
to cover up this difference, leading to occasional
difficulties using them.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -Original Message-
> From: Karl Brand [mailto:k.br...@erasmusmc.nl] 
> Sent: Thursday, December 02, 2010 2:22 AM
> To: William Dunlap
> Cc: r-help@r-project.org; Dimitris Rizopoulos
> Subject: Re: [R] attempted merge() returns: cannot coerce 
> type 'closure' to vector of type 'any'
> 
> Cheers Bill.
> 
> You got me halfway, since:
> 
>  > temp <- merge(x=x, y=y[,17, drop=FALSE], by="rownames", sort=FALSE)
> Error in fix.by(by.x, x) : 'by' must specify valid column(s)
> 
> but, using "row.names" instead of "rownames", like:
>  > temp <- merge(x=x, y=y[,17, drop=FALSE], by="row.names", 
> sort=FALSE)
> 
> works (but adds a column "Row.names").
> 
> Which seems some what counter intuitive to me since i am 
> feeding in two 
> matrices to the merge function, which i understand have 
> 'rownames', not 
> 'row.names' as data frames have, right? Although the output 
> of merge() 
> is a data frame...
> 
> thanks again,
> 
> Karl
> 
> 
> On 12/1/2010 6:08 PM, William Dunlap wrote:
> > Try
> > by="rownames"
> > instead of
> > by=rownames
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >> -Original Message-
> >> From: r-help-boun...@r-project.org
> >> [mailto:r-help-boun...@r-project.org] On Behalf Of Karl Brand
> >> Sent: Wednesday, December 01, 2010 6:35 AM
> >> To: Dimitris Rizopoulos
> >> Cc: r-help@r-project.org
> >> Subject: [R] attempted merge() returns: cannot coerce type
> >> 'closure' to vector of type 'any'
> >>
> >> Hi Dimtris and esteemed useRs,
> >>
> >> I don't understand why i get this error message when
> >> attempting to use
> >> merge() -
> >>
> >>   >  temp<- merge(x, y[,17, drop=FALSE], by=rownames, sort=FALSE)
> >> Error in as.vector(x, mode) :
> >> cannot coerce type 'closure' to vector of type 'any'
> >>
> >> It should work because:
> >>
> >>   >  all(rownames(x[order(rownames(x)),]) ==
> >> + rownames(y[order(rownames(y[,17, drop=FALSE])),17,
> >> drop=FALSE]) 
> >> [TRUNCATED]
> >> [1] TRUE
> >>
> >> also:
> >>
> >>   >  class(x); class(y[,17, drop=FALSE])
> >> [1] "matrix"
> >> [1] "matrix"
> >>
> >>
> >> Any idea why i cant use merge() in the normal way here? 
> I'm forced to
> >> add the column using:
> >>
> >> temp.b<- cbind(x, y[match(rownames(x), rownames(y)),17])
> >>
> >> All insights appreciated for this leaRner,
> >>
> >> cheers,
> >>
> >> Karl
> >>
> >>
> >> --
> >> Karl Brand
> >> Department of Genetics
> >> Erasmus MC
> >> Dr Molewaterplein 50
> >> 3015 GE Rotterdam
> >> P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> 
> -- 
> Karl Brand 
> Department of Genetics
> Erasmus MC
> Dr Molewaterplein 50
> 3015 GE Rotterdam
> P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help summarizing R data frame

2010-12-02 Thread Mike Rennie
see also tapply()

e.g.

a<-c(1,1,1,2,2,2,3,3,3)
b<-c(10,10,10,15,15,15,20,20,20)
c.dat<-data.frame(a,b)

tapply(c.dat[,2],c.dat[,1],sum)

Mike

On Thu, Dec 2, 2010 at 10:33 AM, Ivan Calandra  wrote:

> see ?aggregate, and ?summaryBy (in package doBy)
> I think ddply (in package plyr) could also do the job
>
> Ivan
>
> Le 12/2/2010 17:24, chris99 a écrit :
>
>> I am trying to aggregate data in column 2 to identifiers in col 1
>>
>> eg..
>>
>> take this>
>>
>> identifier   quantity
>> 1 10
>> 1 20
>> 2 30
>> 1 15
>> 2 10
>> 3 20
>>
>> and make this>
>>
>> identifier quantity
>> 145
>> 240
>> 320
>>
>>
>> Thanks in advance for your help!
>>
>
> --
> Ivan CALANDRA
> PhD Student
> University of Hamburg
> Biozentrum Grindel und Zoologisches Museum
> Abt. Säugetiere
> Martin-Luther-King-Platz 3
> D-20146 Hamburg, GERMANY
> +49(0)40 42838 6231
> ivan.calan...@uni-hamburg.de
>
> **
> http://www.for771.uni-bonn.de
> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help summarizing R data frame

2010-12-02 Thread Patrick Hausmann

Here are some examples with tapply, aggregate, ddply:

x <- read.table("clipboard", head=TRUE)

with(x, tapply(quantity, identifier, sum))

aggregate(x$quantity, by=list(x$identifier), sum)

aggregate(quantity ~ identifier, data = x, sum)

library(plyr)
ddply(x, .(identifier), summarise, quantity=sum(quantity))

HTH
Patrick

Am 02.12.2010 17:24, schrieb chris99:


I am trying to aggregate data in column 2 to identifiers in col 1

eg..

take this>

identifier   quantity
1 10
1 20
2 30
1 15
2 10
3 20

and make this>

identifier quantity
145
240
320


Thanks in advance for your help!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2D Random walk

2010-12-02 Thread Patrick Burns

Hopefully someone has a good suggestion
for your question.  I'd just like to
point out that the generation of the
random walk is pretty much optimized for
worst performance.

A 'for' loop is not necessary and growing
the objects could be a very significant
drag.  See Circles 2 and 3 of 'The R Inferno'.

Hint: 'cumsum' will be useful.

On 02/12/2010 15:58, featherbox wrote:


I've wrote some code to simulate a random walk in 2 dimensions on a lattice.
Basically I want to add something in to make it plot it point by point so
you can see what is going on.
Heres my code for the random walk in 2d


RW2D<-function(N)
   {
 i<-0
 xdir<-0
 ydir<-0
 xpos<-vector()
 xpos[1]<-xdir
 ypos<-vector()
 ypos[1]<-ydir
 for (i in 1:N-1)
   {
r<-runif(1)
if(r<=0.25) {xdir<-xdir+1}
if(r>0.25&&  r<=0.5) {xdir<-xdir-1}
if(r>0.5&&  r<=0.75) {ydir<-ydir +1}
if(r>0.75) {ydir<-ydir-1}
xpos[i+1]<-xdir
ypos[i+1]<-ydir
   }
 return(cbind(xpos,ypos))
   }
rw<-RW2D(1)

xmin<-min(rw[,1])
xmax<-max(rw[,1])
ymin<-min(rw[,2])
ymax<-max(rw[,2])

plot(rw[,1],rw[,2],type="l",xlab="x",ylab="y",main="Random Walk Simulation
In Two Dimensions",col="green4",xlim=range(xmin:xmax),ylim=range(ymin:ymax))

end<-cbind(rw[1,1],rw[1,2])
start<-cbind(0,0)

points(start,pch=4,col="red")
points(end,pch=4,col="red")



--
Patrick Burns
pbu...@pburns.seanet.com
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help summarizing R data frame

2010-12-02 Thread jim holtman
Nice thing about R is there are a number of ways to do things:

> x
  identifier quantity
1  1   10
2  1   20
3  2   30
4  1   15
5  2   10
6  3   20
> require(sqldf)
> sqldf('select identifier, sum(quantity) as quantity from x group by 
> identifier')
  identifier quantity
1  1   45
2  2   40
3  3   20
>

or using 'data.table'

> require(data.table)
Loading required package: data.table
> x <- data.table(x)
> x[, sum(quantity), by = identifier]
 identifier V1
[1,]  1 45
[2,]  2 40
[3,]  3 20


On Thu, Dec 2, 2010 at 11:24 AM, chris99  wrote:
>
> I am trying to aggregate data in column 2 to identifiers in col 1
>
> eg..
>
> take this>
>
> identifier       quantity
> 1                     10
> 1                     20
> 2                     30
> 1                     15
> 2                     10
> 3                     20
>
> and make this>
>
> identifier         quantity
> 1                    45
> 2                    40
> 3                    20
>
>
> Thanks in advance for your help!
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Help-summarizing-R-data-frame-tp3069624p3069624.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help summarizing R data frame

2010-12-02 Thread Henrique Dallazuanna
Try this:

with(DF, rowsum(quantity, identifier))


On Thu, Dec 2, 2010 at 2:24 PM, chris99  wrote:

>
> I am trying to aggregate data in column 2 to identifiers in col 1
>
> eg..
>
> take this>
>
> identifier   quantity
> 1 10
> 1 20
> 2 30
> 1 15
> 2 10
> 3 20
>
> and make this>
>
> identifier quantity
> 145
> 240
> 320
>
>
> Thanks in advance for your help!
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Help-summarizing-R-data-frame-tp3069624p3069624.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] attempted merge() returns: cannot coerce type 'closure' to vector of type 'any'

2010-12-02 Thread Karl Brand

Cheers Bill,

Thank you for the clarificaiton. Prabably showing the str() of these 
objects would have made it easier for you see exatly where i was going 
wrong. Which as David pointed out, was really a lack of ?merge reading...:)


Karl


On 12/2/2010 5:46 PM, William Dunlap wrote:

I didn't change your "rownames" to "row.names"
because I figured you had the name you wanted
but only forgot to include the quotes.  Both "rownames"
and "row.names" are valid inputs, but they mean
different things.  Howver, without the quotes you were
passing in the contents of the object called
rownames, which happens to be a function (aka a closure),
not the string "rownames.  The quotes are needed to
distinguish between things and names of things.
Some functions (e.g., library() and help()) try
to cover up this difference, leading to occasional
difficulties using them.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: Karl Brand [mailto:k.br...@erasmusmc.nl]
Sent: Thursday, December 02, 2010 2:22 AM
To: William Dunlap
Cc: r-help@r-project.org; Dimitris Rizopoulos
Subject: Re: [R] attempted merge() returns: cannot coerce
type 'closure' to vector of type 'any'

Cheers Bill.

You got me halfway, since:

  >  temp<- merge(x=x, y=y[,17, drop=FALSE], by="rownames", sort=FALSE)
Error in fix.by(by.x, x) : 'by' must specify valid column(s)

but, using "row.names" instead of "rownames", like:
  >  temp<- merge(x=x, y=y[,17, drop=FALSE], by="row.names",
sort=FALSE)

works (but adds a column "Row.names").

Which seems some what counter intuitive to me since i am
feeding in two
matrices to the merge function, which i understand have
'rownames', not
'row.names' as data frames have, right? Although the output
of merge()
is a data frame...

thanks again,

Karl


On 12/1/2010 6:08 PM, William Dunlap wrote:

Try
 by="rownames"
instead of
 by=rownames

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Karl Brand
Sent: Wednesday, December 01, 2010 6:35 AM
To: Dimitris Rizopoulos
Cc: r-help@r-project.org
Subject: [R] attempted merge() returns: cannot coerce type
'closure' to vector of type 'any'

Hi Dimtris and esteemed useRs,

I don't understand why i get this error message when
attempting to use
merge() -

   >   temp<- merge(x, y[,17, drop=FALSE], by=rownames, sort=FALSE)
Error in as.vector(x, mode) :
 cannot coerce type 'closure' to vector of type 'any'

It should work because:

   >   all(rownames(x[order(rownames(x)),]) ==
+ rownames(y[order(rownames(y[,17, drop=FALSE])),17,
drop=FALSE]) 
[TRUNCATED]
[1] TRUE

also:

   >   class(x); class(y[,17, drop=FALSE])
[1] "matrix"
[1] "matrix"


Any idea why i cant use merge() in the normal way here?

I'm forced to

add the column using:

temp.b<- cbind(x, y[match(rownames(x), rownames(y)),17])

All insights appreciated for this leaRner,

cheers,

Karl


--
Karl Brand
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Karl Brand
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268



--
Karl Brand 
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3211 | F +31 (0)10 704 4743 | M +31 (0)642 777 268

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] procrustes results affected by order of sites in input file

2010-12-02 Thread Christine Dolph
Hi, Thanks very much for your response.

Unfortunately, using the set.seed() call does not seem to solve my problem.
If I do not use set.seed(), I do indeed get some small differences in
protest() results due to the effect of random starts. But with my sites in a
given order in the input files, and using set.seed(), I get the same results
every time I run protest().

It's only when I change the order of sites in the input files that I get a
big change in results. The differences I see in protest() results after
re-sorting the order of sites in the input file is quite large, larger than
the smaller differences do to random starts I see if I run protest() without
the set.seed. When sites are re-ordered in the input file, the
correlation-like statistic produced by protest() can jump from between 0.28
to 0.52 for a dataset with 38 sites, with a corresponding change in
significance. Without using set.seed the numbers do jump around a bit, but
not to the same extreme degree.

Looking at the NMDS plots to see how ordinations change with the order of
sites in the file, it seems that if I re-order sites, I am simply getting
ordination solutions that are reflections on one another across one of the 3
NMDS axes. These are not real differences in the ordination solutions. Is
there something I am failing to include in the protest() call that should
enable the function to flip one ordination around to maximize it's
similarity to the other ordination (i.e., that should orient the two
ordinations the same way in 3-dimensional space)?

Thanks,

--Christy


On Thu, Dec 2, 2010 at 6:02 AM, Gavin Simpson wrote:

> On Wed, 2010-12-01 at 14:19 -0600, Christine Dolph wrote:
> > Dear All,
> >
> > I am using a Procrustes analysis to compare two NMDS ordinations for the
> > same set of sites. One ordination is based on fish data, the other is
> based
> > on invertebrate data. Ordinations were derived using metaMDS() from the
> > {vegan} library as follows:
> >
> > fish.mds<-metaMDS(fish.data, distance="bray", k=3, trymax=100,
> > wascores=TRUE, trace=TRUE, zero="add")
> >
> > invert.mds<-metaMDS(bugcal.a, distance="bray", k=3, trymax=100,
> > wascores=TRUE, trace=TRUE, zero="add"),
> >
> >
> > where fish.data and invert.data are site-by-abundance matrices for fish
> and
> > invertebrates, respectively.
> >
> > I have then used protest() to test the significance between the two
> > ordinations:
>
> Simple things first:
>
> You did have a set.seed() call before the protest() call didn't you? In
> fact, I'd probably put set.seed() calls before the two metaMDS() calls
> as well.
>
> The ordering should have no impact at all, but the random starts in
> metaMDS() and the random permutations in protest() will vary from run to
> run unless you set the random seed to some known value.
>
> G
>
> > protest.results<-protest(fish.mds, invert.mds, scale=TRUE,
> symmetric=TRUE,
> > permutations=999)
> >
> > The problem is, the results of the protest analysis seem to be affected
> by
> > the original ordering of sites in the data input files. That is, if I
> > re-sort one of the input files based on some criteria, I sometimes see a
> > change in the strength and significance of the concordance results. In an
> > attempt to figure out what is happening, I have re-ordered each dataset
> in a
> > number of ways, and plotted the resulting ordinations. I have seen that
> > while the ordering of sites in the input file does not change the spatial
> > relationship between them (i.e., does not change the fundamental
> ordination
> > solution), it can change how the sites are oriented with respect to the
> > different NMDS axes. I was of the belief that a difference in orientation
> > with respect to the NMDS axes should not affect the Procrustes comparison
> of
> > two ordinations, as the protest function should be rotating and
> reflecting
> > one matrix with respect to the other to find the closest match between
> them
> > (i.e., it should be accounting for differences in how the two solutions
> are
> > oriented in space). However, I appear to see different results depending
> on
> > how sites are oriented with respect to each NMDS axis.
> >
> > When comparing two ordination solutions with Protest, is it necessary to
> > ensure that the original data input files are ordered in the same way?
> >
> > Thanks in advance for any insight.
> >
> > Sincerely,
> >
> >
> >
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
>  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London  [w] 
> http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>


-- 
Christy Dolph
Ph.D. Candidate
Water Resources Science
University of Minnesota
200 Hods

Re: [R] 2D Random walk

2010-12-02 Thread jim holtman
Here is a use of color to show what the path is and following up on
Patrick's post to make the code a little more efficient:

# compute path
n <- 1
rw <- matrix(0, ncol = 2, nrow = n)
# generate the indices to set the deltas
indx <- cbind(seq(n), sample(c(1, 2), n, TRUE))

# now set the values
rw[indx] <- sample(c(-1, 1), n, TRUE)
# cumsum the columns
rw[,1] <- cumsum(rw[, 1])
rw[, 2] <- cumsum(rw[, 2])

plot(0,type="n",xlab="x",ylab="y",main="Random Walk Simulation
In Two Dimensions",col=1:10,xlim=range(rw[,1]),ylim=range(rw[,2]))

# use 'segments' to color each path
segments(head(rw[, 1], -1)
   , head(rw[, 2], -1)
   , tail(rw[, 1], -1)
   , tail(rw[, 2], -1)
   , col = rainbow(nrow(rw) -1)  # a range of colors
   )

end<-cbind(rw[1,1],rw[1,2])
start<-cbind(0,0)

points(start,pch=16,col="green", cex = 3)
points(end,pch=16,col="red", cex = 3)



On Thu, Dec 2, 2010 at 10:58 AM, featherbox
 wrote:
>
> I've wrote some code to simulate a random walk in 2 dimensions on a lattice.
> Basically I want to add something in to make it plot it point by point so
> you can see what is going on.
> Heres my code for the random walk in 2d
>
>
> RW2D<-function(N)
>  {
>    i<-0
>    xdir<-0
>    ydir<-0
>    xpos<-vector()
>    xpos[1]<-xdir
>    ypos<-vector()
>    ypos[1]<-ydir
>    for (i in 1:N-1)
>      {
>        r<-runif(1)
>        if(r<=0.25) {xdir<-xdir+1}
>        if(r>0.25 && r<=0.5) {xdir<-xdir-1}
>        if(r>0.5 && r<=0.75) {ydir<-ydir +1}
>        if(r>0.75) {ydir<-ydir-1}
>        xpos[i+1]<-xdir
>        ypos[i+1]<-ydir
>      }
>    return(cbind(xpos,ypos))
>  }
> rw<-RW2D(1)
>
> xmin<-min(rw[,1])
> xmax<-max(rw[,1])
> ymin<-min(rw[,2])
> ymax<-max(rw[,2])
>
> plot(rw[,1],rw[,2],type="l",xlab="x",ylab="y",main="Random Walk Simulation
> In Two Dimensions",col="green4",xlim=range(xmin:xmax),ylim=range(ymin:ymax))
>
> end<-cbind(rw[1,1],rw[1,2])
> start<-cbind(0,0)
>
> points(start,pch=4,col="red")
> points(end,pch=4,col="red")
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/2D-Random-walk-tp3069557p3069557.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tukey Test after lme, error: less than two groups

2010-12-02 Thread Lilith Epperlein
Dear R-group,

I am trying desperately to get this Tukey test working. Its my first time here 
so I hope my question is not too stupid, but I couldn't find anything helpful 
in the help or in the forum.

I am analysing a dataset of treated grasses. I want to find out, if the grasses 
from different Provenances react differently.
In the aov test I found a significance for the combination Treatment and 
provenance:

summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code)))

Treatmentf:Provenancef   p-value:  0.008023 ** 

In the Linear fixed effects model lme, I can see that there is a significance 
for two provenances (HU and ES)

summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random= 
~1|Datef/Code,na.action=na.omit))

  Value  Std.Error  DF   t-value 
p-value
(Intercept)   0.6890317 0.06117401 994 11.263473  
0.
TreatmentfF  -0.2897619 0.05484590 467 -5.283201  
0.
ProvenancefDE 0.0105873 0.05484590 467  0.193037  
0.8470

TreatmentfF:ProvenancefES 0.1647302 0.08226884 467  2.002340  
0.0458
TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467  2.023526  
0.0436

No the big mystery is the Tukey test. I just can't find the mistake, it keeps 
telling me, that there are " less than two groups"

summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))

Fehler in contrMat(table(mf[[nm]]), type = types[pm]) : 
  less than two groups

I guess its important to know that I made factors out of some of the data. Here 
is the code:


PAMdata$provenance[PAMdata$provenance == "5"] = "ES" 
PAMdata$provenance[PAMdata$provenance == "6"] = "HU"
# etc.

Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10 
19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10 
19:00","23.06.10 12:30"),ordered=TRUE)


Pretreatmentf <- as.factor(PAMdata$pretreatment)
Provenancef <- as.factor(PAMdata$provenance)
Greenhousef <- as.factor(PAMdata$greenhouse)
Individualf <- as.factor(PAMdata$individual)

PAMval <- (PAMdata$DataPAM)
Code<-(PAMdata$code)

Thank you for any hint! That Tukey test seems so easy, I just can't find the 
mistake
Thank you very much fpr your help and greetings from Tanzania,
Lilith



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hmisc label function applied to data frame

2010-12-02 Thread Krishna Tateneni
Hello,

I'm attempting to create a data frame with correlations between every pair
of variables in a data frame, so that I can then sort by the value of the
correlation coefficient and see which pairs of variables are most strongly
correlated.

The sm2vec function in the corpcor library works very nicely as shown here:

library(Hmisc)
library(corpcor)

# Create example data
x1 = runif(50)
x2 = runif(50)
x3 = runif(50)
d = data.frame(x1=x1,x2=x2,x3=x3)
label(d$x1) = "Variable x1"
label(d$x2) = "Variable x2"
label(d$x3) = "Variable x3"

# Get correlations
cormat = cor(d)

# Get vector form of lower triangular elements
cors = sm2vec(cormat,diag=F)
inds = sm.index(cormat,diag=F)

# Create a data frame
var1 = dimnames(cormat)[[1]][inds[,1]]
var2 = dimnames(cormat)[[2]][inds[,2]]
lbl1 = label(d[,var1])
lbl2 = label(d[,var2])
cor_df = data.frame(Var1=lbl1,Var2=lbl2,Cor=cors)

The issue that I run into is when trying to get the labels in lbl1 and
lbl2.  I get the warning:

In mapply(FUN = label, x = x, default = default, MoreArgs = list(self =
TRUE),  :
  longer argument not a multiple of length of shorter

My usage of label seems ambiguous since the data frame could also a label
attached to it, aside from labels attached to variables within the data
frame.  However, the code above does work, with the warning.  Aside from
using a loop to get the label of one variable at a time, is there any other
way of getting the labels for all variables in the data frame?

Also, if there is a better way to achieve my goal of getting the
correlations between all variable pairs, I'd love to know.

Thanks in advance for any responses!

--Krishna

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] openNLP package error

2010-12-02 Thread Sascha Wolfer

Hello list,

I seem to have a problem with the openNLP package, I'm actually stuck in 
the very beginning. Here's what I did:

> install.packages("openNLP")
> install.packages("openNLPmodels.de", repos = 
"http://datacube.wu.ac.at/";, type = "source")


> library(openNLPmodels.de)
> library(openNLP)

So I installed the main package as well as the supplementary german 
model. Now, I try to use the "sentDetect" function:


> s <- c("Das hier ist ein Satz. Und hier ist noch einer - sogar mit 
Gedankenstrich. Ist das nicht toll?")

> sentDetect(s, language = "de", model = "openNLPmodels.de")

I get the following error message which I can't make any sense of:

Fehler in .jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", 
.jnew("java.io.File",  :
  java.io.FileNotFoundException: openNLPmodels.de (No such file or 
directory)


The model seems to have been installed just fine but there seems to be a 
directory missing. The documentation of openNLP doesn't mention anything 
like setting a particular working directory or something... also a 
Google search for this error didn't come up with anything useful.


Does anybody have a solution for this?

I'm on R 2.12.0 on Mac OS X 10.6.5

Thanks,
Sascha

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tukey Test, lme, error: less than two groups

2010-12-02 Thread Lilith

Dear R-group,

I am trying desperately to get this Tukey test working. Its my first time
here so I hope my question is not too stupid, but I couldn't find anything
helpful in the help or in the forum.

I am analysing a dataset of treated grasses. I want to find out, if the
grasses from different Provenances react differently.
In the aov test I found a significance for the combination Treatment and
provenance:

summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code)))

Treatmentf:Provenancef   p-value:  0.008023 ** 

In the Linear fixed effects model lme, I can see that there is a
significance for two provenances (HU and ES)

summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
~1|Datef/Code,na.action=na.omit))

  Value  Std.Error  DF   t-value
p-value
(Intercept)   0.6890317 0.06117401 994 11.263473 
0.
TreatmentfF  -0.2897619 0.05484590 467 -5.283201 
0.
ProvenancefDE 0.0105873 0.05484590 467  0.193037 
0.8470

TreatmentfF:ProvenancefES 0.1647302 0.08226884 467  2.002340 
0.0458
TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467  2.023526 
0.0436

No the big mystery is the Tukey test. I just can't find the mistake, it
keeps telling me, that there are " less than two groups"

summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))

Fehler in contrMat(table(mf[[nm]]), type = types[pm]) : 
  less than two groups

I guess its important to know that I made factors out of some of the data.
Here is the code:


PAMdata$provenance[PAMdata$provenance == "5"] = "ES" 
PAMdata$provenance[PAMdata$provenance == "6"] = "HU"
# etc.

Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
19:00","23.06.10 12:30"),ordered=TRUE)


Pretreatmentf <- as.factor(PAMdata$pretreatment)
Provenancef <- as.factor(PAMdata$provenance)
Greenhousef <- as.factor(PAMdata$greenhouse)
Individualf <- as.factor(PAMdata$individual)

PAMval <- (PAMdata$DataPAM)
Code<-(PAMdata$code)

Thank you for any hint! That Tukey test seems so easy, I just can't find the
mistake
Thank you very much fpr your help and greetings from Tanzania,
Lilith

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Tukey-Test-lme-error-less-than-two-groups-tp3069789p3069789.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Suitable test for ordinal variable vs continuous variable trend

2010-12-02 Thread Greg Snow
There are many possibilities depending on what you are really looking for 
(strict increasing, vs. increasing or constant, vs. general trend, etc.)

I would start with a plot of the data, that may result in a significant 
interocular concussion test and will at least help you understand what is 
happening in your data.

The plotting idea can be extended to a more formal test if you have a clear 
null hypothesis and a way to simulate from the null (e.g. if the null is no 
relationship, then you can just permute one of the variables) by using the 
vis.test function in the TeachingDemos package.

You could calculate the correlation between the continuous variable and 
as.numeric(ordered variable) as a spearman like correlation which would give a 
general trend.

You could do a linear regression (assuming the continuous variable is close 
enough to normal conditional on the ordered one) with contrasts set to measure 
the differences in successive groups, then if all the coefficients are 
significant and of the same sign, then you have a clear monotonic trend, if you 
have some that are significant but of different signs then you clearly don't 
have a monotonic trend, if some are non significant then you need to decide 
what that means for you.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Daniel Brewer
> Sent: Thursday, December 02, 2010 7:17 AM
> To: r-help@r-project.org
> Subject: [R] Suitable test for ordinal variable vs continuous variable
> trend
> 
> Dear all,
> 
> For a population of about 200 I have a continuous variable and an
> ordinal variable.  The question I would like to ask is whether the
> continuously increases (or decreases) as the rank of the ordinal
> variable increases.  I was thinking that a Spearmen's rank correlation
> or or a chi squared trend might be appropriate.  I don't have any
> experience dealing with ordinal variables so I am at a bit of a loss.
> What is the most appropriate test? and is it implemented in R?
> 
> Many thanks
> 
> 
> **
> 
> Daniel Brewer
> 
> Institute of Cancer Research
> Molecular Carcinogenesis
> MUCRC
> 15 Cotswold Road
> Sutton, Surrey SM2 5NG
> United Kingdom
> 
> Tel: +44 (0) 20 8722 4109
> Fax: +44 (0) 20 8722 4141
> 
> Email: daniel.bre...@icr.ac.uk
> 
> **
> 
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> Company Limited by Guarantee, Registered in England under Company No.
> 534147 with its Registered Office at 123 Old Brompton Road, London SW7
> 3RP.
> 
> This e-mail message is confidential and for use by the
> a...{{dropped:5}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] StatET: Connecting to a remote rterm instance

2010-12-02 Thread WanderingWizard

Is it possible to use Eclipse (StatET) to connect to an rterm instance
running on another computer?  It looks like it should be easy, but when I
select an R Engine it says "Connection failed to: hostname" everytime.  Is
there a guide somewhere?

-- 
View this message in context: 
http://r.789695.n4.nabble.com/StatET-Connecting-to-a-remote-rterm-instance-tp3069931p3069931.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] StatET: Connecting to a remote rterm instance

2010-12-02 Thread Tobias Verbeke

Dear WanderingWizard,

On 12/02/2010 07:58 PM, WanderingWizard wrote:


Is it possible to use Eclipse (StatET) to connect to an rterm instance
running on another computer?


Yes. You can launch, disconnect and reconnect to a remote R console.


It looks like it should be easy, but when I
select an R Engine it says "Connection failed to: hostname" everytime.  Is
there a guide somewhere?


There is Eclipse Help available in the StatET User Guide
(Help >  Help Contents > StatET..)
and additional information can be found (if needed) in
the archives of the StatET user list

http://lists.r-forge.r-project.org/mailman/listinfo/statet-user

searchable here

http://lists.r-forge.r-project.org/mailman/swish.cgi?query=listname%3D%22statet-user%22

Best,
Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extreme value probabilities

2010-12-02 Thread Sarah Henderson
Greetings to all --

I have a set of ~1300 radon measurements from homes across British
Columbia in Canada.  We have split these measurements into five groups
according the the tectonic belt in which they fall, with N ranging
from ~160 to ~600 measurements per group.  The values are roughly
log-normally distributed (1) overall and (2) within each group.

For those areas where values are low I would like to be able to
estimate the probability of finding values greater than 200 Bq/m3 (the
new safety standard) even though the maximum measured value is 174
Bq/m3.  I know this is a question of extreme value distributions, and
the reading I have done so far suggests that there are multiple
approaches (maximum likelihood, probability-weighted moments,...).  I
am wondering if someone could point me towards R packages/functions
appropriate for these data?  I assume that it would evir and I have
read through the documentation, but I would appreciate some guidance
if someone with more experience is willing to offer it.

Many thanks,

Sarah

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Filter data

2010-12-02 Thread Diogenas

Hello,
I understand that question is probably stupid, but ...

I have data (polity IV index)
"country","year","democ","autoc","polity","polity2"
"1","Afghanistan ",1800,1,7,-6,-6
"2","Afghanistan ",1801,1,7,-6,-6
"3","Afghanistan ",1802,1,7,-6,-6

I need to create new data sets which includes only cases after year 1995.
I've tried this command:
poli2<-poli[poli$year > 1994,] ,
 however it generated this:
Error in poli[poli$year > 1994, ] : incorrect number of dimensions.
Any ideas how i can overcome this?

Thank in advance for your help,





-
Diogenas
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Filter-data-tp3070069p3070069.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Женская логика

2010-12-02 Thread Vovan
Хотите понимать женщин? Знать реальные их 
мысли, когда они пытаются спрятать их за 
совершенно отвлеченными речами? На 
практических тренингах Академии Знакомств 
вы в реальных условиях научитесь быть 
предупредительным и понимающим мужчиной 
для женщин.
http://samec.org.ua/?p=202
Не верите? Попробуйте, вам понравится. 
Желаю вам больших научных успехов.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Женская логика

2010-12-02 Thread Vovan
Хотите понимать женщин? Знать реальные их 
мысли, когда они пытаются спрятать их за 
совершенно отвлеченными речами? На 
практических тренингах Академии Знакомств 
вы в реальных условиях научитесь быть 
предупредительным и понимающим мужчиной 
для женщин.
http://samec.org.ua/?p=202
Не верите? Попробуйте, вам понравится. 
Желаю вам больших научных успехов.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Женская логика

2010-12-02 Thread Vovan
Хотите понимать женщин? Знать реальные их 
мысли, когда они пытаются спрятать их за 
совершенно отвлеченными речами? На 
практических тренингах Академии Знакомств 
вы в реальных условиях научитесь быть 
предупредительным и понимающим мужчиной 
для женщин.
http://samec.org.ua/?p=202
Не верите? Попробуйте, вам понравится. 
Желаю вам больших научных успехов.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Женская логика

2010-12-02 Thread Vovan
Хотите понимать женщин? Знать реальные их 
мысли, когда они пытаются спрятать их за 
совершенно отвлеченными речами? На 
практических тренингах Академии Знакомств 
вы в реальных условиях научитесь быть 
предупредительным и понимающим мужчиной 
для женщин.
http://samec.org.ua/?p=202
Не верите? Попробуйте, вам понравится. 
Желаю вам больших научных успехов.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Женская логика

2010-12-02 Thread Vovan
Хотите понимать женщин? Знать реальные их 
мысли, когда они пытаются спрятать их за 
совершенно отвлеченными речами? На 
практических тренингах Академии Знакомств 
вы в реальных условиях научитесь быть 
предупредительным и понимающим мужчиной 
для женщин.
http://samec.org.ua/?p=202
Не верите? Попробуйте, вам понравится. 
Желаю вам больших научных успехов.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Женская логика

2010-12-02 Thread Vovan
Хотите понимать женщин? Знать реальные их 
мысли, когда они пытаются спрятать их за 
совершенно отвлеченными речами? На 
практических тренингах Академии Знакомств 
вы в реальных условиях научитесь быть 
предупредительным и понимающим мужчиной 
для женщин.
http://samec.org.ua/?p=202
Не верите? Попробуйте, вам понравится. 
Желаю вам больших научных успехов.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] latex tables for 3+ dimensional tables/arrays

2010-12-02 Thread Michael Friendly
I'm looking for an R method to produce latex versions of tables for 
table/array objects of 3 or more dimensions,
which, of necessity is flattened to a 2D display, for example with 
ftable(), or vcd::structable, as shown below.

I'd be happy to settle for a flexible solution for the 3D case.

> UCB <- aperm(UCBAdmissions, c(2, 1, 3))
> ftable(UCB)
Dept   A   B   C   D   E   F
Gender Admit
Male   Admitted  512 353 120 138  53  22
   Rejected  313 207 205 279 138 351
Female Admitted   89  17 202 131  94  24
   Rejected   19   8 391 244 299 317
> structable(Dept ~ Admit+Gender, UCBAdmissions)
Dept   A   B   C   D   E   F
AdmitGender
Admitted Male512 353 120 138  53  22
 Female   89  17 202 131  94  24
Rejected Male313 207 205 279 138 351
 Female   19   8 391 244 299 317


I tried using the xtable package but discovered that xtable.table is 
silently
wrong for tables of length(dim(x))>2.  E.g., for a 2 x 2 x 6 table, we 
get only UCBAdmissions[,,1]


> xtable(UCBAdmissions)
% latex table generated in R 2.11.1 by xtable 1.5-6 package
% Tue Nov 16 13:23:44 2010
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
  \hline
& Male & Female \\
  \hline
Admitted & 512.00 & 89.00 \\
  Rejected & 313.00 & 19.00 \\
   \hline
\end{tabular}
\end{center}
\end{table}
>

The author of xtable tells me that only 2-way tables are presently 
implemented, and, while I would normally

look to Hmisc for all things LaTeX, I don't find anything useful there:

> library(Hmisc)
> methods("latex")
 [1] latex.bystats  latex.bystats2
 [3] latex.default  latex.describe
 [5] latex.describe.single  latex.function
 [7] latex.list latex.responseSummary
 [9] latex.summary.formula.crosslatex.summary.formula.response
[11] latex.summary.formula.reverse
> latex(UCB)
Error in x[, j] : incorrect number of dimensions
> traceback()
3: format.df(object, dcolumn = dcolumn, na.blank = na.blank, 
numeric.dollar = numeric.dollar,
   cdot = cdot, math.row.names = math.row.names, math.col.names = 
math.col.names,

   double.slash = double.slash, ...)
2: latex.default(UCB)
1: latex(UCB)


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter data

2010-12-02 Thread Nordlund, Dan (DSHS/RDA)
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Diogenas
> Sent: Thursday, December 02, 2010 12:09 PM
> To: r-help@r-project.org
> Subject: [R] Filter data
> 
> 
> Hello,
> I understand that question is probably stupid, but ...
> 
> I have data (polity IV index)
> "country","year","democ","autoc","polity","polity2"
> "1","Afghanistan ",1800,1,7,-6,-6
> "2","Afghanistan ",1801,1,7,-6,-6
> "3","Afghanistan ",1802,1,7,-6,-6
> 
> I need to create new data sets which includes only cases after year
> 1995.
> I've tried this command:
> poli2<-poli[poli$year > 1994,] ,
>  however it generated this:
> Error in poli[poli$year > 1994, ] : incorrect number of dimensions.
> Any ideas how i can overcome this?
> 

We need more information.  You haven't given us a self-contained, reproducible 
example.  The syntax you show works for me.  What does 

>str(poli)
 
show?

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latex tables for 3+ dimensional tables/arrays

2010-12-02 Thread David Winsemius


On Dec 2, 2010, at 3:47 PM, Michael Friendly wrote:

I'm looking for an R method to produce latex versions of tables for  
table/array objects of 3 or more dimensions,
which, of necessity is flattened to a 2D display, for example with  
ftable(), or vcd::structable, as shown below.

I'd be happy to settle for a flexible solution for the 3D case.

> UCB <- aperm(UCBAdmissions, c(2, 1, 3))
> ftable(UCB)
   Dept   A   B   C   D   E   F
Gender Admit
Male   Admitted  512 353 120 138  53  22
  Rejected  313 207 205 279 138 351
Female Admitted   89  17 202 131  94  24
  Rejected   19   8 391 244 299 317
> structable(Dept ~ Admit+Gender, UCBAdmissions)
   Dept   A   B   C   D   E   F
AdmitGender
Admitted Male512 353 120 138  53  22
Female   89  17 202 131  94  24
Rejected Male313 207 205 279 138 351
Female   19   8 391 244 299 317


?latex  # yes I know you are aware of this but it seemed as though you  
hadn't read its help page.

# Attached is the output on my device from the following command:
latexVerbatim( ftable(UCB))

# (OK, so I can't speell.)
--
David.

verbatiom.pdf
Description: Adobe PDF document







I tried using the xtable package but discovered that xtable.table is  
silently
wrong for tables of length(dim(x))>2.  E.g., for a 2 x 2 x 6 table,  
we get only UCBAdmissions[,,1]


> xtable(UCBAdmissions)
% latex table generated in R 2.11.1 by xtable 1.5-6 package
% Tue Nov 16 13:23:44 2010
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
 \hline
& Male & Female \\
 \hline
Admitted & 512.00 & 89.00 \\
 Rejected & 313.00 & 19.00 \\
  \hline
\end{tabular}
\end{center}
\end{table}
>

The author of xtable tells me that only 2-way tables are presently  
implemented, and, while I would normally
look to Hmisc for all things LaTeX, I don't find anything useful  
there:


> library(Hmisc)
> methods("latex")
[1] latex.bystats  latex.bystats2
[3] latex.default  latex.describe
[5] latex.describe.single  latex.function
[7] latex.list latex.responseSummary
[9] latex.summary.formula.crosslatex.summary.formula.response
[11] latex.summary.formula.reverse
> latex(UCB)
Error in x[, j] : incorrect number of dimensions
> traceback()
3: format.df(object, dcolumn = dcolumn, na.blank = na.blank,  
numeric.dollar = numeric.dollar,
  cdot = cdot, math.row.names = math.row.names, math.col.names =  
math.col.names,

  double.slash = double.slash, ...)
2: latex.default(UCB)
1: latex(UCB)


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter data

2010-12-02 Thread Diogenas

Simplified version of str(poli): (real dataset includes 36 variables)

 $ country : chr [1:15846] "Afghanistan " "Afghanistan
" "Afghanistan " "Afghanistan " ...
 $ year: num [1:15846] 1800 1801 1802 1803 1804 ...
 $ democ   : num [1:15846] 1 1 1 1 1 1 1 1 1 1 ...
 $ autoc   : num [1:15846] 7 7 7 7 7 7 7 7 7 7 ...
 $ polity  : num [1:15846] -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 ...
 $ polity2 : num [1:15846] -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 ...
- attr(*, "label.table")=List of 36
  ..$ country : NULL
  ..$ year: NULL
  ..$ democ   : NULL
  ..$ autoc   : NULL
  ..$ polity  : NULL
  ..$ polity2 : NULL


 

-
Diogenas
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Filter-data-tp3070069p3070201.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with package rsm: running fit.mult.impute with cph

2010-12-02 Thread Frank Harrell

rms and fit.mult.impute don't necessarily play well with mice.

Note also that you will need a special adjustment for variances for having
done stepwise regression.  One approach is to insert a zero for a
coefficient every time that variable is not 'selected', then compute the
sample variance of the beta hats and zeros [using the bootstrap].

Frank


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/problem-with-package-rsm-running-fit-mult-impute-with-cph-tp3069319p3070218.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MANOVA Type III SS

2010-12-02 Thread Ali S
How do I get MANOVA results with Type III sums of squares?


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Integral of PDF

2010-12-02 Thread Doran, Harold
The integral of any probability density from -Inf to Inf should equal 1, 
correct? I don't understand last result below.

> integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
1 with absolute error < 9.4e-05

> integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
1 with absolute error < 0.00012

> integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
8.410947e-11 with absolute error < 1.6e-10

> all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value, 0)
[1] TRUE

> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] statmod_1.4.6  mlmRev_0.99875-1   lme4_0.999375-35   Matrix_0.999375-33 
lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.1   nlme_3.1-96   stats4_2.10.1 tools_2.10.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MANOVA Type III SS

2010-12-02 Thread David Winsemius


On Dec 2, 2010, at 4:18 PM, Ali S wrote:


How do I get MANOVA results with Type III sums of squares?



First pull on your asbestos underwear, then visit :

http://search.r-project.org/cgi-bin/namazu.cgi?query=type+III+SS&max=100&result=normal&sort=score&idxname=functions&idxname=Rhelp08&idxname=Rhelp10&idxname=Rhelp02

Followed by:

http://finzi.psych.upenn.edu/R/library/car/html/Anova.html


--


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significance of the difference between two correlation coefficients

2010-12-02 Thread Norm Matloff
Adaikalavan Ramasamy wrote:

> Using the info from that website, I can code up the following to give
> the two-tailed p-value of difference in correlations:
> 
>   diff.corr <- function( r1, n1, r2, n2 ){
> ...

William Revelle also mentioned the r.test in the psych package.

I would add here that inference on second-order quantities, such as
correlation coefficients and variances, is not robust to the assumption
of a normally-distributed population.  (Inference on first-order
quantities such as means and regression coefficients, IS pretty robust
to that assumption.) A good general alternative is the bootstrap,
implemented in R in the boot package.

Norm Matloff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] parLapply - Error in do.call("fun", lapply(args, enquote)) : could not find function "fun"

2010-12-02 Thread Adrienne Wootten
Hello everybody,

I've got a bit of a problem with parLapply that's left me scratching my head
today.  I've tried this in R 2.11 and the 23 bit Revolution R Enterprise and
gotten the same result, OS in question is Windows XP, the package involved
is the snow package.

I've got a list of 20 rain/no rain (1/0) situations for these two stations i
and j, all the items in this list look like this, each of these have a
different order:

> setin[[20]]
   ri_1 rj_1
2900
1500
3 10
3100
4 00
1 11
1811
2100
2700
2400
6 01
9 11
5 00
3000
2500
2800
1911
8 00
1711
1200
1000
2211
2300
7 11
1400
2600
1300
2 01
1111
2011
1611

What I'm trying to do is to apply the same function I crafted to all the
items in the list with parLapply.  I'm using this instead of apply since it
allows me to test using parallel running R, and because it runs faster than
lapply.

M_set <-
parLapply(cl,setin,M.set.find,month=month,n1=n1,n2=n2,MC_1st_obs2=MC_1st_obs2)

using this it gives me the follow error:

Error in do.call("fun", lapply(args, enquote)) :
  could not find function "fun"

but, this works correctly when I just use lapply (it's just a bit slower
than I need it to be).  Also, I know that the clusterCall function works
fine with my homemade function because all the nodes of the cluster return
the appropriate results when I try this:

clusterCall(cl,M.set.find,setin=setin[[1]],month=month,n1=n1,n2=n2,MC_1st_obs2=MC_1st_obs2)

but that will only let me do this calculation one at a time.

I perused the earlier post about this error, and that doesn't work for me, I
don't do anything anywhere in my code to mess with "c".  I also know that my
code is producing the rest of the required parts for the function correctly.

I wish I could provide more on what's happened, but the code involved is
somewhat extensive.  Any ideas all of you have would be wonderful.

Thanks in advance!

A
-- 
Adrienne Wootten
Graduate Research Assistant
State Climate Office of North Carolina
Department of Marine, Earth and Atmospheric Sciences
North Carolina State University

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latex tables for 3+ dimensional tables/arrays

2010-12-02 Thread Michael Friendly

On 12/2/2010 4:04 PM, David Winsemius wrote:


?latex  # yes I know you are aware of this but it seemed as though you 
hadn't read its help page.

# Attached is the output on my device from the following command:
latexVerbatim( ftable(UCB))


Perhaps I should have been more explicit.  I can always use verbatim as 
a fall-back, but what I want is
something to produce the *latex* code to give something actually 
formatted as a latex table.


Below is
something along the lines of what I want (similar to  t(  
structable(Dept ~ Admit+Gender, UCBAdmissions)) ).
This was written *directly* from an object in SAS/IML by routines I 
wrote some time ago.  I'm looking

for an R equivalent, something that can be used in Sweave documents.

% Table berk3 written by md2tex 12Dec99
\begin{table}[htb]
 \caption{Berkeley data: 3-way table}\label{tab:berk3}
 \begin{center}
 \begin{tabular}{|c|rr|rr|}
  \hline
& \multicolumn{2}{c}{Male} & \multicolumn{2}{c|}{Female} \\ \cline{2-5}
  Dept & Admit & Reject & Admit & Reject  \\
  \hline
  A & 512 & 313 & 89 & 19 \\
  B & 353 & 207 & 17 & 8 \\
  C & 120 & 205 & 202 & 391 \\
  D & 138 & 279 & 131 & 244 \\
  E & 53 & 138 & 94 & 299 \\
  F & 22 & 351 & 24 & 317 \\
  \hline
 \end{tabular}
 \end{center}
\end{table}

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter data

2010-12-02 Thread mohamed . lajnef

Hi Diogenas,

Try this,
data<-subset(poli, poli$year>1994)

Best,
M

Diogenas  a écrit :



Hello,
I understand that question is probably stupid, but ...

I have data (polity IV index)
"country","year","democ","autoc","polity","polity2"
"1","Afghanistan ",1800,1,7,-6,-6
"2","Afghanistan ",1801,1,7,-6,-6
"3","Afghanistan ",1802,1,7,-6,-6

I need to create new data sets which includes only cases after year 1995.
I've tried this command:
poli2<-poli[poli$year > 1994,] ,
 however it generated this:
Error in poli[poli$year > 1994, ] : incorrect number of dimensions.
Any ideas how i can overcome this?

Thank in advance for your help,





-
Diogenas
--
View this message in context:   
http://r.789695.n4.nabble.com/Filter-data-tp3070069p3070069.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter data

2010-12-02 Thread Mike Rennie
Try looking at subset()... I typically use this when I'm trying to isolate
specific cases from a data frame.

Also, Dalgaard's introductory R book has some great examples for parsing
data in the manner you're attempting you might want to look at that.

Mike
On Thu, Dec 2, 2010 at 2:08 PM, Diogenas  wrote:

>
> Hello,
> I understand that question is probably stupid, but ...
>
> I have data (polity IV index)
> "country","year","democ","autoc","polity","polity2"
> "1","Afghanistan ",1800,1,7,-6,-6
> "2","Afghanistan ",1801,1,7,-6,-6
> "3","Afghanistan ",1802,1,7,-6,-6
>
> I need to create new data sets which includes only cases after year 1995.
> I've tried this command:
> poli2<-poli[poli$year > 1994,] ,
>  however it generated this:
> Error in poli[poli$year > 1994, ] : incorrect number of dimensions.
> Any ideas how i can overcome this?
>
> Thank in advance for your help,
>
>
>
>
>
> -
> Diogenas
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Filter-data-tp3070069p3070069.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Aaron Polhamus
Ivan and Michael,

Many thanks for the tips, those solved my queries. Still interested in how
to force custom functions to work over rows rather than columns when using
apply, but the MAT/rowSums(MAT) technique is definitely the most efficient
way to go for this application.

Cheers,
Aaron

2010/12/2 Michael Bedward 

> Hi Aaron,
>
> Following up on Ivan's suggestion, if you want the column order to
> mirror the row order...
>
> mo <- order(rowSums(MAT), decreasing=TRUE)
> MAT2 <- MAT[mo, mo]
>
> Also, you don't need all those extra c() calls when creating
> inputData, just the outermost one.
>
> Regarding your second question, your statements...
>
> TMAT <- apply(t(MAT), 2, function(X) X/sum(X))
> TMAT <- t(TMAT)
>
> is actually just a complicated way of doing this...
>
> TMAT <- MAT / rowSums(MAT)
>
> You can confirm that by doing it your way and then this...
>
> TMAT == MAT / rowSums(MAT)
>
> ...and you should see a matrix of TRUE values
>
> Michael
>
>
> On 2 December 2010 20:43, Ivan Calandra 
> wrote:
> > Hi,
> >
> > Here is a not so easy way to do your first step, but it works:
> > MAT2 <- cbind(MAT, rowSums(MAT))
> > MAT[order(MAT2[,6], decreasing=TRUE),]
> >
> > For the second, I don't know!
> >
> > HTH,
> > Ivan
> >
> >
> > Le 12/2/2010 09:46, Aaron Polhamus a écrit :
> >>
> >> Greetings,
> >>
> >> My goal is to create a Markov transition matrix (probability of moving
> >> from
> >> one state to another) with the 'highest traffic' portion of the matrix
> >> occupying the top-left section. Consider the following sample:
> >>
> >> inputData<- c(
> >> c(5, 3, 1, 6, 7),
> >> c(9, 7, 3, 10, 11),
> >> c(1, 2, 3, 4, 5),
> >> c(2, 4, 6, 8, 10),
> >> c(9, 5, 2, 1, 1)
> >> )
> >>
> >> MAT<- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE)
> >> colnames(MAT)<- c("A", "B", "C", "D", "E")
> >> rownames(MAT)<- c("A", "B", "C", "D", "E")
> >>
> >> rowSums(MAT)
> >>
> >> I wan to re-arrange the elements of this matrix such that the elements
> >> with
> >> the largest row sums are placed to the top-left, in descending order.
> Does
> >> this make sense? In this case the order I'm looking for would be B, D,
> A,
> >> E,
> >> C Any thoughts?
> >>
> >> As an aside, here is the function I've written to construct the
> transition
> >> matrix. Is there a more elegant way to do this that doesn't involve a
> >> double
> >> transpose?
> >>
> >> TMAT<- apply(t(MAT), 2, function(X) X/sum(X))
> >> TMAT<- t(TMAT)
> >>
> >> I tried the following:
> >>
> >> TMAT<- apply(MAT, 1, function(X) X/sum(X))
> >>
> >> But my the custom function is still getting applied over the columns of
> >> the
> >> array, rather than the rows. For a check try:
> >>
> >> rowSums(TMAT)
> >> colSums(TMAT)
> >>
> >> Row sums here should equal 1...
> >>
> >> Many thanks in advance,
> >> Aaron
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > --
> > Ivan CALANDRA
> > PhD Student
> > University of Hamburg
> > Biozentrum Grindel und Zoologisches Museum
> > Abt. Säugetiere
> > Martin-Luther-King-Platz 3
> > D-20146 Hamburg, GERMANY
> > +49(0)40 42838 6231
> > ivan.calan...@uni-hamburg.de
> >
> > **
> > http://www.for771.uni-bonn.de
> > http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



-- 
Aaron Polhamus 
Statistical consultant, Revolution Analytics
MSc Applied Statistics, The University of Oxford, 2009
838a NW 52nd St, Seattle, WA 98107
Cell: +1 (206) 380.3948

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tukey Test, lme, error: less than two groups

2010-12-02 Thread Steven McKinney
Comments in-line below

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Lilith
> Sent: December-02-10 9:39 AM
> To: r-help@r-project.org
> Subject: [R] Tukey Test, lme, error: less than two groups
> 
> 
> Dear R-group,
> 
> I am trying desperately to get this Tukey test working. Its my first time
> here so I hope my question is not too stupid, but I couldn't find anything
> helpful in the help or in the forum.
> 
> I am analysing a dataset of treated grasses. I want to find out, if the
> grasses from different Provenances react differently.
> In the aov test I found a significance for the combination Treatment and
> provenance:
> 
> summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code)))
> 
> Treatmentf:Provenancef   p-value:  0.008023 **
> 
> In the Linear fixed effects model lme, I can see that there is a
> significance for two provenances (HU and ES)
> 
> summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
> ~1|Datef/Code,na.action=na.omit))
> 
>   Value  Std.Error  DF   t-value
> p-value
> (Intercept)   0.6890317 0.06117401 994 11.263473
> 0.
> TreatmentfF  -0.2897619 0.05484590 467 -5.283201
> 0.
> ProvenancefDE 0.0105873 0.05484590 467  0.193037
> 0.8470
> 
> TreatmentfF:ProvenancefES 0.1647302 0.08226884 467  2.002340
> 0.0458
> TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467  2.023526
> 0.0436
> 
> No the big mystery is the Tukey test. I just can't find the mistake, it
> keeps telling me, that there are " less than two groups"
> 
> summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))
> 
> Fehler in contrMat(table(mf[[nm]]), type = types[pm]) :
>   less than two groups
> 
> I guess its important to know that I made factors out of some of the data.
> Here is the code:
> 
> 
> PAMdata$provenance[PAMdata$provenance == "5"] = "ES"
> PAMdata$provenance[PAMdata$provenance == "6"] = "HU"
> # etc.
> 
> Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
> Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
> 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
> 19:00","23.06.10 12:30"),ordered=TRUE)
> 
> 
> Pretreatmentf <- as.factor(PAMdata$pretreatment)
> Provenancef <- as.factor(PAMdata$provenance)
> Greenhousef <- as.factor(PAMdata$greenhouse)
> Individualf <- as.factor(PAMdata$individual)
> 
> PAMval <- (PAMdata$DataPAM)
> Code<-(PAMdata$code)

I suspect the problem is the creation of all these individual variables.

Try instead

PAMdata$Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
PAMdata$Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
 19:00","23.06.10 12:30"),ordered=TRUE)
...
PAMdata$PAMval <- (PAMdata$DataPAM)
PAMdata$Code<-(PAMdata$code)
etc.

so that all of your required variables are variables in the dataframe PAMdata.

When you pass off fitted model objects to additional functions, the
additional functions often require access to the dataframe used in the
initial modeling.  Then call lme with

summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
 ~1|Datef/Code, data = PAMdata, na.action=na.omit))

then try

summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))

again.
 
(If you still get an error, run the

traceback()

command and provide that information.)


I'm also wondering why no term for Pretreatmentf shows in your model output.
After setting up the factor variables in the PAMdata dataframe, what does the
command

with(PAMdata, table(Pretreatmentf, Provenancef, Treatmentf))

show?  Is Pretreatmentf even needed in the model?

The output of the command

sessionInfo() 

is also useful to help people figure out such issues.

Also, if you can share the data, or a mock-up of it, others will
be able to run code examples, and not just guess.

HTH

Steve McKinney

> 
> Thank you for any hint! That Tukey test seems so easy, I just can't find the
> mistake
> Thank you very much fpr your help and greetings from Tanzania,
> Lilith
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Tukey-Test-lme-error-less-than-two-groups-
> tp3069789p3069789.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commente

Re: [R] Integral of PDF

2010-12-02 Thread Hans W Borchers

You can dive into the thread "puzzle with integrate over infinite range"
from September this year. The short answer appears to be: Increase the
error tolerance.

integrate(function(x) dnorm(x, 500,50), -Inf, Inf,
  subdivisions=500, rel.tol=1e-11)
# 1 with absolute error < 1.1e-12

Hans Werner



Doran, Harold wrote:
> 
> The integral of any probability density from -Inf to Inf should equal 1,
> correct? I don't understand last result below.
> 
>> integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
> 1 with absolute error < 9.4e-05
> 
>> integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
> 1 with absolute error < 0.00012
> 
>> integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
> 8.410947e-11 with absolute error < 1.6e-10
> 
>> all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value, 0)
> [1] TRUE
> 
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-mingw32
> 
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> other attached packages:
> [1] statmod_1.4.6  mlmRev_0.99875-1   lme4_0.999375-35  
> Matrix_0.999375-33 lattice_0.17-26
> 
> loaded via a namespace (and not attached):
> [1] grid_2.10.1   nlme_3.1-96   stats4_2.10.1 tools_2.10.1
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Integral-of-PDF-tp3070243p3070315.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latex tables for 3+ dimensional tables/arrays

2010-12-02 Thread RICHARD M. HEIBERGER
Michael,

I think this is what you are looking for.
Rich



library(Hmisc)
tmp <- array(rnorm(60), c(3,4,5),
list(letters[1:3],LETTERS[4:7],letters[8:12]))
## latex(tmp)
ftable(tmp)
dviname <- latex(ftable(tmp))
ft <- ftable(tmp)
dviname <- latex(ft,
 colheads=attr(ft,"col.vars")[[1]],
 rowname=attr(ft,"row.vars")[[2]],
 rgroup=attr(ft,"row.vars")[[1]],
 n.rgroup=c(4,4,4))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integral of PDF

2010-12-02 Thread Albyn Jones
To really understaand it you will have to look at the fortran code
underlying integrate.  I tracked it back through a couple of layers
(dqagi, dqagie, ...  just use google, these are old netlib
subroutines) then decided I ought to get back to grading papers :-)

It looks like the integral is split into the sum of two integrals,
(-Inf,0] and [0,Inf).  


> integrate(function(x) dnorm(x, 350,50), 0, Inf)
1 with absolute error < 1.5e-05
> integrate(function(x) dnorm(x, 400,50), 0, Inf)
1.068444e-05 with absolute error < 2.1e-05
> integrate(function(x) dnorm(x, 500,50), 0, Inf)
8.410947e-11 with absolute error < 1.6e-10
> integrate(function(x) dnorm(x, 500,50), 0, 1000)
1 with absolute error < 7.4e-05

The integral from 0 to Inf is the lim_{x -> Inf} of the integral
over [0,x].  I suspect that the algorithm is picking an interval
[0,x], evaluating the integral over that interval, and then performing
some test to decide whether to expand the interval.  When the initial
interval doesn't contain much, expanding a little won't add enough to
trigger another iteration.  

albyn

On Thu, Dec 02, 2010 at 04:21:34PM -0500, Doran, Harold wrote:
> The integral of any probability density from -Inf to Inf should equal 1, 
> correct? I don't understand last result below.
> 
> > integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
> 1 with absolute error < 9.4e-05
> 
> > integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
> 1 with absolute error < 0.00012
> 
> > integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
> 8.410947e-11 with absolute error < 1.6e-10
> 
> > all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value, 0)
> [1] TRUE
> 
> > sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-mingw32
> 
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> other attached packages:
> [1] statmod_1.4.6  mlmRev_0.99875-1   lme4_0.999375-35   
> Matrix_0.999375-33 lattice_0.17-26
> 
> loaded via a namespace (and not attached):
> [1] grid_2.10.1   nlme_3.1-96   stats4_2.10.1 tools_2.10.1
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Albyn Jones
Reed College
jo...@reed.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] initial values for alpha and beta in Gamma inverse

2010-12-02 Thread Rosario Garcia Gil
Hello

I am trying to fit a model using glm function with family=Gamma(link="inverse").

Is it possible to give initial values, and how? I tried to search for info but 
not managed to get any answer.

Kind regards and thanks in advance.
Rosario

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integral of PDF

2010-12-02 Thread William Dunlap
You can use trace() to see what is happening

> trace(dnorm, quote(print(round(x
> integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
Tracing dnorm(x, 500, 50) on entry
 [1]   1 233   0  38   0  14   0   7   0   4   0   2   0   2   1
Tracing dnorm(x, 500, 50) on entry
 [1]   -1 -2330  -380  -140   -70   -40   -20
-2   -1
Tracing dnorm(x, 500, 50) on entry
 [1]   3 467   1  78   1  29   1  14   1   9   2   6   2   4   2
Tracing dnorm(x, 500, 50) on entry
 [1]   -3 -467   -1  -78   -1  -29   -1  -14   -1   -9   -2   -6   -2
-4   -2
Tracing dnorm(x, 500, 50) on entry
 [1] 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0
Tracing dnorm(x, 500, 50) on entry
 [1]  0 -1  0 -1  0 -1  0 -1  0 -1  0 -1  0  0  0
Tracing dnorm(x, 500, 50) on entry
 [1]   7 935   3 156   3  58   3  30   4  18   4  12   5   9   6
Tracing dnorm(x, 500, 50) on entry
 [1]   -7 -935   -3 -156   -3  -58   -3  -30   -4  -18   -4  -12   -5
-9   -6
Tracing dnorm(x, 500, 50) on entry
 [1] 2 3 1 3 1 3 1 3 1 2 1 2 1 2 1
Tracing dnorm(x, 500, 50) on entry
 [1] -2 -3 -1 -3 -1 -3 -1 -3 -1 -2 -1 -2 -1 -2 -1
8.410947e-11 with absolute error < 1.6e-10

You are asking integrate to find the relatively tiny portion of
the the floating point real line (from c. -10^300 to 10^300)
on which dnorm(x) is bigger than c. 10^-300.  It found a few
points where it was that big, but apparently not enough to home
on the answer.  You need to give it better hints abouts dnorm's
support region.  E.g.,

> integrate(function(x) dnorm(x, 500,50), -100, 900)
Tracing dnorm(x, 500, 50) on entry
 [1] 400 -87 887 -33 833  60 740 183 617 326 474 -98 898 -65 865  10 790
119 681
[20] 253 547
Tracing dnorm(x, 500, 50) on entry
 [1] 150 -93 393 -66 366 -20 320  42 258 113 187 -99 399 -83 383 -45 345
9 291
[20]  76 224
Tracing dnorm(x, 500, 50) on entry
 [1] 650 407 893 434 866 480 820 542 758 613 687 401 899 417 883 455 845
509 791
[20] 576 724
Tracing dnorm(x, 500, 50) on entry
 [1] 525 403 647 417 633 440 610 471 579 506 544 401 649 409 641 427 623
455 595
[20] 488 562
Tracing dnorm(x, 500, 50) on entry
 [1] 775 653 897 667 883 690 860 721 829 756 794 651 899 659 891 677 873
705 845
[20] 738 812
1 with absolute error < 4.0e-07

Making the limits +-10^4 works ok also, but +-Inf is apparently
asking for too much.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold
> Sent: Thursday, December 02, 2010 1:22 PM
> To: r-help@r-project.org
> Subject: [R] Integral of PDF
> 
> The integral of any probability density from -Inf to Inf 
> should equal 1, correct? I don't understand last result below.
> 
> > integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
> 1 with absolute error < 9.4e-05
> 
> > integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
> 1 with absolute error < 0.00012
> 
> > integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
> 8.410947e-11 with absolute error < 1.6e-10
> 
> > all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, 
> Inf)$value, 0)
> [1] TRUE
> 
> > sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-mingw32
> 
> locale:
> [1] LC_COLLATE=English_United States.1252  
> LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> other attached packages:
> [1] statmod_1.4.6  mlmRev_0.99875-1   lme4_0.999375-35   
> Matrix_0.999375-33 lattice_0.17-26
> 
> loaded via a namespace (and not attached):
> [1] grid_2.10.1   nlme_3.1-96   stats4_2.10.1 tools_2.10.1
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Minor warning about seq

2010-12-02 Thread Carl Witthoft
I believe the R intro manual and R-inferno have some pretty stern 
warnings about never using 'c' or 't' as variable names, for obvious 
reasons.


Similarly,  I nearly crushed some code once by writing a nice little 
function to find the mode of a data set and calling the function 
"mode()" .  (conflicts w/ an R builtin).  Ended up calling it 'smode' .


Carl


From: Dieter Menne 
Date: Wed, 01 Dec 2010 00:14:17 -0800 (PST)

Prof. John C Nash wrote:
>
> I spent more time than I should have debugging a script because I wanted
> x<-seq(0,100)*0.1
>
> but typed
> x<-seq(O:100)*0.1
>
> seq(0:100) yields 1 to 101,
>

Which leads us to another rule: never use a variable called "O". I 
remember this was a no-no even in my first Algol-course in 1967.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integral of PDF

2010-12-02 Thread Ravi Varadhan
But that only postpones the misery, Hans Werner!  You can always make the
mean large enough to get a wrong answer from `integrate'.  

integrate(function(x) dnorm(x, 700,50), -Inf, Inf,
  subdivisions=500, rel.tol=1e-11)

integrate(function(x) dnorm(x, -700,50), -Inf, Inf,
  subdivisions=500, rel.tol=1e-11)

The problem is that the quadrature algorithm is not smart enough to
recognize that the center of mass is at `mean'.  The QUADPACK algorithm
(Gauss-Kronrod quadratutre) is adaptive, but it does not look to identify
regions of high mass.  Algorithms which can locate regions of highest
density, such as those developed by Alan Genz, will do much better in
problems like this.

Genz and Kass (1997).  J Computational Graphics and Statistics.

There is a FORTRAN routine DCHURE that you might want to try for infinite
regions.  I don't think this has been ported to R (although I may be wrong).
There may be other more recent ones.

A simple solution is to locate the mode of the integrand, which should be
quite easy to do, and then do a coordinate shift to that point and then
integrate the mean-shifted integrand using `integrate'.

Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns
Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Hans W Borchers
Sent: Thursday, December 02, 2010 5:16 PM
To: r-help@r-project.org
Subject: Re: [R] Integral of PDF


You can dive into the thread "puzzle with integrate over infinite range"
from September this year. The short answer appears to be: Increase the
error tolerance.

integrate(function(x) dnorm(x, 500,50), -Inf, Inf,
  subdivisions=500, rel.tol=1e-11)
# 1 with absolute error < 1.1e-12

Hans Werner



Doran, Harold wrote:
> 
> The integral of any probability density from -Inf to Inf should equal 1,
> correct? I don't understand last result below.
> 
>> integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
> 1 with absolute error < 9.4e-05
> 
>> integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
> 1 with absolute error < 0.00012
> 
>> integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
> 8.410947e-11 with absolute error < 1.6e-10
> 
>> all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value, 0)
> [1] TRUE
> 
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-mingw32
> 
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> other attached packages:
> [1] statmod_1.4.6  mlmRev_0.99875-1   lme4_0.999375-35  
> Matrix_0.999375-33 lattice_0.17-26
> 
> loaded via a namespace (and not attached):
> [1] grid_2.10.1   nlme_3.1-96   stats4_2.10.1 tools_2.10.1
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://r.789695.n4.nabble.com/Integral-of-PDF-tp3070243p3070315.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integral of PDF

2010-12-02 Thread Ted Harding
Albyn's reply is in line with an hypothesis I was beginning to
formulate (without looking at the underlying FoRTRAN code),
prompted by the hint in '?integrate':

  Details:
 If one or both limits are infinite, the infinite range
 is mapped onto a finite interval.
 For a finite interval, globally adaptive interval
 subdivision is used in connection with extrapolation
 by the Epsilon algorithm.

as a result of some numerical experiments. Following up on
Harold Doran's original dnorm(x,mean.mean/10) pattern (and
quoting a small subset of the experiments ... ):

  integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
  # 1 with absolute error < 0.00012
  integrate(function(x) dnorm(x, 200,20), -Inf, Inf)
  # 1.429508e-08 with absolute error < 2.8e-08

  integrate(function(x) dnorm(x, 140,14), -Inf, Inf)
  # 1 with absolute error < 2.2e-06
  integrate(function(x) dnorm(x, 150,15), -Inf, Inf)
  # 2.261582e-05 with absolute error < 4.4e-05

  integrate(function(x) dnorm(x, 144,14.4), -Inf, Inf)
  # 1 with absolute error < 1.7e-06
  integrate(function(x) dnorm(x, 145,14.5), -Inf, Inf)
  # 5.447138e-05 with absolute error < 0.00011

  integrate(function(x) dnorm(x, 150,15), -33000, 33300)
  # 1 with absolute error < 0.00012
  integrate(function(x) dnorm(x, 150,15), -34000, 34300)
  # 5.239671e-05 with absolute error < 0.00010

  integrate(function(x) dnorm(x, 150,15),
-33768.260234277, 34068.2602334277)
  # 0.5000307 with absolute error < 6.1e-05
  integrate(function(x) dnorm(x, 150,15),
-33768.260234278, 34068.2602334278)
  # 6.139415e-05 with absolute error < 0.00012

So it seems that, depending on how integrate() "maps to a finite
interval", and on how the algorithm goes about its "adaptive
interval subdivision", there are critical points where integrate()
switches from one to another of the following:

[A] The whole of a finite interval containing all but the extreme
  outer tails of dnorm() is integrated over;
[B] The whole of a finite interval containing one half of the
  distribution of dnorm() is integrated over;
[C] The whole of a finite interval lying in the extreme of one
  tail of the dnorm distribution is integrated over.

with result: [A] close to 1; [B] close to 0.5; [C] close to 0.

This is compatible with Albyn's conclusion that the integral is
split into the sum of (-Inf,0) and (0,Inf), with the algorithm
ignoring one, or the other, or both, or neither!

This must be one of the nastiest exemplar's of "rounding error" ever!!!
Ted.

On 02-Dec-10 22:39:37, Albyn Jones wrote:
> To really understaand it you will have to look at the fortran code
> underlying integrate.  I tracked it back through a couple of layers
> (dqagi, dqagie, ...  just use google, these are old netlib
> subroutines) then decided I ought to get back to grading papers :-)
> 
> It looks like the integral is split into the sum of two integrals,
> (-Inf,0] and [0,Inf).  
> 
> 
>> integrate(function(x) dnorm(x, 350,50), 0, Inf)
> 1 with absolute error < 1.5e-05
>> integrate(function(x) dnorm(x, 400,50), 0, Inf)
> 1.068444e-05 with absolute error < 2.1e-05
>> integrate(function(x) dnorm(x, 500,50), 0, Inf)
> 8.410947e-11 with absolute error < 1.6e-10
>> integrate(function(x) dnorm(x, 500,50), 0, 1000)
> 1 with absolute error < 7.4e-05
> 
> The integral from 0 to Inf is the lim_{x -> Inf} of the integral
> over [0,x].  I suspect that the algorithm is picking an interval
> [0,x], evaluating the integral over that interval, and then performing
> some test to decide whether to expand the interval.  When the initial
> interval doesn't contain much, expanding a little won't add enough to
> trigger another iteration.  
> 
> albyn
> 
> On Thu, Dec 02, 2010 at 04:21:34PM -0500, Doran, Harold wrote:
>> The integral of any probability density from -Inf to Inf should equal
>> 1, correct? I don't understand last result below.
>> 
>> > integrate(function(x) dnorm(x, 0,1), -Inf, Inf)
>> 1 with absolute error < 9.4e-05
>> 
>> > integrate(function(x) dnorm(x, 100,10), -Inf, Inf)
>> 1 with absolute error < 0.00012
>> 
>> > integrate(function(x) dnorm(x, 500,50), -Inf, Inf)
>> 8.410947e-11 with absolute error < 1.6e-10
>> 
>> > all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value,
>> > 0)
>> [1] TRUE
>> 
>> > sessionInfo()
>> R version 2.10.1 (2009-12-14)
>> i386-pc-mingw32
>> 
>> locale:
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> States.1252
>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>> 
>> other attached packages:
>> [1] statmod_1.4.6  mlmRev_0.99875-1   lme4_0.999375-35  
>> Matrix_0.999375-33 lattice_0.17-26
>> 
>> loaded via a namespace (and not attached):
>> [1] grid_2.10.1   nlme_3.1-96   stats4_2.10.1 tools_2.10.1
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __

Re: [R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Petr Savicky
On Thu, Dec 02, 2010 at 01:13:40PM -0800, Aaron Polhamus wrote:
...
> Many thanks for the tips, those solved my queries. Still interested in how
> to force custom functions to work over rows rather than columns when using
> apply,

In the command

TMAT <- apply(MAT, 1, function(X) X/sum(X))

the custom function is applied over rows of MAT, however, when
collecting the obtained (originally) rows to a matrix, they
become columns, due to the following rule from ?apply

 If each call to ‘FUN’ returns a vector of length ‘n’, then ‘apply’
 returns an array of dimension ‘c(n, dim(X)[MARGIN])’ if ‘n > 1’.

Here, n is the length of the original rows and dim(X)[MARGIN] is
the number of rows of the input matrix. So, the order of the 
dimensions gets reversed in this case.

The following demonstrates that apply(t(MAT), 2, ) and apply(MAT, 1, )
produce the same matrix except of a transposition.

  MAT <- rbind(
  c(5, 3, 1, 6, 7),
  c(9, 7, 3, 10, 11),
  c(1, 2, 3, 4, 5),
  c(2, 4, 6, 8, 10),
  c(9, 5, 2, 1, 1)
  )
  colnames(MAT) <- c("A", "B", "C", "D", "E")
  rownames(MAT) <- c("A", "B", "C", "D", "E")
  
  A <- apply(t(MAT), 2, function(X) X/sum(X))
  A <- t(A)
  
  B <- apply(MAT, 1, function(X) X/sum(X))
  
  all(A == t(B)) # [1] TRUE

The "dual" command to M / rowSums(M), which divides each element
of a matrix M by the sum of its column, may be done, for example

  M <- matrix(1:12, nrow=3)
  M1 <- M / rep(colSums(M), each=dim(M)[1])
[,1]  [,2]  [,3]  [,4]
  [1,] 0.167 0.267 0.2916667 0.3030303
  [2,] 0.333 0.333 0.333 0.333
  [3,] 0.500 0.400 0.375 0.3636364

  M/M1
   [,1] [,2] [,3] [,4]
  [1,]6   15   24   33
  [2,]6   15   24   33
  [3,]6   15   24   33

Alternatively, it is possible to use

  sweep(M, 2, colSums(M), FUN="/")

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integral of PDF

2010-12-02 Thread Albyn Jones
On Thu, Dec 02, 2010 at 06:23:45PM -0500, Ravi Varadhan wrote:

> A simple solution is to locate the mode of the integrand, which should be
> quite easy to do, and then do a coordinate shift to that point and then
> integrate the mean-shifted integrand using `integrate'.
> 
> Ravi.

Translation:  think before you compute!

albyn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kmeans() compared to PROC FASTCLUS

2010-12-02 Thread Andrew Agrimson
Hello all,

I've been comparing results from kmeans() in R to PROC FASTCLUS in SAS and
I'm getting drastically different results with a real life data set. Even
with a simulated data set starting with the same seeds with very well
seperated clusters the resulting cluster means are still different. I was
hoping to look at the source code of kmeans(), but it's in C and FORTRAN and
I'm not quite sure how to get at it. Has anybody looked into the differences
in the implementations or have any thoughts on the matter? Below is the code
I'm using in each case.


fit=kmeans(obs[,-1],centers,nstart=25)

*

proc* *fastclus* data=std maxclusters=*2* maxiter=*100* outiter drift

converge=*0.01* outseed=centers out=cluster;

var x y z;
*

run*;
Thanks,
Andy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The behaviour of read.csv().

2010-12-02 Thread Rolf Turner

I have recently been bitten by an aspect of the behaviour of
the read.csv() function.

Some lines in a (fairly large) *.csv file that I read in had
too many entries.  I would have hoped that this would cause
read.csv() to throw an error, or at least issue a warning,
but it read the file without complaint, putting the extra
entries into an additional line.

This behaviour is illustrated by the toy example in the
attached file ``junk.csv''.  Just do

junk <- read.csv("junk.csv",header=TRUE)
junk

to see the problem.

If the offending over-long line were in the fourth line of data
or earlier, an error would be thrown, but if it is in the fifth line
of data or later no error is given.

This is in a way compatible with what the help on read.csv()
says:

The number of data columns is determined by looking at
the first five lines of input (or the whole file if it
has less than five lines), or from the length of col.names
if it is specified and is longer.

However, the help for read.table() says the same thing.  And yet if
one does

gorp <- read.table("junk.csv",sep=",",header=TRUE)

one gets an error, whereas read.csv() gives none.

Am I correct in saying that is inappropriate behaviour on
the part of read.csv(), or am I missing something?

cheers,

Rolf Turner



P. S.:
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/C/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] datasets  utils stats graphics  grDevices methods   base 

other attached packages:
[1] misc_0.0-13 gtools_2.6.2spatstat_1.21-2 deldir_0.0-13  
[5] mgcv_1.6-2  fortunes_1.4-0  MASS_7.3-8 

loaded via a namespace (and not attached):
[1] grid_2.12.0lattice_0.19-13Matrix_0.999375-44 nlme_3.1-97   
[5] tools_2.12.0  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >