date:20110406

Re: [R] Pulling strings from a Flat file

2011-04-06 Thread Bill.Venables

Isn't all you need read.fwf?

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Kalicin, Sarah [sarah.kali...@intel.com]
Sent: 06 April 2011 09:48
To: r-help@r-project.org
Subject: [R] Pulling strings from a Flat file

Hi,

I have a flat file that contains a bunch of strings that look like this. The 
file was originally in Unix and brought over into Windows:

E123456E234567E345678E456789E567891E678910E. . . .
Basically the string starts with E and is followed with 6 numbers. One 
string=E123456, length=7 characters. This file contains 10,000's of these 
strings. I want to separate them into one vector the length of the number of 
strings in the flat file, where each string is it's on unique value.

cc<-c(7,7,7,7,7,7,7)
> aa<- file("Master","r", raw=TRUE)
> readChar(aa, cc, useBytes = FALSE)
[1] "E123456"  "\nE23456" "7\nE3456" "78\nE456" "789\nE56" "7891\nE6" "78910\nE"
> close(aa)
> unlink("Master")

The biggest issue is I am getting \n added into the string, which I am not sure 
where it is coming from, and splices the strings. Any suggestions on getting 
rid of the /n and create an infinite sequence of 7's for the string length for 
the cc vector? Is there a better way to do this?

Sarah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] qcc.overdispersion-test

2011-04-06 Thread Lao Meng

hi:
Another question abour overdispersion test.

I wanna make sure that:
if p value<0.05,then the data is NOT overdispersion;
if p value>=0.05,then the data IS overdispersion.

I'm not sure whether it's true,just get the above conclusion from simulated
data.

Thanks for your help.



2011/4/2 

> Hi all,
>
> I have made an overdispersion test for a data set and get the following
> result
>
> Overdispersion test Obs.Var/Theor.Var Statistic p-value
>   poisson data  16.24267  47444.85   0
>
>
> after deleting the outliers from the data set I get the following result
>
>
> Overdispersion test Obs.Var/Theor.Var Statistic p-value
>   poisson data  16.27106 0   1
>
>
> The problem is that the overdispersion parameter does not really change,
> but how could the p-value and the statistic change so that the null
> hypothesis is accepted??
>
> I would be very grateful if someone could help me?
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Support Counting

2011-04-06 Thread Petr Savicky

On Tue, Apr 05, 2011 at 08:43:34AM -0500, psombe wrote:
> well im using the "arules" package and i'm trying to use the support command.

Hi.

R-help can provide help for some of the frequently used CRAN packages,
but not for all. There are too many of them. It is not clear, whether
there is someone on R-help, who uses "arules". One of my students is using
Eclat for association rules directly, but not from R. I am using R, but
not for association rules.

Try to determine, whether your question is indeed specific to "arules".
If the question may be formulated without "arules", it has a good chance
to be replied here. Otherwise, send a query to the package maintainer.
Package maintainers usually welcome feedback.

> my data is read form a file using the "read.transactions" command and a line
> of data looks something like this. there are aboutt 88000 rows and 16000
> different items
> > inspect(dset[3])
>   items
> 1 {33, 
> 34, 
> 35} 
> > inspect(dset[1])
>   items
> 1 {0, 1,  10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2,  20, 21, 22, 23, 24,
> 25, 26, 27, 28, 29, 3, 4,5, 6, 7,  8,  9}  
> 
> So in order to use support i have to make an object of class "itemsets" and
> im kind of struggling with the "new" command.
> I made an object of class itemsets by first creating a presence/absence
> matrix and with something like 16000 items this is really sort of tedious. I
> wonder if there is a better way.
> 
> //Currently im doing this
> 
> avec = array(dim=400) //dim is till the max number of the item im concerned
> with
> avec[1:400] = 0
> avec[27] = 1
> avec[63] = 1 //and do on for all the items i want
> 
> amat = matrix(data = avec,ncol = 400)

Up to here, this may be simplified, if the required indices
are stored in a vector, say, "indices". For example

  indices <- c(3, 5, 6, 10)
  avec <- array(0, dim=14)
  avec[indices] <- 1
  amat <- rbind(avec)

or

  amat <- matrix(0, nrow=1, ncol=14)
  amat[1, indices] <- 1
  amat

   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] 
[,14]
  avec001011000 1 0 0 0 0

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grImport/ghostscript problems

2011-04-06 Thread guillaume Le Ray

Hi Paul,

I'm using the latest version 0.7-2 with the version 9.00 of ghostscript on a
windows XP machine.
The error is still there and there is no xml file genrerated but a .rds
file...

Guillaume

2011/4/5 Paul Murrell 

> Hi
>
>
> On 5/04/2011 9:30 p.m., guillaume Le Ray wrote:
>
>> Hi Al,
>>
>> I'm facing exactly the same problem as you are, have you manage to
>> fix it? If yes I eager to know the trick.
>>
>
> Al's problem turned out to be a bug in 'grImport', so one thing you can try
> is to install the latest version of 'grImport'.
>
> If that still fails, you might be able to get more information about the
> problem by looking at the end of the XML file that is created by
> PostScriptTrace().  If ghostscript has hit trouble it's error messages will
> hopefully be at the end of that XML file.
>
> Paul
>
>
>
>> Regards,
>>
>> Guillaume
>>
>> 2011/3/27 Al Roark
>>
>>  Paul Murrell  auckland.ac.nz>  writes:
>>>
>>>
 Hi

 On 28/03/2011 8:13 a.m., Al Roark wrote:

>
> Hi All: I've been struggling for a while trying to get grImport
> up and running.  I'm on a Windows 7 (home premium 64 bit)
> machine running R-2.12.2 along with GPL Ghostscript 9.01. I've
> set my Windows PATH variable to point to the Ghostscript \bin
> and \lib directories, and I've created the R_GSCMD environment
> variable pointing to gswin32c.exe. I don't have any experience
> with Ghostscript, but with the setup described above I can view
> the postscript file with the following command to the Windows
> command prompt: gswin32c.exe D:\Sndbx\vasarely.ps However, I
> can't get the PostScriptTrace() function to work on the same
> file.  Submitting PostScriptTrace("D:/Sndbx/vasarely.ps") gives
> me the error: Error in PostScriptTrace("D:/Sndbx/vasarely.ps")
> :   status 127 in running command 'gswin32c.exe -q -dBATCH
> -dNOPAUSE -sDEVICE=pswrite
> -sOutputFile=C:\Users\Al\AppData\Local\Temp\RtmppPjDAf\file5db99cb
>
>
>  -sstdout=vasarely.ps.xml capturevasarely.ps' Your suggestions are
>
>>  much appreciated. Cheers, Al [[alternative HTML version
> deleted]]
>

 You could try running the ghostscript command that is printed in
 the error message at the Windows command prompt to see more info
 about the problem (might need to remove the '-q' so that
 ghostscript prints messages to the screen).

 Paul


>>> Thanks for your reply.
>>>
>>> Perhaps this is a Ghostscript problem. When I run the Ghostscript
>>> command, I'm met with the rather unhelpful error: 'GPL Ghostscript
>>> 9.01: Unrecoverable error, exit code 1 (occurs whether or not I
>>> remove the -q)'.
>>>
>>> Interestingly, if I remove the final argument (in this case,
>>> capturevasarely.ps) the Ghostscript command executes, placing a
>>> file (appears to be xml) in the temporary directory. However, I'm
>>> not sure what to do with this result.
>>>
>>> __ R-help@r-project.org
>>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
>>> read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> [[alternative HTML version deleted]]
>>
>> __ R-help@r-project.org
>>
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
>> read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Dr Paul Murrell
> Department of Statistics
> The University of Auckland
> Private Bag 92019
> Auckland
> New Zealand
> 64 9 3737599 x85392
> p...@stat.auckland.ac.nz
> http://www.stat.auckland.ac.nz/~paul/
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] syntax to subset for multiple values from a single variable

2011-04-06 Thread SNV Krishna

Hi All,
 
Is it possible to use the subset() function to select data based on multiple
values of a single variable from a data frame.
 
My actual data set is much bigger and would like to illustrate with
following dataset
> df = data.frame(x = c('a','b','c','d','e','f','g','h','a','a','b','b'), y
= 1:12)
I would like to select all rows where x = a or b. 
 

> subset(df, x == c('a','b')) # this command did not return all rows where x
is equal to a or b

   x  y

1  a  1

2  b  2

9  a  9

12 b 12
 
> df[df$x %in% c('a','b'),] # subsetting using subscripts returned all rows 
   x  y

1  a  1

2  b  2

9  a  9

10 a 10

11 b 11

12 b 12
 
I know there might be a problem with subset syntax that I have used, but
could'nt figure out what it is. Any insights from members will be highly
appreciated and thanks for the same.
 
Regards,
 
S.N.V. Krishna

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A fortunes candidate?

2011-04-06 Thread Achim Zeileis


On Tue, 5 Apr 2011, Bert Gunter wrote:


A fortunes candidate?


Definitely!

Added to the devel version on R-Forge.
thx,
Z


On Tue, Apr 5, 2011 at 3:59 PM, David Winsemius  wrote:

... 

" Tested solutions offered when reproducible examples are provided. "





--
"Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions."

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] syntax to subset for multiple values from a single variable

2011-04-06 Thread Ivan Calandra


Hi,

I'm not sure what you're looking for because it looks to me that you 
have the answer already...

Is this what you want:
subset(df, x %in% c('a','b'))
?

Ivan

Le 4/6/2011 10:45, SNV Krishna a écrit :

Hi All,

Is it possible to use the subset() function to select data based on multiple
values of a single variable from a data frame.

My actual data set is much bigger and would like to illustrate with
following dataset

df = data.frame(x = c('a','b','c','d','e','f','g','h','a','a','b','b'), y

= 1:12)
I would like to select all rows where x = a or b.



subset(df, x == c('a','b')) # this command did not return all rows where x

is equal to a or b

x  y

1  a  1

2  b  2

9  a  9

12 b 12


df[df$x %in% c('a','b'),] # subsetting using subscripts returned all rows

x  y

1  a  1

2  b  2

9  a  9

10 a 10

11 b 11

12 b 12

I know there might be a problem with subset syntax that I have used, but
could'nt figure out what it is. Any insights from members will be highly
appreciated and thanks for the same.

Regards,

S.N.V. Krishna



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Layout within levelplot from the lattice package

2011-04-06 Thread Ian Renner

Hi,

I'm a novice with levelplot and need some assistance! Basically, I want a 
window 
which contains 6 levelplots of equal size presented in 3 columns and 2 rows. 


I've tried to approach it two ways. The first way leads to this question:

Is there any way to concatenate levelplots from a factor vertically as opposed 
to horizontally? I'd like to pair the levelplots by factor.2 on top of each 
other with the colorkey at thebottom, resulting in 3 columns of paired 
levelplots. I can only get 3 rows of paired levelplots. Here is some mock code 
to illustrate the point:

start = expand.grid(1:10,1:14)
start2 = rbind(start,start,start,start,start,start)
z = rnorm(840)
factor.1 = c(rep("A", 280), rep("B", 280), rep("C", 280))
factor.2 = c(rep("1", 140), rep("2", 140), rep("1", 140), rep("2", 140), 
rep("1", 140), rep("2", 140))

data = data.frame(start2, z, factor.1, factor.2)
names(data)[1:2] = c("x", "y")

data.A = data[data$factor.1 == "A",]
data.B = data[data$factor.1 == "B",]
data.C = data[data$factor.1 == "C",]

print(levelplot(z~x*y|factor.2,data.A,col.regions=heat.colors,asp="iso",xlab = 
"", ylab = "", colorkey = list(space="bottom"), 
scales=list(y=list(draw=F),x=list(draw=F))),split=c(1,1,1,3))
print(levelplot(z~x*y|factor.2,data.B,col.regions=topo.colors,asp="iso",xlab = 
"", ylab = "", colorkey = list(space="bottom"), 
scales=list(y=list(draw=F),x=list(draw=F))),split=c(1,2,1,3),newpage=FALSE)
print(levelplot(z~x*y|factor.2,data.C,col.regions=terrain.colors,asp="iso",xlab 
= "", ylab = "", colorkey = list(space="bottom"), 
scales=list(y=list(draw=F),x=list(draw=F))),split=c(1,3,1,3),newpage=FALSE)

My other approach has been to plot the 6 levelplots individually so that I can 
control the placement. However, I'd like to have the paired levelplots touch as 
they do in my first approach, and I'd like them to be the same size, despite 
having the colorkey only below one of the plots. Also, I'd like to put more 
space between the plot and the colorkey. Is there a way to manual control the 
size of the individual plot windows? Here is some additional code to illustrate 
this point:

data.A1 = data.A[data.A$factor.2 == "1",]
data.A2 = data.A[data.A$factor.2 == "2",]
data.B1 = data.B[data.B$factor.2 == "1",]
data.B2 = data.B[data.B$factor.2 == "2",]
data.C1 = data.C[data.C$factor.2 == "1",]
data.C2 = data.C[data.C$factor.2 == "2",]

print(levelplot(z~x*y,data=data.A1,col.regions=heat.colors,asp="iso",xlab = "", 
ylab = "",main="Method 
A",scales=list(y=list(draw=F),x=list(draw=F)),colorkey=FALSE),split=c(1,1,3,2))
print(levelplot(z~x*y,data=data.A2,col.regions=heat.colors,asp="iso",xlab = "", 
ylab = "", 
scales=list(y=list(draw=F),x=list(draw=F)),colorkey=list(space="bottom")),split=c(1,2,3,2),newpage=FALSE)

print(levelplot(z~x*y,data=data.B1,col.regions=topo.colors,asp="iso",xlab = "", 
ylab = "", main="Method 
B",scales=list(y=list(draw=F),x=list(draw=F)),colorkey=FALSE),split=c(2,1,3,2),newpage=FALSE)

print(levelplot(z~x*y,data=data.B2,col.regions=topo.colors,asp="iso",xlab = "", 
ylab = "", 
scales=list(y=list(draw=F),x=list(draw=F)),colorkey=list(space="bottom")),split=c(2,2,3,2),newpage=FALSE)

print(levelplot(z~x*y,data=data.C1,col.regions=terrain.colors,asp="iso",xlab = 
"", ylab = "", main="Method 
C",scales=list(y=list(draw=F),x=list(draw=F)),colorkey=FALSE),split=c(3,1,3,2),newpage=FALSE)

print(levelplot(z~x*y,data=data.C2,col.regions=terrain.colors,asp="iso",xlab = 
"", ylab = "", 
scales=list(y=list(draw=F),x=list(draw=F)),colorkey=list(space="bottom")),split=c(3,2,3,2),newpage=FALSE)


Any help would be greatly appreciated!

Ian
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] function order

2011-04-06 Thread Yan Jiao

Dear All

I'm trying to sort a matrix using function order,
Some thing really odd:

e.g.
abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))## matrix I want to sort

if I do
abc[ order(abc[,3]), increasing = TRUE]

the result is correct
 [,1] [,2] [,3]
[1,]231
[2,]652
[3,]123

But if I want to sort in decresing order:
abc[ order(abc[,3]), decreasing = TRUE]

the result is wrong
 [,1] [,2] [,3]
[1,]231
[2,]652
[3,]123

Also if I use
abc[ order(abc[,3]), increasing = FALSE]
it returns nothing
[1,]
[2,]
[3,]

Why is that?


Many thanks

Yan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] CSV file in "tm" package

2011-04-06 Thread Shreyasee

Hi,

I have a .CSV file with data in multiple columns.
I am using "tm" package in R for extracting data from the .CSV file.
The issue is that when I convert that .CSV file into Corpus and try to
inspect the Corpus...I could see numbers like 6,2,5, etcinstead of data
from those respective columns.
Does anybody know a solution for this?


Regards,
Shreyasee

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function order

2011-04-06 Thread Jim Lemon


On 04/06/2011 08:35 PM, Yan Jiao wrote:

Dear All

I'm trying to sort a matrix using function order,
Some thing really odd:

e.g.
abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))## matrix I want to sort

if I do
abc[ order(abc[,3]), increasing = TRUE]

the result is correct
  [,1] [,2] [,3]
[1,]231
[2,]652
[3,]123

But if I want to sort in decresing order:
abc[ order(abc[,3]), decreasing = TRUE]

the result is wrong
  [,1] [,2] [,3]
[1,]231
[2,]652
[3,]123

Also if I use
abc[ order(abc[,3]), increasing = FALSE]
it returns nothing
[1,]
[2,]
[3,]

Why is that?


Hi Yan,
It is because you have put the "decreasing" argument outside the 
parentheses, and it is not being used in the "order" function.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function order

2011-04-06 Thread Philipp Pagel

On Wed, Apr 06, 2011 at 11:35:32AM +0100, Yan Jiao wrote:
> abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))## matrix I want to sort
> 
> if I do
> abc[ order(abc[,3]), increasing = TRUE]

Jim already pointed out that the argument needs to go inside the
parenthes of the order function. In addition, order has an argument
called 'decreasing', but none called 'inceasing'. Finally, you are
lacking a comma in your subsetting of the matrix:

> abc[ order(abc[,3], decreasing=F)]
[1] 2 6 1

But you probably mean:

> abc[ order(abc[,3], decreasing=F), ]
 [,1] [,2] [,3]
[1,]231
[2,]652
[3,]123

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculated mean value based on another column bin from dataframe.

2011-04-06 Thread Fabrice Tourre

Dear list,

I have a dataframe with two column as fellow.

> head(dat)
   V1  V2
 0.15624 0.94567
 0.26039 0.66442
 0.16629 0.97822
 0.23474 0.72079
 0.11037 0.83760
 0.14969 0.91312

I want to get the column V2 mean value based on the bin of column of
V1. I write the code as fellow. It works, but I think this is not the
elegant way. Any suggestions?

dat<-read.table("dat.txt",head=F)
ran<-seq(0,0.5,0.05)
mm<-NULL
for (i in c(1:(length(ran)-1)))
{
fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1]
m<-mean(dat[fil,2])
mm<-c(mm,m)
}
mm

Here is the first 20 lines of my data.

> dput(head(dat,20))
structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037,
0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856,
0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238,
0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376,
0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424,
0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022
)), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating a symmetric contingency table from two vectors with different length of levels in R

2011-04-06 Thread suparna mitra

Hello,
How can I create a symmetric contingency table from two categorical vectors
having different length of levels?
For example one vector has 98 levels
TotalData1$Taxa.1
 [1] "Aconoidasida" "Actinobacteria (class)"
"Actinopterygii"   "Alphaproteobacteria"
 [5] "Amoebozoa""Amphibia"
"Anthozoa" "Aquificae (class)"
and so on .
98 Levels: Aconoidasida Actinobacteria (class) 

  and the other vector has 105 levels
TotalData1$Taxa.2
[1] FlavobacteriaProteobacteria
Bacteroidetes/Chlorobi group Bacteria
[5] EpsilonproteobacteriaEpsilonproteobacteria
 Epsilonproteobacteria
and so on  ..
105 Levels: Acidobacteria Aconoidasida Actinobacteria (class) 

Now I want to create a symmetric contingency table.
Any quick idea will be really helpful.
Best regards,
Mitra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help in kmeans

2011-04-06 Thread Raji

Hi All,

  I was using the following command for performing kmeans for Iris dataset.

Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3)

This was giving proper results for me. But, in my application we generate
the R commands dynamically and there was a requirement that the column names
will be sent instead of column indices to the R commands.Hence, to
incorporate this, i tried using the R commands in the following way.

kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3)

or

kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3)

In both the ways, we found that the results are different from what we saw
with the first command (with column indices).

can you please let  us know what is going wrong here.If so, can you please
let us know how the column names can be used in kmeans to obtain the correct
results?

Many thanks,
Raji 

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem to convert date to number

2011-04-06 Thread Chris82

Hi R users,

I have a maybe small problem which I cannot solve by myself.

I want to convert 

"chron" "dates" "times"

(04/30/06 11:35:00)


to a number with as.POSIXct.

The Problem is that I can't choose different timezones. I always get "CEST"
and not "UTC" what I need.  

date = as.POSIXct(y,tz="UTC")

"2006-04-30 11:35:00 CEST"

Then I tried to use as.POSIXlt.

date = as.POSIXlt(y,tz="UTC")

"2006-04-30 11:35:00 UTC"

The advantage is I get time in UTC but now the problem is that I can
calculate numbers.

date <- as.double(date)/86400

it is not working with  as.POSIXlt but with as.POSIXct

 
Thanks!

With best regards



--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-to-convert-date-to-number-tp3430571p3430571.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function order

2011-04-06 Thread Dennis Murphy

Hi:

Try this:

abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))
abc[order(abc[, 1], decreasing = TRUE), ]
 [,1] [,2] [,3]
[1,]652
[2,]231
[3,]123

HTH,
Dennis

On Wed, Apr 6, 2011 at 3:35 AM, Yan Jiao  wrote:

> Dear All
>
> I'm trying to sort a matrix using function order,
> Some thing really odd:
>
> e.g.
> abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))## matrix I want to sort
>
> if I do
> abc[ order(abc[,3]), increasing = TRUE]
>
> the result is correct
> [,1] [,2] [,3]
> [1,]231
> [2,]652
> [3,]123
>
> But if I want to sort in decresing order:
> abc[ order(abc[,3]), decreasing = TRUE]
>
> the result is wrong
> [,1] [,2] [,3]
> [1,]231
> [2,]652
> [3,]123
>
> Also if I use
> abc[ order(abc[,3]), increasing = FALSE]
> it returns nothing
> [1,]
> [2,]
> [3,]
>
> Why is that?
>
>
> Many thanks
>
> Yan
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help in kmeans

2011-04-06 Thread Christian Hennig

I'm not going to comment on column names, but this is just to make you 
aware that the results of k-means depend on random initialisation.


This means that it is possible that you get different results if you run 
it several times. It basically gives you a local optimum and there may be 
more than one of these.

Use set.seed to see whether this explains your problem.

Best regards,
Christian

On Wed, 6 Apr 2011, Raji wrote:


Hi All,

 I was using the following command for performing kmeans for Iris dataset.

Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3)

This was giving proper results for me. But, in my application we generate
the R commands dynamically and there was a requirement that the column names
will be sent instead of column indices to the R commands.Hence, to
incorporate this, i tried using the R commands in the following way.

kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3)

or

kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3)

In both the ways, we found that the results are different from what we saw
with the first command (with column indices).

can you please let  us know what is going wrong here.If so, can you please
let us know how the column names can be used in kmeans to obtain the correct
results?

Many thanks,
Raji

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] executing from .R source file in the src package

2011-04-06 Thread Duncan Murdoch


On 11-04-06 2:13 AM, Larry wrote:

Can I run R code straight from R src (.R) file instead of .rdb/.rdx?  I of
course tried simply unzipping tar.gz in the R_LIBS directory but R complains
with "not a valid installed package".

Real issue: I am very new to R and all, so this could be something basic.
  I'm trying to use ess-tracebug (Emacs front-end to trace/browser et al).
  It works great when I trace functions in .R files because browser outputs
filename+line-number when it hits a breakpoint. i.e. something like this:

debug at /home/lgd/test/R/test3.R#1: {

It even moves the cursor as you step through the function.  This is just
lovely as I'm sure everyone knows.  However, in case of a trace on function
in a package (ie loaded from a .rdb/.rdx) there is no filename/linenum
information probably because its not retained.  i.e. it prints something
like this:

debugging in: train(net, P, target, error.criterium = "LMS", report = TRUE,

Any way to work around this?  Thanks for all insights.


I don't know ESS, but to get debug info in a package, you need to 
install the package with debug information included.  By default 
source() includes it, installed packages don't.


Setting the environment variable

R_KEEP_PKG_SOURCE=yes

before running

R CMD INSTALL foo.tar.gz

will install the debug information.  Then the browser, etc. will list 
filename and line number information.  This is also necessary for 
setBreakpoint to work, but then you'll also probably need to say which 
environments to look in, e.g.


setBreakpoint("Sweave.R#70", env=environment(Sweave))

will set a breakpoint in whatever function is defined at line 70 of 
Sweave.R, as long as that function lives in the same environment as Sweave.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating a symmetric contingency table from two vectors with different length of levels in R

2011-04-06 Thread andrija djurovic

Hi:

Here is one solution:

a<-factor(c(1,2,4,5,6))
b<-factor(c(2,2,4,5,5))
b1<-factor(b,levels=c(levels(b),levels(a)[levels(a)%in%levels(b)==FALSE]))
table(a,b1)

but be aware that levels of b is a subset of levels of a.

Andrija

On Wed, Apr 6, 2011 at 10:39 AM, suparna mitra <
mi...@informatik.uni-tuebingen.de> wrote:

> Hello,
> How can I create a symmetric contingency table from two categorical vectors
> having different length of levels?
> For example one vector has 98 levels
> TotalData1$Taxa.1
>  [1] "Aconoidasida" "Actinobacteria (class)"
> "Actinopterygii"   "Alphaproteobacteria"
>  [5] "Amoebozoa""Amphibia"
> "Anthozoa" "Aquificae (class)"
> and so on .
> 98 Levels: Aconoidasida Actinobacteria (class) 
>
>  and the other vector has 105 levels
> TotalData1$Taxa.2
>[1] FlavobacteriaProteobacteria
> Bacteroidetes/Chlorobi group Bacteria
>[5] EpsilonproteobacteriaEpsilonproteobacteria
>  Epsilonproteobacteria
> and so on  ..
> 105 Levels: Acidobacteria Aconoidasida Actinobacteria (class) 
>
> Now I want to create a symmetric contingency table.
> Any quick idea will be really helpful.
> Best regards,
> Mitra
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculated mean value based on another column bin from dataframe.

2011-04-06 Thread Henrique Dallazuanna

Try this:

fil <- sapply(ran, '<', e1 = dat[,1]) & sapply(ran[2:(length(ran) +
1)], '>=', e1 = dat[,1])
mm <- apply(fil, 2, function(idx)mean(dat[idx, 2]))

On Wed, Apr 6, 2011 at 5:48 AM, Fabrice Tourre  wrote:
> Dear list,
>
> I have a dataframe with two column as fellow.
>
>> head(dat)
>       V1      V2
>  0.15624 0.94567
>  0.26039 0.66442
>  0.16629 0.97822
>  0.23474 0.72079
>  0.11037 0.83760
>  0.14969 0.91312
>
> I want to get the column V2 mean value based on the bin of column of
> V1. I write the code as fellow. It works, but I think this is not the
> elegant way. Any suggestions?
>
> dat<-read.table("dat.txt",head=F)
> ran<-seq(0,0.5,0.05)
> mm<-NULL
> for (i in c(1:(length(ran)-1)))
> {
>    fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1]
>    m<-mean(dat[fil,2])
>    mm<-c(mm,m)
> }
> mm
>
> Here is the first 20 lines of my data.
>
>> dput(head(dat,20))
> structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037,
> 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856,
> 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238,
> 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376,
> 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424,
> 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022
> )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame")
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple save question

2011-04-06 Thread Terry Therneau

--- begin inclusion--
Hi,

When I run the survfit function, I want to get the restricted mean  
value and the standard error also. I found out using the "print"  
function to do so, as shown below,


The questions is, is there any way to extract these values from the  
print command?

-  end inclusion ---

Use sfit <- summary(fit).  Then sfit$table contains the data the the
print method produces.
 
No, there isn't a way to extract them from the print command; the
standard in S/R is for all print commands to return the object passed to
them, without embellisment.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem to convert date to number

2011-04-06 Thread David Winsemius

On Apr 6, 2011, at 7:55 AM, Chris82 wrote:

Hi R users,

I have a maybe small problem which I cannot solve by myself.

I want to convert

"chron" "dates" "times"

(04/30/06 11:35:00)

Using the example from help(chron)

> as.POSIXlt(x)
# chron times are assumed to be UTC but are printed with the current  
local value

[1] "1992-02-27 18:03:20 EST" "1992-02-27 17:29:56 EST"
[3] "1992-01-13 20:03:30 EST" "1992-02-28 13:21:03 EST"
[5] "1992-02-01 11:56:26 EST"
> as.POSIXlt(x, tz="UTC")
[1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC"
[3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC"
[5] "1992-02-01 16:56:26 UTC"
> as.POSIXlt(x, tz="CEST")  # "not working"
[1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC"
[3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC"
[5] "1992-02-01 16:56:26 UTC"

So it makes me wonder if as.POSIXct considers CEST to be a valid tz  
value.

> as.POSIXlt(x, tz="XYZST")
[1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC"
[3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC"
[5] "1992-02-01 16:56:26 UTC"

> as.POSIXlt(x, tz="EST5EDT")  # where  I am, and seems to be working
[1] "1992-02-27 18:03:20 EST" "1992-02-27 17:29:56 EST"
[3] "1992-01-13 20:03:30 EST" "1992-02-28 13:21:03 EST"
[5] "1992-02-01 11:56:26 EST"

But despite the returned code from Sys.time() that TLA  (EDT) does not  
"work":

> Sys.time()
[1] "2011-04-06 08:44:01 EDT"
> as.POSIXlt(x, tz="EDT")
# "EDT" printed as UTC values
[1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC"
[3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC"
[5] "1992-02-01 16:56:26 UTC"

But an unambiguous version does return the expected offset. All of  
this can be specific to your system (not provided) and your locale  
setting (also not provided)

> as.POSIXlt(x, tz="UTC+02")
[1] "1992-02-27 21:03:20 UTC" "1992-02-27 20:29:56 UTC"
[3] "1992-01-13 23:03:30 UTC" "1992-02-28 16:21:03 UTC"
[5] "1992-02-01 14:56:26 UTC"

--
David.

to a number with as.POSIXct.

The Problem is that I can't choose different timezones. I always get  
"CEST"

and not "UTC" what I need.

date = as.POSIXct(y,tz="UTC")

"2006-04-30 11:35:00 CEST"
Then I tried to use as.POSIXlt.

date = as.POSIXlt(y,tz="UTC")

"2006-04-30 11:35:00 UTC"

The advantage is I get time in UTC but now the problem is that I can
calculate numbers.

date <- as.double(date)/86400

it is not working with  as.POSIXlt but with as.POSIXct

Thanks!

With best regards

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-to-convert-date-to-number-tp3430571p3430571.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] frailty and survival curves

2011-04-06 Thread Terry Therneau

With respect to Cox models + frailty, and post-fit survival curves.

 1. There are two possible survival curves, the conditional curve where
we identify which center a subject comes from, and the marginal curve
where we have integrated out center and give survival for an
"unspecified" individual.  I find the first more useful.  More
importantly to your case, the survival package currently has no code to
calculate the second of these.

 2. When the number of centers is large the coxph code may have used a
sparse approximation to the variance matrix, for speed reasons.  In this
particular case one cannot use the "newdata" argument.  The reason is
entirely practical --- the code turned out to be very hard to write.
The need for this comes up very rarely, and the work around is to use
   coxph(... + frailty(center, sparse=1000, )
where we set the "sparse computation" threshold to be some number larger
than the number of centers, i.e., force non-sparse computation.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice xscale.components: different ticks on top/bottom axis

2011-04-06 Thread Boris.Vasiliev

 
> On Sat, Apr 2, 2011 at 1:29 AM,   wrote:
> >
> >> On Fri, Mar 11, 2011 at 12:28 AM,
> >>  wrote:
> >> > Good afternoon,
> >> >
> >> > I am trying to create a plot where the bottom and top 
> >> > axes have the 
> >> > same scale but different tick marks.  I tried user-defined 
> >> > xscale.component function but it does not produce 
> >> > desired results.
> >> > Can anybody suggest where my use of xscale.component function is 
> >> > incorrect?
> >> >
> >> > For example, the code below tries to create a plot where 
> >> > horizontal 
> >> > axes limits are c(0,10), top axis has ticks at odd integers, and 
> >> > bottom axis has ticks at even integers.
> >> >
> >> > library(lattice)
> >> >
> >> > df <- data.frame(x=1:10,y=1:10)
> >> >
> >> > xscale.components.A <- function(...,user.value=NULL) {
> >> >  # get default axes definition list; print user.value
> >> >  ans <- xscale.components.default(...)
> >> >  print(user.value)
> >> >
> >> >  # start with the same definition of bottom and top axes
> >> >  ans$top <- ans$bottom
> >> >
> >> >  # - bottom labels
> >> >  ans$bottom$labels$at <- seq(0,10,by=2)
> >> >  ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-")
> >> >
> >> >  # - top labels
> >> >  ans$top$labels$at <- seq(1,9,by=2)
> >> >  ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-")
> >> >
> >> >  # return axes definition list
> >> >  return(ans)
> >> > }
> >> >
> >> > oltc <- xyplot(y~x,data=df,
> >> >
> >> > scales=list(x=list(limits=c(0,10),at=0:10,alternating=3)),
> >> >               xscale.components=xscale.components.A,
> >> >               user.value=1)
> >> > print(oltc)
> >> >
> >> > The code generates a figure with incorrectly placed 
> >> > bottom and top 
> >> > labels.  Bottom labels "B-0", "B-2", ... are at 0, 1, ... 
> >> > and top labels "T-1", "T-3", ... are at 0, 1, ...  When 
> >> > axis-function runs out of labels, it replaces labels with NA.
> >> >
> >> > It appears that lattice uses top$ticks$at to place labels and 
> >> > top$labels$labels for labels.  Is there a way to override this 
> >> > behaviour (other than to expand the "labels$labels"  vector to 
> >> > be as long as "ticks$at" vector and set necessary elements to "")?
> >>
> >> Well, $ticks$at is used to place the ticks, and 
> >> $labels$at is used to place the labels. They should 
> >> typically be the 
> >> same, but you have changed one and not the other.
> >> Everything seems to work if you set $ticks$at to the same 
> >> values as $labels$at:
> >>
> >>
> >>     ##  - bottom labels
> >> +   ans$bottom$ticks$at <- seq(0,10,by=2)
> >>     ans$bottom$labels$at <- seq(0,10,by=2)
> >>     ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-")
> >>
> >>     ##  - top labels
> >> +   ans$top$ticks$at <- seq(1,9,by=2)
> >>     ans$top$labels$at <- seq(1,9,by=2)
> >>     ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-")
> >>
> >>
> >> > Also, can user-parameter be passed into xscale.components() 
> >> > function? (For example, locations and labels of ticks on the top 
> >> > axis).  In the  code above, print(user.value) returns NULL even 
> >> > though in the xyplot() call user.value is 1.
> >>
> >> No. Unrecognized arguments are passed to the panel function only, not 
> >> to any other function. However, you can always define an inline
> >> function:
> >>
> >> oltc <- xyplot(y~x,data=df,
> >>                scales=list(x=list(limits=c(0,10), at = 0:10,
> >>                            alternating=3)),
> >>                xscale.components = function(...)
> >>                            xscale.components.A(..., user.value=1))
> >>
> >> Hope that helps (and sorry for the late reply).
> >>
> >> -Deepayan
> >>
> >
> > Deepyan,
> >
> > Thank you very much for your reply.  It makes things a bit clearer.
> >
> > It other words in the list prepared by xscale.components(), vectors 
> > $ticks$at and $labels$at must be the same.
> > If only every second tick is to be labelled then every second label 
> > should be set explicitly to empty strings:
> 
> Now when you put it that way, the current behaviour does seem 
> wrong (I didn't read your original post carefully enough). I 
> guess this was one of the not-yet-implemented things 
> mentioned in the Details section of ?xscale.components.default.
> 
> I have added support for different ticks$at and labels$at in 
> the SVN sources in r-forge. You can test it from there (your 
> original code works as expected). I won't make a new release 
> on CRAN until after R
> 2.13 is released (we are almost in code freeze now).
> 
> -Deepayan
> 

Great!
Thank you very much.
Boris.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Layout within levelplot from the lattice package

2011-04-06 Thread Dieter Menne

Ian Renner wrote:
> 
> Hi,
> 
> I'm a novice with levelplot and need some assistance! Basically, I want a
> window 
> which contains 6 levelplots of equal size presented in 3 columns and 2
> rows. 
> ...
> Is there any way to concatenate levelplots from a factor vertically as
> opposed 
> to horizontally? 
> 
> 

Thank for providing a self-contained example. Remembering my early struggles
with lattice, you must have needed some hours to get this working!
Your last question is easy to answer: use a slightly different version of
split (see below). To keep the code more transparent, separating plot
generation from display, I prefer the following scheme:

p = xyplot(...)
print(p, split)

Normally one would try to use one data frame and panels for your type of
plot, but as I see this cannot be done here because you want different color
regions which is not vectorized. So doing it in three runs seems to be fine,
if Deepayan has no other solution.
To get the plots closer together you must find the correct par.settings.
This is one of the tricky parts in lattice, but try

str(trellis.par.get())

to find what is possible. 

Dieter

# ---
library(lattice)
start = expand.grid(1:10,1:14)
start2 = rbind(start,start,start,start,start,start)
z = rnorm(840)
factor.1 = c(rep("A", 280), rep("B", 280), rep("C", 280))
factor.2 = c(rep("1", 140), rep("2", 140), rep("1", 140), rep("2", 140),
rep("1", 140), rep("2", 140))

data = data.frame(start2, z, factor.1, factor.2)
   names(data)[1:2] = c("x", "y")

data.A = data[data$factor.1 == "A",]
data.B = data[data$factor.1 == "B",]
data.C = data[data$factor.1 == "C",]
## End of data generation

doLevels = function(data, col.regions){
  levelplot(z~x*y|factor.2,data,
col.regions=col.regions,asp="iso",xlab = "", 
ylab = "", colorkey = list(space="bottom"),
scales=list(y=list(draw=F),x=list(draw=F)),
par.settings=list(layout.widths=list(
  right.padding=-1,
  left.padding = -1
  ))
)
}

p1 = doLevels(data.A,heat.colors)
p2 = doLevels(data.B,topo.colors)
p3 = doLevels(data.C,terrain.colors)

print(p1,split=c(1,1,3,1),more=TRUE)
print(p2,split=c(2,1,3,1),more=TRUE)
print(p3,split=c(3,1,3,1))

--
View this message in context: 
http://r.789695.n4.nabble.com/Layout-within-levelplot-from-the-lattice-package-tp3430421p3430812.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package diveMove readTDR problem

2011-04-06 Thread Ben Bolker

mwege  zoology.up.ac.za> writes:

> I am trying to read my TDR data into R using the readTDR function for the
> diveMove package.
> 
> > seal <- readTDR("file location and name here", dateCol=1, depthCol=3,
> > speed=FALSE, subsamp=1, concurrentCols=4:5)
> 
> But I keep getting the following error:
> > Error: all(!is.na(time)) is not TRUE
> 
> All my columns to have values in them (there are no empty records)
> 
> The manual and vignette of the package diveMove doesnt give a proper
> description of how to read data into R. It only describes how to access the
> data in the system file that comes with the package.
> What am I doing wrong?

  It's hard to answer this without a reproducible example, and it's
generally harder to get answers about less-used/more specialized
packages.  I'm going to guess that at least some of your dates are
not in the same format as specified (from the manual page: 
default is "%d/%m/%Y %H:%M:%S" --
you can change this with the 'dtformat' argument).

  You shouldn't need to specify arguments to the function that
are the same as the defaults: I would expect that

seal <- readTDR("file location and name here", 
subsamp=1, concurrentCols=4:5)

would work equally well (or poorly ...)

"%d/%m/%Y %H:%M:%S"

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Decimal Accuracy Loss?

2011-04-06 Thread Brigid Mooney

This is hopefully a quick question on decimal accuracy.  Is any
decimal accuracy lost when casting a numeric vector as a matrix?  And
then again casting the result back to a numeric?

I'm finding that my calculation values are different when I run for
loops that manually calculate matrix multiplication as compared to
when I cast the vectors as matrices and multiply them using "%*%".
(The errors are very small, but the process is run iteratively
thousands of times, at which point the error between the two
differences becomes noticeable.)

I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?",
but just want to confirm that the differences in values are due to
differences in the matrix multiplication operator and manual
calculation via for loops, rather than information that is lost when
casting a numeric as a matrix and back again.

Thanks in advance for the help,
Brigid

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimal Accuracy Loss?

2011-04-06 Thread Bert Gunter

Confirmed. "Casting" just adds/removes the dim attribute to the
numeric vector/matrix.

-- Bert

On Wed, Apr 6, 2011 at 8:33 AM, Brigid Mooney  wrote:
> This is hopefully a quick question on decimal accuracy.  Is any
> decimal accuracy lost when casting a numeric vector as a matrix?  And
> then again casting the result back to a numeric?
>
> I'm finding that my calculation values are different when I run for
> loops that manually calculate matrix multiplication as compared to
> when I cast the vectors as matrices and multiply them using "%*%".
> (The errors are very small, but the process is run iteratively
> thousands of times, at which point the error between the two
> differences becomes noticeable.)
>
> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?",
> but just want to confirm that the differences in values are due to
> differences in the matrix multiplication operator and manual
> calculation via for loops, rather than information that is lost when
> casting a numeric as a matrix and back again.
>
> Thanks in advance for the help,
> Brigid
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
"Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions."

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimal Accuracy Loss?

2011-04-06 Thread Brigid Mooney

Thanks, Bert.  That's a big help.

-Brigid



On Wed, Apr 6, 2011 at 11:45 AM, Bert Gunter  wrote:
> Confirmed. "Casting" just adds/removes the dim attribute to the
> numeric vector/matrix.
>
> -- Bert
>
> On Wed, Apr 6, 2011 at 8:33 AM, Brigid Mooney  wrote:
>> This is hopefully a quick question on decimal accuracy.  Is any
>> decimal accuracy lost when casting a numeric vector as a matrix?  And
>> then again casting the result back to a numeric?
>>
>> I'm finding that my calculation values are different when I run for
>> loops that manually calculate matrix multiplication as compared to
>> when I cast the vectors as matrices and multiply them using "%*%".
>> (The errors are very small, but the process is run iteratively
>> thousands of times, at which point the error between the two
>> differences becomes noticeable.)
>>
>> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?",
>> but just want to confirm that the differences in values are due to
>> differences in the matrix multiplication operator and manual
>> calculation via for loops, rather than information that is lost when
>> casting a numeric as a matrix and back again.
>>
>> Thanks in advance for the help,
>> Brigid
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> "Men by nature long to get on to the ultimate truths, and will often
> be impatient with elementary studies or fight shy of them. If it were
> possible to reach the ultimate truths without the elementary studies
> usually prefixed to them, these would not be preparatory studies but
> superfluous diversions."
>
> -- Maimonides (1135-1204)
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decimal Accuracy Loss?

2011-04-06 Thread Petr Savicky

On Wed, Apr 06, 2011 at 11:33:48AM -0400, Brigid Mooney wrote:
> This is hopefully a quick question on decimal accuracy.  Is any
> decimal accuracy lost when casting a numeric vector as a matrix?  And
> then again casting the result back to a numeric?
> 
> I'm finding that my calculation values are different when I run for
> loops that manually calculate matrix multiplication as compared to
> when I cast the vectors as matrices and multiply them using "%*%".
> (The errors are very small, but the process is run iteratively
> thousands of times, at which point the error between the two
> differences becomes noticeable.)
> 
> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?",
> but just want to confirm that the differences in values are due to
> differences in the matrix multiplication operator and manual
> calculation via for loops, rather than information that is lost when
> casting a numeric as a matrix and back again.

Others already confirmed that casting a numeric as a matrix and back
again does not change the numbers. It is likely that the library
operator "%*%" is more accurate than a straightforward for loop.
For example, sum(x) uses a more accurate algorithm than iteration
of s <- s + x[i] in double precision.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Use of the dot.dot.dot option in functions.

2011-04-06 Thread KENNETH R CABRERA

Hi R users:

I try this code, where "fun" is a parameter of a random generating
function name, and I pretend to use "..." parameter to pass the parameters
of different random generating functions.

What am I doing wrong?

f1<-function(nsim=20,n=10,fun=rnorm,...){ 
vp<-replicate(nsim,t.test(fun(n,...),fun(n,...))$p.value)
return(vp)
}

This works!
f1()
f1(n=20,mean=10)

This two fails:
f1(n=10,fun=rexp)
f1(n=10,fun=rbeta,shape1=1,shape2=2)

Thank you for your help.

Kenneth


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summing values by week - based on daily dates - but with some dates missing

2011-04-06 Thread Dimitri Liakhovitski

Guys, sorry to bother you again:

I am running everything as before (see code below - before the line
with a lot of ##). But now I am getting an error:
Error in eval(expr, envir, enclos) : could not find function "na.locf"
I also noticed that after I run the 3rd line from the bottom: "wk <-
as.numeric(format(myframe$dates, "%Y.%W"))" - there are some weeks
that end with .00
And then, after I run the 2nd line from the bottom: "is.na(wk) <- wk
%% 1 == 0" those weeks turn into NAs.
Whether I run the second line or not - I get the same error about it
not finding the function "na.locf".
Do you know what might be going on?
Thanks a lot!
Dimitri

### Creating a longer example data set:
mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2)
myfactor<-c(rep("group.1",500),rep("group.2",500))
set.seed(123)
myvalues<-runif(1000,0,1)
myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues)
(myframe)
dim(myframe)

## Removing same rows (dates) unsystematically:
set.seed(123)
removed.group1<-sample(1:500,size=150,replace=F)
set.seed(456)
removed.group2<-sample(501:1000,size=150,replace=F)
to.remove<-c(removed.group1,removed.group2);length(to.remove)
to.remove<-to.remove[order(to.remove)]
myframe<-myframe[-to.remove,]
(myframe)
dim(myframe)
names(myframe)# write.csv(myframe,file="x.test.csv",row.names=F)

wk <- as.numeric(format(myframe$dates, "%Y.%W"))
is.na(wk) <- wk %% 1 == 0
solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)




###

On Wed, Mar 30, 2011 at 5:25 PM, Henrique Dallazuanna  wrote:
> You're right:
>
> wk <- as.numeric(format(myframe$dates, "%Y.%W"))
> is.na(wk) <- wk %% 1 == 0
> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
>
>
> On Wed, Mar 30, 2011 at 6:10 PM, Dimitri Liakhovitski
>  wrote:
>> Yes, zoo! That's what I forgot. It's great.
>> Henrique, thanks a lot! One question:
>>
>> if the data are as I originally posted - then week numbered 52 is
>> actually the very first week (it straddles 2008-2009).
>> What if the data much longer (like in the code below - same as before,
>> but more dates) so that we have more than 1 year to deal with.
>> It looks like this code is lumping everything into 52 weeks. And my
>> goal is to keep each week independent. If I have 2 years, then it
>> should be 100+ weeks. Makes sense?
>> Thank you!
>>
>> ### Creating a longer example data set:
>> mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2)
>> myfactor<-c(rep("group.1",500),rep("group.2",500))
>> set.seed(123)
>> myvalues<-runif(1000,0,1)
>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues)
>> (myframe)
>> dim(myframe)
>>
>> ## Removing same rows (dates) unsystematically:
>> set.seed(123)
>> removed.group1<-sample(1:500,size=150,replace=F)
>> set.seed(456)
>> removed.group2<-sample(501:1000,size=150,replace=F)
>> to.remove<-c(removed.group1,removed.group2);length(to.remove)
>> to.remove<-to.remove[order(to.remove)]
>> myframe<-myframe[-to.remove,]
>> (myframe)
>> dim(myframe)
>> names(myframe)
>>
>> library(zoo)
>> wk <- as.numeric(format(myframe$dates, '%W'))
>> is.na(wk) <- wk == 0
>> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
>> solution<-solution[order(solution$group),]
>> write.csv(solution,file="test.csv",row.names=F)
>>
>>
>>
>> On Wed, Mar 30, 2011 at 4:45 PM, Henrique Dallazuanna  
>> wrote:
>>> Try this:
>>>
>>> library(zoo)
>>> wk <- as.numeric(format(myframe$dates, '%W'))
>>> is.na(wk) <- wk == 0
>>> aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
>>>
>>>
>>>
>>> On Wed, Mar 30, 2011 at 4:35 PM, Dimitri Liakhovitski
>>>  wrote:
 Henrique, this is great, thank you!

 It's almost what I was looking for! Only one small thing - it doesn't
 "merge" the results for weeks that "straddle" 2 years. In my example -
 last week of year 2008 and the very first week of 2009 are one week.
 Any way to "join them"?
 Asking because in reality I'll have many years and hundreds of groups
 - hence, it'll be hard to do it manually.


 BTW - does format(dates,"%Y.%W") always consider weeks as starting with 
 Mondays?

 Thank you very much!
 Dimitri


 On Wed, Mar 30, 2011 at 2:55 PM, Henrique Dallazuanna  
 wrote:
> Try this:
>
> aggregate(value ~ group + format(dates, "%Y.%W"), myframe, FUN = sum)
>
>
> On Wed, Mar 30, 2011 at 11:23 AM, Dimitri Liakhovitski
>  wrote:
>> Dear everybody,
>>
>> I have the following challenge. I have a data set with 2 subgroups,
>> dates (days), and corresponding values (see example code below).
>> Within each subgroup: I need to aggregate (sum) the values by week -
>> for weeks that start on a Monday (for example, 2008-12-29 was a
>> Monday).
>> I find it difficult because I have missing dates in my data - so that
>> sometimes I don't even have the date for some M

Re: [R] summing values by week - based on daily dates - but with some dates missing

2011-04-06 Thread Dimitri Liakhovitski

Sorry - never mind. It turns out I did not load the zoo package. That
was the reason.

On Wed, Apr 6, 2011 at 12:14 PM, Dimitri Liakhovitski
 wrote:
> Guys, sorry to bother you again:
>
> I am running everything as before (see code below - before the line
> with a lot of ##). But now I am getting an error:
> Error in eval(expr, envir, enclos) : could not find function "na.locf"
> I also noticed that after I run the 3rd line from the bottom: "wk <-
> as.numeric(format(myframe$dates, "%Y.%W"))" - there are some weeks
> that end with .00
> And then, after I run the 2nd line from the bottom: "is.na(wk) <- wk
> %% 1 == 0" those weeks turn into NAs.
> Whether I run the second line or not - I get the same error about it
> not finding the function "na.locf".
> Do you know what might be going on?
> Thanks a lot!
> Dimitri
>
> ### Creating a longer example data set:
> mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2)
> myfactor<-c(rep("group.1",500),rep("group.2",500))
> set.seed(123)
> myvalues<-runif(1000,0,1)
> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues)
> (myframe)
> dim(myframe)
>
> ## Removing same rows (dates) unsystematically:
> set.seed(123)
> removed.group1<-sample(1:500,size=150,replace=F)
> set.seed(456)
> removed.group2<-sample(501:1000,size=150,replace=F)
> to.remove<-c(removed.group1,removed.group2);length(to.remove)
> to.remove<-to.remove[order(to.remove)]
> myframe<-myframe[-to.remove,]
> (myframe)
> dim(myframe)
> names(myframe)# write.csv(myframe,file="x.test.csv",row.names=F)
>
> wk <- as.numeric(format(myframe$dates, "%Y.%W"))
> is.na(wk) <- wk %% 1 == 0
> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
>
>
>
>
> ###
>
> On Wed, Mar 30, 2011 at 5:25 PM, Henrique Dallazuanna  
> wrote:
>> You're right:
>>
>> wk <- as.numeric(format(myframe$dates, "%Y.%W"))
>> is.na(wk) <- wk %% 1 == 0
>> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
>>
>>
>> On Wed, Mar 30, 2011 at 6:10 PM, Dimitri Liakhovitski
>>  wrote:
>>> Yes, zoo! That's what I forgot. It's great.
>>> Henrique, thanks a lot! One question:
>>>
>>> if the data are as I originally posted - then week numbered 52 is
>>> actually the very first week (it straddles 2008-2009).
>>> What if the data much longer (like in the code below - same as before,
>>> but more dates) so that we have more than 1 year to deal with.
>>> It looks like this code is lumping everything into 52 weeks. And my
>>> goal is to keep each week independent. If I have 2 years, then it
>>> should be 100+ weeks. Makes sense?
>>> Thank you!
>>>
>>> ### Creating a longer example data set:
>>> mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2)
>>> myfactor<-c(rep("group.1",500),rep("group.2",500))
>>> set.seed(123)
>>> myvalues<-runif(1000,0,1)
>>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues)
>>> (myframe)
>>> dim(myframe)
>>>
>>> ## Removing same rows (dates) unsystematically:
>>> set.seed(123)
>>> removed.group1<-sample(1:500,size=150,replace=F)
>>> set.seed(456)
>>> removed.group2<-sample(501:1000,size=150,replace=F)
>>> to.remove<-c(removed.group1,removed.group2);length(to.remove)
>>> to.remove<-to.remove[order(to.remove)]
>>> myframe<-myframe[-to.remove,]
>>> (myframe)
>>> dim(myframe)
>>> names(myframe)
>>>
>>> library(zoo)
>>> wk <- as.numeric(format(myframe$dates, '%W'))
>>> is.na(wk) <- wk == 0
>>> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)
>>> solution<-solution[order(solution$group),]
>>> write.csv(solution,file="test.csv",row.names=F)
>>>
>>>
>>>
>>> On Wed, Mar 30, 2011 at 4:45 PM, Henrique Dallazuanna  
>>> wrote:
 Try this:

 library(zoo)
 wk <- as.numeric(format(myframe$dates, '%W'))
 is.na(wk) <- wk == 0
 aggregate(value ~ group + na.locf(wk), myframe, FUN = sum)



 On Wed, Mar 30, 2011 at 4:35 PM, Dimitri Liakhovitski
  wrote:
> Henrique, this is great, thank you!
>
> It's almost what I was looking for! Only one small thing - it doesn't
> "merge" the results for weeks that "straddle" 2 years. In my example -
> last week of year 2008 and the very first week of 2009 are one week.
> Any way to "join them"?
> Asking because in reality I'll have many years and hundreds of groups
> - hence, it'll be hard to do it manually.
>
>
> BTW - does format(dates,"%Y.%W") always consider weeks as starting with 
> Mondays?
>
> Thank you very much!
> Dimitri
>
>
> On Wed, Mar 30, 2011 at 2:55 PM, Henrique Dallazuanna  
> wrote:
>> Try this:
>>
>> aggregate(value ~ group + format(dates, "%Y.%W"), myframe, FUN = sum)
>>
>>
>> On Wed, Mar 30, 2011 at 11:23 AM, Dimitri Liakhovitski
>>  wrote:
>>> Dear everybody,
>>>
>>> I have the following challenge. I have a data set with 2 subgroups,
>>> dates (days), and correspond

Re: [R] Decimal Accuracy Loss?

2011-04-06 Thread peter dalgaard


On Apr 6, 2011, at 17:58 , Petr Savicky wrote:

> On Wed, Apr 06, 2011 at 11:33:48AM -0400, Brigid Mooney wrote:
>> This is hopefully a quick question on decimal accuracy.  Is any
>> decimal accuracy lost when casting a numeric vector as a matrix?  And
>> then again casting the result back to a numeric?
>> 
>> I'm finding that my calculation values are different when I run for
>> loops that manually calculate matrix multiplication as compared to
>> when I cast the vectors as matrices and multiply them using "%*%".
>> (The errors are very small, but the process is run iteratively
>> thousands of times, at which point the error between the two
>> differences becomes noticeable.)
>> 
>> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?",
>> but just want to confirm that the differences in values are due to
>> differences in the matrix multiplication operator and manual
>> calculation via for loops, rather than information that is lost when
>> casting a numeric as a matrix and back again.
> 
> Others already confirmed that casting a numeric as a matrix and back
> again does not change the numbers. It is likely that the library
> operator "%*%" is more accurate than a straightforward for loop.
> For example, sum(x) uses a more accurate algorithm than iteration
> of s <- s + x[i] in double precision.

Even more likely, %*% is optimized for speed by reordering the additions and 
multiplications to take advantage of pipelining and CPU caches at several 
levels. This may or may not improve accuracy, but certainly does affect the 
last bits in the results. 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] metaplot

2011-04-06 Thread cheba meier

Dear all,

I have a four variable: Stuy.Name, OR, 95%LCI and 95%UCI and I would like to
create a meta analysis plot. I can't use meta.MH function in metaplot
because I do not have
n.trt, n.ctrl, col.trt, col.ctrl are not available! Is there an alternative
way to do it?

Many thanks in advance,
Cheba

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of the dot.dot.dot option in functions.

2011-04-06 Thread Duncan Murdoch


On 06/04/2011 12:04 PM, KENNETH R CABRERA wrote:

Hi R users:

I try this code, where "fun" is a parameter of a random generating
function name, and I pretend to use "..." parameter to pass the parameters
of different random generating functions.

What am I doing wrong?

f1<-function(nsim=20,n=10,fun=rnorm,...){
 vp<-replicate(nsim,t.test(fun(n,...),fun(n,...))$p.value)
 return(vp)
}

This works!
f1()
f1(n=20,mean=10)

This two fails:
f1(n=10,fun=rexp)
f1(n=10,fun=rbeta,shape1=1,shape2=2)

Thank you for your help.



I imagine it's a scoping problem: replicate() is probably not evaluating 
the ... in the context you think it is.  You could debug this by writing 
a function like


showArgs <- function(n, ...) {
  print(n)
  print(list(...))
}

and calling f1(n=10, fun=showArgs), but it might be easier just to avoid 
the problem:


f1 <- function(nsim=20,n=10,fun=rnorm,...){
force(fun)
force(n)
localfun <- function() fun(n, ...)
vp<-replicate(nsim,t.test(localfun(), localfun())$p.value)
return(vp)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] General binary search?

2011-04-06 Thread Martin Morgan


On 04/04/2011 01:50 PM, William Dunlap wrote:

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Stavros Macrakis
Sent: Monday, April 04, 2011 1:15 PM
To: r-help
Subject: [R] General binary search?

Is there a generic binary search routine in a standard library which

a) works for character vectors
b) runs in O(log(N)) time?

I'm aware of findInterval(x,vec), but it is restricted to
numeric vectors.


xtfrm(x) will convert a character (or other) vector to
a numeric vector with the same ordering.  findInterval
can work on that.  E.g.,
>  f0<- function(x, vec) {
tmp<- xtfrm(c(x, vec))
findInterval(tmp[seq_along(x)], tmp[-seq_along(x)])
  }
>  f0(c("Baby", "Aunt", "Dog"), LETTERS)
[1] 2 1 4
I've never looked at its speed.


For a little progress (though no 'generic binary searchin a standard 
library'), here's the 'one-liner'


bsearch1 <-
function(val, tab, L=1L, H=length(tab))
{
while (H >= L) {
M <- L + (H - L) %/% 2L
if (tab[M] > val) H <- M - 1L
else if (tab[M] < val) L <- M + 1L
else return(M)
}
return(L - 1L)
}

It seems like a good candidate for the new (R-2.13) 'compiler' package, so

library(compiler)
bsearch2 <- cmpfun(bsearch1)

And Bill's suggestion

bsearch3 <- function(val, tab) {
tmp <- xtfrm(c(val, tab))
findInterval(tmp[seq_along(val)], tmp[-seq_along(val)])
}

which will work best when 'val' is a vector to be looked up.

A quick look at data.table:::sortedmatch seemed to return matches, 
whereas Stavros is looking for lower bounds.


It seems that one could shift the weight more to C code by 'vectorizing' 
the one-liner, first as


bsearch5 <-
function(val, tab, L=1L, H=length(tab))
{
b <- cbind(L=rep(L, length(val)), H=rep(H, length(val)))
i0 <- seq_along(val)
repeat {
M <- b[i0,"L"] + (b[i0,"H"] - b[i0,"L"]) %/% 2L
i <- tab[M] > val[i0]
b[i0 + i * length(val)] <-
ifelse(i, M - 1L, ifelse(tab[M] < val[i0], M + 1L, M))
i0 <- which(b[i0, "H"] >= b[i0, "L"])
if (!length(i0)) break;
}
b[,"L"] - 1L
}

and then a little more thoughtfully (though more room for improvement) as

bsearch7 <-
function(val, tab, L=1L, H=length(tab))
{
b <- cbind(L=rep(L, length(val)), H=rep(H, length(val)))
i0 <- seq_along(val)
repeat {
updt <- M <- b[i0,"L"] + (b[i0,"H"] - b[i0,"L"]) %/% 2L
tabM <- tab[M]
val0 <- val[i0]
i <- tabM < val0
updt[i] <- M[i] + 1L
i <- tabM > val0
updt[i] <- M[i] - 1L
b[i0 + i * length(val)] <- updt
i0 <- which(b[i0, "H"] >= b[i0, "L"])
if (!length(i0)) break;
}
b[,"L"] - 1L
}

none of bsearch 3, 5, or 7 is likely to benefit substantially from 
compilation.


Here's a little test data set converting numeric to character as an easy 
cheat.


set.seed(123L)
x <- sort(as.character(rnorm(1e6)))
y <- as.character(rnorm(1e4))

There seems to be some significant initial overhead, so we warm things 
up (and also introduce the paradigm for multiple look-ups in bsearch 1, 2)


warmup <- function(y, x) {
lapply(y, bsearch1, x)
lapply(y, bsearch2, x)
bsearch3(y, x)
bsearch5(y, x)
bsearch7(y, x)
}
replicate(3, warmup(y, x))

and then time

> system.time(res1 <- unlist(lapply(y, bsearch1, x), use.names=FALSE))
   user  system elapsed
  2.692   0.000   2.696
> system.time(res2 <- unlist(lapply(y, bsearch2, x), use.names=FALSE))
   user  system elapsed
  1.379   0.000   1.380
> identical(res1, res2)
[1] TRUE
> system.time(res3 <- bsearch3(y, x)); identical(res1, res3)
   user  system elapsed
  8.339   0.001   8.350
[1] TRUE
> system.time(res5 <- bsearch5(y, x)); identical(res1, res5)
   user  system elapsed
  0.700   0.000   0.702
[1] TRUE
> system.time(res7 <- bsearch7(y, x)); identical(res1, res7)
   user  system elapsed
  0.222   0.000   0.222
[1] TRUE

Martin



Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



I'm also aware of various hashing solutions (e.g.
new.env(hash=TRUE) and
fastmatch), but I need the greatest-lower-bound match in my
application.

findInterval is also slow for large N=length(vec) because of the O(N)
checking it does, as Duncan Murdoch has pointed
out:
though
its documentation says it runs in O(n * log(N)), it actually
runs in O(n *
log(N) + N), which is quite noticeable for largish N.  But
that is easy
enough to work around by writing a variant of findInterval which calls
find_interv_vec without checking.

 -s

PS Yes, binary search is a one-liner in R, but I always prefer to use
standard, fast native libraries when possible

binarysearch<- function(val,tab,L,H) {while (H>=L) {
M=L+(H-L) %/% 2; if
(tab[M]>val) H<-M-1 else if (tab[M]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/po

Re: [R] Examples of web-based Sweave use?

2011-04-06 Thread Tal Galili

In case you haven't seen it, it seems that you've got a post answering your
question:
http://biostatmatt.com/archives/1184

 

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Mon, Apr 4, 2011 at 4:38 PM, Tal Galili  wrote:

> I've written about a bunch of Web R interfaces here:
> *
> http://www.r-statistics.com/2010/04/jeroen-oomss-ggplot2-web-interface-a-new-version-released-v0-2/
> *
>
> http://www.r-statistics.com/2010/04/r-node-a-web-front-end-to-r-with-protovis/
> (And some other posts here:
> http://www.r-statistics.com/category/r-and-the-web/
> )
> I'm not sure which of
> them use Sweave behind them, but you could look around and check.
>
> Hope that helps,
> Tal
>
>
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>
>
>
>
> On Mon, Apr 4, 2011 at 2:31 PM, carslaw  wrote:
>
>> I appreciate that this is OT, but I'd be grateful for pointers to examples
>> of
>> where
>> Sweave has been used for web-based applications.  In particular, examples
>> of
>> where reports/analyses are produced automatically through submission of
>> data
>> to a web-sever.  I am mostly interested in situations where pdf reports
>> have
>> been produced rather than, say, a plot/table etc shown on a web page.
>>
>> I've had limited success finding examples on this.
>>
>> Many thanks.
>>
>> David Carslaw
>>
>>
>> Environmental Research Group
>> MRC-HPA Centre for Environment and Health
>> King's College London
>> Franklin Wilkins Building
>> Stamford Street
>> London SE1 9NH
>>
>> david.cars...@kcl.ac.uk
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Examples-of-web-based-Sweave-use-tp3425324p3425324.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] metaplot

2011-04-06 Thread Scott Chamberlain

 What about the metafor package?


Or just create your own plot. 

For example, using ggplot2 package:

limits <- aes(ymax = OR + (OR - 95%LCI), ymin = OR - (OR - 95%LCI))
ggplot(dataframe, aes(x = Study.Name, y = OR)) + geom_point() + 
geom_errobar(limits)

Best, 
Scott
On Wednesday, April 6, 2011 at 11:53 AM, cheba meier wrote:
Dear all,
> 
> I have a four variable: Stuy.Name, OR, 95%LCI and 95%UCI and I would like to
> create a meta analysis plot. I can't use meta.MH function in metaplot
> because I do not have
> n.trt, n.ctrl, col.trt, col.ctrl are not available! Is there an alternative
> way to do it?
> 
> Many thanks in advance,
> Cheba
> 
>  [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] unexpected sort order with merge

2011-04-06 Thread Johann Hibschman

`merge` lists sorted as if by character, not by the actual class of the
by-columns.

> tmp <- merge(data.frame(f=ordered(c("a","b","b","a","b"),
levels=c("b","a")),
  x=1:5),
   data.frame(f=ordered(c("a","b"),
levels=c("b","a")),
  y=c(10,20)))
> tmp
  f x  y
1 a 1 10
2 a 4 10
3 b 2 20
4 b 3 20
5 b 5 20

> tmp[order(tmp$f),]
  f x  y
3 b 2 20
4 b 3 20
5 b 5 20
1 a 1 10
2 a 4 10

I expected the second order, not the first.

I actually ran into this issue when merging zoo yearmon columns, but
that adds a package dependency.  In that context, I observed different
behavior depending on whether I had one key or two:

> library(zoo)
> d1 <- data.frame(date=as.yearmon(2000 + (0:5)/12), icpn=500, foo=1:6)
> d2 <- data.frame(date=as.yearmon(2000 + (0:5)/12), icpn=500, bar=10*1:6)
> merge(d1,d2)
  date icpn foo bar
1 Apr 2000  500   4  40
2 Feb 2000  500   2  20
3 Jan 2000  500   1  10
4 Jun 2000  500   6  60
5 Mar 2000  500   3  30
6 May 2000  500   5  50

> d1 <- data.frame(date=as.yearmon(2000 + (0:5)/12), foo=1:6)
> d2 <- data.frame(date=as.yearmon(2000 + (0:5)/12), bar=10*1:6)
> merge(d1,d2)
  date foo bar
1 Jan 2000   1  10
2 Feb 2000   2  20
3 Mar 2000   3  30
4 Apr 2000   4  40
5 May 2000   5  50
6 Jun 2000   6  60

The first example appears to sort by the name of the date, not by the
actual date value.

The documentation of `merge` says the sort is "lexicographic", but I
assumed that was in the cartesian-product sense, not in some
convert-everything-to-character sense.

Is this behavior expected?

Thanks,
Johann


P.S. 

> sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-unknown-linux-gnu 

locale:
[1] C

attached base packages:
[1] grid  splines   stats graphics  grDevices utils datasets 
[8] methods   base 

other attached packages:
[1] ggplot2_0.8.8   reshape_0.8.3   Rauto_1.0   plyr_1.1   
[5] zoo_1.6-4   Hmisc_3.7-0 survival_2.35-8 ascii_0.7  
[9] proto_0.3-8

loaded via a namespace (and not attached):
[1] cluster_1.12.1  digest_0.4.2lattice_0.17-26 tools_2.10.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [SOLVED] Re: Use of the dot.dot.dot option in functions.

2011-04-06 Thread KENNETH R CABRERA

Thank you very much for your help.

It works very well!

Still, it is not very clear why the "replicate" function do not
take the "..." arguments like they should.

- Mensaje original -
De: Duncan Murdoch 
Fecha: Miércoles, 6 de Abril de 2011, 11:56 am
Asunto: Re: [R] Use of the dot.dot.dot option in functions.
A: KENNETH R CABRERA 
CC: r-help@r-project.org

> On 06/04/2011 12:04 PM, KENNETH R CABRERA wrote:
> >Hi R users:
> >
> >I try this code, where "fun" is a parameter of a random generating
> >function name, and I pretend to use "..." parameter to pass the 
> parameters>of different random generating functions.
> >
> >What am I doing wrong?
> >
> >f1<-function(nsim=20,n=10,fun=rnorm,...){
> > vp<-
> replicate(nsim,t.test(fun(n,...),fun(n,...))$p.value)> return(vp)
> >}
> >
> >This works!
> >f1()
> >f1(n=20,mean=10)
> >
> >This two fails:
> >f1(n=10,fun=rexp)
> >f1(n=10,fun=rbeta,shape1=1,shape2=2)
> >
> >Thank you for your help.
> 
> 
> I imagine it's a scoping problem: replicate() is probably not 
> evaluating the ... in the context you think it is.  You 
> could debug this by writing a function like
> 
> showArgs <- function(n, ...) {
>   print(n)
>   print(list(...))
> }
> 
> and calling f1(n=10, fun=showArgs), but it might be easier just 
> to avoid the problem:
> 
> f1 <- function(nsim=20,n=10,fun=rnorm,...){
> force(fun)
> force(n)
> localfun <- function() fun(n, ...)
> vp<-replicate(nsim,t.test(localfun(), 
> localfun())$p.value)return(vp)
> }
> 
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A zoo related question

2011-04-06 Thread Bogaso Christofer

Dear all, please consider my following workbook:

 

library(zoo)

lis1 <- vector('list', length = 2)

lis2 <- vector('list', length = 2)

lis1[[1]] <- zooreg(rnorm(20), start = as.Date("2010-01-01"), frequency = 1)

lis1[[2]] <- zooreg(rnorm(20), start = as.yearmon("2010-01-01"), frequency =
12)

 

lis2[[1]] <- matrix(1:40, 20)

lis2[[2]] <- matrix(41:80, 20)

 

Now I want to make each element of 'lis2' as zoo object where the
corresponding indices will be borrowed from 'lis1'. This means:

 

for (i in 1:2) {

lis2[[i]] <- zoo(lis2[[i]],
index(lis1[[i]]))

}

 

However is there any faster way to do that? I found that if the sizes of
lis1 & lis2 is quite big then it takes lot of time to complete.

 

Any help will be really appreciated.

 

Thanks,


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A zoo related question

2011-04-06 Thread Gabor Grothendieck

On Wed, Apr 6, 2011 at 3:40 PM, Bogaso Christofer
 wrote:
> Dear all, please consider my following workbook:
>
>
>
> library(zoo)
>
> lis1 <- vector('list', length = 2)
>
> lis2 <- vector('list', length = 2)
>
> lis1[[1]] <- zooreg(rnorm(20), start = as.Date("2010-01-01"), frequency = 1)
>
> lis1[[2]] <- zooreg(rnorm(20), start = as.yearmon("2010-01-01"), frequency =
> 12)
>
>
>
> lis2[[1]] <- matrix(1:40, 20)
>
> lis2[[2]] <- matrix(41:80, 20)
>
>
>
> Now I want to make each element of 'lis2' as zoo object where the
> corresponding indices will be borrowed from 'lis1'. This means:
>
>
>
> for (i in 1:2) {
>
>                                lis2[[i]] <- zoo(lis2[[i]],
> index(lis1[[i]]))
>
>                }
>
>
>
> However is there any faster way to do that? I found that if the sizes of
> lis1 & lis2 is quite big then it takes lot of time to complete.
>

Try this:

mapply(function(mat, z) zoo(mat, time(z)), lis2, lis1, SIMPLIFY = FALSE)

or this:

mapply(zoo, lis2, lapply(lis1, time), SIMPLIFY = FALSE)

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A zoo related question

2011-04-06 Thread Bogaso Christofer

Dear all, please consider my following workbook:

 

library(zoo)

lis1 <- vector('list', length = 2)

lis2 <- vector('list', length = 2)

lis1[[1]] <- zooreg(rnorm(20), start = as.Date("2010-01-01"), frequency = 1)

lis1[[2]] <- zooreg(rnorm(20), start = as.yearmon("2010-01-01"), frequency =
12)

 

lis2[[1]] <- matrix(1:40, 20)

lis2[[2]] <- matrix(41:80, 20)

 

Now I want to make each element of 'lis2' as zoo object where the
corresponding indices will be borrowed from 'lis1'. This means:

 

for (i in 1:2) {

lis2[[i]] <- zoo(lis2[[i]],
index(lis1[[i]]))

}

 

However is there any faster way to do that? I found that if the sizes of
lis1 & lis2 is quite big then it takes lot of time to complete.

 

Any help will be really appreciated.

 

Thanks,


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grid on Map

2011-04-06 Thread MacQueen, Don

Possibly something similar to

  abline(v=seq(long.min, long.max, length=3)
  abline(h=seq(lat.min, lat.max, length=3)

?

The above will add vertical and horizontal lines to an existing plot, and
assumes that the plot is in long/lat coordinates. Of course, this ignores
the fact that long/lat is not a cartesian coordinate system.

(can't provide more detail without more information)

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





-Original Message-
From: Jaimin Dave 
Date: Mon, 4 Apr 2011 18:39:45 -0700
To: "r-help@r-project.org" 
Subject: [R] Grid on Map

>I am new to R.I want to draw grid from a csv file which contains latitude
>minimum ,latitude maximum ,longitude minimum ,longitude maximum.The grid
>should be divided into exactly 4 quadrants. The map is of NY state of
>USA. I
>want to know how can I do it.
>Help would be appreciated.
>
>Thanks
>Jaimin
>
>[[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to call data elements

2011-04-06 Thread B77S

assuming this is from a list

mydata[[1]][1]  

and 

mydata[[1]][7]

??




Wonjae Lee wrote:
> 
> Hi, 
> 
> I have a stupid and simple question. 
> Please forgive me. 
> 
> In an example below, please tell me how to call  "1947" in mydata. 
> Thank you in advance. 
> 
> Wonjae 
> 
>> mydata 
> [[1]] 
> [1] "1947""83"  "234.289" "235.6"   "159" "107.608" "1947"   
> [8] "60.323" 
> 
>> mydata[[1],1] 
> error:unexpected ',' in "mydata[[1]," 
>> mydata[1,[1]] 
> error:unexpected '[' in "mydata[1,["
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-call-data-elements-tp3430859p3430881.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R and multithread

2011-04-06 Thread David martin


Hello,
Sorry if this question has been posted before but could't find out 
exactly an answer to the question


I'm doing bioinformatics and doing small RNA sequencing that make use of 
packages such as DESeq and EDGE. For those familiar with this data you 
will notice that you end up having large matrices with millions of 
entries. So i guess many people might be facing the same problem of 
dealing with so big matrices. and vectors.


I've heard of people using other math libraries (since the default R 
math lib is single core). to compile R and make it use several procs on 
the server. I'm not familiar with the different math libs availables 
(BLAS,..)


Can i use faster math libraries so that R uses the full procs capacities 
of my server ?


I'm running 2.12.2 on an  linux server (16cpu with 32Gb ram).

thanks for your tips,
david

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Search arrays based on similar values

2011-04-06 Thread mjdubya

Petr,
Perfect! Thank you.

--
View this message in context: 
http://r.789695.n4.nabble.com/Search-arrays-based-on-similar-values-tp3429381p3430906.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating a symmetric contingency table from two vectors with different length of levels in R

2011-04-06 Thread suparna mitra

Dear Andrija,
 Thank you very much for your quick reply. It looks like working.
Thanks again,
Suparna.

On Wed, Apr 6, 2011 at 2:11 PM, andrija djurovic wrote:

> Hi:
>
> Here is one solution:
>
> a<-factor(c(1,2,4,5,6))
> b<-factor(c(2,2,4,5,5))
> b1<-factor(b,levels=c(levels(b),levels(a)[levels(a)%in%levels(b)==FALSE]))
> table(a,b1)
>
> but be aware that levels of b is a subset of levels of a.
>
> Andrija
>
> On Wed, Apr 6, 2011 at 10:39 AM, suparna mitra <
> mi...@informatik.uni-tuebingen.de> wrote:
>
>> Hello,
>> How can I create a symmetric contingency table from two categorical
>> vectors
>> having different length of levels?
>> For example one vector has 98 levels
>> TotalData1$Taxa.1
>>  [1] "Aconoidasida" "Actinobacteria (class)"
>> "Actinopterygii"   "Alphaproteobacteria"
>>  [5] "Amoebozoa""Amphibia"
>> "Anthozoa" "Aquificae (class)"
>> and so on .
>> 98 Levels: Aconoidasida Actinobacteria (class) 
>>
>>  and the other vector has 105 levels
>> TotalData1$Taxa.2
>>[1] FlavobacteriaProteobacteria
>> Bacteroidetes/Chlorobi group Bacteria
>>[5] EpsilonproteobacteriaEpsilonproteobacteria
>>  Epsilonproteobacteria
>> and so on  ..
>> 105 Levels: Acidobacteria Aconoidasida Actinobacteria (class) 
>>
>> Now I want to create a symmetric contingency table.
>> Any quick idea will be really helpful.
>> Best regards,
>> Mitra
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RMySQL query Help

2011-04-06 Thread Radhouane Aniba

Hello everyone,

I am using RMySQL for a project and I have to deal with a complicated query
here it is :

tmp2 <-sprintf(paste("select Score from  YR.Transcription_Factor
   inner join YR.Promoter on YR.Transcription_Factor.
Promoter_idPromoter=YR.Promoter.idPromoter
   inner join YR.GP_BIS on
YR.Promoter.idPromoter=YR.GP_BIS.Promoter_idPromoter
   inner join YR.Gene on YR.GP_BIS.Gene_idGene=YR.Gene.idGene
   inner join YR.Cluster on YR.Gene.Cluster_idCluster=YR.Cluster.idCluster
   where Cluster_Name='%s' AND Specie='%s' AND TF_Name='%s' AND
YR.Promoter.Gene_Name in (",toto,")",sep="")

   ,as.character(TRIPLETS[j,1]),species[i,],as.character(TRIPLETS[j,3]),
titi )

where toto and titi are temporary variables containing

toto <-
noquote(paste(rep(shQuote("%s"),length(in_cluster_tf1$Score)),collapse=","))
titi <- (paste(shQuote(in_cluster_tf1$Gene_Name),collapse=","))

the problem is in quotes ( ' ) and ( ` )

to make it simple how can I create a vector like this

> Vector

"A","B","C" 

from Initial Vector

> Initial vector

[1] "A"  "B"  "C" ..

and then inject Vector to the query especialy after " Where X in
("A","B","C")

I tried several methodes with sQuote, dQuote, shQuotes but it seems like I
have a problem with that

Any Idea ?

Thanks


-- 
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Examples of web-based Sweave use?

2011-04-06 Thread Matt Shotwell

That's an interesting idea. I had written a long email describing a 
proof-of-concept, but decided to post is to the website below instead.


http://biostatmatt.com/archives/1184

Matt

On 04/04/2011 07:31 AM, carslaw wrote:

I appreciate that this is OT, but I'd be grateful for pointers to examples of
where
Sweave has been used for web-based applications.  In particular, examples of
where reports/analyses are produced automatically through submission of data
to a web-sever.  I am mostly interested in situations where pdf reports have
been produced rather than, say, a plot/table etc shown on a web page.

I've had limited success finding examples on this.

Many thanks.

David Carslaw


Environmental Research Group
MRC-HPA Centre for Environment and Health
King's College London
Franklin Wilkins Building
Stamford Street
London SE1 9NH

david.cars...@kcl.ac.uk


--
View this message in context: 
http://r.789695.n4.nabble.com/Examples-of-web-based-Sweave-use-tp3425324p3425324.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read BUFR format radar data

2011-04-06 Thread flore.mounier


Dear all,
I am looking for a way to open *Binary Universal Form for the 
Representation of meteorological data* (*BUFR*) radar data with R 
software and I am wondering if their is any package that could help on 
this task. I have looked without success in the question section.

Thanks in advance for your help.
Sincerely,
Flore

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pass character vector in instrument field of get.hist.quote function

2011-04-06 Thread algotr8der

Hi Joshua,

Thank you for showing me how to use the getSymbols function. So I tried the
following without success - 

> tickers
 [1] "SPY" "DIA" "IWM" "SMH" "OIH" "XLY" "XLP" "XLE" "XLI" "XLB" "XLK" "XLU"
"XLV"
[14] "QQQ"
> str(tickers)
 chr [1:14] "SPY" "DIA" "IWM" "SMH" "OIH" "XLY" "XLP" "XLE" ...

> ClosePrices <- do.call(merge, lapply(tickers,myX,start="2001-03-01",
> end="2011-03-11"), Cl(get(myX)))

Error in get(myX) : invalid first argument

The code for function myX is as follows:

myX <- function(tickers, start, end) {
require(quantmod) 
getSymbols(tickers, from=start, to=end)
}

I'm not sure why you are using the Cl(get(x)) function in your do.call
invocation. I greatly appreciate some guidance.

Thank you

--
View this message in context: 
http://r.789695.n4.nabble.com/pass-character-vector-in-instrument-field-of-get-hist-quote-function-tp3350779p3431001.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] effect sizes

2011-04-06 Thread netrunner

Dear all,
I used the friedman.test.with.post.hoc in my analysis to compare the scores
of three groups , but I would like to compute also the effect sizes. Anyone
can help me? 

thank you

net

--
View this message in context: 
http://r.789695.n4.nabble.com/effect-sizes-tp3431058p3431058.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Curious treatment of entities in xmlTreeParse

2011-04-06 Thread Adam Cooper

Hello!

I am not experienced enough to know whether I have found a bug or
whether I am just ignorant.

I have been trying to use the tm package to read in material from RSS
2.0 feeds, which has required grappling with writing a reader for that
flavour of XML. I get an error - "Error : 1: EntityRef: expecting ';' -
which I think I've tracked down.

The feed being processed is from Wordpress:
http://scottbw.wordpress.com/feed/

Note that it contains a number of entity references in various places.
The trouble-makers seem to be & references that are the "&" in a URL
query string.
http://0.gravatar.com/avatar/a1033a3e5956f5db65e0cc20f5ea167f?s=96&d=identicon&r=G";
 medium="image">

AFAIK, this is a correct encoding,

Parsing this with the following two lines followed by inspecting "t"
shows that the & references have been translated to "&" while other
entity refs have not.

a<-readLines(url(as.character(feeds[2,2])))
t<-XML::xmlTreeParse(a, replaceEntities=FALSE, asText=TRUE)


I'm guessing this is what breaks things when I try to do things with tm:
rss2Reader <- readXML(
spec = list(
Author = list("node", "/item/creator"), 
Content = list("node", "/item/description"),
DateTimeStamp = list("function",function(x)   
as.POSIXlt(Sys.time(),
tz = "GMT")),
Heading = list("node", "/item/title"),
ID = list("function", function(x) tempfile()),
Origin = list("node", "/item/link")),
doc = PlainTextDocument())

rss2Source <- function(x, encoding = "UTF-8")
  XMLSource(x, function(tree)
XML::getNodeSet(XML::xmlRoot(tree),"/rss/channel/item"), rss2Reader,
encoding)

feed.rss2 <- rss2Source(url("http://scottbw.wordpress.com/feed/";))
corp1<-Corpus(feed.rss2, readerControl=list(language="en"))


I've googled around for this problem but got nowhere. Have I missed
something?

Any help will be received gratefully; this was supposed to be the easy
part!

Cheers, Adam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data smoothing

2011-04-06 Thread stefano.giampiccolo

Good morning,
I have a time serie of clinical data. I know, from the literature, that it
must have a linear trend, without seasonal oscillations. Can I use LOESS to
have a qualitative confirm? I have only 15 mensurations: how should I choose
the smoothing parameter?
Thanks, best reguards
Stefano Giampiccolo

--
View this message in context: 
http://r.789695.n4.nabble.com/data-smoothing-tp3430797p3430797.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to call data elements

2011-04-06 Thread Wonjae Lee

Hi, 

I have a stupid and simple question. 
Please forgive me. 

In an example below, please tell me how to call  "1947" in mydata. 
Thank you in advance. 

Wonjae 

> mydata 
[[1]] 
[1] "1947""83"  "234.289" "235.6"   "159" "107.608" "1947"   
[8] "60.323" 

> mydata[[1],1] 
error:unexpected ',' in "aa[[1]," 
> mydata[1,[1]] 
error:unexpected '[' in "aa[1,["

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-call-data-elements-tp3430859p3430859.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cannot install pakcage RMySQL

2011-04-06 Thread alon.benari

Hello All, 

I have a technical difficulty installing RMySQL. I am running openSUSE11.1
and R 2.12
I have installed MySQL from the website.
and following installation , as root
This is the part where trouble begin,
..
checking mysql.h usability... no
checking mysql.h presence... no
checking for mysql.h... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking /usr/local/include/mysql/mysql.h usability... no
checking /usr/local/include/mysql/mysql.h presence... no
checking for /usr/local/include/mysql/mysql.h... no
checking /usr/include/mysql/mysql.h usability... no
checking /usr/include/mysql/mysql.h presence... no
checking for /usr/include/mysql/mysql.h... no
checking /usr/local/mysql/include/mysql/mysql.h usability... no
checking /usr/local/mysql/include/mysql/mysql.h presence... no
checking for /usr/local/mysql/include/mysql/mysql.h... no
checking /opt/include/mysql/mysql.h usability... no
checking /opt/include/mysql/mysql.h presence... no
checking for /opt/include/mysql/mysql.h... no
checking /include/mysql/mysql.h usability... no
checking /include/mysql/mysql.h presence... no
checking for /include/mysql/mysql.h... no

Configuration error:
  could not find the MySQL installation include and/or library
  directories.  Manually specify the location of the MySQL
  libraries and the header files and re-run R CMD INSTALL.

INSTRUCTIONS:

1. Define and export the 2 shell variables PKG_CPPFLAGS and
   PKG_LIBS to include the directory for header files (*.h)
   and libraries, for example (using Bourne shell syntax):

  export PKG_CPPFLAGS="-I"
  export PKG_LIBS="-L -lmysqlclient"

   Re-run the R INSTALL command:

  R CMD INSTALL RMySQL_.tar.gz

2. Alternatively, you may pass the configure arguments
  --with-mysql-dir= (distribution directory)
   or
  --with-mysql-inc= (where MySQL header files reside)
  --with-mysql-lib= (where MySQL libraries reside)
   in the call to R INSTALL --configure-args='...' 

   R CMD INSTALL --configure-args='--with-mysql-dir=DIR' RMySQL_.tar.gz

ERROR: configuration failed for package ‘RMySQL’
* removing ‘/root/R/i686-pc-linux-gnu-library/2.12/RMySQL’

Any ideas how to proceed from here? I am quite a newbie in LINUX
How do I find the mysql.h ?
using find / -name mysql.h but nothing there?
Did I use it right?
Any other ideas? How do I set this right?

Thank you


--
View this message in context: 
http://r.789695.n4.nabble.com/Cannot-install-pakcage-RMySQL-tp3431025p3431025.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pass character vector in instrument field of get.hist.quote function

2011-04-06 Thread algotr8der

Hi Joshua,

THank you for showing me how to use getSymbols. I am trying to follow the
example you provided. However I am having some difficulty using the various
combination of functions you have used. I tried to execute one step at a
time as follows - 

I have a ticker vector that looks like the following:

> tickers
 [1] "SPY" "DIA" "IWM" "SMH" "OIH" "XLY" "XLP" "XLE" "XLI" "XLB" "XLK" "XLU"
"XLV"
[14] "QQQ"
> str(tickers)
 chr [1:14] "SPY" "DIA" "IWM" "SMH" "OIH" "XLY" "XLP" "XLE" ...

I wrote a function called myX to use in the lapply call. It has the
following code: 

myX <- function(tickers, start, end) {
require(quantmod) 
getSymbols(tickers, from=start, to=end)
}


1) Call lapply by itself

>lapply(tickers,myX,start="2001-03-01", end="2011-03-11")

> lapply(tickers,myX,start="2001-03-01", end="2011-03-11")
[[1]]
[1] "SPY"

[[2]]
[1] "DIA"

[[3]]
[1] "IWM"

[[4]]
[1] "SMH"

[[5]]
[1] "OIH"

[[6]]
[1] "XLY"

[[7]]
[1] "XLP"

[[8]]
[1] "XLE"

[[9]]
[1] "XLI"

[[10]]
[1] "XLB"

[[11]]
[1] "XLK"

[[12]]
[1] "XLU"

[[13]]
[1] "XLV"

[[14]]
[1] "QQQ"

So this works fine and I can inspect the value of any of the tickers i.e.
SPY. 

Now I want to extract the Closing prices. 

2) I did Cl(SPY) and this outputs the data in the Close column as expected.
However, I am not sure how to extract the Closing prices of each of the
elements inside the data structure returned by lapply, which I believe is a
list structure. I want to merge them into one object as you did but I cant
seem to follow.

Any guidance would be greatly appreciated.



--
View this message in context: 
http://r.789695.n4.nabble.com/pass-character-vector-in-instrument-field-of-get-hist-quote-function-tp3350779p3431118.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Decimal Accuracy Loss?

2011-04-06 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 06.04.2011 17:33:48:

> This is hopefully a quick question on decimal accuracy.  Is any
> decimal accuracy lost when casting a numeric vector as a matrix?  And
> then again casting the result back to a numeric?
> 
> I'm finding that my calculation values are different when I run for
> loops that manually calculate matrix multiplication as compared to
> when I cast the vectors as matrices and multiply them using "%*%".
> (The errors are very small, but the process is run iteratively
> thousands of times, at which point the error between the two
> differences becomes noticeable.)
> 
> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?",
> but just want to confirm that the differences in values are due to
> differences in the matrix multiplication operator and manual
> calculation via for loops, rather than information that is lost when
> casting a numeric as a matrix and back again.

Without some example it is difficult to see the possible sources of 
difference. Clever people may know how %*% operator really works, but only 
those who are able mind reading can know what you do inside your for 
loops.

Regards
Petr


> 
> Thanks in advance for the help,
> Brigid
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] smoothing bathymetry

2011-04-06 Thread Jens


Dear R Users, 

Using the following R-script I created the first image 

> require(akima) 
> require(spatial) 
> dep <- interp(long, lat, depth, xo=seq(1,990,10), yo=seq(1,990,10), 
+ extrap=FALSE, ncp=0,duplicate = "mean", dupfun = NULL) 

http://r.789695.n4.nabble.com/file/n3431391/Rpics.bmp 

Where "long" are x-coordinates between 1 and 1000, "lat" are y-cordinates
between 1 and 1000, and depth are depth values corresponding to the x and y
coordinates. All data are in vector form. 

I would like to extrapolate the data points beyond the data set, by setting
extrap=TRUE, filling out the entire 1000x1000 grid. 
I attempted this using the following script, which generated the second
image. 

>dep <- interp(long, lat, depth, xo=seq(1,990,10), yo=seq(1,990,10), 
+ linear=FALSE,extrap=TRUE, ncp=NULL,duplicate = "mean", dupfun = NULL) 

However, as seen in the image, this results in a messy looking bathymetry,
with erros. Any suggestions on what went wrong or other packages or
functions I could try to would be greatly appreciated? 

I played around with the krig.image function in the "fields" package, but
didnt have much luck.

Sincerly 
Jens 


--
View this message in context: 
http://r.789695.n4.nabble.com/smoothing-bathymetry-tp3431391p3431391.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ROCR - best sensitivity/specificity tradeoff?

2011-04-06 Thread Christian Meesters

Hi,

My questions concerns the ROCR package and I hope somebody here on the list can 
help - or point me to some better place.

When evaluating a model's performane, like this:


pred1 <- predict(model, ..., type="response")
pred2 <- prediction(pred1, binary_classifier_vector)
perf  <- performance(pred, "sens", "spec")

(Where "prediction" and "performance" are ROCR-functions.)

How can I then retrieve the cutoff value for the sensitivity/specificity 
tradeoff with regard to the data in the model (e.g. model = 
glm(binary_classifier_vector ~ data, family="binomial", data=some_dataset)? 
Perhaps I missed something in the manual? Or do I need an entirely different 
approach for this? Or is there an alternative solution?

Thanks,
Christian


--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and multithread

2011-04-06 Thread Bert Gunter

How about googling it?!!

"R multithread"
" R Parallel processing"

both got numerous apprently relevant hits.

Also please familiarize yourself with CRAN's task views, where you
will find HighPerformanceComputing.

-- Bert

On Wed, Apr 6, 2011 at 5:27 AM, David martin  wrote:
> Hello,
> Sorry if this question has been posted before but could't find out exactly
> an answer to the question
>
> I'm doing bioinformatics and doing small RNA sequencing that make use of
> packages such as DESeq and EDGE. For those familiar with this data you will
> notice that you end up having large matrices with millions of entries. So i
> guess many people might be facing the same problem of dealing with so big
> matrices. and vectors.
>
> I've heard of people using other math libraries (since the default R math
> lib is single core). to compile R and make it use several procs on the
> server. I'm not familiar with the different math libs availables (BLAS,..)
>
> Can i use faster math libraries so that R uses the full procs capacities of
> my server ?
>
> I'm running 2.12.2 on an  linux server (16cpu with 32Gb ram).
>
> thanks for your tips,
> david
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
"Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions."

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Teradata ODBC driver

2011-04-06 Thread William Poling

Hi.

I have had this URL passed to me in order to obtain the necessary driver to 
connect my R application to our Teradata warehouse, however, the URL does not 
seem to exist anymore, my internet explorer browser fails to connect for some 
reason.


http://downloads.teradata.com/download/applications/teradata-r/1.0

Is there an alternative site or location for obtaining the necessary driver?

Thanks

WHP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Treatment of xml-stylesheet processing instructions in XML module

2011-04-06 Thread Adam Cooper

Hello again,
Another stumble here that is defeating me.

I try:
a<-readLines(url("http://feeds.feedburner.com/grokin";))
t<-XML::xmlTreeParse(a, ignoreBlanks=TRUE, replaceEntities=FALSE,
asText=TRUE)
elem<- XML::getNodeSet(XML::xmlRoot(t),"/rss/channel/item")[[1]]

And I get:
Start tag expected, '<' not found
Error: 1: Start tag expected, '<' not found

When I modify the second line in "a" to remove the following (just
leaving the  tag with its attributes), I do not get the error.
I removed:
http://feeds.feedburner.com/~d/styles/itemcontent.css
\"?>

I would have expected the PI to be totally ignored by default.
Have I missed something??

Thanks in advance...

Cheers, Adam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculated mean value based on another column bin from dataframe.

2011-04-06 Thread Fabrice Tourre

Dear Henrique Dallazuanna,

Thank you very much for your suggestion.

It is obvious that your method is better than me.

Is it possible to use cut, table,by etc? Whether there is some
aggregate function in R can do this?

Thanks.

On Wed, Apr 6, 2011 at 2:16 PM, Henrique Dallazuanna  wrote:
> Try this:
>
> fil <- sapply(ran, '<', e1 = dat[,1]) & sapply(ran[2:(length(ran) +
> 1)], '>=', e1 = dat[,1])
> mm <- apply(fil, 2, function(idx)mean(dat[idx, 2]))
>
> On Wed, Apr 6, 2011 at 5:48 AM, Fabrice Tourre  wrote:
>> Dear list,
>>
>> I have a dataframe with two column as fellow.
>>
>>> head(dat)
>>       V1      V2
>>  0.15624 0.94567
>>  0.26039 0.66442
>>  0.16629 0.97822
>>  0.23474 0.72079
>>  0.11037 0.83760
>>  0.14969 0.91312
>>
>> I want to get the column V2 mean value based on the bin of column of
>> V1. I write the code as fellow. It works, but I think this is not the
>> elegant way. Any suggestions?
>>
>> dat<-read.table("dat.txt",head=F)
>> ran<-seq(0,0.5,0.05)
>> mm<-NULL
>> for (i in c(1:(length(ran)-1)))
>> {
>>    fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1]
>>    m<-mean(dat[fil,2])
>>    mm<-c(mm,m)
>> }
>> mm
>>
>> Here is the first 20 lines of my data.
>>
>>> dput(head(dat,20))
>> structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037,
>> 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856,
>> 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238,
>> 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376,
>> 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424,
>> 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022
>> )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame")
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Calculated mean value based on another column bin from dataframe.

2011-04-06 Thread Petr PIKAL

Hi


r-help-boun...@r-project.org napsal dne 06.04.2011 10:48:04:

> Dear list,
> 
> I have a dataframe with two column as fellow.
> 
> > head(dat)
>V1  V2
>  0.15624 0.94567
>  0.26039 0.66442
>  0.16629 0.97822
>  0.23474 0.72079
>  0.11037 0.83760
>  0.14969 0.91312
> 
> I want to get the column V2 mean value based on the bin of column of
> V1. I write the code as fellow. It works, but I think this is not the
> elegant way. Any suggestions?

Do you want something like that?

#make data
x<-runif(100) 
y<-runif(100)

#cut first column to bins (in your case dat[,1] and ran)
x.c<-cut(x, seq(0,1,.1))

#aggregate column 2 according to bins (in your case dat[,2])
aggregate(y,list(x.c), mean)
 Group.1 x
1(0,0.1] 0.5868734
2  (0.1,0.2] 0.5436263
3  (0.2,0.3] 0.5099366
4  (0.3,0.4] 0.4815855
5  (0.4,0.5] 0.4137687
6  (0.5,0.6] 0.4698156
7  (0.6,0.7] 0.4687639
8  (0.7,0.8] 0.5661048
9  (0.8,0.9] 0.5489297
10   (0.9,1] 0.4812521

Regards
Petr

> 
> dat<-read.table("dat.txt",head=F)
> ran<-seq(0,0.5,0.05)
> mm<-NULL
> for (i in c(1:(length(ran)-1)))
> {
> fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1]
> m<-mean(dat[fil,2])
> mm<-c(mm,m)
> }
> mm
> 
> Here is the first 20 lines of my data.
> 
> > dput(head(dat,20))
> structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037,
> 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856,
> 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238,
> 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376,
> 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424,
> 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022
> )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = 
"data.frame")
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Wald test with inequality constraints

2011-04-06 Thread sébastien saegesser


Dear Helpers,

I need to do a spanning test with short sale constraints - so I have to impose 
and test multiple inequality restrictions to my panel regression model. I 
looked at several methodologies and for my research the most efficiency 
methodology is to a Wald test as proposed in:
Econometrica, Vol. 50, No. 1 (January, 1982) LIKELIHOOD RATIO TEST, WALD
TEST, AND KUHN-TUCKER TEST IN LINEAR MODELS WITH INEQUALITY CONSTRAINTS ON
THE REGRESSION PARAMETERS BY CHRISTIAN GOURIE'ROUX, ALBERTO HOLLY. AND ALAIN
MONFORTAs I'm not good with programming I was hoping to find a post-estimation 
test in R.  I found the ic.infer package, which uses the likelihood ratio test. 
Is anyone knows how to transform the program ic.infer to a likelihood ratio 
test into a Wald test, or how to program into R this Wald test with inequality 
constraints?

Thanks!

Best,
Seb




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need a more efficient way to implement this type of logic in R

2011-04-06 Thread Walter Anderson

 I have cobbled together the following logic.  It works but is very 
slow.  I'm sure that there must be a better r-specific way to implement 
this kind of thing, but have been unable to find/understand one.  Any 
help would be appreciated.


hh.sub <- households[c("HOUSEID","HHFAMINC")]
for (indx in 1:length(hh.sub$HOUSEID)) {
  if ((hh.sub$HHFAMINC[indx] == '01') | (hh.sub$HHFAMINC[indx] == '02') 
| (hh.sub$HHFAMINC[indx] == '03') | (hh.sub$HHFAMINC[indx] == '04') | 
(hh.sub$HHFAMINC[indx] == '05'))

hh.sub$CS_FAMINC[indx] <- 1 # Less than $25,000
  if ((hh.sub$HHFAMINC[indx] == '06') | (hh.sub$HHFAMINC[indx] == '07') 
| (hh.sub$HHFAMINC[indx] == '08') | (hh.sub$HHFAMINC[indx] == '09') | 
(hh.sub$HHFAMINC[indx] == '10'))

hh.sub$CS_FAMINC[indx] <- 2 # $25,000 to $50,000
  if ((hh.sub$HHFAMINC[indx] == '11') | (hh.sub$HHFAMINC[indx] == '12') 
| (hh.sub$HHFAMINC[indx] == '13') | (hh.sub$HHFAMINC[indx] == '14') | 
(hh.sub$HHFAMINC[indx] == '15'))

hh.sub$CS_FAMINC[indx] <- 3 # $50,000 to $75,000
  if ((hh.sub$HHFAMINC[indx] == '16') | (hh.sub$HHFAMINC[indx] == '17'))
hh.sub$CS_FAMINC[indx] <- 4 # $75,000 to $100,000
  if ((hh.sub$HHFAMINC[indx] == '18'))
hh.sub$CS_FAMINC[indx] <- 5 # More than $100,000
  if ((hh.sub$HHFAMINC[indx] == '-7') | (hh.sub$HHFAMINC[indx] == '-8') 
| (hh.sub$HHFAMINC[indx] == '-9'))

hh.sub$CS_FAMINC[indx] = 0
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sweave Cairo driver?

2011-04-06 Thread Liviu Andronic

Dear all
I would like to use Sweave with the Cairo() graphics device instead of
pdf(), since the former supports Unicode, allows for easier font
selection out of a greater range of available fonts, and automatically
embeds fonts into the resulting PDF.

Following this older discussion [1], has anyone come up with an Sweave
driver that uses the Cairo graphics device? Thank you
Liviu

[1] http://tolstoy.newcastle.edu.au/R/e12/help/10/10/0986.html



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cannot install pakcage RMySQL

2011-04-06 Thread Radhouane Aniba

Alon, I remember having the same problem than you, but I am using Ubuntu

I solved that by installing  libmysqlclient16-dev   using  "sudo apt-get
install libdbd-mysql libmysqlclient16-dev"

I don't know what command equivalet you have in Suse

Radhouane

2011/4/6 alon.benari 

> Hello All,
>
> I have a technical difficulty installing RMySQL. I am running openSUSE11.1
> and R 2.12
> I have installed MySQL from the website.
> and following installation , as root
> This is the part where trouble begin,
> ..
> checking mysql.h usability... no
> checking mysql.h presence... no
> checking for mysql.h... no
> checking for mysql_init in -lmysqlclient... no
> checking for mysql_init in -lmysqlclient... no
> checking for mysql_init in -lmysqlclient... no
> checking for mysql_init in -lmysqlclient... no
> checking for mysql_init in -lmysqlclient... no
> checking for mysql_init in -lmysqlclient... no
> checking for mysql_init in -lmysqlclient... no
> checking /usr/local/include/mysql/mysql.h usability... no
> checking /usr/local/include/mysql/mysql.h presence... no
> checking for /usr/local/include/mysql/mysql.h... no
> checking /usr/include/mysql/mysql.h usability... no
> checking /usr/include/mysql/mysql.h presence... no
> checking for /usr/include/mysql/mysql.h... no
> checking /usr/local/mysql/include/mysql/mysql.h usability... no
> checking /usr/local/mysql/include/mysql/mysql.h presence... no
> checking for /usr/local/mysql/include/mysql/mysql.h... no
> checking /opt/include/mysql/mysql.h usability... no
> checking /opt/include/mysql/mysql.h presence... no
> checking for /opt/include/mysql/mysql.h... no
> checking /include/mysql/mysql.h usability... no
> checking /include/mysql/mysql.h presence... no
> checking for /include/mysql/mysql.h... no
>
> Configuration error:
>  could not find the MySQL installation include and/or library
>  directories.  Manually specify the location of the MySQL
>  libraries and the header files and re-run R CMD INSTALL.
>
> INSTRUCTIONS:
>
> 1. Define and export the 2 shell variables PKG_CPPFLAGS and
>   PKG_LIBS to include the directory for header files (*.h)
>   and libraries, for example (using Bourne shell syntax):
>
>  export PKG_CPPFLAGS="-I"
>  export PKG_LIBS="-L -lmysqlclient"
>
>   Re-run the R INSTALL command:
>
>  R CMD INSTALL RMySQL_.tar.gz
>
> 2. Alternatively, you may pass the configure arguments
>  --with-mysql-dir= (distribution directory)
>   or
>  --with-mysql-inc= (where MySQL header files reside)
>  --with-mysql-lib= (where MySQL libraries reside)
>   in the call to R INSTALL --configure-args='...'
>
>   R CMD INSTALL --configure-args='--with-mysql-dir=DIR' RMySQL_.tar.gz
>
> ERROR: configuration failed for package RMySQL
> * removing /root/R/i686-pc-linux-gnu-library/2.12/RMySQL
>
> Any ideas how to proceed from here? I am quite a newbie in LINUX
> How do I find the mysql.h ?
> using find / -name mysql.h but nothing there?
> Did I use it right?
> Any other ideas? How do I set this right?
>
> Thank you
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Cannot-install-pakcage-RMySQL-tp3431025p3431025.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cannot install pakcage RMySQL

2011-04-06 Thread Radhouane Aniba

and then I installed RMySQL from R itself using install.packages

Cheers

2011/4/6 Radhouane Aniba 

> Alon, I remember having the same problem than you, but I am using Ubuntu
>
> I solved that by installing  libmysqlclient16-dev   using  "sudo apt-get
> install libdbd-mysql libmysqlclient16-dev"
>
> I don't know what command equivalet you have in Suse
>
> Radhouane
>
> 2011/4/6 alon.benari 
>
> Hello All,
>>
>> I have a technical difficulty installing RMySQL. I am running openSUSE11.1
>> and R 2.12
>> I have installed MySQL from the website.
>> and following installation , as root
>> This is the part where trouble begin,
>> ..
>> checking mysql.h usability... no
>> checking mysql.h presence... no
>> checking for mysql.h... no
>> checking for mysql_init in -lmysqlclient... no
>> checking for mysql_init in -lmysqlclient... no
>> checking for mysql_init in -lmysqlclient... no
>> checking for mysql_init in -lmysqlclient... no
>> checking for mysql_init in -lmysqlclient... no
>> checking for mysql_init in -lmysqlclient... no
>> checking for mysql_init in -lmysqlclient... no
>> checking /usr/local/include/mysql/mysql.h usability... no
>> checking /usr/local/include/mysql/mysql.h presence... no
>> checking for /usr/local/include/mysql/mysql.h... no
>> checking /usr/include/mysql/mysql.h usability... no
>> checking /usr/include/mysql/mysql.h presence... no
>> checking for /usr/include/mysql/mysql.h... no
>> checking /usr/local/mysql/include/mysql/mysql.h usability... no
>> checking /usr/local/mysql/include/mysql/mysql.h presence... no
>> checking for /usr/local/mysql/include/mysql/mysql.h... no
>> checking /opt/include/mysql/mysql.h usability... no
>> checking /opt/include/mysql/mysql.h presence... no
>> checking for /opt/include/mysql/mysql.h... no
>> checking /include/mysql/mysql.h usability... no
>> checking /include/mysql/mysql.h presence... no
>> checking for /include/mysql/mysql.h... no
>>
>> Configuration error:
>>  could not find the MySQL installation include and/or library
>>  directories.  Manually specify the location of the MySQL
>>  libraries and the header files and re-run R CMD INSTALL.
>>
>> INSTRUCTIONS:
>>
>> 1. Define and export the 2 shell variables PKG_CPPFLAGS and
>>   PKG_LIBS to include the directory for header files (*.h)
>>   and libraries, for example (using Bourne shell syntax):
>>
>>  export PKG_CPPFLAGS="-I"
>>  export PKG_LIBS="-L -lmysqlclient"
>>
>>   Re-run the R INSTALL command:
>>
>>  R CMD INSTALL RMySQL_.tar.gz
>>
>> 2. Alternatively, you may pass the configure arguments
>>  --with-mysql-dir= (distribution directory)
>>   or
>>  --with-mysql-inc= (where MySQL header files reside)
>>  --with-mysql-lib= (where MySQL libraries reside)
>>   in the call to R INSTALL --configure-args='...'
>>
>>   R CMD INSTALL --configure-args='--with-mysql-dir=DIR' RMySQL_.tar.gz
>>
>> ERROR: configuration failed for package RMySQL
>> * removing /root/R/i686-pc-linux-gnu-library/2.12/RMySQL
>>
>> Any ideas how to proceed from here? I am quite a newbie in LINUX
>> How do I find the mysql.h ?
>> using find / -name mysql.h but nothing there?
>> Did I use it right?
>> Any other ideas? How do I set this right?
>>
>> Thank you
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Cannot-install-pakcage-RMySQL-tp3431025p3431025.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> **
>
>


-- 
*Radhouane Aniba*
*Bioinformatics Research Associate*
*Institute for Advanced Computer Studies
Center for Bioinformatics and Computational Biology* *(CBCB)*
*University of Maryland, College Park
MD 20742*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cannot install pakcage RMySQL

2011-04-06 Thread Phil Spector


You need to install the mysql client development libraries.
On SUSE systems, I believe the package is called
libmysqlclient-devel .
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 6 Apr 2011, alon.benari wrote:

Hello All, 


I have a technical difficulty installing RMySQL. I am running openSUSE11.1
and R 2.12
I have installed MySQL from the website.
and following installation , as root
This is the part where trouble begin,
..
checking mysql.h usability... no
checking mysql.h presence... no
checking for mysql.h... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking for mysql_init in -lmysqlclient... no
checking /usr/local/include/mysql/mysql.h usability... no
checking /usr/local/include/mysql/mysql.h presence... no
checking for /usr/local/include/mysql/mysql.h... no
checking /usr/include/mysql/mysql.h usability... no
checking /usr/include/mysql/mysql.h presence... no
checking for /usr/include/mysql/mysql.h... no
checking /usr/local/mysql/include/mysql/mysql.h usability... no
checking /usr/local/mysql/include/mysql/mysql.h presence... no
checking for /usr/local/mysql/include/mysql/mysql.h... no
checking /opt/include/mysql/mysql.h usability... no
checking /opt/include/mysql/mysql.h presence... no
checking for /opt/include/mysql/mysql.h... no
checking /include/mysql/mysql.h usability... no
checking /include/mysql/mysql.h presence... no
checking for /include/mysql/mysql.h... no

Configuration error:
 could not find the MySQL installation include and/or library
 directories.  Manually specify the location of the MySQL
 libraries and the header files and re-run R CMD INSTALL.

INSTRUCTIONS:

1. Define and export the 2 shell variables PKG_CPPFLAGS and
  PKG_LIBS to include the directory for header files (*.h)
  and libraries, for example (using Bourne shell syntax):

 export PKG_CPPFLAGS="-I"
 export PKG_LIBS="-L -lmysqlclient"

  Re-run the R INSTALL command:

 R CMD INSTALL RMySQL_.tar.gz

2. Alternatively, you may pass the configure arguments
 --with-mysql-dir= (distribution directory)
  or
 --with-mysql-inc= (where MySQL header files reside)
 --with-mysql-lib= (where MySQL libraries reside)
  in the call to R INSTALL --configure-args='...' 


  R CMD INSTALL --configure-args='--with-mysql-dir=DIR' RMySQL_.tar.gz

ERROR: configuration failed for package ‘RMySQL’
* removing ‘/root/R/i686-pc-linux-gnu-library/2.12/RMySQL’

Any ideas how to proceed from here? I am quite a newbie in LINUX
How do I find the mysql.h ?
using find / -name mysql.h but nothing there?
Did I use it right?
Any other ideas? How do I set this right?

Thank you


--
View this message in context: 
http://r.789695.n4.nabble.com/Cannot-install-pakcage-RMySQL-tp3431025p3431025.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Getting number of students with zeroes in long format

2011-04-06 Thread Christopher Desjardins

Hi,
I have longitudinal school suspension data on students. I would like to
figure out how many students (id_r) have no suspensions (sus), i.e. have a
code of '0'. My data is in long format and the first 20 records look like
the following:

> suslm[1:20,c(1,7)]
   id_r sus
   11   0
   15  10
   16   0
   18   0
   19   0
   19   0
   20   0
   21   0
   21   0
   22   0
   24   0
   24   0
   25   3
   26   0
   26   0
   30   0
   30   0
   31   0
   32   0
   33   0

Each id_r is unique and I'd like to know the number of id_r that have a 0
for sus not the total number of 0. Does that make sense?
Thanks!
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need a more efficient way to implement this type of logic in R

2011-04-06 Thread Duncan Murdoch


On 06/04/2011 4:02 PM, Walter Anderson wrote:

   I have cobbled together the following logic.  It works but is very
slow.  I'm sure that there must be a better r-specific way to implement
this kind of thing, but have been unable to find/understand one.  Any
help would be appreciated.

hh.sub<- households[c("HOUSEID","HHFAMINC")]
for (indx in 1:length(hh.sub$HOUSEID)) {
if ((hh.sub$HHFAMINC[indx] == '01') | (hh.sub$HHFAMINC[indx] == '02')
| (hh.sub$HHFAMINC[indx] == '03') | (hh.sub$HHFAMINC[indx] == '04') |
(hh.sub$HHFAMINC[indx] == '05'))
  hh.sub$CS_FAMINC[indx]<- 1 # Less than $25,000


The answer is to think in terms of vectors and logical indexing.  The 
code above is equivalent to


hh.sub$CS_FAMINC[ hh.sub$HHFAMINC %in% c('01', '02', '03', '04', '05') ] 
<- 1


I've left off the rest of the loop, but I think it's similar.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need a more efficient way to implement this type of logic in R

2011-04-06 Thread Joshua Wiley

Hi Walter,

Take a look at the function ?cut.  It is designed to take a continuous
variable and categorize it, and will be much simpler and faster.  The
only qualification is that your data would need to be numeric, not
character.  However, if your only values are the ones you put in
quotes in your code ('02' etc), a simple call to
as.numeric(variablename) ought to do the trick.  Beyond being faster,
you can probably get down to one line of code, which should be much
easier on the eyes.  To see some examples with cut(), type (at the
console):

example(cut)

Hope this helps,

Josh

P.S. If you are planning on doing any modelling with this data, why
not leave it continuous?

On Wed, Apr 6, 2011 at 1:02 PM, Walter Anderson  wrote:
>  I have cobbled together the following logic.  It works but is very slow.
>  I'm sure that there must be a better r-specific way to implement this kind
> of thing, but have been unable to find/understand one.  Any help would be
> appreciated.
>
> hh.sub <- households[c("HOUSEID","HHFAMINC")]
> for (indx in 1:length(hh.sub$HOUSEID)) {
>  if ((hh.sub$HHFAMINC[indx] == '01') | (hh.sub$HHFAMINC[indx] == '02') |
> (hh.sub$HHFAMINC[indx] == '03') | (hh.sub$HHFAMINC[indx] == '04') |
> (hh.sub$HHFAMINC[indx] == '05'))
>    hh.sub$CS_FAMINC[indx] <- 1 # Less than $25,000
>  if ((hh.sub$HHFAMINC[indx] == '06') | (hh.sub$HHFAMINC[indx] == '07') |
> (hh.sub$HHFAMINC[indx] == '08') | (hh.sub$HHFAMINC[indx] == '09') |
> (hh.sub$HHFAMINC[indx] == '10'))
>    hh.sub$CS_FAMINC[indx] <- 2 # $25,000 to $50,000
>  if ((hh.sub$HHFAMINC[indx] == '11') | (hh.sub$HHFAMINC[indx] == '12') |
> (hh.sub$HHFAMINC[indx] == '13') | (hh.sub$HHFAMINC[indx] == '14') |
> (hh.sub$HHFAMINC[indx] == '15'))
>    hh.sub$CS_FAMINC[indx] <- 3 # $50,000 to $75,000
>  if ((hh.sub$HHFAMINC[indx] == '16') | (hh.sub$HHFAMINC[indx] == '17'))
>    hh.sub$CS_FAMINC[indx] <- 4 # $75,000 to $100,000
>  if ((hh.sub$HHFAMINC[indx] == '18'))
>    hh.sub$CS_FAMINC[indx] <- 5 # More than $100,000
>  if ((hh.sub$HHFAMINC[indx] == '-7') | (hh.sub$HHFAMINC[indx] == '-8') |
> (hh.sub$HHFAMINC[indx] == '-9'))
>    hh.sub$CS_FAMINC[indx] = 0
> }
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need a more efficient way to implement this type of logic in R

2011-04-06 Thread Phil Spector


Walter -
   Since your codes represent numbers, you could use something like
this:

chk = as.numeric((hh.sub$HHFAMINC)
hh.sub$CS_FAMINC = cut(chk,c(-10,0,5,10,15,17,18),labels=c(0,1:5))

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Wed, 6 Apr 2011, Walter Anderson wrote:

I have cobbled together the following logic.  It works but is very slow. 
I'm sure that there must be a better r-specific way to implement this kind of 
thing, but have been unable to find/understand one.  Any help would be 
appreciated.


hh.sub <- households[c("HOUSEID","HHFAMINC")]
for (indx in 1:length(hh.sub$HOUSEID)) {
 if ((hh.sub$HHFAMINC[indx] == '01') | (hh.sub$HHFAMINC[indx] == '02') | 
(hh.sub$HHFAMINC[indx] == '03') | (hh.sub$HHFAMINC[indx] == '04') | 
(hh.sub$HHFAMINC[indx] == '05'))

   hh.sub$CS_FAMINC[indx] <- 1 # Less than $25,000
 if ((hh.sub$HHFAMINC[indx] == '06') | (hh.sub$HHFAMINC[indx] == '07') | 
(hh.sub$HHFAMINC[indx] == '08') | (hh.sub$HHFAMINC[indx] == '09') | 
(hh.sub$HHFAMINC[indx] == '10'))

   hh.sub$CS_FAMINC[indx] <- 2 # $25,000 to $50,000
 if ((hh.sub$HHFAMINC[indx] == '11') | (hh.sub$HHFAMINC[indx] == '12') | 
(hh.sub$HHFAMINC[indx] == '13') | (hh.sub$HHFAMINC[indx] == '14') | 
(hh.sub$HHFAMINC[indx] == '15'))

   hh.sub$CS_FAMINC[indx] <- 3 # $50,000 to $75,000
 if ((hh.sub$HHFAMINC[indx] == '16') | (hh.sub$HHFAMINC[indx] == '17'))
   hh.sub$CS_FAMINC[indx] <- 4 # $75,000 to $100,000
 if ((hh.sub$HHFAMINC[indx] == '18'))
   hh.sub$CS_FAMINC[indx] <- 5 # More than $100,000
 if ((hh.sub$HHFAMINC[indx] == '-7') | (hh.sub$HHFAMINC[indx] == '-8') | 
(hh.sub$HHFAMINC[indx] == '-9'))

   hh.sub$CS_FAMINC[indx] = 0
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting number of students with zeroes in long format

2011-04-06 Thread Jorge Ivan Velez

Hi Chris,

Is this what you have in mind?

> sum(with(yourdata, tapply(sus, id_r, function(x) any(x==0
[1] 13

HTH,
Jorge


On Wed, Apr 6, 2011 at 4:44 PM, Christopher Desjardins <> wrote:

> Hi,
> I have longitudinal school suspension data on students. I would like to
> figure out how many students (id_r) have no suspensions (sus), i.e. have a
> code of '0'. My data is in long format and the first 20 records look like
> the following:
>
> > suslm[1:20,c(1,7)]
>   id_r sus
>   11   0
>   15  10
>   16   0
>   18   0
>   19   0
>   19   0
>   20   0
>   21   0
>   21   0
>   22   0
>   24   0
>   24   0
>   25   3
>   26   0
>   26   0
>   30   0
>   30   0
>   31   0
>   32   0
>   33   0
>
> Each id_r is unique and I'd like to know the number of id_r that have a 0
> for sus not the total number of 0. Does that make sense?
> Thanks!
> Chris
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need a more efficient way to implement this type of logic in R

2011-04-06 Thread Alexander Engelhardt


Am 06.04.2011 22:02, schrieb Walter Anderson:

I have cobbled together the following logic. It works but is very slow.
I'm sure that there must be a better r-specific way to implement this
kind of thing, but have been unable to find/understand one. Any help
would be appreciated.

hh.sub <- households[c("HOUSEID","HHFAMINC")]
for (indx in 1:length(hh.sub$HOUSEID)) {
if ((hh.sub$HHFAMINC[indx] == '01') | (hh.sub$HHFAMINC[indx] == '02') |
(hh.sub$HHFAMINC[indx] == '03') | (hh.sub$HHFAMINC[indx] == '04') |
(hh.sub$HHFAMINC[indx] == '05'))
hh.sub$CS_FAMINC[indx] <- 1 # Less than $25,000
if ((hh.sub$HHFAMINC[indx] == '06') | (hh.sub$HHFAMINC[indx] == '07') |
(hh.sub$HHFAMINC[indx] == '08') | (hh.sub$HHFAMINC[indx] == '09') |
(hh.sub$HHFAMINC[indx] == '10'))
hh.sub$CS_FAMINC[indx] <- 2 # $25,000 to $50,000
if ((hh.sub$HHFAMINC[indx] == '11') | (hh.sub$HHFAMINC[indx] == '12') |
(hh.sub$HHFAMINC[indx] == '13') | (hh.sub$HHFAMINC[indx] == '14') |
(hh.sub$HHFAMINC[indx] == '15'))
hh.sub$CS_FAMINC[indx] <- 3 # $50,000 to $75,000
if ((hh.sub$HHFAMINC[indx] == '16') | (hh.sub$HHFAMINC[indx] == '17'))
hh.sub$CS_FAMINC[indx] <- 4 # $75,000 to $100,000
if ((hh.sub$HHFAMINC[indx] == '18'))
hh.sub$CS_FAMINC[indx] <- 5 # More than $100,000
if ((hh.sub$HHFAMINC[indx] == '-7') | (hh.sub$HHFAMINC[indx] == '-8') |
(hh.sub$HHFAMINC[indx] == '-9'))
hh.sub$CS_FAMINC[indx] = 0
}


Hi,
the for-loop is entirely unnecessary. You can, as a first step, rewrite 
the code like this:


if ((hh.sub$HHFAMINC == '01') | (hh.sub$HHFAMINC == '02') |
(hh.sub$HHFAMINC == '03') | (hh.sub$HHFAMINC == '04') |
(hh.sub$HHFAMINC == '05'))
hh.sub$CS_FAMINC <- 1 # Less than $25,000

This very basic concept is called "vectorization" in R. You should read 
about it, it rocks.


In this case, though, you don't even need to do that:
If you cast the variable HHFAMINC into a number like this:
hh.sub$HHFAMINC <- as.numeric(hh.sub$HHFAMINC)
, then you can apply the cut() function to create a factor variable:

hh.sub$myawesomefactor <- cut(hh.sub$HHFAMINC, breaks=c(5.5, 10.5, 15.5, 
17.5))
or something like that should do the trick. You will then have to rename 
the factor values. I think it is the function names(), but I'm only 95% 
sure (heh.)


Also, this might be my OCD speaking, but I would use NA instead of 0 for 
non-available values.


Have fun,
 Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting number of students with zeroes in long format

2011-04-06 Thread Douglas Bates

On Wed, Apr 6, 2011 at 3:44 PM, Christopher Desjardins
 wrote:
> Hi,
> I have longitudinal school suspension data on students. I would like to
> figure out how many students (id_r) have no suspensions (sus), i.e. have a
> code of '0'. My data is in long format and the first 20 records look like
> the following:
>
>> suslm[1:20,c(1,7)]
>   id_r sus
>   11   0
>   15  10
>   16   0
>   18   0
>   19   0
>   19   0
>   20   0
>   21   0
>   21   0
>   22   0
>   24   0
>   24   0
>   25   3
>   26   0
>   26   0
>   30   0
>   30   0
>   31   0
>   32   0
>   33   0
>
> Each id_r is unique and I'd like to know the number of id_r that have a 0
> for sus not the total number of 0. Does that make sense?

You say you have longitudinal data so may we assum that a particular
id_r can occur multiple times in the data set?  It is not clear to me
what you want the result to be for students who have no suspensions at
one time but may have a suspension at another time.  Are you
interested in the number of students who have only zeros in the sus
column?

One way to approach this task is to use tapply.  I would create a data
frame and convert id_r to a factor.

df <- within(as.data.frame(suslm), id_r <- factor(id_r))
counts <- with(df, lapply(sus, id_r, function(sus) all(sus == 0)))

The tapply function will split the vector sus according to the levels
of id_r and apply the function to the subvectors.

I just say Jorge's response and he uses the same tactic but he is
looking for students who had any value of sus==0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ROCR - best sensitivity/specificity tradeoff?

2011-04-06 Thread David Winsemius



On Apr 6, 2011, at 2:27 PM, Christian Meesters wrote:


Hi,

My questions concerns the ROCR package and I hope somebody here on  
the list can help - or point me to some better place.


When evaluating a model's performane, like this:


pred1 <- predict(model, ..., type="response")
pred2 <- prediction(pred1, binary_classifier_vector)
perf  <- performance(pred, "sens", "spec")

(Where "prediction" and "performance" are ROCR-functions.)

How can I then retrieve the cutoff value for the sensitivity/ 
specificity tradeoff with regard to the data in the model (e.g.  
model = glm(binary_classifier_vector ~ data, family="binomial",  
data=some_dataset)? Perhaps I missed something in the manual?


Or perhaps in your learning phase regarding decision theory perhaps?  
You have not indicated that you understand the need to assign a cost  
to errors of either type before you can talk about a preferred cutoff  
value.



Or do I need an entirely different approach for this? Or is there an  
alternative solution?


Thanks,
Christian


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help on pspline in coxph

2011-04-06 Thread Lei Liu


Hi there,

I have a question on how to extract the linear term in the penalized 
spline in coxph. Here is a sample code:


n=100

set.seed(1)

x=runif(100)

f1 = cos(2*pi*x)

hazard = exp(f1)

T = 0

for (i in 1:100) {

  T[i] = rexp(1,hazard[i])

}

C = runif(n)*4

cen = T<=C

y = T*(cen) + C*(1-cen)

data.tr=cbind(y,cen,x)

   fit=coxph(Surv(data.tr[,1], data.tr[,2])~pspline(data.tr[,3]) )

If I use summary(fit), it will show the following results:

summary(fit)
Call:
coxph(formula = Surv(data.tr[, 1], data.tr[, 2]) ~ pspline(data.tr[,
3]))

  n= 100
  coef  se(coef) se2   Chisq DF   p
pspline(data.tr[, 3]), li 0.495 0.4370.437  1.28 1.00 2.6e-01
pspline(data.tr[, 3]), no  43.79 3.08 1.9e-09

   exp(coef) exp(-coef) lower .95 upper .95
ps(data.tr[, 3])2 0.4404  2.270   0.08164 2.376
ps(data.tr[, 3])3 0.2065  4.842   0.01701 2.507
ps(data.tr[, 3])4 0.0951 10.512   0.00695 1.302
ps(data.tr[, 3])5 0.0493 20.274   0.00387 0.628
ps(data.tr[, 3])6 0.0280 35.741   0.00230 0.340
ps(data.tr[, 3])7 0.0192 52.068   0.00156 0.237
ps(data.tr[, 3])8 0.0219 45.605   0.00178 0.271
ps(data.tr[, 3])9 0.0473 21.156   0.00399 0.561
ps(data.tr[, 3])100.1432  6.983   0.01250 1.640
ps(data.tr[, 3])110.3936  2.541   0.03436 4.509
ps(data.tr[, 3])120.9449  1.058   0.0688512.969
ps(data.tr[, 3])132.2406  0.446   0.0764365.683

Iterations: 3 outer, 9 Newton-Raphson
 Theta= 0.697
Degrees of freedom for terms= 4.1
Rsquare= 0.385   (max possible= 0.994 )
Likelihood ratio test= 48.6  on 4.08 df,   p=7.74e-10
Wald test= 45.1  on 4.08 df,   p=4.29e-09

My question is how to extract the linear coefficient (0.495) in 
pspline(data.tr[, 3]). I tried coef(fit) but it fails to display this 
term. Your help is greatly appreciated!


Lei Liu
Associate Professor
Division of Biostatistics and Epidemiology
Department of Public Health Sciences
University of Virginia School of Medicine

http://people.virginia.edu/~ll9f/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mgp.axis.labels

2011-04-06 Thread Judith Flores

Hello,

   I am trying to use mgp.axis labels to locate the x-xis at a different 
distance from the one specified for the y-axis. I know I could use other 
functions such as mtext or axis. But I am curious to know about how to use 
'mgp'axis.labels' from the Hmisc package. I tried following the example in the 
documentation without success.

Sample code:

x<-1:10
y<-1:10

mgp.axis.labels(type='x', mgp=c(5,1,2))
plot(x,y, xlab="X")


   I know I am not really understanding the concept, but some guidance would be 
greatly appreciated.

Thank you,

Judith

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting number of students with zeroes in long format

2011-04-06 Thread Christopher Desjardins

On Wed, Apr 6, 2011 at 4:03 PM, Douglas Bates  wrote:

> On Wed, Apr 6, 2011 at 3:44 PM, Christopher Desjardins
>  wrote:
> > Hi,
> > I have longitudinal school suspension data on students. I would like to
> > figure out how many students (id_r) have no suspensions (sus), i.e. have
> a
> > code of '0'. My data is in long format and the first 20 records look like
> > the following:
> >
> >> suslm[1:20,c(1,7)]
> >   id_r sus
> >   11   0
> >   15  10
> >   16   0
> >   18   0
> >   19   0
> >   19   0
> >   20   0
> >   21   0
> >   21   0
> >   22   0
> >   24   0
> >   24   0
> >   25   3
> >   26   0
> >   26   0
> >   30   0
> >   30   0
> >   31   0
> >   32   0
> >   33   0
> >
> > Each id_r is unique and I'd like to know the number of id_r that have a 0
> > for sus not the total number of 0. Does that make sense?
>
> You say you have longitudinal data so may we assum that a particular
> id_r can occur multiple times in the data set?


Yes an id_r can occur multiple times in the data set.


> It is not clear to me
> what you want the result to be for students who have no suspensions at
> one time but may have a suspension at another time.  Are you
> interested in the number of students who have only zeros in the sus
> column?
>

Yes. Once a student has a value other than zero I don't want to include that
student in the tally. So I want to know how many students never got
suspended during the study.


>
> One way to approach this task is to use tapply.  I would create a data
> frame and convert id_r to a factor.
>
> df <- within(as.data.frame(suslm), id_r <- factor(id_r))
> counts <- with(df, lapply(sus, id_r, function(sus) all(sus == 0)))
>


I am getting the following message:

> df <- within(as.data.frame(suslm), id_r <- factor(id_r))
> counts <- with(df, lapply(sus, id_r, function(sus) all(sus == 0)))
Error in get(as.character(FUN), mode = "function", envir = envir) :
  object 'id_r' of mode 'function' was not found


Thanks,
Chris


> The tapply function will split the vector sus according to the levels
> of id_r and apply the function to the subvectors.
>
> I just say Jorge's response and he uses the same tactic but he is
> looking for students who had any value of sus==0
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread Marius Hofert

Dear expeRts,

I would like to create a plotmath-label of the form X_1^\prime. Here is how to 
*not* do it [not nicely aligned symbols]:

plot(0,0,main=expression(italic(X*minute[1])))
plot(0,0,main=expression(italic(X[1]*minute)))
plot(0,0,main=expression(italic(X)[1]*minute))

Any suggestions?

Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread Peter Ehlers


On 2011-04-06 14:14, Marius Hofert wrote:

Dear expeRts,

I would like to create a plotmath-label of the form X_1^\prime. Here is how to 
*not* do it [not nicely aligned symbols]:

plot(0,0,main=expression(italic(X*minute[1])))
plot(0,0,main=expression(italic(X[1]*minute)))
plot(0,0,main=expression(italic(X)[1]*minute))

Any suggestions?


Hmm ; your subject line is a clue:

 expression(italic(X)[1]^minute)

Note the '^'.

Peter Ehlers



Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread Peter Ehlers


see inline;

On 2011-04-06 14:22, Peter Ehlers wrote:

On 2011-04-06 14:14, Marius Hofert wrote:

Dear expeRts,

I would like to create a plotmath-label of the form X_1^\prime. Here is how to 
*not* do it [not nicely aligned symbols]:

plot(0,0,main=expression(italic(X*minute[1])))
plot(0,0,main=expression(italic(X[1]*minute)))
plot(0,0,main=expression(italic(X)[1]*minute))

Any suggestions?


Hmm ; your subject line is a clue:

   expression(italic(X)[1]^minute)

Note the '^'.


and if you want a 'straight' prime:

   expression(italic(X)[1]^"\'")

Peter



Peter Ehlers



Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread David Winsemius



On Apr 6, 2011, at 5:14 PM, Marius Hofert wrote:


Dear expeRts,

I would like to create a plotmath-label of the form X_1^\prime. Here  
is how to *not* do it [not nicely aligned symbols]:


Not all of us read LaTeX, so this is my initial guess at what you are  
requesting.  The "*" after the '1' is a non-space plotmath separator.


plot(0,0,main=expression(italic(X["`"*1])))


plot(0,0,main=expression(italic(X*minute[1])))
plot(0,0,main=expression(italic(X[1]*minute)))
plot(0,0,main=expression(italic(X)[1]*minute))


If you wanted the tick after the 1, then think of 'minute' as a  
constant rather than as a function:


plot(0,0,main=expression(italic(X[1*minute])))

Or:

plot(0,0,main=expression(italic(X[minute*1])))



Any suggestions?


Greater effort at explanation that does not depend on intuiting your  
goal from erroneous code.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Teradata ODBC driver

2011-04-06 Thread Peter Ehlers


On 2011-04-06 11:23, William Poling wrote:

Hi.

I have had this URL passed to me in order to obtain the necessary driver to 
connect my R application to our Teradata warehouse, however, the URL does not 
seem to exist anymore, my internet explorer browser fails to connect for some 
reason.


http://downloads.teradata.com/download/applications/teradata-r/1.0


I just tried this with Firefox 3.6.16 and had no problem.
Proxy server?

Peter Ehlers



Is there an alternative site or location for obtaining the necessary driver?

Thanks

WHP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculated mean value based on another column bin from dataframe.

2011-04-06 Thread David Winsemius

On Apr 6, 2011, at 9:46 AM, Fabrice Tourre wrote:

Dear Henrique Dallazuanna,

Thank you very much for your suggestion.

It is obvious that your method is better than me.

Is it possible to use cut, table,by etc? Whether there is some
aggregate function in R can do this?

Thanks.

On Wed, Apr 6, 2011 at 2:16 PM, Henrique Dallazuanna  
 wrote:

Try this:

fil <- sapply(ran, '<', e1 = dat[,1]) & sapply(ran[2:(length(ran) +
1)], '>=', e1 = dat[,1])
mm <- apply(fil, 2, function(idx)mean(dat[idx, 2]))

On Wed, Apr 6, 2011 at 5:48 AM, Fabrice Tourre > wrote:

Dear list,

I have a dataframe with two column as fellow.

head(dat)

  V1  V2
 0.15624 0.94567
 0.26039 0.66442
 0.16629 0.97822
 0.23474 0.72079
 0.11037 0.83760
 0.14969 0.91312

I want to get the column V2 mean value based on the bin of column of
V1. I write the code as fellow. It works, but I think this is not  
the

elegant way. Any suggestions?

dat<-read.table("dat.txt",head=F)
ran<-seq(0,0.5,0.05)
mm<-NULL
for (i in c(1:(length(ran)-1)))
{
   fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1]
   m<-mean(dat[fil,2])
   mm<-c(mm,m)
}
mm

Here is the first 20 lines of my data.

dput(head(dat,20))

structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037,
0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856,
0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238,
0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376,
0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424,
0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022
)), .Names = c("V1", "V2"), row.names = c(NA, 20L), class =  
"data.frame")

__

Here is how I would have done it with findInterval and tapply which is  
very similar to using a `cut` and `table` approach:

> dat$grp <- findInterval(dat$V1, seq(0,0.5,0.05) )
> tapply(dat$V2, dat$grp, mean)
1 2 3 4 5 6 8
0.9252300 0.8836100 0.9135429 0.9213600 0.8493450 0.7269900 0.6978900
#---

You do not get exactly the same form of the result as with Henrique's  
method. His yields:

> mm
 [1] 0.9252300 0.8836100 0.9135429 0.9213600 0.8493450  
0.7269900   NaN

 [8] 0.6978900   NaN   NaN   NaN

The cut approach would yield this, which is more informatively  
labeled. (I'm wasn't completely sure the second to last word in the  
prior sentence was a real word, but several dictionaries seem to think  
so.):

> dat$grp2 <- cut(dat$V1 , breaks=ran)
> tapply(dat$V2, dat$grp2, mean)
  (0,0.05] (0.05,0.1] (0.1,0.15] (0.15,0.2] (0.2,0.25] (0.25,0.3]
 0.9252300  0.8836100  0.9135429  0.9213600  0.8493450  0.7269900
(0.3,0.35] (0.35,0.4] (0.4,0.45] (0.45,0.5]
NA  0.6978900 NA NA

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glm predict on new data

2011-04-06 Thread dirknbr

I am aware this has been asked before but I could not find a resolution.

I am doing a logit

lg <- glm(y[1:200] ~ x[1:200,1],family=binomial)

Then I want to predict a new set

pred <- predict(lg,x[201:250,1],type="response")

But I get varying error messages or warnings about the different number of
rows. I  have tried data/newdata and also to wrap in data.frame() but cannot
get to work.

Help would be appreciated.

Dirk.

--
View this message in context: 
http://r.789695.n4.nabble.com/glm-predict-on-new-data-tp3431855p3431855.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread Marius Hofert

Dear Peter, Dear David,

this is also what I tried: plot(0,0,main=expression(italic(X)[1]^minute)) [as 
suggested by Peter]. The problem is that the prime seems so small/short when 
used with "^". Is there a way to get a thicker/larger prime?

Cheers,

Marius

On 2011-04-06, at 23:22 , Peter Ehlers wrote:

> On 2011-04-06 14:14, Marius Hofert wrote:
>> Dear expeRts,
>> 
>> I would like to create a plotmath-label of the form X_1^\prime. Here is how 
>> to *not* do it [not nicely aligned symbols]:
>> 
>> plot(0,0,main=expression(italic(X*minute[1])))
>> plot(0,0,main=expression(italic(X[1]*minute)))
>> plot(0,0,main=expression(italic(X)[1]*minute))
>> 
>> Any suggestions?
> 
> Hmm ; your subject line is a clue:
> 
> expression(italic(X)[1]^minute)
> 
> Note the '^'.
> 
> Peter Ehlers
> 
>> 
>> Cheers,
>> 
>> Marius
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] corSpatial and nlme

2011-04-06 Thread Robert Baer

I noticed that ?corClasses in package nlme does not list corSpatial among the 
standard classes.  This might either be intentional because corSpatial is not 
"standard" , or it might be simply an oversight that needs correcting.


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with all/all.equal

2011-04-06 Thread Laura Smith

Hi!

In a function, I may have an instance in which all elements are equal.

> x <- rep(1,5)
>
> x
[1] 1 1 1 1 1
> identical(x)
Error in .Internal(identical(x, y, num.eq, single.NA, attrib.as.set)) :
  'y' is missing
> all.equal(x)
Error in is.expression(x) : 'x' is missing
>

I don't care what particular value it is, I just want to know if they are
all equal.

What am I doing wrong, please?

Thanks,
Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread David Winsemius



On Apr 6, 2011, at 5:58 PM, Marius Hofert wrote:


Dear Peter, Dear David,

this is also what I tried: plot(0,0,main=expression(italic(X) 
[1]^minute)) [as suggested by Peter]. The problem is that the prime  
seems so small/short when used with "^". Is there a way to get a  
thicker/larger prime?


This any better?

plot(0,0,main=expression(italic(X)[1]^bolditalic("'")))



Cheers,

Marius

On 2011-04-06, at 23:22 , Peter Ehlers wrote:


On 2011-04-06 14:14, Marius Hofert wrote:

Dear expeRts,

I would like to create a plotmath-label of the form X_1^\prime.  
Here is how to *not* do it [not nicely aligned symbols]:


plot(0,0,main=expression(italic(X*minute[1])))
plot(0,0,main=expression(italic(X[1]*minute)))
plot(0,0,main=expression(italic(X)[1]*minute))

Any suggestions?


Hmm ; your subject line is a clue:

expression(italic(X)[1]^minute)

Note the '^'.

Peter Ehlers



Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fisher's Exact test

2011-04-06 Thread Jim Silverton

Hello,
I have a matrix, X2 ith 2 columns and I want to do the fisher's exact test
on each row. However, it is too slow and I would like to use the
sage.testsage command from the library called

sage.test(datasnp[,1], datasnp[,2], n1 =100, n2 =100)

-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fisher exact test approximation?

2011-04-06 Thread Jim Silverton

Hello,
I have a matrix, X2 ith 2 columns and I want to do the fisher's exact test
on each row. However, it is too slow and I would like to use the
sage.testsage command from the library called (sagenhaft). I used:

sage.test(X2[,1], X2[,2], n1 =100, n2 =100)

but the pvalues histograms does not look anything like the ones from
Fisher's exact test. Is there a problem here?

-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread Marius Hofert

thickness looks good, but length... 
it should be something in between the following two:

plot(0,0,main=expression(italic(X)[1]^bolditalic("'")))
plot(0,0,main=expression(italic(X)[1]^bolditalic("|")))


On 2011-04-07, at 24:08 , David Winsemius wrote:

> 
> On Apr 6, 2011, at 5:58 PM, Marius Hofert wrote:
> 
>> Dear Peter, Dear David,
>> 
>> this is also what I tried: plot(0,0,main=expression(italic(X)[1]^minute)) 
>> [as suggested by Peter]. The problem is that the prime seems so small/short 
>> when used with "^". Is there a way to get a thicker/larger prime?
> 
> This any better?
> 
> plot(0,0,main=expression(italic(X)[1]^bolditalic("'")))
> 
>> 
>> Cheers,
>> 
>> Marius
>> 
>> On 2011-04-06, at 23:22 , Peter Ehlers wrote:
>> 
>>> On 2011-04-06 14:14, Marius Hofert wrote:
 Dear expeRts,
 
 I would like to create a plotmath-label of the form X_1^\prime. Here is 
 how to *not* do it [not nicely aligned symbols]:
 
 plot(0,0,main=expression(italic(X*minute[1])))
 plot(0,0,main=expression(italic(X[1]*minute)))
 plot(0,0,main=expression(italic(X)[1]*minute))
 
 Any suggestions?
>>> 
>>> Hmm ; your subject line is a clue:
>>> 
>>> expression(italic(X)[1]^minute)
>>> 
>>> Note the '^'.
>>> 
>>> Peter Ehlers
>>> 
 
 Cheers,
 
 Marius
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
> 
> David Winsemius, MD
> West Hartford, CT
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quiz: Who finds the nicest form of X_1^\prime?

2011-04-06 Thread Marius Hofert

Haha, I found a hack (using the letter "l"): 

plot(0,0,main=expression(italic(X)[1]^bolditalic("l")))

Cheers,

Marius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 133 matches

Mail list logo