from:"Bernardo Rangel Tura"

Re: [R] Resshufling algorithm (Thiago Souza)

2008-07-19 Thread Bernardo Rangel Tura

Em Sáb, 2008-07-19 às 11:07 -0300, Thiago Souza escreveu:
> Hi folks!
> 
> I'm a beginner and I'm learning how to use R.
> 
> I need to calculate the value of Simpson Diversity Index (*D*=1-Σ*pi*^2;
> where *p* is the relative abundance of the species* i*) of one sample (e.g.,
> diversity of bromeliad spiders) and test the significance of this value. For
> this I'll need to randomize samples while conserving the observed species
> abundance and sample-size distributions. After this, in need to repeate
> these randomizations 10.000 times to form a null distribution and assessing
> the statistical significance by compare the observad D value with the null
> distribution.
> But unfortunately, I don't know how to run this procedure in R software.
> 
> Please, can anybody help me?
> 
> Thank you in advance
> 

Thiago,

If I understood your problem do you solve using sample, look this
example

data<-rep(letters[1:6],10)

sample<-sample (data,10)
sample
table(sample)

sample<-sample (data,10)
sample
table(sample)

For repeat the sample using "for" 


for (i in 1:1){
commands
}


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculate differences - strange outcome

2008-07-20 Thread Bernardo Rangel Tura

Em Qui, 2008-07-17 às 11:47 +0200, Kunzler, Andreas escreveu:
> Dear List,
> 
> I ran into some trouble by calculating differences. For me it is
> important that differences are either 0 or not.
> 
> So I don't understand the outcome of this calculation
> 
> 865.56-(782.86+0+63.85+18.85+0)
> [1] -1.136868e-13
> 
> I run R version 2.71 on WinXP

Hi Andreas Kunzler,

Your problem is cause by numeric represntation in computer
(Floating-Point numbers) and this topic is explain in R FAQ 7.31

But the solution for your calculations is possible using the "guard
digits" approach, so  if you needd solve:

865.56-(782.86+0+63.85+18.85+0)

Using 

865.56/100 -(782.86+0+63.85+18.85+0)/100

More details in http://docs.sun.com/source/806-3568/ncg_goldberg.html
(This links is indicated in R FAQ 7.31)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot help

2008-08-03 Thread Bernardo Rangel Tura

Em Sex, 2008-08-01 às 14:16 -0700, Rajasekaramya escreveu:
> hi
> 
> I have list of matrix of lenggth 61 containg the mean values..I want
to make
> a boxplot for each of the matrix.
> I used a for loop but i cant figure out the way to save in the
boxplots
> 
> > all.the.mean
> [[1]]
> mean
> 0.5
> o.6
> 0.8
> [[2]]
> 0.6
> 0.6
> 0.9
>  now i want the boxplot for each of the matrix in a seperate window.my
code
> for(i in 1:length(all.the.mean)
> {
> windows()
> boxplot(all.the.mean[[i]]
> } 
> this prints in all the boxplots  but it cant be saved i dont know to
save
> those
> plz help me on this

Rajasekaramya,

Have 2 forms to save a plot using savePlot ou using device like: jpeg(),
png() or pdf() in you case

for(i in 1:length(all.the.mean){
windows()
boxplot(all.the.mean[[i]]
title<-paste("Mean",i,sep="-")
savePlot(filename = title,type = c("png"))
}

OR
 
for(i in 1:length(all.the.mean){
windows()
title<-paste(i,"png",sep=".")
png(filename =tilte) 
boxplot(all.the.mean[[i]]
dev.off()
}


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in installing R packages

2008-08-07 Thread Bernardo Rangel Tura

Em Qua, 2008-08-06 às 15:51 -0700, Eva Winter escreveu:
> Hello,
> 
>  
> 
> I am trying to install R packages under linux, some of the packages can
> not be installed  and I got the following errors, could anybody give me
> suggestion on where the problem is and how to fix them? Thanks -e
> 
(...)
> 
> make: *** [cmeans.o] Error 1
> 
> chmod: cannot access `/usr/share/R/library/e1071/libs/*': No such file
> or directory
> 
> ERROR: compilation failed for package 'e1071'
> 
> ** Removing '/usr/share/R/library/e1071'
> 
>  
> 
> The downloaded packages are in
> 
>     /tmp/RtmpZ0Gndg/downloaded_packages


Eva,

Sorry but you type "sudo R" to invoke R ?


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tcl\tk not supported on this system

2008-08-11 Thread Bernardo Rangel Tura

Em Sex, 2008-08-08 às 13:27 -0400, Michael Gormley escreveu:
> In trying to install the pbatR package, I was greeted with the error
> 
> Error: package 'tcltk' does not have a name space
> Execution halted
> 
> Directly installing the package tcltk2 returned the following error:
> 
> Loading required package: tcltk
> Error in firstlib(which.lib.loc, package) :
> Tcl/Tk support is not available on this system
> 
> I have seen from previous posts that tcl/tk must be present when R is
> installed.
> I tried reinstalling R from source using
> 
> ./configure
> make
> 
> and although the installation went fine I received the same errors.
> 
> Typing rpm -qa|grep tcl returns:
> tclx-8.3.5-4
> tcl-8.4.7-2
> tcl-8.4.7-2


Hi Michael,

If I understand you install tcl/tk but not instaled developers packages
of tcl/tk.

I ubbuntu - debian packages - exists 2 packagas for tcl (tcl8.? and
tcl8.?-dev) and 2 packages for tk (tk8.? and tk8.?-dev)

In your system also exist 2 packagas tcl-8.4.7-2 and tcl-devel-8.4.7-2

I thinks if you install tcl-devel and tk-devel packages you solve your
problem
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparison of demographics between 2 study samples

2008-08-14 Thread Bernardo Rangel Tura

Em Qua, 2008-08-13 às 19:14 -0700, Mark Home escreveu:
> Dear All:
> 
> I have a clinical study where I would like to compare the demographic 
> information for 2 samples in a study.  The demographics include both 
> categorical and continuous variables.  I would like to be able to say whether 
> the demographics are significantly different or not.
> 
> The majority of papers that I have read use multiple techniques to achieve 
> this (e.g., t-test for the continuous variables and either Fischer exact or 
> Chi-square for categorical).  I wonder whether this might lead to spurious 
> differences due to multiple significance tests.  Is there a better way to do 
> this?
> 
> Thanks in advance for your advice,
> 
> Mark

Mark,

You need make multiple comparison correction.

The most know correction is Bonferroni.

In this case you make 

new p value = alpha/n  where alpha is "alpha" of planning your study and
"n"  is a number of test you using.

Example

If may clinical study have a alpha = 0.05 and i make 10 test my ne
p-value is 0.05/10 = 0.005



-- 
[]s
Tura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confidence Interval

2008-08-20 Thread Bernardo Rangel Tura

Em Ter, 2008-08-19 às 23:25 -0300, Raphael Saldanha escreveu:
> Hi!
> 
> With the following script, I'm trying to make a demonstration of a
> Confidence Interval, but I'm observing some differences on tails.

Raphael,

If you make demonstration of Confidence Interval why you don't use
"ci.examp" of "TeachingDemos" package?

It's a very good example


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpreting Logistic Regression

2008-08-23 Thread Bernardo Rangel Tura

Em Qua, 2008-08-20 às 23:54 -0700, Madhavi Bhave escreveu:

Hi Madhavi,


> Hi !
> 
> This is Madhavi from Mumbai, India. Incidently this is my first post.

You are wellcome! 
> 
> I am working on Credit Scoring Model and using R, I have run the logistic 
> regression. I have received following Output.
> 
> I have two questions
> 
> (a) What is the significance of "family = binomial(link = logit)". Why do I 
> have to mention Binomial? Is it because my dependent variable assumes only 
> two values 0 and 1? Can I write name of some other Statistical distribution 
> (say Poisson or Negative Binomial) in place of Binomial? How will it affect 
> my results?

Well the logistitc regression is a generalized linear model. Your
specification is just "binomial" function with "logit" link function so
"summary" mention this especification ...

The binomial model is choose because your data have a thorical binomial
distribution (two type of outcome wtih fix probabilty of outcome and
independent observations)

For ohert distribuitions existing other links functions see more details
in:  ?family 
> 
> (b) How do I interpret the "R" result as given below? I know all the 
> variables are significant. How do I get Log Likelihood ratio, Odds ratio etc.?

well odds ratio:

model<- glm (formula,family=binomial)
exp(coef(model))

Log Likehood

model$deviance
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlation between two 2D point patterns?

2009-08-30 Thread Bernardo Rangel Tura

On Sun, 2009-08-30 at 07:51 +0100, William Simpson wrote:
> Suppose I have two sets of (x,y) points like this:
> 
> x1<-runif(n=10)
> y1<-runif(n=10)
> A<-cbind(x1,y1)
> 
> x2<-runif(n=10)
> y2<-runif(n=10)
> B<-cbind(x2,y2)
> 
> I would like to measure how similar the two sets of points are.
> Something like a correlation coefficient, where 0 means the two
> patterns are unrelated, and 1 means they are identical. And in
> addition I'd like to be able to assign a p-value to the calculated
> statistic.
> 
> cor(x1,x2)
> cor(y1,y2)
> gives two numbers instead of one.
> 
> cor(A,B)
> gives a correlation matrix
> 
> I have looked a little at spatial statistics. I have seen methods
> that, for each point, search in some neighbourhood around it and then
> compute the correlation as a function of search radius. That is not
> what I am looking for. I would like a single number that summarises
> the strength of the relationship between the two patterns.
> 
> I will do procrustes on the two point sets first, so that if A is just
> a rotated, translated, scaled, reflected version of B the two patterns
> will superimpose and the statistic I'm looking for will say there is
> perfect correspondence.
> 
> Thanks very much for any help in finding such a statistic and
> calculating it using R.
> 
> Bill

Hi Bill,

If your 2 points set is similar I expect your Euclidean distance is 0,
so I suggest this script:

dist<-sqrt((x1-x2)^2+(y1-y2)^2) # Euclidean distance

t.test(dist) # test for mean equal 0

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Color index in image function

2009-09-07 Thread Bernardo Rangel Tura

On Sat, 2009-09-05 at 04:14 -0700, FMH wrote:
> Dear All,
> 
> I was looking for the color index in image function, such as from 
> topo.colors(n) and etc. but still never found it. For instance, from the help 
> menu.
> 
> 
> ###
> # Volcano data visualized as matrix. Need to transpose and flip
> # matrix horizontally.
> image(t(volcano)[ncol(volcano):1,])
> 
> # A prettier display of the volcano
> x <- 10*(1:nrow(volcano))
> y <- 10*(1:ncol(volcano))
> image(x, y, volcano, col = terrain.colors(100), axes = FALSE)
> contour(x, y, volcano, levels = seq(90, 200, by = 5),
> add = TRUE, col = "peru")
> axis(1, at = seq(100, 800, by = 100))
> axis(2, at = seq(100, 600, by = 100))
> box()
> title(main = "Maunga Whau Volcano", font.main = 4)
> #
> 
> >From the script above, it yields a beautiful  image of volcano with variety 
> >of colors but i have to list down the color index that could show the 
> >meaning of each color in my thesis. 
> 
> Could someone please help me to extract this color index?
> 
> Thank you
> Fir

If I understand your question you need change the Palette of image plot.

So you need use "colorRampPalette" look my example

Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
image(x, y, volcano, col = Brazilan.Pallete(50), axes = FALSE)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Color index in image function

2009-09-10 Thread Bernardo Rangel Tura

On Wed, 2009-09-09 at 02:33 -0700, FMH wrote:
> Thank you for the hints, but how could i add the grid lines which have 
> numbers, representing the height of the volcano on the image.
> 
> Thank you
> 

So I think this script  is what you need


Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
require(fileds)
image.plot(volcano, col = Brazilan.Pallete(50), axes = FALSE) 
contour(volcano, levels = seq(90, 200, by = 5), add = TRUE)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil


> 
> 
> - Original Message 
> From: Bernardo Rangel Tura 
> To: FMH 
> Sent: Tuesday, September 8, 2009 10:14:07 AM
> Subject: Re: [R] Color index in image function
> 
> On Mon, 2009-09-07 at 07:59 -0700, FMH wrote:
> > Thank you for the tips. I have manage to run your script, but  was still 
> > never get the way to include the color index beside the image which could 
> > explain the intensity of the color from the lower index(green) to the 
> > higher index(blue). This color index might be represented by  an increasing 
> > of color index in another table beside the image, started from green 
> > followed by green-yellow, yellow, yellow-blue and blue?
> > 
> > Could someone please advice on this matter?
> > 
> > Cheers
> > Fir
> > 
> 
> Hi FHM,
> 
> Well If you desire one color index in a imageplot  I don't know solve
> your problem.
> 
> But in your scirptyou use image and 
> 
> image(x, y, volcano, col = terrain.colors(100), axes = FALSE)
> contour(x, y, volcano, levels = seq(90, 200, by = 5),
> add = TRUE, col = "peru")
> 
> In this case I suggest you use 
> 
> Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
> filled.contour(volcano, color = Brazilan.Pallete)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Color index in image function

2009-09-11 Thread Bernardo Rangel Tura

On Thu, 2009-09-10 at 05:27 -0700, FMH wrote:
> Thank you for the scripts, but the label and the values in the x and y-axis 
> suddently dissapear even the 'axis' function is used as stated in the command 
> below. Could you help on this?
> 
> axis(1, at = seq(100, 800, by = 100))
> axis(2, at = seq(100, 600, by = 100))
> 
> How could i add a tittle on top of color index?
> 
> Thank you
> Fir
Try this


Brazilan.Pallete <- colorRampPalette(c("green","yellow", "blue"))
require(fields)
image.plot(volcano, col = Brazilan.Pallete(50), legend.lab="Scale") 
contour(volcano, levels = seq(90, 200, by = 5), add = TRUE)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in performing goodness of fit test in R.

2010-02-14 Thread Bernardo Rangel Tura

On Sun, 2010-02-14 at 12:42 +0500, Faiz Rasool wrote:
> I am trying to perform goodness of fit test using R. I am using this website 
> http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html for help. 
> However, I am unable to carry out the test successfully. My code follows. It 
> is taken from the website just mentioned. 
> freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the dice 150 
> times.
> prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
> chisq.test(freq,p=prob) # I do not know what this line means. I just followed 
> instructions on the website.
> The erorr I receive is "erorr in chisq.test(freq,p=prob)/6 probabilities must 
> sum to 1" 
> 
> I am very new to R, so any help would be appreciated. 
> Faiz.

Faiz,


Well ... 
In my computer( Phenom X4 9650, runing Ubuntu 9.10 and R 2.10.1) the
script work


> freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the
dice 150 times.
> prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
> chisq.test(freq,p=prob) # I do not know what this line means

Chi-squared test for given probabilities

data:  freq 
X-squared = 6.72, df = 5, p-value = 0.2423


About the third line You must read ?chisq.test for better know the
command, but you execute one chi-square test with uniform probability
distribution 



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fisher.test gives p>1

2010-03-04 Thread Bernardo Rangel Tura

On Thu, 2010-03-04 at 11:15 -0500, Jacob Wegelin wrote:
> The purpose of this email is to
> 
> (1) report an example where fisher.test returns p > 1
> 
> (2) ask if there is a reliable way to avoid p>1 with fisher.test.
> 
> If one has designed one's code to return an error when it finds a 
> "nonsensical" probability, of course a value of p>1 can cause havoc.
> 
> Example:
> 
> > junk<-data.frame(score=c(rep(0,14), rep(1,29), rep(2, 16)))
> > junk<-rbind(junk, junk)
> > junk$group<-c(rep("DOG", nrow(junk)/2), rep("kitty", nrow(junk)/2))
> > table(junk$score, junk$group)
> 
>  DOG kitty
>0  1414
>1  2929
>2  1616
> > dput(fisher.test(junk$score, junk$group)$p.value)
> 1.12

Hi jacob,

I think this is cover in R FAQ 7.31, but look this command
all.equal(dput(fisher.test(matrix(c(14,14,29,29,16,16),byrow=T,ncol=2))$p.value),1)
1.000012
[1] TRUE

P.S
R FAQ 7.31 -
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different forms of nls recommendations

2010-03-21 Thread Bernardo Rangel Tura

On Sat, 2010-03-20 at 14:55 -0800, emorway wrote:
> Hello, 
> 
> Using this data:
> http://n4.nabble.com/file/n1676330/US_Final_Values.txt US_Final_Values.txt 
> 
> and the following code i got the image at the end of this message:
> 
> US.final.values<-read.table("c:/tmp/US_Final_Values.txt",header=T,sep=" ")
> US.nls.1<-nls(US.final.values$ECe~a*US.final.values$WTD^b+c,data=US.final.values,start=list(a=2.75,b=-0.95,c=0.731),trace=TRUE)
> f.US1<-function(x){coef(US.nls.1)["a"]*x^coef(US.nls.1)["b"]+coef(US.nls.1)["c"]}
> xvals.US1<-seq(min(US.final.values$WTD),max(US.final.values$WTD),length.out=75)
> yvals.US1<-f.US1(xvals.US1)
> Rsq.nls.1<-sum((predict(US.nls.1)-mean(US.final.values$ECe))^2/sum((US.final.values$ECe-mean(US.final.values$ECe))^2))
> plot(US.final.values$WTD,US.final.values$ECe,col="red",pch=19,cex=.75)
> lines(xvals.US1,yvals.US1,col="blue")
> 
> but the r^2 wasn't so hot.  
> Rsq.nls.1
> [1] 0.2377306
> 
> So I wanted to try a different equation of the general form a/(b+c*x^d)
> 
> US.nls.2<-nls(US.final.values$ECe~(a/(b+c*US.final.values$WTD^d)),data=US.final.values,start=list(a=100.81,b=73.7299,c=0.0565,d=-6.043),trace=TRUE,algorithm="port")
> 
> but that ended with "Convergence failure: false convergence (8)".  I tried


Hi emorway,

Do you have 657 obs and 4 parameters to fit.
In my opinion you have few obs...
I think do you  fit in steps:

US.nls.2<-nls(ECe~(a/(b + c *
WTD^d)),data=US.final.values,start=list(a=100.81,b=73.7299,c=0.0565,d=-6.043),trace=TRUE,algorithm="port")
temp_nls1 <- nls(ECe~(100/(73 + .05 *
WTD^d)),data=US.final.values,start=list(d=-6.043),trace=TRUE,algorithm="port")
temp_nls2 <- nls(ECe~(100/(73 + .05 *
WTD^d)),data=US.final.values,start=list(d=-1.01613),trace=TRUE,algorithm="port")
temp_nls3 <- nls(ECe~(100/(73 + c *
WTD^(-1.01613))),data=US.final.values,start=list(c=0.05),trace=TRUE,algorithm="port")
temp_nls4 <- nls(ECe~(100/(73 + c *
WTD^(-1.01613))),data=US.final.values,start=list(c=-14.7127),trace=TRUE,algorithm="port")
temp_nls5 <- nls(ECe~(100/(b-14.7127 *
WTD^(-1.01613))),data=US.final.values,start=list(b=73),trace=TRUE,algorithm="port")
temp_nls6 <- nls(ECe~(100/(b-14.7127 *
WTD^(-1.01613))),data=US.final.values,start=list(b=70.4936),trace=TRUE,algorithm="port")
temp_nls7 <- nls(ECe~(a/(70.4936-14.7127 *
WTD^(-1.01613))),data=US.final.values,start=list(a=100),trace=TRUE,algorithm="port")
  0: 2243.9898:  100.000
  1: 2122.8218:  106.219
  2: 1359.8819:  187.530
  3: 1359.8819:  187.530



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SQL on R

2009-07-05 Thread Bernardo Rangel Tura

On Sat, 2009-07-04 at 22:24 -0700, JoK LoQ wrote:
> I'm dealing with lots of columns and conditions, wats t best way to deal with
> that?
> How do I work with SQL on R? the manual is quite confuse talking about that.
> Do I need a package?

I don't understand your question, but 

if you think use SQL in your data frames do you use SQLDF
(http://cran.r-project.org/web/packages/sqldf/index.html) 

if you thinks use a database server and access it in R i sugest you use
RMySQL (http://cran.r-project.org/web/packages/RMySQL/index.html)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] queue simulation

2010-03-22 Thread Bernardo Rangel Tura

On Mon, 2010-03-22 at 02:49 -0600, Carlos Ernesto Lopez Nataren wrote:
> Hello everybody :) I am trying to simulate a queue with times of arrival to 
> the queue and time taken to dispatch every member of the queue coming from 
> two exponential distributions, I am interested in knowing the number of 
> people at any time and the time that takes every member of this queue to be 
> dispatched.
> 
> I thought this was gonna be an easy task but I've failed to try to simulate 
> this, is there any package that does this already?
> 
> Any help will be greatly appreciated.
> 
> Thank you very much
> Carlos

Carlos, if I understand your problem do you need know the time for each
person in a queue dispatched.

You say this time  fit a exponential distribution, so do you have a rate
of dispatch (x dispatch per time)  

If I want generate 10 times of dispatch rate 0.4, use the command 

times <- rexp(10,0.4)

If I need the total delay for each person,  use the command

cumsum(times)

If I need the average time in the queue, use the command

means(cumsum(times)) 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing content of column in data.frame + efficient join extraction between 2 data.frames

2010-03-23 Thread Bernardo Rangel Tura

On Tue, 2010-03-23 at 09:37 +, Corrado wrote:
> Dear R users,
> 
> I have 2 SpatialPointsDataFrame's, pcs and East.
> 
> The column str_1 in the first (pcs) is:
> 
>  > pcs[0:4,]
>coordinates cat   str_1  int_1  int_2dbl_1 dbl_2
> 1 (101000, 263000)   1 "SM06B" 101000 263000 4.978915 -4.293668
> 2 (101000, 265000)   2 "SM06C" 101000 265000 4.960478 -4.266742
> 3 (101000, 267000)   3 "SM06D" 101000 267000 4.912984 -4.246849
> 4 (101000, 269000)   4 "SM06E" 101000 269000 4.613309 -4.185405
>  >
> 
> The column str_1 in the second (East) is:
> 
>  > East[0:4,]
>coordinates str_1
> 1 (489000, 215000) sp81x
> 2 (489000, 217000) sp81y
> 3 (493000, 209000) sp90j
> 4 (495000, 209000) sp90p
>  >
> 
> I would like to do 2 things:
> 
> 1) I would like to change the format of the column str_1
> in the first to be the same that it is in the second,
> that is I need to remove the inverted commas " and I need to
> make it lower case.
> 
> 2) I would like to extract the rows from the first one (pcs) where 
> pcs$str_1
> is the same as East$str_1.
> 
> I have even tried regexp, but cannot modify
> the content of pcs$str_1 to remove
> the inveretd commas " and change the case to lowercase.
> 
> How do I do that?
> 
> Regards


Hi Corrado!

First: tolower(pcs$str_1) change to lower case
Second: try merge (East,pcs,by.x=str_1,by.y=str_1) to fusion data frames
Third: I don't recreate your database East in my computer do you give a
small part to I try solve your problem?

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compare two fingerprint images

2010-04-03 Thread Bernardo Rangel Tura

On Fri, 2010-04-02 at 20:52 +0200, Juan Antonio Gil Pascual wrote:
> Hello
> I wanted to compare two fingerprint images. How do you do with R?.
> Is there a role for cross-correlation of images?
> 
> Thanks
> 

Hi juan,

You was a quite vage in your question, so I don't know exactly what you
need but try this;

require(ReadImages)
x <- read.jpeg(image1)
x1 <- read.jpeg(image2)
table(x1==x)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compare two fingerprint images

2010-04-03 Thread Bernardo Rangel Tura

On Sat, 2010-04-03 at 21:18 +0200, Juan Antonio Gil Pascual wrote:
> Hi Bernado
> 
> I need to compare two fingerprint images and let me know if you can do 
> with R. I have used the technique of minutiae but it seems to work 
> better with the cross-correlation and wanted to know if you can do with R.
> 
> Thank you very much
> 
> Juan
> 
> 
> Bernardo Rangel Tura escribió:
> > On Fri, 2010-04-02 at 20:52 +0200, Juan Antonio Gil Pascual wrote:
> >   
> >> Hello
> >> I wanted to compare two fingerprint images. How do you do with R?.
> >> Is there a role for cross-correlation of images?
> >>
> >> Thanks
> >>
> >> 
> >
> > Hi juan,
> >
> > You was a quite vage in your question, so I don't know exactly what you
> > need but try this;
> >
> > require(ReadImages)
> > x <- read.jpeg(image1)
> > x1 <- read.jpeg(image2)
> > table(x1==x)
> >
> >
> >   
> 

Juan,

I don't know cross-correlation to images but I know this for time series
in this case:


ccf(ts(as.numeric(x)),ts(as.numeric(x1)))


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Numbers

2010-04-04 Thread Bernardo Rangel Tura

On Sun, 2010-04-04 at 04:03 -0700, kende jan wrote:
> Hello,
> How can I generate randomly in R a sample of skewed data with first quartile 
> is 540 and third quartile is 715.
> I need a sample of 100 cases.
> 
> Thank you for your help
> 
> Jan
> 

Hi Jan, 

Sorry my direct approach but why you don't use

data <- c(runif(25,number,540),runif(50,540,715),runif(25,715,number))

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA or CA

2009-09-30 Thread Bernardo Rangel Tura

On Tue, 2009-09-29 at 17:02 +, Paul Dennis wrote:
> Dear all
> 
> I have a data set for which PCA based between group analysis (BGA) gives 
> significant results but CA-BGA does not.
> 
> I am having difficulty finding a reliable method for deciding which 
> ordination technique is most appropriate. 
> 
> I have been told to do a 1 table CA and if the 1st axis is>2 units go for CA 
> if not then PCA.
> 
> Another approach is that described in the Canoco manual - perform DCA and 
> then look at the length of the axes.  I used decorana in vegan and it gives 
> axis lengths.  I assume that these are measured in SD units. Anyway the 
> manual say if the axis length is <3 go for PCA,>4 use CA and if intermediate 
> use either. 
> 
> Are either of these approaches good/valid/recommended or is there a better 
> method?
> 
> Thanks
> 
> Paul  

Hi Paul 

I think that Ca is Correspondence Analysis and PCA is Principal
Component Analysis, right?

In this case, if all variables is numeric do you must use PCA, if all
variables is factor do you must use CA.

If you have a mixed  variables do you have a problem, in most case I
convert numeric variables to factor (with loss of information) and make
CA
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aproximate a titration kurve to the measure data.

2009-10-01 Thread Bernardo Rangel Tura

On Wed, 2009-09-30 at 10:55 -0700, awayguy wrote:
> Halo
> 
> i'm studying chemistry, today we made an experiment and i have to draw a
> titration kurve for my mess data. we can do it on a mm paper, or we can also
> use a programe. people from chemistry recomend "R"
> last year i studied civil eng. and we used Matlab, as I see, R ist very
> similar to it, but its got other comands.
> But i think R would be a good help for some exercises.
> 
> so my main question is: i have some measurement data from my titration, and
> I want aproximate a kurve to this data. is it possible to do it with R?
> 
> a titration kurve looks like this:
> 
> http://www.nabble.com/file/p25685986/acetic-acid-titration-curve.png 
> 
> hope you can help me, and yes when its possible, if you know something like
> a tutorial then i would be glad if you could post it.
> 
> with regards

Hi,

I think do you need use drc package:

drc: Analysis of dose-response curves
Analysis of one or multiple curves with focus on concentration-response,
dose-response and time-response curves used, for example in biology,
environmental sciences, medicine, pharmacology, toxicology.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use Subpopulation data?

2009-10-01 Thread Bernardo Rangel Tura

On Thu, 2009-10-01 at 10:06 +, KABELI MEFANE wrote:
> Dear Helpers
> 
> I have a sample frame and i have sampled from it using three methods and now 
> i want to calculate the statistics but i only get the population parameters.
> 
> H <- matrix(rnorm(100, mean=5, sd=5000))
> sampleframe=data.frame(type=c(rep("H",100)),value=c(H))
> sampleframe
> 
> str=strata(sampleframe,c("type"),size=c(20,), method="srswor")

Try using

str=strata(sampleframe,c("type"),size=20, method="srswor")

or better

str <- strata(sampleframe,c("type"),size=20, method="srswor")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use Subpopulation data?

2009-10-01 Thread Bernardo Rangel Tura

On Thu, 2009-10-01 at 13:34 +, KABELI MEFANE wrote:
> ## package sampling
> 
> stra=strata(sampleframe,c("type","value","rating"),size=c(20,80,200,300,400),
> method="srswor")
> sample.strat<-getdata(sampleframe,stra)
> 
> sample.strat
> 

Try:

stra<-strata(sampleframe,size=c(20,80,200,300,400),method="srswor")
sample.strat<-getdata(sampleframe,stra)
sample.strat

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use Subpopulation data?

2009-10-02 Thread Bernardo Rangel Tura

On Fri, 2009-10-02 at 00:42 +, KABELI MEFANE wrote:
> Thanks
> 
> But it seems like you don't get my problem.Do you mean that there is 
> something wrong with the code as it seems like what you are doing is 
> suggesting different ways to write a code. 
> 
> Will i get to use the variable that have been name previously like if i want 
> to calculate the standard deviation of stratum hypermarket in a sample not 
> population, the first start is to check if it would help me is to check the 
> length() of different levels if it is not same as that of the original data
> 
> Best Regards
> 
> I rest my case, i might dream it

Sorry

I don't understand your problem early.

In your case you have 2 data.frames: sampleframe and sample.strat so you
need show to R where calculate sd or length or summary


set.seed(1)# only to make this example don't use in your code
stra<-strata(sampleframe,c("type"),size=c(20,80,200,300,400),method="srswor")
sample.strat<-getdata(sampleframe,stra)

length of value in sampleframe where type is Hypermarket:
> length(sampleframet$value[sampleframe$type=="Hypermarket"]) 
[1] 100

SD of value in sampleframe where type is Hypermarket:
> sd(sampleframe$value[sampleframet$type=="Hypermarket"])
[1] 4586.854

length of value in sample.strat where type is Hypermarket:
> length(sample.strat$value[sample.strat$type=="Hypermarket"] )
[1] 20

SD of value in sample.strat where type is Hypermarket:
> sd(sample.strat$value[sample.strat$type=="Hypermarket"] )
[1] 4679.336


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aproximate a titration kurve to the measure data.

2009-10-02 Thread Bernardo Rangel Tura

On Thu, 2009-10-01 at 10:18 -0700, awayguy wrote:
> yes, halo thank you.
> my measure data:
> v2 <- c(0, 2, 4, 6, 6.2, 6.4, 6.6, 6.8, 7, 7.2, 7.4, 7.6, 7.8, 8, 10, 12,
> 14)
> ph2 <- c(12.10, 11.94, 11.68, 11.11, 10.91, 10.74, 10.47, 9.71, 7.1, 4.24,
> 3.3, 3.08, 2.98, 2.86, 2.33, 2.11, 1.98)
> 
> with regards

Ok

temp<-data.frame(ph2,v2)
> drm(temp,, fct = LL.4())

A 'drc' model.

Call:
drm(formula = temp, fct = LL.4())

Coefficients:
b:(Intercept)  c:(Intercept)  d:(Intercept)  e:(Intercept)  
   39.420  2.425     11.487  7.002 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Satellite ocean color palette?

2009-10-09 Thread Bernardo Rangel Tura

On Fri, 2009-10-09 at 11:51 -0700, Tim Clark wrote:
> Dear List,
> 
> Is there a color palette avaliable similar to what is used in satellite ocean 
> color imagery?  I.e. a gradient with blue on one end and red on the other, 
> with yellow in the middle?  I have tried topo.colors(n) but that comes out 
> more yellow on the end.  I am looking for something similar to what is found 
> on the CoastWatch web page:
> 
> http://oceanwatch.pifsc.noaa.gov/imagery/GA2009281_2009282_sst_2D_eddy.jpg
> 
> Thanks!
> 
> Tim
> 
> 
> Tim Clark
> Department of Zoology 
> University of Hawaii

Tim,

You can make a palette in R, using colorRampPalette, look this example

Satelite.Pallete <-
colorRampPalette(c("blue3","cyan","aquamarine","yellow","orange","red"))
require(fields)
image.plot(volcano, col = Satelite.Pallete(500), legend.lab="Scale") 
contour(volcano, levels = seq(90, 200, by = 5), add = TRUE)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting a mirror "permanently" on R on ubuntu

2009-10-10 Thread Bernardo Rangel Tura

On Sat, 2009-10-10 at 04:54 +0100, Lazarus Mramba wrote:
> Dear all,
> 
> I seem to have many problems as I run R on my ubuntu system.
> want to set a mirror so that anytime I use the command "install.packages",
> it does not ask me for which mirror to use but go direct.
> This is because of the error I keep on getting below and I dont know how to
> solve it.
> 
> Please help.
> 
> Kind regards,
> Lazarus


Lazarus,

use options(repos="mirror URL")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] row selection

2009-10-11 Thread Bernardo Rangel Tura

On Thu, 2009-10-08 at 16:14 -0400, Ashta wrote:
> Hi all,
> I have a matrix  named x with N by  C
> I want to select every 5 th rrow from matrix x
> I used the following code
>  n<- nrow(x)
> > for(i in 1: n){
> + b <- a[i+5,]
> >b
> }
> Error: subscript out of bounds
> 
> Can any body point out the problem?

Hi Ashta,

If I understand your request you need select row 5,10,15, ...
In this case you can use this script:

x[1:nrow(n)%%5==0]

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stochastic process

2009-10-13 Thread Bernardo Rangel Tura

On Tue, 2009-10-13 at 14:43 +0800, 刘哲 wrote:
> Hi,
>  
> I'm a student in China, and I never used R before.
>  
> I'm now wondering how to simulate a sample of Markov chain of ,say 500 
> entries with a certain transition matrix. 
>  
> Thanks a lot.

Hi

Well the naive script for you problem is something like this

#
# initial condition
#
initial<-matrix(c(300,200,0),nrow=1)
initial
 [,1] [,2] [,3]
[1,]  300  2000

#
# transition matrix (third state is absortive)
#
transition<-matrix(c(
.5,.3,.2,
.3,.4,.3,
0,0,1
),nrow=3,byrow=T)
transition
 [,1] [,2] [,3]
[1,]  0.5  0.3  0.2
[2,]  0.3  0.4  0.3
[3,]  0.0  0.0  1.0

#
# first step
#
S1<-initial%*%transition
S1
 [,1] [,2] [,3]
[1,]  210  170  120

#
# Second step
#
S2<-S1%*%transition
S2
 [,1] [,2] [,3]
[1,]  156  131  213


If you use a large number of step is more pratical use command mtx.exp
of Biodem package, something like this


require(Biodem)
#
# 10th step direct
#
initial%*%mtx.exp(transition,10)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] attach

2009-10-18 Thread Bernardo Rangel Tura

On Wed, 2009-10-14 at 07:21 +0200, Christophe Dutang wrote:
> Hi all,
> 
> I have a question regarding the memory usage for the attach function.  
> Say I have a data.frame inputdat that I create with read.csv.
> 
> I would like to know what happens on the memory side when I use
> attach(inputdata)
> 
> Is there a second allocation of memory for inputdata?
> 
> Then I'm using eval on a expression which depends on the columns of  
> inputdata. Is it better not to use attach function?
> 
> Thanks in advance
> 
> Christophe

Well, if you attach a data.frame twice times, it use your memory twice
times.

I don't use attach I prefer with(data.frame, command)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] operation with if/else on a dataframe

2009-10-29 Thread Bernardo Rangel Tura

On Thu, 2009-10-29 at 01:47 -0700, Fran100681 wrote:
> Hi to all,
> I have this dataframe (I show the first six rows)
> 
> >head(table)
> 
>  A   RFold.Change P.Value 
> Count1   Count2
> 1 ENSRNOE002_at 0 1.13 0.601  
> 1
> 2 ENSRNOE009_at 0-1.04 0.733  
> 3
> 3 ENSRNOE020_at 0-1.08 0.680  
> 0
> 4 ENSRNOE021_at 0-1.31 0.201  
> 2
> 5 ENSRNOE023_at 0-1.06 0.643  
> 3
> 6 ENSRNOE024_at 0-1.14 0.403  
> 3
> 
> I would like to generate a function that determine for each row a new value
> (resulting in a new vector of values to add to the dataframe). 
> The function should give for every row the same value showed in column "R".
> However,I need of this because not all the R-values reported in this table
> are correctly determined following the criteria mentioned below.
> 
> 
> These new values calculated by the function must be:
> 
> 1)"UP" 
> if all the following conditions are verified in a certain row:
> Fold.Change is >= +1.5, 
> P.Value is < 0.05 
> Count1 >= 2
> 
> 2)"DOWN"
> if all the following conditions are verified in a certain row:
> Fold.Change is <= -1.5, 
> P.Value is < 0.05 
> Count2 >= 2
> 
> 3) 0 if both of previous conditions are not verified
> 
> So I have set these fllowing parameters (new objects) because I'll have to
> repeat this procedure to different dataframes in which the order of columns
> of interest might change (So I can change these parameters depending on the
> order of the columns in any different table)
> 
> Fold.change <-  3 #(because in this table, Fold.Change value is the third
> column and so on...)
> P.Value <-  4
> Count1 <-  5
> Count2 <-  6

[ Quote text]

> This is my problem, I cannot use the function to recalculate values in
> R-column for all rows in my dataframe. I don't understand  where is the
> problem, can someone help me?
> Thanks a lot!!
> 
> Francesco

Francesco,


I think you solve this problem with a simple way.
Remember in R the most function and operations are vectorized so look
this example: 

set.seed(123)
x<-rpois(20,5)
y<-rpois(20,15)
z<-rpois(20,10)
dta<-data.frame(x,y,z)
dta
dta$NEW<-ifelse(x>5 & y>15 & z>10,"UP",
ifelse(x<5 & y<15 & z<10,"DOWN",
"0"))
dta

First, I use ifelse command to simplify your nested conditional
situation.

Second, I know that R test this nested condition in order so the first
position will result test x[1],y[1] and z[1], the second postion will
result test x[2],y[2] and z[2] ...

The new vector result is the same order the original data.frame so I use
dta$NEW to create a new column in data.frame 



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] negative log likelihood

2009-11-09 Thread Bernardo Rangel Tura

On Sun, 2009-11-08 at 11:12 -0800, mat7770 wrote:
> I have two related variables, each with 16 points (x and Y). I am given
> variance and the y-intercept. I know how to create a regression line and
> find the residuals, but here is my problem. I have to make a loop that uses
> the seq() function, so that it changes the slope value of the y=mx + B
> equation ranging from 0-5 in increments of 0.01. The loop also needs to
> calculate the negative log likelihood at each slope value and determine the
> lowest one. I know that R can compute the best regression line by using
> lm(y~x), but I need to see if that value matches the loop functions. 

Hi

If I understand your question you need extract log-likelihood for a
linear model, so you need using logLik command, see example:

set.seed(1)
x<-rpois(16,6)
y<-2*x+3+rnorm(16,sd=3)
model<-lm(y~x)
logLik(model)
'log Lik.' -40.1177 (df=3)



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "subset" or "condition" as argument to a function

2009-11-21 Thread Bernardo Rangel Tura

On Fri, 2009-11-20 at 17:40 -0800, Santosh wrote:
> Dear Rxperts!
> 
> I was wondering if it is possible to write a function which can take in
> argument of a subset or condition.. Of course, I am aware of the alternate
> methods like coplot, par.plot, xyplot etc... I am specifically interested in
> using conditions/subsets with "plot"..
> 
> A simple fragmented example is shown here..
> 
> pltit <- function(y,x,dat,dat1,dat2,sbst) {
> plot(y~x, data=dat, subset=sbst)
> lines(y~x,data=dat1, subset=subst)
> points(y~x,data=dat2,subset=subst)
> }
> 
> pltit(profit,weeks,dat=zone1, sbst='group==1&sales>200')
> 
> Could you also suggest pointers/references/examples on efficient ways to
> plot simulated data overlaid with observed data?
> 
> Have a good weekend!

Hi Santosh,

IMHO your script is

pltit <- function(y,x,dat,dat1,dat2,sbst) {
subs <- subset(dat,sbst)
with(subs,plot(y~x))

subs1 <- subset(dat1,sbst)
with(subs1,lines(y~x))

subs2 <- subset(dat2,sbst)
with(subs2,points(y~x))
}


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] areg (stata) equivalent in R?

2009-12-01 Thread Bernardo Rangel Tura

On Mon, 2009-11-30 at 22:20 -0500, John K. Williams wrote:
> hi, i am looking to reproduce a study done in stata in R, where a
> regression was done while absorbing a categorical variable.  i am new
> to R, i've i installed the design package but haven't been able to
> find an applicable function.  thanks for any help.
> 

John,

Do you show a example for this command?

I don't know stata so I don't help you

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with "Cannot compute correct p-values with ties"

2009-12-22 Thread Bernardo Rangel Tura

On Wed, 2009-12-02 at 16:52 +0800, Zhijiang Wang wrote:
> Dear All,
>1. why did the problem happen?
>2. How to solve it?
> 
>--
> 
> Best wishes,
> Zhijiang Wang


Well... The algorithm for Mann-whitney test have problem with ties

To solve you can use jitter

a<-1:10
b<-1:10
wilcox.test(a,b)

Wilcoxon rank sum test with continuity correction

data:  a and b 
W = 50, p-value = 1
alternative hypothesis: true location shift is not equal to 0 

Warning message:
In wilcox.test.default(a, b) : cannot compute exact p-value with ties

wilcox.test(a,jitter(b))

Wilcoxon rank sum test

data:  a and jitter(b) 
W = 49, p-value = 0.9705
alternative hypothesis: true location shift is not equal to 0 

look ?jitter for more information

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and McAfee 8.5

2009-05-11 Thread Bernardo Rangel Tura

On Mon, 2009-05-11 at 09:54 -0300, Joyce, Warren wrote:
> Hi,
> 
> I have been working with R for the last year and using the UKFSST package to 
> look at satellite tag track data and SST information. Fpr those not familiar, 
> the package uses the positions estimated by the satellite tags themselves and 
> the associated SST data from servers (in this case, from the University of 
> Hawaii and the NOAA Coastwatch website) for the time preiod to estimate a 
> reasonable track the animal took while it was at liberty.  I'm currently 
> having trouble downloading data from two different servers which I never had 
> a problem before I recevied an upgraded computer with McAfee 8.5. When the 
> package goes to the various servers for SST information to download, I 
> continue to get a message in R:
> 
(...)
> Into my web browser, I get prompted with a file download window which quickly 
> downloads an empty file. It looks like the data is available, I just can't 
> seem to download it.
> 
> I've deduced that it must be a problem with the new firewall features of 
> McAfee 8.5 and was wondering if anyone else has run into a problem like this. 
> I've tried using the exclusions options in 8.5 to exclude any R files but 
> have come up empty handed. Still waiting on our informatics division to try 
> and solve the problem as well but I thought I might try asking here as well.
> 
> Thanks and I appreciate any assistance!
> 
> Warren N. Joyce 

Hi Joice,

In my opinion you need think about 2 things.

1- This is a problem of McAfee 8.5 not a R problem, so do you contact a
McAfee support for fix this problem ?

2- I presume  you use a windows but i do know your version (XP or Vista
or win 95 etc) neither your R configuration (do you use "R --internet2"
or not?)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with a cumullative Hazrd Ratio plot

2009-05-13 Thread Bernardo Rangel Tura

Hi R-masters

I need help to make modified cumulative hazard ratio plot.

I need create a common plot but with the number of subjects in risk each
ticks times for two different groups in bottom of plot (I put one
example in attach).

Do you know a routine for this?
Is possible create a routine for this? 
In this case with how commands?

Thanks in advance!
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert a lis to matrix

2009-06-07 Thread Bernardo Rangel Tura

On Sun, 2009-06-07 at 00:14 -0400, Manisha Brahmachary wrote:
> Hello,
> 
>  
> 
> This is an urgent request. I want to convert a list of 3 elements into a
> matrix and I am not sure how to do it.
> 
>  
> 
> The list looks like this:
> 
> List of 3
> 
>  $ : num [1:15364, 1] 0.133 0.622 0.588 1.024 0.583 ...
> 
>   ..- attr(*, "dimnames")=List of 2
> 
>   .. ..$ : chr [1:15364] "6420681" "3610072" "2260458" "60689" ...
> 
>   .. ..$ : NULL
> 
>  $ : num [1:15364, 1] 3.159 0.265 0.522 1.912 3.380 ...
> 
>   ..- attr(*, "dimnames")=List of 2
> 
>   .. ..$ : chr [1:15364] "6420681" "3610072" "2260458" "60689" ...
> 
>   .. ..$ : NULL
> 
>  $ : num [1:15364, 1] 3.214 0.277 1.447 1.827 2.054 ...
> 
>   ..- attr(*, "dimnames")=List of 2
> 
>   .. ..$ : chr [1:15364] "6420681" "3610072" "2260458" "60689" ...
> 
>   .. ..$ : NULL

I'm not sure if understood your question, but look this code

a<-list(B=1:3,C=2:4,D=3:5)
matrix(unlist(a),ncol=3)




-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fisher.test - FEXACT error 7

2009-03-23 Thread Bernardo Rangel Tura

On Fri, 2009-03-20 at 18:29 +, Helena Mouriño wrote:
> Dear all,
> 
> Im having an awkward problem in R.  When I write the command
> Fisher.test(school.data,workspace=2e+07), where school.data is the matrix
> corresponding to the data set,
> I get the error  message:

> FEXACT error 7.
> LDSTP is too small for this problem.
> Try increasing the size of the workspace.
> 
> Increasing the workspace: 
> Fisher.test(school.data,workspace=1e+10),
> I get a different message, but it still doesnt work:
> 
> NAs in foreign function call (arg 10)
> In addition: Warning message:
> In fisher.test(dados, workspace = 1e+10, alternative = "two.sided") :
>   NAs introduced by coercion

Hi Helena,

In this case you can try 3 solutions:

1- chisq.test(school.data), but pay attention if expected value of any
cell is < 5 

2- Fisher.test(school.data,workspace=2e+07,hybrid=TRUE) from Help

For larger than  2 by 2 tables and 'hybrid = TRUE', asymptotic
 chi-squared probabilities are only used if the 'Cochran
 conditions' are satisfied, that is if no cell has count zero, and
 more than 80% of the cells have counts at least 5.

3- Use "large tables" approach from Sir David Cox:

Law, G. R. and Cox, D. R. and Machonochie, N. E. S. and Simpson, J. and
Roman, E. and Carpenter, L. M. (2001) Large tables. Biostatistics
2(2):pp. 163-171.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any package for connecting berkeley db in R?

2009-03-23 Thread Bernardo Rangel Tura

On Fri, 2009-03-20 at 17:23 -0400, Zheng, Xin (NIH) [C] wrote:
> Hi there,
> 
> Is there any such package? I searched but found none. Thanks in advance.
> 
> Xin Zheng
> 

Hi Xin

Do you try package DBI?

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Search "Errors"

2009-03-25 Thread Bernardo Rangel Tura

On Tue, 2009-03-24 at 11:45 -0700, CE.KA wrote:
> Hi R users,
> I have a big program
> So in Rgui I can t see all the execution of it
> Is there a way to ask R if there is "Errors" in my program
> Sincerely yours

Hi

Normally i use 3 functions in debug R routines: trace, browser and
debug.
In special cases i use a debug package
( http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf )
Article: Debugging without (too many) tears 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] very fast OLS regression?

2009-03-26 Thread Bernardo Rangel Tura

On Wed, 2009-03-25 at 22:11 +0100, Dimitris Rizopoulos wrote:
> check the following options:
> 
> ols1 <- function (y, x) {
>  coef(lm(y ~ x - 1))
> }
> 
> ols2 <- function (y, x) {
>  xy <- t(x)%*%y
>  xxi <- solve(t(x)%*%x)
>  b <- as.vector(xxi%*%xy)
>  b
> }
> 
> ols3 <- function (y, x) {
>  XtX <- crossprod(x)
>  Xty <- crossprod(x, y)
>  solve(XtX, Xty)
> }
> 
> ols4 <- function (y, x) {
>  lm.fit(x, y)$coefficients
> }
> 
> # check timings
> MC <- 500
> N <- 1
> 
> set.seed(0)
> x <- matrix(rnorm(N*MC), N, MC)
> y <- matrix(rnorm(N*MC), N, MC)
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))

Hi Dimitris and Ivo

I think this is not a fair comparison, look this

x[8,100]<-NA

system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
   user  system elapsed 
  8.765   0.000   8.762 
 

 system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
Error in solve.default(t(x) %*% x) : 
  system is computationally singular: reciprocal condition number = 0
Timing stopped at: 0 0 0.002 
 

 system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
Error in solve.default(XtX, Xty) : 
  system is computationally singular: reciprocal condition number = 0
Timing stopped at: 0 0 0.001 
 

 system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))
Error in lm.fit(x, y) : NA/NaN/Inf in foreign function call (arg 1)
Timing stopped at: 0 0 0.001 

So routines ols2, ols3 and ols4 only functional in fully matrix if have
one NA this functions don't run.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to increase the limit for max.print in R

2009-03-31 Thread Bernardo Rangel Tura

On Tue, 2009-03-31 at 14:47 +0530, pooja arora wrote:
> Hi All,
> 
>  
> 
> I am using DNAcopy package in R for copy number analysis of 500K chip.
> 
> The final output which I get from DNA copy is too big to be printed in a
> file.
> 
> So I am getting an error as "reached getOption("max.print") -- omitted
> 475569 rows " 
> 
> Can somebody please provide me the pointers with how to increase the limit
> for max.print .
> 
>  
> 
> Thanks,
> 
>  
> 
> Pooja

Hi Pooja,

You must use options command, something like this

options(max.print=5.5E5)

For more information type? ?options

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to increase the limit for max.print in R

2009-03-31 Thread Bernardo Rangel Tura

On Tue, 2009-03-31 at 15:51 +0530, pooja arora wrote:
> Thanks, it Worked.
> Do you have any idea how much is the max limit for max.print
> 
> 
> Thanks and Regards,
> Pooja Arora 

Hi Pooja,

In this case max is;

options(max.print=.Machine$double.xmax)

In my case I have a compiled R 2.8.1 in AMD phenom using Ubuntu 8.10 AMD
64 

> .Machine$double.xmax
[1] 1.797693e+308

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A query about na.omit

2009-04-01 Thread Bernardo Rangel Tura

On Wed, 2009-04-01 at 16:49 +0100, Jose Iparraguirre D'Elia wrote:
> Dear all,
>  
> Say I have the following dataset:
>  
> > DF
> x y z
> [1]   1 1 1
> [2]   2 2 2
> [3]   3 3NA
> [4]   4   NA   4
> [5]  NA  5 5
>  
> And I want to omit all the rows which have NA, but only in columns X and Y, 
> so that I get:
>  
>  x  y  z
> 1  1  1
> 2  2  2
> 3  3  NA
>  
> If I use na.omit(DF), I would delete the row for which z=NA, obtaining thus
>  
> x y z
> 1 1 1
> 2 2 2
>  
> But this is not what I want, of course. 
> If I use na.omit(DF[,1:2]), then I obtain
>  
> x y 
> 1 1
> 2 2
> 3 3
>  
> which is OK for x and y columns, but I wouldn't get the corresponding values 
> for z (ie 1 2 NA)
>  
> Any suggestions about how to obtain the desired results efficiently (the 
> actual dataset has millions of records and almost 50 columns, and I would 
> apply the procedure on 12 of these columns)?
>  
> Sincerely,
>  
> Jose Luis 
>  
> Jose Luis Iparraguirre
> Senior Research Economist 
> Economic Research Institute of Northern Ireland
>  

Hi Jose Luis,

I think this script is sufficient for your problem:

tab<-matrix(c(1,1,1,2,2,2,3,3,NA,4,NA,4,NA,5,5),ncol=3,byrow=T)
tab[!is.na(tab[,1])&!is.na(tab[,2]),]

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installation of all packages under Unix

2009-04-02 Thread Bernardo Rangel Tura

On Thu, 2009-04-02 at 14:37 +0200, nboe...@uni-potsdam.de wrote:
> Dear List-Members,
> 
> does anyone know a simple way to automatically install all R packages  
> on a unix system to the default library path? Not from inside R, it  
> should rather work as a shell script - job at startup.
> 
> Something like R cmd install -l pkgs ### where pkgs should mean all  
> packages (is there an option to control overwriting?)
> 
> Thanks a lot,
> 
> N. Boehme

All packages in CRAN

install.packages(new.packages(),dep=T,clean=T)


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] threshold distribution

2009-04-05 Thread Bernardo Rangel Tura

On Sun, 2009-04-05 at 01:13 -0400, jim holtman wrote:
> Here is what I get from using 'fitdistr' in R to fit to a lognormal.
> The resulting density plot from the distribution seems to be a reason
> match to the data.
> 
> > x <- scan()
> 1: 0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559
> 9: 0.85009 0.85804 0.73324 1.04826 0.84002
> 14: 1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549
> 22: 0.93641 0.80418 0.95285 0.76876 0.82588
> 27: 1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696
> 35: 0.82364 0.84390 0.71402 0.80293 1.02873
> 40:
> Read 39 items
> > plot(density(x))
> > library(MASS)
> > fitdistr(x, 'lognormal')
>  meanlogsdlog
>   -0.134806360.19118861
>  ( 0.03061468) ( 0.02164785)
> > lines(dlnorm(x, -.1348, .1911), col='red')

Hi Jim

I agree with your solution but my plot result not fine.
I obtain same result.
> fitdistr(x, 'lognormal')
 meanlogsdlog   
  -0.134806360.19118861 
 ( 0.03061468) ( 0.02164785)

In plot when I use points (blue) and curve (green) the fit o lognormal
and density(data) is fine but when I use lines (red) the fit is bad (in
attach I send a PDF of my output)

Do you know why this  happen?

Thanks in advance

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

P.S. my script is

x <- scan()
0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559
0.85009 0.85804 0.73324 1.04826 0.84002
1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549
0.93641 0.80418 0.95285 0.76876 0.82588
1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696
0.82364 0.84390 0.71402 0.80293 1.02873

require(MASS)
fitdistr(x, 'lognormal')
pdf("adj.pdf")
plot(density(x))
lines(dlnorm(x, -.1348, .1911), col='red')
points(x,dlnorm(x, -.1348, .1911), col='blue')
curve(dlnorm(x, -.1348, .1911), col='green',add=T)
dev.off()




adj.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to save the variables in R to a file

2009-04-06 Thread Bernardo Rangel Tura

On Mon, 2009-04-06 at 11:05 +0200, Cetinyürek Aysun wrote:
> Dear list members,
> 
> I have f.mean matrix of size 500 X 83 in R, and I want to save this into a
> file.
> I used the command
> save("f.mean.Rdata",file="D:/Users/Ays/Documents/Results")
> it saves but when I open the file in notepad it is just some characters
> meaningless.
> 
> Thank you in advance,
> 
> Kind Regards,
> 
> Aysun

Hi Aysun,

Try write.csv something like this:

write.csv(f.mean.Rdata,"D:/Users/Ays/Documents/Results/f_mean.csv")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sequences

2009-04-07 Thread Bernardo Rangel Tura

On Tue, 2009-04-07 at 05:16 -0700, Melissa2k9 wrote:
> Hi,
> 
> I am trying to make a sequence and am  using a for loop for this. I want to
> start off with an initial value ie S[0]=0 then use the loop to create other
> values. This is what I have so far but I just keep getting error messages.
> 
> #To calculate the culmulative sums:
> 
> s<-rep(0,207)#as this is the length of the
> vector I know I will have
> s<-as.vector(s)
> s[0]<-0
> for (i in 1:length(lambs))# where lambs is a vector of
> length 207 consisting of temperature 
> values
> 
> 
> {
>   s[i]<-s[i-1]-mean(lambs)
> }
> 
> I continually get the error message: 
> 
> Error in s[i] <- s[i - 1] - mean(lambs) : replacement has length zero
> 
> 
> When I merely use s[i]<-i-mean(lambs) it works so there is obviously
> something wrong with the s[i-1] but i cant see what. All I want is for each
> S[i] to be the previous value for S - the mean.

Hi Melissa,

I think this is a index problem ...

your code is:

s<-rep(0,207)   
s<-as.vector(s)
s[0]<-0
for (i in 1:length(lambs)){
s[i]<-s[i-1]-mean(lambs)
}

But try this code:

s<-rep(0,207)   
s<-as.vector(s)
s[1]<-0 # not s[0]
for (i in 2:length(lambs)){ #not 1:length(lambs)
s[i]<-s[i-1]-mean(lambs)
}

The vector in R begin in 1 not in 0...

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] newbie query: simple crosstabs

2009-04-08 Thread Bernardo Rangel Tura

On Tue, 2009-04-07 at 16:33 -0400, Donald Braman wrote:
> I've been playing around with various table tools, trying to construct a
> fairly simple cross-tab.  It shouldn't be hard, but for some reason it
> turning out to be (for me).
> 
> If I want to see how many men and how many women agree with a agree/disagree
> question (coded 1,0), I can do this:
> 
> >attach(mydata)
> >mytable <- table(male, q1.bin) # gender and a binary response variable
> >prop.table(mytable, 1) # row percentages
>  q1.bin
> male  0  1
>0 0.3988 0.6012
>1 0.2879 0.7121
> 
> I can repeat that for each of the items I want gender breakdowns for (q2,
> q3, q4 ).   But what I really want is a table that shows the percentage
> answering yes (coded as 1) across many, many binary response items.  E.g.,
> 
> 
> male q1.bin q2.bin q3.bin ...
>0 0.6012 0.3421 0.9871 ...
>1 0.7121 0.6223 0.0198 ...
> 
> I've tried various combinations of apply & cbind, but to no avail. It would
> be easy in SPSS crosstabs, but darnit, I want to use R!

Donald,

The other solutions is using command CrossTable of package gmodels.
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dummy (factor) based on a pair of variables

2009-04-18 Thread Bernardo Rangel Tura

On Sat, 2009-04-18 at 08:55 +0200, Serguei Kaniovski wrote:
> 
> Dear All!
> 
> my data is on pairs of countries, i and j, e.g.:
> 
> y,i,j
> 1,AUT,BEL
> 2,AUT,GER
> 3,BEL,GER
> 
> I would like to create a dummy (indicator) variable for use in regression
> (using factor?), such that it takes the value of 1 if the country is in the
> pair (i.e. EITHER an i-country OR an j-country).
> 
> Thank you for your help,
> Serguei

Hi Serguei,

If I understand your doubt, the solution is something like this for pair
i-country is AUT or j-country is BEL


output ~ I(i-country=="AUT"|j-country=="BEL")
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] omit empty cells in crosstab?

2009-04-26 Thread Bernardo Rangel Tura

On Fri, 2009-04-24 at 13:12 -0700, sjaffe wrote:
> small example:
> 
> a<-c(1.1, 2.1, 9.1)
> b<-cut(a,0:10)
> c<-data.frame(b,b)
> d<-table(c)
> dim(d) 
> ##result: c(10, 10)
> 
> But only 9 of the 100 cells are non-zero.
> If there were 10 columns, the table have 10 dimensions each of length 10, so
> have 10^10 elements, too much even to fit in memory

Hi Steve

In your only 3 cells > 0
> d
b.1
b(0,1] (1,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,10]
  (0,1]  0 0 0 0 0 0 0 0 0  0
  (1,2]  0 1 0 0 0 0 0 0 0  0
  (2,3]  0 0 1 0 0 0 0 0 0  0
  (3,4]  0 0 0 0 0 0 0 0 0  0
  (4,5]  0 0 0 0 0 0 0 0 0  0
  (5,6]  0 0 0 0 0 0 0 0 0  0
  (6,7]  0 0 0 0 0 0 0 0 0  0
  (7,8]  0 0 0 0 0 0 0 0 0  0
  (8,9]  0 0 0 0 0 0 0 0 0  0
  (9,10] 0 0 0 0 0 0 0 0 0  1

If you desire use simple code to find only cell>0 use this

 table(interaction(c,drop=T))

  (1,2].(1,2]   (2,3].(2,3] (9,10].(9,10] 
    1 1 1 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Division ?

2009-05-03 Thread Bernardo Rangel Tura

On Sat, 2009-05-02 at 10:34 -0700, bogdanno wrote:
> It seems division with numbers bigger than 10 000 000 doesn't work
>  2000/21
> [1] 952381
> > /23
> [1] 2415459
> 
> Thank you


Hi bogdanno

First of all look this

> all.equal(21*2000/21,2000)
[1] TRUE

So de division work correctly

but if I type

> all.equal(952381*21,2000)
[1] "Mean relative difference: 5e-08"

It's not means R division don't work correctly if you use the command
> format(2000/21,digits=22)
[1] "952380.952380952

So the result: 952381 is a round number not a real result of division.
This occur because R print only 7 significants digits in numbers, if you
test
> all.equal(21*952380.952380952,2000)
[1] TRUE

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with function "fitdistr" in "MASS"

2010-01-03 Thread Bernardo Rangel Tura

On Sat, 2010-01-02 at 23:20 -0800, Saji Ren wrote:
> Hi, R users:
> 
> I want to fit my data into a normal distribution by using the command
> "fitdistr" in "MASS".
> I changed my data class from "ts" to "numeric" by
> 
> >class(mydata)="numeric"
> 
> but after using "fitdistr", I got the result below
> 
> >fitdistr(mydata,"normal")
>   meansd 
>   NA NA  
>  (NA)   (NA) 
> 
> the help doc of "fitdistr" does not mention anything about that, thus I need
> your help.
> 
> Thank you in advanced,
> Saji from Shanghai

Hi Sajj,

You hava NA in your data

try: fitdistr(na.exclude(mydata),"normal")

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using tabble

2010-01-18 Thread Bernardo Rangel Tura

On Mon, 2010-01-18 at 17:17 +0800, Fan Dan wrote:
> Dear Helper: 
> I am using R to finish a project which is near the dealine. 
> Please help me ! 
> I need to construct a table. The name of the table is the sequence from 1 to 
> 7 with some missing values arbitrarily. 
> For example 1 to 7 
> 12   36   7
> 12  3   45  5   7
>  I want to add the missing numbers of the title which should be seq(1,7) 
> In this case, 4  and 5 
> and the corresponding data is 0 
> So after revised the table should be like 
>  12   3   4   5   6   7 
>  12  3  45  0   0   5   7 
> Please tell me how 
> I really appreaciate your time and helpn 
> Yours 
> Wolfgang AMadeurs 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Hi!

If i understood your problem is easy create a fake data for example:

set.seed(123)
x<-rep(sample(1:7,5),3)
table(x)
x
3 4 5 6 7 
3 3 3 3 3

In this example the values "1" and "2" are missing

Solution change your variable to factor and determine your leveal
x.factor<-factor(x,levels=1:7)
table(x.factor)
x
1 2 3 4 5 6 7 
0 0 3 3 3 3 3 






-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identifying odd or even number

2010-07-01 Thread Bernardo Rangel Tura

On Thu, 2010-07-01 at 08:40 -0700, Yemi Oyeyemi wrote:
> Hi, I run into problem when writing a syntax, I don't know syntax that will 
> return true or false if an integer is odd or even.
> Thanks
> 
> OYEYEMI, Gafar Matanmi
> 
> Department of Statistics
> 
> University of Ilorin

Hi Yemi!

Your problem is solve wiht "!" and "%%", Look this example

> x <- 1:11
> x
 [1]  1  2  3  4  5  6  7  8  9 10 11
> x%%2
 [1] 1 0 1 0 1 0 1 0 1 0 1
> !(x%%2)
 [1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best way to compute a sum

2010-07-03 Thread Bernardo Rangel Tura

On Fri, 2010-07-02 at 21:23 -0700, Roger Deangelis wrote:
> 
>   Although it does not apply to your series and is impractical,  it seems to
> me that the most accurate algorithm might be to add all the rational numbers
> whose sum and components can be represented  without error in binary first,
> ie 2.5 + .5 or 1/16 + 1/16 + 1/8.
> 
>   You could also get very clever and investigate a sum that should have an
> exact binary representation when the individual components do not, ie .1 +
> .2 + .2 = .5 and correct the sum.
> 
> Roger

Roger I think you must read: What Every Computer Scientist Should Know
About Floating-Point Arithmetic
( http://docs.sun.com/source/806-3568/ncg_goldberg.html )

I think your question and others like this question is answer in this
paper 
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best way to compute a sum

2010-07-04 Thread Bernardo Rangel Tura

On Sat, 2010-07-03 at 17:27 -0700, Roger Deangelis wrote:
> Hi Bernado,
> 
> In many financial applications if you convert the dollars and cents to
> pennies( ie $1.10 to 110) and divide by 100 at the vary end you can get
> maintain higher precision. This applies primarily to sums. This is similar
> to keeping track of the decimal fractions which have exact representations
> in floating point. It is good to know where the errors come from.

OK 
> 
> I suppose you could also improve the sum by understanding the decimal to
> binary rounding algorithms. I have noticed differences in computations
> between Sun hardware(FPUs) and Intel(FPUs). When presented the same set of
> operations, it appears that the floating point hardware do not do the
> operations in the same order. 

It's true
>  
> 
> Consider
> 
>sprintf("%a", sum(rep(1.1,100))) # overestimates the sum
> "0x1.b8001p+6"
>   "0x1.b8001p+6"
> 
>   sprintf("%a", sum(rep(11,100))/10)   # this gives the correct answer
> "0x1.b8p+6"
>   110 = 10 1110 = 32 + 8 + 4 + 2 - "0x1.b8p+6" - tricky due to 53 bit and
> little endian (I think this is right)
>   "0x1.b8p+6"
> 
>   note
>cmp <- ifelse (sum(rep(1.1,100))==sum(rep(11,100))/10, "equal",
> "unequal")
>  
> [1] "unequal"
> 
>cmp <- ifelse (sum(rep(1.1,100))>sum(rep(11,100))/10, "greater than",
> "less than orequal")
>   
> [1] "greater than"


Hi Roger,

First of all, is true 

> sum(rep(1.1,100))==sum(rep(11,100))/10
[1] FALSE

But is true too

> all.equal(sum(rep(1.1,100)),(sum(rep(11,100))/10))
[1] TRUE


I think you need read about "Guard Digits approach" 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Bernardo Rangel Tura

On Mon, 2010-07-05 at 19:25 +0200, Przemek Grabowicz wrote:
> Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I 
> do not have administrative rights to, there is only R executive. It 
> seems that I can allocate more than 3GB of memory, however not 
> everything seems to work the same/right as with R64 under MacOS.
> 
> Pms.
> 

Type .Machine$sizeof.pointer
If respond is 8 your R is 64 bits

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] F# vs. R

2010-07-07 Thread Bernardo Rangel Tura

On Wed, 2010-07-07 at 17:31 +0200, Sergey Goriatchev wrote:
> Hello, everyone
> 
> F# is now public. Compiled code should run faster than R.
> 
> Anyone has opinion on F# vs. R? Just curious
> 
> Best,
> S

Sergey,

F# is public, but is not open source. 

F# run in windows but run in AIX, linux, MAC, UNIX etc?

Compiled code should run faster than R, but is precise?

Compiled code should run faster than R, but is reliable?

Compiled code should run faster than R, but have 2.440 packages for
extend your capacities?

Compiled code should run faster than R, but critical code is in C o
FORTRAN

So I think the F# is not a good alternative, if your concern is velocity
dou you look Littler

http://code.google.com/p/littler/

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] levene.test

2010-07-14 Thread Bernardo Rangel Tura

On Wed, 2010-07-14 at 01:37 -0700, martanair wrote:
> I am trying to use Levene's test (of package car), but I do 
> not understand quite well how to use it. '?levene.test' does 
> not unfortunately provide any example. My data are in a data 
> frame and correspond to 1 factor plus response. Could 
> someone please give me an example about how to use the command 
>  
> levene.test(y, group) 
> 
> Thanks in advance, 
>  
> marta

Hi Marta,

levene.test is deprecated functions in car package as say the help of
command (?levene.test). Instead this use leveneTest (?levene.test
again)If you type ?leveneTest you get examples

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating with tolerances

2010-09-09 Thread Bernardo Rangel Tura

On Thu, 2010-09-09 at 09:16 +0430, Jan private wrote:
> Dear list,
> 
> I am from an engineering background, accustomed to work with tolerances.
> 
> For example, I have measured
> 
> Q = 0.15 +- 0.01 m^3/s
> H = 10 +- 0.1 m
> 
> and now I want to calculate
> 
> P = 5 * Q * H 
> 
> and get a value with a tolerance +-
> 
> What is the elegant way of doing this in R?
> 
> Thank you,
>   Jan

Hi Jan,

If I understood  your problem this script solve your problem:

q<-0.15 + c(-.1,0,.1)
h<-10 + c(-.1,0,.1)
5*q*h
[1]  2.475  7.500 12.625

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read columns of quoted numbers as factors

2010-10-05 Thread Bernardo Rangel Tura

On Mon, 2010-10-04 at 09:39 -0700, james hirschorn wrote:
> Suppose I have a data file (possibly with a huge number of columns), where 
> the 
> columns with factors are coded as "1", "2", "3", etc ... The default behavior 
> of 
> read.table is to convert these columns to integer vectors. 
> 
> Is there a way to get read.table to recognize that columns of quoted numbers 
> represent factors (while unquoted numbers are interpreted as integers), 
> without 
> explicitly setting them with colClasses ?
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Hi James,

I think you solve ypur problem using the options colClasses in the
read.table command, something like this

rea.table('name.of.table',colClasses=c(rep(30,'integer'),rep(5,'numeric'),etc))
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can't find and install reshape2??

2010-10-12 Thread Bernardo Rangel Tura

On Mon, 2010-10-04 at 10:27 +0930, Chris Howden wrote:
> Hi everyone,
> 
> 
> 
> Im trying to install reshape2.
> 
> 
> 
> But when I click on install package its not coming up!?!?! Im getting
> reshape, but no reshape2?
> 
> 
> 
> Ive also tried download.packages(reshape2, destdir="c:\\") &
> download.packages(Reshape2, destdir="c:\\")but no luck!!!
> 
> 
> 
> Does anyone have any ideas what could be going on?
> 
> 
> 
> Chris Howden
> 

Hi Chris,

I have two guess:

1- You don't have installed 'stringr' pakage
2- Your R is outdated

Try this two things and after this mail me

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple question on binning data

2010-05-15 Thread Bernardo Rangel Tura

On Thu, 2010-05-13 at 17:31 -0400, Carl Witthoft wrote:
> It's very simple to write a "binit()" function.  If all you want to do 
> is e.g., bin 107 values into sums of 10 at a time, then write a loop 
> that sums x[10*i:11*i-1]  (not tested and not syntactically correct).
> 
> The one I wrote for myself discards any partial bin (101-107 in my 
> example) and leaves a warning note that this took place.
> 
> Carl

Hi Carl,

I think the syntactically correct is x[10*i:(11*i-1)]

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Minimization problem

2010-05-20 Thread Bernardo Rangel Tura

On Thu, 2010-05-20 at 01:35 -0700, Fred wrote:
> Dear R users,
> 
> I am trying to minimize two function simultaneously in R,
> 
> function(x)
> 
> minimize x[1],x[2],x[3]
> 
> mean(distribution(x1,x2,x3) ) - observed mean
> 
> std(distribution(x1,x2,x3)) - observed std
> 
> What I want to achieve is that simulated mean and standard deviation
> of distribution related to x1 x2 x3  would be close to observed mean
> and observed standard deviation.
> 
> 
> is there any function in R can reach this?
> 
> Thank you for the help first .
> 
> F.

Hi!
Do you need use optim, something like this

test <- function(parameters){
m.error <- mean(distribution(x1,x2,x3) ) - observed mean
m.sd <- std(distribution(x1,x2,x3)) - observed std
res <- cbind(m.error,sd.error)
return(res)
}
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How sample without replacement on more than one variables?

2010-05-23 Thread Bernardo Rangel Tura

On Sun, 2010-05-23 at 00:56 -0700, dusadrian wrote:
> This might help, depending on your exact needs:
> > v1 <- sample(letters[1:2], 10, replace=TRUE)
> > v2 <- sample(letters[3:4], 10, replace=TRUE)
> > v3 <- sample(letters[5:6], 10, replace=TRUE)
> > aa <- data.frame(v1=v1, v2=v2, v3=v3)

And now is simple, sample the row of data frame
aa[sample(1:nrows(aa),3),]

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculation of r squared from a linear regression

2010-06-11 Thread Bernardo Rangel Tura

On Fri, 2010-06-11 at 01:16 -0700, Sandra Hawthorne wrote:
> Hi,
> 
> I'm trying to verify the calculation of coefficient of determination (r 
> squared) for linear regression. I've done the calculation manually with a 
> simple test case and using the definition of r squared outlined in 
> summary(lm) help. There seems to be a discrepancy between the what R produced 
> and the manual calculation. Does anyone know why this is so? What does the 
> multiple r squared reported in summary(lm) represent? 
> 
> # The test case:
> x <- c(1,2,3,4)
> y <- c(1.6,4.4,5.5,8.3)
> dummy <- data.frame(x, y)
> fm1 <- lm(y ~ x-1, data = dummy)
> summary(fm1)
> betax <- fm1$coeff[x] * sd(x) / sd(y) 
> # cd is coefficient of determination
> cd <- betax * cor(y, x)
> 
> Thanks.


Sorry Sandra,

But the problem in yours script. Look this
x <- c(1,2,3,4)
y <- c(1.6,4.4,5.5,8.3)
dummy <- data.frame(x, y)
fm1 <- lm(y ~ x, data = dummy)
summary(fm1)

Call:
lm(formula = y ~ x, data = dummy)

Residuals:
1 2 3 4 
-0.17  0.51 -0.51  0.17 

Coefficients:
Estimate Std. Error t value Pr(>|t|)  
(Intercept)  -0.3500 0.6584  -0.532   0.6481  
x 2.1200 0.2404   8.818   0.0126 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.5376 on 2 degrees of freedom
Multiple R-squared: 0.9749, Adjusted R-squared: 0.9624 
F-statistic: 77.76 on 1 and 2 DF,  p-value: 0.01262 

betax <- fm1$coeff[2] * sd(x) / sd(y) 
# cd is coefficient of determination
cd <- betax * cor(y, x)
cd
   x 
0.974924 

The formula "fm1$coeff[2] * sd(x) / sd(y)" is valid only the model have
a intercept...

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Predictions with missing inputs

2011-02-12 Thread Bernardo Rangel Tura

On Fri, 2011-02-11 at 20:51 -0500, Axel Urbiz wrote:
> Dear users,
> 
> I'll appreciate your help with this (hopefully) simple problem.
> 
> I have a model object which was fitted to inputs X1, X2, X3. Now, I'd like
> to use this object to make predictions on a new data set where only X1 and
> X2 are available (just use the estimated coefficients for these variables in
> making predictions and ignoring the coefficient on X3). Here's my attempt
> but, of course, didn't work.
> 
> #fit some linear model to random data
> 
> x=matrix(rnorm(100*3),100,3)
> y=sample(1:2,100,replace=TRUE)
> mydata <- data.frame(y,x)
> mymodel <- lm(y ~ ns(X1, df=3) + X2 + X3, data=mydata)
> summary(mymodel)
> 
> #create new data with 1 missing input
> 
> mynewdata <- data.frame(matrix(rnorm(100*2),100,2))
> mypred <- predict(mymodel, mynewdata)
> Thanks in advance for your help!
> 
> Axel.

Axel,

I think mice package solve your problem 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Complex Time Series

2008-06-18 Thread Bernardo Rangel Tura

Hi R masters,

In my work I analyse a time serie of number of birth in State of Rio de Janeiro.
After study de ACF e p ACF I conclude the model is:

Non seasonal ar(1)
In 7 days lag 7 days seasonal ma(1)  
In 364 days lag 364 days seasonal ma(1)

If the time serie was Non seasonal ar(1) with one  seasonal ma(1)   is simple
using command arima for fit a time serie, but I don't know HOW TO fit a arima
model in a time serie with 2 diferent seasonal periods

I attach de file ts.csv with the serie.

Thanks in advance....

Bernardo Rangel Tura, M.D, M.P.H, PhD
National Institute of Cardiology
Rio de Janeiro - Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ignore blank columns when reading a row from a table

2008-07-06 Thread Bernardo Rangel Tura

Em Ter, 2008-07-01 às 13:49 -0400, [EMAIL PROTECTED] escreveu:
> Hi,
> 
> I am extracting data from a table where the rows have different column 
> lengths,
> and empty columns have NA in them. Whenever I extract a row with some empty
> columns, the resulting vector carries all the NAs. Is there a way to ignore 
> the
> empty columns?
> 
> Thanks,
> -Nina

Nina

You can use scan() and argument what, if necessary you can use the
argument fill=T to solve a diferent coluns length.
use ?scan to get help




-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

P.S. I learn this tip read: "Data Manipulation with R"

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulating Conditional Distributions

2008-03-25 Thread Bernardo Rangel Tura


On Fri, 2008-03-21 at 22:37 -0700, Sherri Rose wrote:
> Dear R-Help List,
> I'm trying to simulate data from a conditional distribution, and  
> haven't been able to modify my existing code to do so.  I searched  
> the archives, but didn't find any previous post that matched my  
> question.
> 
> n=1
> pop = data.frame(W1 = rbinom(n, 1, .2),
>W2 = runif(n, min = 3, max = 8), W3 = rnorm(n, mean=0, sd=2))
> pop = transform(pop,
>A = rbinom(n, 1, .5))
> pop = transform(pop,
>Y = rbinom(n, 1, 1/(1+exp(-(1.5*A-.05*W1-2*W2-2*W3+2*A*W1)
> 
> In this population the probability of being "diseased" (Y=1) is  
> approx 0.030.  What I want to be able to do is specify a conditional  
> distribution of (A, W1, W2, W3) given that Y=1 and one for (A, W1,  
> W2, W3) given that Y=0.  Then I can sample diseased and non-diseased  
> individuals from these distributions without having to simulate a  
> large base population.  This will be particularly useful when the  
> probability of being "diseased" is even smaller and I want a large  
> number of diseased individuals.
> 
> Any pointers to do this would be extremely helpful!
> Thank you,
> Sherri Rose
> 
> UC Berkeley
>   [[alternative HTML version deleted]]

If I understand your problem, this script solve your question:

n<-1
Y<-rbinom(n,1,.3)
A<-ifelse(Y==0,1,rbinom(n, 1, .5))
W1<-ifelse(Y==0,1,rbinom(n, 1, .2))
W2<-ifelse(Y==0,1,runif(n, min = 3, max = 8))
W3<-ifelse(Y==0,1,rnorm(n, mean=0, sd=2))
pop<-data.frame(Y,A,W1,W2,W3)
pop

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] biplot

2007-11-20 Thread Bernardo Rangel Tura


On Mon, 2007-11-19 at 13:51 -0500, Weiwei Shi wrote:
> Hi,
> 
> I am wondering how to draw biplot with the same scales on both plots?
> For example, if the two plots have much different scales, generally
> the two x-y's are scaled so that the two plots are sitting in the
> center automatically. How to disable this?
> 
> Thanks

Hi WeiWei

To solve your problem you must use the options xlim, ylim in your biplot


par(mfrow=c(2,1))
biplot(...,xlim=c(minimun,maximun),ylim=c(minimun,maximun),...)
biplot(...,xlim=c(minimun,maximun),ylim=c(minimun,maximun),...)

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Naming elements of a list

2007-11-22 Thread Bernardo Rangel Tura


On Thu, 2007-11-22 at 03:16 -0500, Thomas L Jones, PhD wrote:
> I have a numeric vector of lenth 1. I am trying to use it inside
> a function just by giving its name, rather than specifying it as
> an argument to the function. I am aware that there is an attach
> function which you need to call. The attach function will accept a
> list. However, I don't seem to be able to create the list properly. (Or 
> should I use a frame instead?)
> 
> free_driver <- function (){
> i <- numeric (1)
> attach (as.list (i))
> i <- 25
> 
> free_test ()
> 
> }
> free_test <- function (){
> print ("i =")
> print (i)
> return ()
> 
> }
> 
> Anyway, here is the output, starting with the load operation:
> 
> --
> 
> > free_driver <- function (){
> + i <- numeric (1)
> + attach (as.list (i))
> + i <- 25
> +
> + free_test ()
> +
> + }
> > free_test <- function (){
> + print ("i =")
> + print (i)
> + return ()
> +
> + }
> > free_driver ()
> Error in attach(as.list(i)) : all elements of a list must be named
> >
> --
> 
> Is there an easy way to name all elements of a list?
> 
> Your advice?

Hi Tom,

If I understand your question is possible solve yours problem with this
code

free_driver <- function (){
i<-as.numeric(25)
i <- list(i)
free_test (i)
}


free_test <- function (i){
cat ("i =",i[[1]],"\n")
}


free_driver()


-- 
Bernardo Rangel Tura, M.D,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testing independence of categorical variables

2007-11-26 Thread Bernardo Rangel Tura

On Thu, 2007-11-22 at 16:16 +0500, Shoaaib Mehmood wrote:
> hi,
> 
> is there a way of calculating of measuring dependence between two
> categorical variables. i tried using the chi square test to test for
> independence but i got error saying that the lengths of the two
> vectors don't match. Suppose X and Y are two factors. X has 5 levels
> and Y has 7 levels. This is what i tried doing
> 
> >temp<-chisq.test(x,y)
> 
> but got error "the lengths of the two vectors don't match". any help
> will be appreciated

Hi Shoaaib,

Try using chisq.test(table(x,y)).

If you using chisq.test(x,y) R will testing goodness-of-fit.

-- 
Bernardo Rangel Tura, M.D,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with plotting table

2007-11-28 Thread Bernardo Rangel Tura

On Tue, 2007-11-27 at 13:01 -0500, Carlos Gershenson wrote:
> Hi all,
> 
> Let us have:
> 
> x<-1:10
> y<-x/2
> plot(table(x), type="p")
> points(table(y), pch=2)
> 
> 
> Why does the last command plots the values of table(y) using the x  
> coordinates of table(x)???
> Am I doing something wrong?
> What would be a way of plotting the points of table(y) on their place?
> 
> #this problem also occurs with:
> plot(table(y), type="p")
> points(table(x), pch=2)
> 
> 
> Thank you very much,
> Carlos

Hi Carlos 

In your command plot(table(y), type="p"), you plot a numeric part of
table(y), in this case:

y
0.5   1 1.5   2 2.5   3 3.5   4 4.5   5 
  1   1   1   1   1   1   1   1   1   1

So you plot the number 1 always. Same for table (x).

Have two possible solutions:

plot(as.numeric(names(table(x))), type="p")
points(as.numeric(names(table(y))), pch=2)

OR

x<-1:10
y<-x/2
plot(x, type="p")
points(y, pch=2)

If you need more help send a mail

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] intercept in multiple regression

2007-11-30 Thread Bernardo Rangel Tura


On Tue, 2007-11-20 at 16:39 +0100, Irene Mantzouni wrote:
> Hi all!
>  
> Is it possible to model a multiple regression in which the response becomes 
> zero when one of the two covariates is zero?  
> lm(y~ x1+x2) and y=0 if x1=0. 
> However, when x1=0, y=x2+1(intercept). 
> Does this mean I cannot have a second covariate and intercept or should I 
> eliminate only the intercept?
>  
> thank you!

Well, 

If y=x2+1 as same y-1=x2
So you can use lm(y-1~x2-1)

Look this example

set.seed(123)
y<-rnorm(100,sd=2)
x2<-rpois(100,25)
lm(y-1~x2-1)
Call:
lm(formula = y - 1 ~ x2 - 1)

Coefficients:
  x2  
-0.03041 

Your regression is y-1=-0.03041*x2 or y=1-0.03041*x2



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is R portable?

2007-12-04 Thread Bernardo Rangel Tura


Well, yesterday I put a linux version of R 2.6.0 in a USB stick of 2Gb and it
runs very well...

Bernardo Rangel Tura, MD,MPH,Phd
National Cardiology Institute



-- Original Message ---
From: John Kane <[EMAIL PROTECTED]>
To: Roland Rau <[EMAIL PROTECTED]>, Tom Backer Johnsen
<[EMAIL PROTECTED]>Cc: 
Sent: Tue, 4 Dec 2007 12:22:36 -0500 (EST)
Subject: Re: [R] Is R portable?

> I simply installed R onto a USB stick, downloaded my
> normal packages to it and it works fine under Windows.
> 
> --- Roland Rau <[EMAIL PROTECTED]> wrote:
> 
> > Hi Tom,
> > 
> > did you check the R for Windows FAQ?
> > 
> http://cran.r-project.org/bin/windows/base/rw-FAQ.html#Can-I-run-R-
> from-a-CD-or-USB-drive_003f
> > 
> > Hope this helps,
> > Roland
> > 
> > 
> > Tom Backer Johnsen wrote:
> > > Recently I came across an interesting web site: 
> > > http://portableapps.com/.  The idea is simple,
> > this is software that 
> > > is possible to install and run on some type of USB
> > memory, a stick or 
> > > one of these hard disks.  I can think of a number
> > of situations where 
> > > this could be handy.  In addition memory sticks
> > are getting cheaper 
> > > and more powerful by the day.
> > > 
> > > So:  Is it possible to run R off one of these
> > sticks?
> > > 
> > > I am also informed that it is possible to run
> > Latex in this manner.
> > > 
> > > Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] level significance

2007-12-09 Thread Bernardo Rangel Tura


On Sat, 2007-12-08 at 22:51 +0100, Irene Mantzouni wrote:
> Hi all!
>  
> I am fitting a (mixed) model with a factor (F) and continuous response and 
> predictor:
> y~F+F:x
>  
> (How) can I check the significance of the model at each factor level (i.e. 
> the model could be significant only at one of the levels)?
>  
> Thank you!

Irene

If I understand your doubt is necessary only use a summary command.

Example
set.seed(123)
x<-rnorm(300,sd=2)
F<-sample(rep(letters[1:3],100))
y<-rnorm(300,mean=2,sd=1.5)
model<-lm(y~F+F:x)
summary(model)

(..)
Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept)  2.118128   0.151543  13.977   <2e-16 ***
Fb  -0.052866   0.213729  -0.2470.805
Fc  -0.229110   0.213726  -1.0720.285
Fa:x 0.036987   0.074549   0.4960.620
Fb:x 0.059439   0.083823   0.7090.479
Fc:x 0.003769   0.082373   0.0460.964
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
(...)

In this case only the Intercept is significant

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] CART analysis

2007-12-10 Thread Bernardo Rangel Tura

On Sun, 2007-12-09 at 21:24 +0100, [EMAIL PROTECTED] wrote:
> I would like to know if is it possible implemet a partitioning tree  
> using a function like rpart, or mvpart, and with formula a glm-object  
> (as a logistic models) or a robust linear regression (as least sum of  
> absolute errors).
> In this case, the appropriate "method" to use is "mrt"? Or another one?
> Thanks,
> Anna Maria Paganoni

Well, 

I think de package tree solve your problem. If your response variable
don't use logistic approach if your variable is numeric use standard
formula

Any doubt read :
http://cran.r-project.org/doc/packages/tree.pdf

and mail me 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the observed "log odds" in logistic regression

2007-12-10 Thread Bernardo Rangel Tura


On Mon, 2007-12-10 at 19:42 -0800, Bin Yue wrote:
(...)
>My problem  is this : in my data set , the IVs are continuous variables,
> do I still have to generate such a table and compute the log odds for each
> level of IV according to which the log odds are calculated?  

If IV is a continuous variable isn't possible you create a contingency
table because don't exist levels.

Similar is not possible calculate de log odds of P(IV=x) but is possible
calculate log odds of P(IVx)

>In R , fitted(fit) gives the fitted probability for DV to be 1.  Dose the
> observed probability exist ? If it does exist , how can I extract it ? If
> the IV is cartegorical , the DV can readily changed to be a tow-culumned
> matrix, thus log(the observed probabily/(1-the observed probability) might
> be the "log odds". I wonder what if the IV is continuous ?
>  And about the residuals. It seems that  the residual is not the actual
> DV minus the fitted probability. For in my model extreme residuals lie well
> beyond (0,1).  I wonder how   the residual is computed.
>   Would you please help me ?  Thank all very much again.

So to help you send a small part of your data and a reproductive example
to us because is more easy understand your question this way
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-Linear Quantile Regression

2008-01-02 Thread Bernardo Rangel Tura


On Tue, 2008-01-01 at 15:16 +0100, Humberto Marotta wrote:
> Please,
> 
> I have a problem with nonlinear quantile regression.
> 
> My data shows a large variability and the quantile regression seemed perfect
> to relate two given variables. I got to run the linear quantile regression
> analysis and to build the graph in the R  (with quantreg package). However, 
> the
> up part of my data dispersion seems a positive exponential curve, while the
> down part seems a negative exponential curve. The median part seems linear
> (maybe non-significant). I think that I needs  to run a non-linear quantile
> regression for this dataset (including a tau = 0.1, 0.25, 0.5, 0.75 and 0.90
> ).
> 
> The problem is: I read very much many manuals about Quantile regression and
> the operational use of R, but I did not get to put parametrs in R to run
> this non-linear analysis.
> 
> Might the function in R be the following?
> nlrq(formula, data=parent.frame(), start, tau=0.5, control,
> trace=FALSE,method="L-BFGS-B")
> 
> What's the formula I could put here for my data?? How I put my file in this
> function? Might it  be as [scan(file=read.dat" ]?? But where?


Well  first you must put your data in a data frame sou you need type

base<-scan(file="read.dat")

Second is necessary look your data to understand the relations about
your variables, I suggest:

attach(base) 
plot(independent variable, dependent variable) 

So based in this plot you choose de formula of your data


> 
> At last, one confirmation: Can I run log X log analysis in nlrq??
> 
> Please, I need very much any response, as I don't know what make in this
> moment...
> 
> Thank you very much,
> Humberto Marotta
> 
> phD Student from Federal University of Rio de Janeiro
> 

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple correspondence analysis

2008-01-13 Thread Bernardo Rangel Tura

Hi R-people,

I try using mca (multiple correspondence analysis) in evocation data
(data base feminino2.csv attach in this mail).

Well in this database have 4 evocations for each 120 persons.

If I use this script:

base<-read.csv("feminino2.csv")

require(MASS)

plot(mca(base,abbrev=T),rows = F)

Work, but result in a ugly plot, so I try extract levels with low
frequency:


niveis<-unique(c(levels(base$p1),levels(base$p2),levels(base
$p3),levels(base$p4))) 

v<-table(factor(base$p1,levels=niveis))+ table(factor(base
$p2,levels=niveis))+table(factor(base$p3,levels=niveis))+ 

n2<-niveis[v>=5] #where 5 is cutoff frequency

evoc.df<-data.frame(factor(base$p1,levels=n2),factor(base
$p2,levels=n2),factor(base$p3,levels=n2),factor(base$p4,levels=n2))

names(evoc.df)<-c("p1","p2","p3","p4")

plot(mca(evoc.df,abbrev=T),rows = F)

Don't work and result in a error:

Error in svd(X) : infinite or missing values in 'x'

Where I worng? Anybody help me?

Thanks in advance


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question regarding hypothesis testing in contingency tables

2008-01-13 Thread Bernardo Rangel Tura


On Thu, 2008-01-10 at 09:47 -0800, eugen pircalabelu wrote:
> Hi R-users!
> 
> I have the following example:
> a<-data.frame(cat=c(5,10,15), dog=c(5,10, 15), mouse=c(10,10,20))
> b<-data.frame(cat=c(15,10,5), dog=c(15, 10, 5), mouse=c(20,10,10))
> rownames(b)<-c("scared", "happy", "sad")
> rownames(a)<-c("scared", "happy", "sad")
> 
(...)
> Another thing that i want to see , is which cells differ one from another? Eg 
> is the 5 percent of scared dogs from sample a, 
> different from  15% of scared dogs form sample b? I would like something like 
> the "adjusted standardized reziduals" test from SPSS?
(...)
> Thank you and have a great day!

Hi Eugen!

The adjusted standardized residuals is available in package gmodels

require(gmodels)
?CrossTable


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cannot allocate memory

2008-09-24 Thread Bernardo Rangel Tura

Em Ter, 2008-09-23 às 21:42 -0400, DumpsterBear escreveu:
> I am getting "Error: cannot allocate vector of size 197 MB".
> I know that similar problems were discussed a lot already, but I
> didn't find any satisfactory answers so far!
> 
> Details:
> *** I have XP (32bit) with 4GB ram. At the time when the problem
> appeared I had 1.5GB of available physical memory.
> *** I increased R memory limit to 3GB via memory.limit(3000)
> *** I did gs() and got
> used  (Mb) gc trigger   (Mb)  max used   (Mb)
> Ncells147534   4.0 407500   10.9407500   10.9
> Vcells 104939449 800.7  186388073 1422.1 185874684 1418.2
> 
> The garbage collection didn't help.
> 
> Any ideas? Many thanks in advance!
> 

Adam,

First, is possible 32bit XP use all your 4Gb?

Second, I think you say "gc" whem say "gs", so in my computer (Ubuntu
64bit with 4Gb):

> gc()
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 188975 10.1 407500 21.8   35 18.7
Vcells 169133  1.3 786432  6.0   786378  6.0

> gc(reset=T)
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 188951 10.1 407500 21.8   188951 10.1
Vcells 168893  1.3 786432  6.0   168893  1.3

If you read gc help:

reset: logical; if 'TRUE' the values for maximum space used are
  reset to the current values.

Other issue is options for rgui command.
Have a option "--max-mem-size" that you modify to expand you RAM
avaiable

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] birthday problem (factorial limit)

2008-09-30 Thread Bernardo Rangel Tura

Em Dom, 2008-09-28 às 19:43 +0200, Jörg Groß escreveu:
> Hi,
> 
> I tried to calculate the formula for the birthday problem
> (the probability that at least two people out of a group of n people  
> share the same birthday)
> 
> But the factorial-function allows me only to calculate factorials up  
> to 170.
> 
> 
> So is there a way to push that limit?
> 
> to solve this formula:
> 
> (factorial(365) / factorial((365-23))) / (365^23)
> 
> (n=23)

Log experession

n<-23
exp(sum(log(1:365))-sum(log(1:(365-n)))-n*log(365))

[1] 0.4927028


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic regression problem

2008-09-30 Thread Bernardo Rangel Tura

Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu:
> I have a huge data set with thousands of variable and one binary
> variable. I know that most of the variables are correlated and are not
> good predictors... but...
> 
> It is very hard to start modeling with such a huge dataset. What would
> be your suggestion. How to make a first cut... how to eliminate most
> of the variables but not to ignore potential interactions... for
> example, maybe variable A is not good predictor and variable B is not
> good predictor either, but maybe A and B together are good
> predictor...
> 
> Any suggestion is welcomed


milicic.marko

I think do you start with a rpart("binary variable"~.)
This show you a set of variables to start a model and the start set to
curoff  for continous variables
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic regression problem

2008-10-01 Thread Bernardo Rangel Tura

Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu:
> Bernardo Rangel Tura wrote:
> > Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu:
> >> I have a huge data set with thousands of variable and one binary
> >> variable. I know that most of the variables are correlated and are not
> >> good predictors... but...
> >>
> >> It is very hard to start modeling with such a huge dataset. What would
> >> be your suggestion. How to make a first cut... how to eliminate most
> >> of the variables but not to ignore potential interactions... for
> >> example, maybe variable A is not good predictor and variable B is not
> >> good predictor either, but maybe A and B together are good
> >> predictor...
> >>
> >> Any suggestion is welcomed
> > 
> > 
> > milicic.marko
> > 
> > I think do you start with a rpart("binary variable"~.)
> > This show you a set of variables to start a model and the start set to
> > curoff  for continous variables
> 
> I cannot imagine a worse way to formulate a regression model.  Reasons 
> include
> 
> 1. Results of recursive partitioning are not trustworthy unless the 
> sample size exceeds 50,000 or the signal to noise ratio is extremely high.
> 
> 2. The type I error of tests from the final regression model will be 
> extraordinarily inflated.
> 
> 3. False interactions will appear in the model.
> 
> 4. The cutoffs so chosen will not replicate and in effect assume that 
> covariate effects are discontinuous and piecewise flat.  The use of 
> cutoffs results in a huge loss of information and power and makes the 
> analysis arbitrary and impossible to interpret (e.g., a high covariate 
> value:low covariate value odds ratio or mean difference is a complex 
> function of all the covariate values in the sample).
> 
> 5. The model will not validate in new data.

Professor Frank,

Thank you for your explain.

Well, if my first idea is wrong what is your opinion on the following
approach?

1- Make PCA with data excluding the binary variable
2- Put de principal components in logistic model
3- After revert principal componentes in variable (only if is
interesting for milicic.marko)

If this approach is wrong too what is your approach?
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cannot open connection: Authorization Required

2008-10-05 Thread Bernardo Rangel Tura

Em Sex, 2008-10-03 às 11:22 -0700, Spencer Graves escreveu:
> Hi, All: 
> 
>   Is there a way in R to access a file / web site that requires 
> permission? 
> 
>   Consider for example the following: 
> 
> 
>  > readLines('http://www.r-project.org/', 4)
> [1] ""
> [2] ""  
> [3] ""  
> [4] "The R Project for Statistical Computing"  
> 
>  > readLines(URL) # URL = web address, which I can see via manual access. 
> Error in file(con, "r") : cannot open the connection
> In addition: Warning message:
> In file(con, "r") :
>   cannot open: HTTP status was '401 Authorization Required'
> 
> 
>   Thanks,
>   Spencer Graves

Spencer,  this is a local problem in my system

> 
> >  readLines('http://www.r-project.org/')
>  [1] ""  
>  
>  [2] "" 
>  
>  [3] "" 
>  
>  [4] "The R Project for Statistical Computing" 
>  
>  [5] "" 
>  
>  [6] " type=\"image/x-icon\">" 
>  [7] "" 
>  
>  [8] ""
>  
>  [9] ""   
>  
> [10] ""
>  
> [11] ""
>  
> [12] ""  
>  
> [13] ""
>  
> [14] ""
>  
> [15] ""   
>  
> [16] "" 
>  
> [17] "The R Project for Statistical Computing"   
>  
> [18] ""   
>  
> [19] "Your browser seems not to support frames,"  
>  
> [20] "here is the contents page of the R 
> Project's"
> [21] "website."       
>  
> [22] ""
>  
> [23] ""
>  
> [24] ""   
>  
> [25] ""   
>  
> [26] ""   
>  

I run R.2.7.2 in Ubuntu AMD 64 machine


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Load a program at the front end

2008-10-06 Thread Bernardo Rangel Tura

Em Qui, 2008-10-02 às 14:36 -0400, Gang Chen escreveu:
> I want to run a R program, prog.R,  interactively. My question is, is
> there a way I can start prog.R on the shell terminal when invoking R,
> instead of using source() inside R?
> 
> TIA,
> Gang

Hi Gang

I my system just only type:

 R --no-save https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] interpreting Shapiro-Wilks test result

2008-10-11 Thread Bernardo Rangel Tura

Em Qua, 2008-10-08 às 17:41 -0700, Halizah Basiron escreveu:
> Hi all,
>I am newbie in using R software and also doing statistical test. I want to 
> know if my data in in normal distribution. I have 2 groups of data and I did 
> calculate Shapiro Wilks using R software. Here is the results:
> 
> Group 1: W = 0.9206, p-value = 0.01683
> Group 2: W = 0.9626, p-value = 0.4694
> 
> I am not quite sure what default confidence level (CF) is used in calculating 
> Shapiro Wilks. Else, may I choose my own CF. Let say, if I choose CF = 0.01, 
> therefore I may approve that Group 1 and Group 2 data are normal. If not, 
> what should I do. My next plan is to apply T Test for both groups (assuming 
> both are normal). I really need advice from experts here.
> 
> Cheers,
> Halizah

Hi Halizah,

In my opinion don't exist confidence level of test. What exist is
confidence level of your experiment.

If you plan your experiment with alpha probability you have
100*(1-alpha) confidence level and you p-value need minor than alpha to
named "significant"

So if you need 99% of significance the two groups is normal and you
using t.test. 


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help about how can R compute AIC?

2008-10-15 Thread Bernardo Rangel Tura

Em Ter, 2008-10-14 às 17:13 +0200, Arnau Mir Torres escreveu:
> Hello.
> 
> I need to know how can R compute AIC when I study a regression model?
> For example, if I use these data:
>growth tannin
> 1 12  0
> 2 10  1
> 3  8  2
> 4 11  3
> 5  6  4
> 6  7  5
> 7  2  6
> 8  3  7
> 9  3  8
> and I do
> model <- lm (growth ~ tannin)
> AIC(model)
> 
> R responses:
> 38.75990
> 
> I know the following formula to compute AIC:
> AIC= -2*log-likelihood + 2*(p+1)
> 
> In my example, it would be:
> AIC=-2*log-likelihood + 2*2
> but I don't know how R computes log-likelihood:
> 
> logLik(model)
> 'log Lik.' -16.37995 (df=3)

Arnau,

LogLik= -16.37995

AIC= -2*log-likelihood + 2*(p+1)

AIC=-2*-16.37995 + 2*(p+1)

AIC= 32.7599+2*(p+1)

#
# this is very important the model have two
# parameter, because sigma is a parameter to.
# so 
#

AIC= 32.7599+2*(2+1) 

AIC= 32.7599+6

AIC= 38.7599
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] legend

2008-10-15 Thread Bernardo Rangel Tura

Em Ter, 2008-10-14 às 18:44 -0700, Lavan escreveu:
> Hi R users,
> 
> I'm trying to have the symbols for sigma[1] in my legend. the code is given
> below, please have a look.
> 
> Thanks,
> 
> lavan

try legend("bottom",legend=expression(sigma[1]),...)

> 
> 
> legend("bottom",legend=paste("Sigma1=", c(0.01,0.1,0.2,0.5,1,1.5,2,4,6,9.5),
> sep=""),
> fill=c("red","green","blue","black","pink","brown","purple","yellow","lightblue","orange"))
> 
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving results of Kruskal Walis test

2008-10-16 Thread Bernardo Rangel Tura

Em Qui, 2008-10-16 às 22:31 +0200, Himanshu Ardawatia escreveu:
> Hello,
> 
> I am running Kruskal-Walis test in R. When I try to save results using
> write.table it gives me the following error :
> 
> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
> stringsAsFactors) :
>   cannot coerce class "htest" into a data.frame
> 
> The overall code is as follows :
> 
> >data_file = read.table("~/DATA.dir/data_file.txt", header=T)
> 
> >attach(data_file)
> 
> >data_file.out <- krukal.test(data_file)
> 
> >write.table(data_file.out, "~/DATA/results/data_file_out.txt")
> 
> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
> stringsAsFactors) :
>   cannot coerce class "htest" into a data.frame
> Results do come in data_file.out after analysis as seen below:
> 
> >data_file.out
> 
> 
> Kruskal-Wallis rank sum test
> 
> data:  value by pathway
> Kruskal-Wallis chi-squared = 5.6031, df = 3, p-value = 0.1326
> 
> 
> I am wondering if I am making a mistake with using write.table (It works
> very well saving results from anova analysis) or is there any other way to
> save results in a file for future use..
> 
> Thanks
> Himanshu


Hi Himanshu 

Well the output of htests is a list so data_file.out is a lista to.

You don't put a list ins a data.frame so you need make this


data.frame(unlist(data_file.out))

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about "granova" library

2008-10-20 Thread Bernardo Rangel Tura

Em Qui, 2008-10-16 às 21:11 +1030, Fernando Marmolejo Ramos escreveu:
> Dear all
> 
> Recently the "granova" package was launched. I installed but after when I
> invoked it in R it requested for other libraries. They were downloaded and
> install automatically.
> 
> I tried to run the example syntax of “granova.1w” and “granova.2w” but two
> things happened: i) either a file called “granova.rdb” wasn’t existent or ii)
> the GUI clashed and R shut down.
> 
> Has anyone else experience this? Do the developers have an answer for this
> troubleshot?
> 
> I’m using a Windows Vista system and I have the R version 2.7.2.
> 
> Cheers,
> 
> Fer

Fernando


I using R version 2.7.2 and Ubuntu 8.04 in my computer:

granova.1w - runs fine 

granova.2w - don't run fine, actual only 1 of 2 graphical windows apear
a plot (rgl surface)



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing R in Linux: problems with JAVA packages (rJava, RWeka, ...) ?

2008-10-21 Thread Bernardo Rangel Tura

Em Ter, 2008-10-21 às 15:36 +0100, Paulo Cortez escreveu:
> Hi,
> 
> While in MacOS it is quite simple to install R and Java packages, the 
> same is not true for Linux. I surfed the web and it seems that other 
> users also have similar problems. Perhaps a nice FAQ answer or HOWTO 
> would help...
> 
> But here is my situation: I have 2 linux servers, one with Fedora 9 and 
> the other with CentOS5.
> 
> I have installed R (2.7.2 version, with yum) and I tryed to install the 
> RWeka and rJava packages.
> 
> After receiving an error, I installed jdk1.6 from java (file:
> jdk-6u10-linux-i586-rpm.bin?AuthParam=1224512972_e5a9932e886a02f44dfdfe48aad02db8&TicketId=B%2Fw2nBuFSltLQRRFM1JblgDk&GroupName=CDS&FilePath=%2FESD5%2FJSCDL%2Fjdk%2F6u10%2Fjdk-6u10-linux-i586-rpm.bin&File=jdk-6u10-linux-i586-rpm.bin)
> in both machines.
  (...)

Hi Paulo

I think you need install JDK and JRE for using rJava


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pchisq error

2009-01-19 Thread Bernardo Rangel Tura

On Mon, 2009-01-19 at 09:54 +0100, Jeremy Silver wrote:
> Dear R experts,
(...)
> > pchisq(5.464342,1,lower.tail = FALSE)
> 
> [1] 0.01940836
> 
> > reproduceError(5.464342)
> 
> stat = 5.464342, p = 0.019408
> 
> > pchisq(5.464342,1,lower.tail = FALSE)
> 
> [1] NaN
> 
> Warning messages:
> 
> 1: In pchisq(5.464342, 1, lower.tail = FALSE) :
> 
>** NON-convergence in pgamma()'s pd_lower_cf() f= nan.
> 
> 2: In pchisq(q, df, lower.tail, log.p) : NaNs produced
> > 
(...)
> 
> > sessionInfo()
> 
> R version 2.8.0 (2008-10-20) 
> 
> i686-pc-linux-gnu 
> 
> locale:
> 
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
> 
> attached base packages:
> 
> [1] stats graphics  grDevices utils datasets  methods   base 

Hi Jeremy,

In my computer your error is not occur. Look This:

> pchisq(5.464342,1,lower.tail = FALSE)
[1] 0.01940836

> pchisq(5.464342,1,lower.tail = FALSE)
[1] 0.01940836

> sessionInfo()
R version 2.8.1 Patched (2009-01-16 r47630) 
x86_64-unknown-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base 


Well, do you already try update your R?


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] easiest way to integrate own functions on startup

2009-01-19 Thread Bernardo Rangel Tura

On Tue, 2009-01-20 at 01:13 +0100, Jörg Groß wrote:
> Hi,
> 
> I am currently writing some own functions that I frequently need.
> 
> So, it would be perfect if I could load these functions at the  
> beginning of each R-session with a small command.
> 
> 
> I tried to generate a R-package and install it that way.
> 
> But it seems that it is not so easy to add new functions to an  
> existing R-package.
> So I am not so flexible by that.
> 
> 
> Is there a way to just load an .R-file which contains the function- 
> definitions with a small command?
> So that the functions are useable in the current session?
> 
> 
> I tried load() but I get an error message...


Hi Jörg,

I do you not use .Fisrt.

Something like this:

.First <- function(){
source(MyFunction1.txt)
source(MyFunction2.txt)
source(MyFunction3.txt)
...
source(MyFunctionn.txt)
}

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 130 matches

Mail list logo