date:20130503

[R] untar() error

2013-05-03 Thread Hakim Abdi

Dear List,

I have a list of 600+ *.gz files that I would like to extract and read the
geotiffs contained within them. I tried using the untar() function to
simplify this task but I am stumped by an error. I've combed the Internet
for a solution without luck. The details are below, and any help in solving
this matter is appreciated.

> files = list.files(path = "J:/GIMMS/NDVI", pattern = "data.tif.gz",
all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE,
include.dirs = TRUE)

> lapply(files, untar)
Error in rawToChar(block[seq_len(ns)]) :
  embedded nul in string: 'II*\0Ã <\001Â´
\0\0`G\0\0\fn\0\0Â¸â\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã|\001\0lÂ£\001\0\030Ã\001\0ÃÃ°\001\0p\027\002\0\034>\002\0Ãd\002\0tâ¹\002\0
Â²\002\0ÃÃ\002\0xÃ¿\002\0$&\003\0ÃL\003\0|s\003'

> untar(files[1])
Error in rawToChar(block[seq_len(ns)]) :
  embedded nul in string: 'II*\0Ã <\001Â´
\0\0`G\0\0\fn\0\0Â¸â\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã|\001\0lÂ£\001\0\030Ã\001\0ÃÃ°\001\0p\027\002\0\034>\002\0Ãd\002\0tâ¹\002\0
Â²\002\0ÃÃ\002\0xÃ¿\002\0$&\003\0ÃL\003\0|s\003'

> untar("J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz")
Error in rawToChar(block[seq_len(ns)]) :
  embedded nul in string: 'II*\0Ã <\001Â´
\0\0`G\0\0\fn\0\0Â¸â\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã|\001\0lÂ£\001\0\030Ã\001\0ÃÃ°\001\0p\027\002\0\034>\002\0Ãd\002\0tâ¹\002\0
Â²\002\0ÃÃ\002\0xÃ¿\002\0$&\003\0ÃL\003\0|s\003'

> traceback()
3: rawToChar(block[seq_len(ns)])
2: untar2(tarfile, files, list, exdir)
1: untar(files[1])

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C

[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



___

Hakim Abdi
Doctoral Student

Physical Geography and Ecosystem Science
Lund University
SÃ¶lvegatan 12, 223 62 Lund, Sweden

Office: +46 (0) 46 2223132
Mobile: +46 (0) 73 9300116

Email: hakim.a...@nateko.lu.se

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Edmonton course: Regression, GLM & GAM with R intro

2013-05-03 Thread Highland Statistics Ltd




We would like to announce the following statistics course:
Data exploration, regression, GLM & GAM. With introduction to R

When: 26 - 30 August 2013.
Where: Edmonton, Canada

For details, see: http://www.highstat.com/statscourse.htm
Course flyer: http://www.highstat.com/Courses/Flyer2013_09Canada.pdf



Kind regards,

Alain Zuur

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] significant test of two quadratic regression models (lm)

2013-05-03 Thread Elaine Kuo

Hello,

I am work with two quadratic regression models

y=ax^2+bx+c with the function of lm.

y1= observed migration distance of butterflies(y1=a1x^2+b1x+c1)

y2= predicted migration distance of butterflies (based on body mass)

(y2=a2x^2+b2x+c2)

x= body mass of butterflies


Now I would like to check the two regression model differ

by testing if the coeffients (a, b, c) of the y1 and the y2 model differ

(null hypothesis: a1=a2 and b1=b2 and c1=c2)


Please kindly advise any significant test in R for the purpose.

Also, please kindly advise how to apply Bonferroni procedure in the test if
necessary.

Thank you in advance.


Elaine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Self-developed package -- installation

2013-05-03 Thread Uwe Ligges




On 03.05.2013 07:54, PIKAL Petr wrote:

Hi

Probably others can give you some better insight but copying folder with 
package from one machine to another is possible until the installation is 
required by a new version of R (about each 3 years).


Reinstallation may be required more often, and we expect that packages 
need to be reinstalled at least if x or y are increased in a new R-x.y.z 
release. In rather rare cases this also happens for patch level updates.
There are examples where a reinstalltion is not required that often, but 
that is not guaranteed.


Best,
Uwe Ligges






Petr


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Hui Du
Sent: Thursday, May 02, 2013 6:55 PM
To: r-help@r-project.org
Subject: [R] Self-developed package -- installation

Hi All,

I have a question about package installation in R. We have developed a
package, say 'ABC'. We have installed it in two machines, A and B by
running 'Install Package(s) from local zip file'. Everything was fine.
Right now, suppose that package got damaged in machine A and our zipped
file is gone, My question is that may I directly copy ../library/ABC
from machine B to machine A rather than running 'Install Package(s)
from local zip file' (I don't have that zip file anymore)?


Thanks.

HXD



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Likelihood

2013-05-03 Thread S Ellison

> I have run a regression and want to calculate the likelihood 
> of obtaining the sample.
> Is there a way in which I can use R to get this likelihood value?

See ?logLik

And see also ?help.search and ??. You would have found the above by typing 
??likelihood at the command line in R


S Ellison
 

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R does not subset

2013-05-03 Thread Katarzyna Kulma

Hi everyone,

I know there have been several requests regarding subsetting before, but
none of them really helps with my problem:

I'm trying to subset only infected individuals from the REC2 data.frame:

> str(REC2)
'data.frame':362 obs. of  7 variables:
 $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41 58
66 17
 $ year : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3 ...
 $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
 $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
 $ binage   : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
 $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1 2
2 1 ...
 $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...

using either

RECinf<-REC2[which (REC2$INFECTION=="Infected"),]

or

RECinf<-subset(REC2,  INFECTION=="Infected")

in both cases I get empty data frame (0 observations):

> str(RECinf)
'data.frame':0 obs. of  7 variables:
 $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
 $ year : Factor w/ 8 levels "Y2002","Y2003",..:
 $ ccFLEDGE : int
 $ rec2012  : int
 $ binage   : Factor w/ 2 levels "ad","juv":
 $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
 $ all.rsLD : num

When subsetting, R doesn't return any warning or error message. Besides, I
used same codes many times before and they worked perfectly well. Any ideas
why this case is different?

Thanks for your help,
Kasia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread David Kulp

You have an extra space in the INFECTION factors.

Use REC2[REC2$INFECTION=="Infected ",]
or
subset(REC2, INFECTION=="Infected ")

No need to use which here.

On May 3, 2013, at 5:48 AM, Katarzyna Kulma wrote:

> Hi everyone,
> 
> I know there have been several requests regarding subsetting before, but
> none of them really helps with my problem:
> 
> I'm trying to subset only infected individuals from the REC2 data.frame:
> 
>> str(REC2)
> 'data.frame':362 obs. of  7 variables:
> $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41 58
> 66 17
> $ year : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3 ...
> $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
> $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
> $ binage   : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
> $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1 2
> 2 1 ...
> $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
> 
> using either
> 
> RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
> 
> or
> 
> RECinf<-subset(REC2,  INFECTION=="Infected")
> 
> in both cases I get empty data frame (0 observations):
> 
>> str(RECinf)
> 'data.frame':0 obs. of  7 variables:
> $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
> $ year : Factor w/ 8 levels "Y2002","Y2003",..:
> $ ccFLEDGE : int
> $ rec2012  : int
> $ binage   : Factor w/ 2 levels "ad","juv":
> $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
> $ all.rsLD : num
> 
> When subsetting, R doesn't return any warning or error message. Besides, I
> used same codes many times before and they worked perfectly well. Any ideas
> why this case is different?
> 
> Thanks for your help,
> Kasia
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Size of a refClass instance

2013-05-03 Thread David Kulp

Good tip.  Thanks Morgan.
I agree that a different structure might (necessarily) be in order.  I wanted 
to create a tree where nodes in a tree were of different derived sub-classes -- 
possibly holding more data and behaving polymorphically.  OO programming seemed 
ideal for this: lots of small things with specialized behavior -- but this 
isn't R's strength.

On May 2, 2013, at 4:57 PM, Martin Morgan wrote:

> On 05/01/2013 11:20 AM, David Kulp wrote:
>> I'm using refClass for a complex multi-directional tree structure with
>> possibly 100,000s of nodes.  The refClass design is very impressive and I'd
>> love to use it, but I've found that the size of refClass instances are very
>> large and creation time is slow.  For example, below is a RefClass and normal
>> S4 class.  The RefClass requires about 4KB per instance vs 500B for the S4
>> class -- based on adding the Ncells and Vcells of used memory reported by
>> gc().  And instantiation is more than twice as slow for a RefClass.  (R
>> 2.14.2)
>> 
>> Anyone have thoughts on this and whether there's any hope for improving
>> resources on either front?
> 
> Hi David -- not necessarily helpful but creating a few large objects is 
> always better than creating many small in R, so perhaps re-conceptualize your 
> data structure? As a rough analogy, instead of constructing a graph as a 
> large number of 'Node' instances each pointing to one another, a graph could 
> be represented as a data.frame containing columns of 'from' and 'to' indexes 
> (neighbour-edge list, a few large objects) or as an adjacency matrix. One 
> would also implement creation and update of the few large objects in an 
> R-friendly (vectorized) way.
> 
> Perhaps there are existing packages that already model the data you're 
> interested in? If your multi-directional tree can be represented as a graph, 
> then perhaps
> 
>  http://bioconductor.org/packages/release/bioc/html/graph.html
> 
> including facilities in the Boost graph library (RBGL, on the Bioconductor 
> web site, too) or the igraph package can be put to use.
> 
> Martin
> 
>> 
>> I wonder what others are doing.  I've been thinking about lightweight
>> alternative implementations, but nothing particularly elegant has come to
>> mind, yet!
>> 
>> Thanks!
>> 
>> 
>> simple <- setRefClass('simple', fields = list(a = "character", b="numeric")
>> ) gc() system.time(simple.list <- lapply(1:10, function(i) {
>> simple$new(a='foo',b=i) })) gc()
>> 
>> setClass('simple2', representation(a="character",b="numeric"))
>> setMethod("initialize", "simple2", function(.Object, a, b) { .Object@a <- a
>> .Object@b <- b .Object })
>> 
>> gc() system.time(simple2.list <- lapply(1:10, function(i) {
>> new('simple2',a='foo',b=i) })) gc()
>> 
>> __ R-help@r-project.org mailing
>> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
>> guide http://www.R-project.org/posting-guide.html and provide commented,
>> minimal, self-contained, reproducible code.
>> 
> 
> 
> -- 
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cURL ?

2013-05-03 Thread jawad hussain

Dear Sir 
I tried to find cURL on web but I do not find reliable file; there are some 
files on http://curl.haxx.se/. But I do not know which is suitable for R and 
how to install?
Kind Regards 

 
Jawad Hussain Ashraf 
VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275


> Date: Sun, 28 Apr 2013 19:07:05 +0100
> From: rip...@stats.ox.ac.uk
> To: miyanja...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] unsupported url scheme
> 
> On 28/04/2013 15:32, jawad hussain wrote:
> > fileUrl <- 
> > "https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD"download.file(fileUrl,destfile="./data/Cameras.csv",method="curl")
> >  I tried it after installing package "RCurl" but it give error message: 
> > Error in download.file(fileUrl, destfile = "Cameras.csv") :
> >unsupported URL schemeI can you help me to solve this problem. JAWAD 
> > HUSSAIN ASHRAF
> 
> 
> Yes, simply install a version of cURL which supports that scheme, then 
> re-install RCurl.
> 
>   
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> That does apply to you, too.  No HTML, tell us your sessionInfo() 
> 
> -- 
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Luis Iván Ortiz Valencia

$ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1 2

it is a factor variable, so it takes numeric values, for "Infected "  it is
assigned value 1.

subset(REC2,  INFECTION==1)


2013/5/3 Jorge I Velez 

> Hi Kasia,
>
> You need
>
> subset(REC2,  INFECTION=="Infected ")
>
> (note the space after "Infected").
>
> HTH,
> Jorge.-
>
>
> On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
> wrote:
>
> > Hi everyone,
> >
> > I know there have been several requests regarding subsetting before, but
> > none of them really helps with my problem:
> >
> > I'm trying to subset only infected individuals from the REC2 data.frame:
> >
> > > str(REC2)
> > 'data.frame':362 obs. of  7 variables:
> >  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41 58
> > 66 17
> >  $ year : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3
> > ...
> >  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
> >  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
> >  $ binage   : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
> >  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2
> 1 2
> > 2 1 ...
> >  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
> >
> > using either
> >
> > RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
> >
> > or
> >
> > RECinf<-subset(REC2,  INFECTION=="Infected")
> >
> > in both cases I get empty data frame (0 observations):
> >
> > > str(RECinf)
> > 'data.frame':0 obs. of  7 variables:
> >  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
> >  $ year : Factor w/ 8 levels "Y2002","Y2003",..:
> >  $ ccFLEDGE : int
> >  $ rec2012  : int
> >  $ binage   : Factor w/ 2 levels "ad","juv":
> >  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
> >  $ all.rsLD : num
> >
> > When subsetting, R doesn't return any warning or error message. Besides,
> I
> > used same codes many times before and they worked perfectly well. Any
> ideas
> > why this case is different?
> >
> > Thanks for your help,
> > Kasia
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Luis Iván Ortiz Valencia
Doutorando Saúde Pública - Epidemiologia, IESC, UFRJ
Estatístico Msc.
Spatial Analyst Msc.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Katarzyna Kulma

Hi Luis,

thanks for the suggestion, but still nothing:

> RECinf2<-subset(REC2,  INFECTION==1)
> head(RECinf2)
[1] RINGNOyear  ccFLEDGE  rec2012   binageINFECTION all.rsLD
<0 rows> (or 0-length row.names)

cheers,
Kasia


Katarzyna Kulma

PhD Student
Department of Ecology and Genetics
Institute of Ecology and Evolution/Animal Ecology
Uppsala University
Norbyvägen 18D
SE-752 36 Uppsala, Sweden

email: katarzyna.ku...@ebc.uu.se
Tel.+46 (0)18 471 2672
Fax.+46 18 471 6484


On 3 May 2013 14:13, Luis Iván Ortiz Valencia  wrote:

> $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1 2
>
> it is a factor variable, so it takes numeric values, for "Infected "  it
> is assigned value 1.
>
> subset(REC2,  INFECTION==1)
>
>
> 2013/5/3 Jorge I Velez 
>
>> Hi Kasia,
>>
>> You need
>>
>> subset(REC2,  INFECTION=="Infected ")
>>
>> (note the space after "Infected").
>>
>> HTH,
>> Jorge.-
>>
>>
>> On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
>> wrote:
>>
>> > Hi everyone,
>> >
>> > I know there have been several requests regarding subsetting before, but
>> > none of them really helps with my problem:
>> >
>> > I'm trying to subset only infected individuals from the REC2 data.frame:
>> >
>> > > str(REC2)
>> > 'data.frame':362 obs. of  7 variables:
>> >  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41
>> 58
>> > 66 17
>> >  $ year : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3
>> > ...
>> >  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
>> >  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
>> >  $ binage   : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
>> >  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2
>> 1 2
>> > 2 1 ...
>> >  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
>> >
>> > using either
>> >
>> > RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
>> >
>> > or
>> >
>> > RECinf<-subset(REC2,  INFECTION=="Infected")
>> >
>> > in both cases I get empty data frame (0 observations):
>> >
>> > > str(RECinf)
>> > 'data.frame':0 obs. of  7 variables:
>> >  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
>> >  $ year : Factor w/ 8 levels "Y2002","Y2003",..:
>> >  $ ccFLEDGE : int
>> >  $ rec2012  : int
>> >  $ binage   : Factor w/ 2 levels "ad","juv":
>> >  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
>> >  $ all.rsLD : num
>> >
>> > When subsetting, R doesn't return any warning or error message.
>> Besides, I
>> > used same codes many times before and they worked perfectly well. Any
>> ideas
>> > why this case is different?
>> >
>> > Thanks for your help,
>> > Kasia
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Luis Iván Ortiz Valencia
> Doutorando Saúde Pública - Epidemiologia, IESC, UFRJ
> Estatístico Msc.
> Spatial Analyst Msc.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Jorge I Velez

Hi Kasia,

You need

subset(REC2,  INFECTION=="Infected ")

(note the space after "Infected").

HTH,
Jorge.-


On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
wrote:

> Hi everyone,
>
> I know there have been several requests regarding subsetting before, but
> none of them really helps with my problem:
>
> I'm trying to subset only infected individuals from the REC2 data.frame:
>
> > str(REC2)
> 'data.frame':362 obs. of  7 variables:
>  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41 58
> 66 17
>  $ year : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3
> ...
>  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
>  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
>  $ binage   : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
>  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1 2
> 2 1 ...
>  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
>
> using either
>
> RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
>
> or
>
> RECinf<-subset(REC2,  INFECTION=="Infected")
>
> in both cases I get empty data frame (0 observations):
>
> > str(RECinf)
> 'data.frame':0 obs. of  7 variables:
>  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
>  $ year : Factor w/ 8 levels "Y2002","Y2003",..:
>  $ ccFLEDGE : int
>  $ rec2012  : int
>  $ binage   : Factor w/ 2 levels "ad","juv":
>  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
>  $ all.rsLD : num
>
> When subsetting, R doesn't return any warning or error message. Besides, I
> used same codes many times before and they worked perfectly well. Any ideas
> why this case is different?
>
> Thanks for your help,
> Kasia
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Katarzyna Kulma

Jorge, thanks for your suggestions, but they give the same (empty) result:

> RECinf<-subset(REC2,  INFECTION=="Infected")
> head(RECinf)
[1] RINGNOyear  ccFLEDGE  rec2012   binageINFECTION all.rsLD
<0 rows> (or 0-length row.names)

but David's suggestion worked! :

> RECinf<-REC2[REC2$INFECTION=="Infected ",]
> head(RECinf)
RINGNO  year ccFLEDGE rec2012 binage INFECTION   all.rsLD
2  BX23298 Y20036   1juv Infected  -6.1938776
4  BT53646 Y20035   2 ad Infected  -4.1938776
7  BT53248 Y20036   1 ad Infected  -2.1938776
11 BY75833 Y20045   0 ad Infected  -4.6574803
13 BX23067 Y20046   0 ad Infected  -3.6574803
17 BX24240 Y20046   0 ad Infected   0.3425197


still not sure why the subset() function didn't work, though.

Thanks for your help!



Katarzyna Kulma

PhD Student
Department of Ecology and Genetics
Institute of Ecology and Evolution/Animal Ecology
Uppsala University
Norbyvägen 18D
SE-752 36 Uppsala, Sweden

email: katarzyna.ku...@ebc.uu.se
Tel.+46 (0)18 471 2672
Fax.+46 18 471 6484


On 3 May 2013 13:13, David Kulp  wrote:

> You have an extra space in the INFECTION factors.
>
> Use REC2[REC2$INFECTION=="Infected ",]
> or
> subset(REC2, INFECTION=="Infected ")
>
> No need to use which here.
>
> On May 3, 2013, at 5:48 AM, Katarzyna Kulma wrote:
>
> > Hi everyone,
> >
> > I know there have been several requests regarding subsetting before, but
> > none of them really helps with my problem:
> >
> > I'm trying to subset only infected individuals from the REC2 data.frame:
> >
> >> str(REC2)
> > 'data.frame':362 obs. of  7 variables:
> > $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41 58
> > 66 17
> > $ year : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3
> ...
> > $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
> > $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
> > $ binage   : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
> > $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1
> 2
> > 2 1 ...
> > $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
> >
> > using either
> >
> > RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
> >
> > or
> >
> > RECinf<-subset(REC2,  INFECTION=="Infected")
> >
> > in both cases I get empty data frame (0 observations):
> >
> >> str(RECinf)
> > 'data.frame':0 obs. of  7 variables:
> > $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
> > $ year : Factor w/ 8 levels "Y2002","Y2003",..:
> > $ ccFLEDGE : int
> > $ rec2012  : int
> > $ binage   : Factor w/ 2 levels "ad","juv":
> > $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
> > $ all.rsLD : num
> >
> > When subsetting, R doesn't return any warning or error message. Besides,
> I
> > used same codes many times before and they worked perfectly well. Any
> ideas
> > why this case is different?
> >
> > Thanks for your help,
> > Kasia
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] untar() error

2013-05-03 Thread Prof Brian Ripley


On 03/05/2013 08:31, Hakim Abdi wrote:

Dear List,

I have a list of 600+ *.gz files that I would like to extract and read the
geotiffs contained within them. I tried using the untar() function to
simplify this task but I am stumped by an error. I've combed the Internet
for a solution without luck. The details are below, and any help in solving
this matter is appreciated.


Those are most likely not tar files.  What does file (the command-line 
program contained in Rtools) say they are?





files = list.files(path = "J:/GIMMS/NDVI", pattern = "data.tif.gz",

all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE,
include.dirs = TRUE)


lapply(files, untar)

Error in rawToChar(block[seq_len(ns)]) :
   embedded nul in string: 'II*\0ÃŒ <\001Â´
\0\0`G\0\0\fn\0\0Â¸â€�\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034>\002\0Ãˆd\002\0tâ€¹\002\0
Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$&\003\0Ã�L\003\0|s\003'


untar(files[1])

Error in rawToChar(block[seq_len(ns)]) :
   embedded nul in string: 'II*\0ÃŒ <\001Â´
\0\0`G\0\0\fn\0\0Â¸â€�\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034>\002\0Ãˆd\002\0tâ€¹\002\0
Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$&\003\0Ã�L\003\0|s\003'


untar("J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz")

Error in rawToChar(block[seq_len(ns)]) :
   embedded nul in string: 'II*\0ÃŒ <\001Â´
\0\0`G\0\0\fn\0\0Â¸â€�\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034>\002\0Ãˆd\002\0tâ€¹\002\0
Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$&\003\0Ã�L\003\0|s\003'


traceback()

3: rawToChar(block[seq_len(ns)])
2: untar2(tarfile, files, list, exdir)
1: untar(files[1])


sessionInfo()

R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C

[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



___

Hakim Abdi
Doctoral Student

Physical Geography and Ecosystem Science
Lund University
SÃ¶lvegatan 12, 223 62 Lund, Sweden

Office: +46 (0) 46 2223132
Mobile: +46 (0) 73 9300116

Email: hakim.a...@nateko.lu.se

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Mihai Nica

Hi:

"(note the space after "Infected")"

Since I lost a morning too with this issue, I am just curious, why is there a 
space? 

I know, it must be a dumb question, a reasonable programming rule, but that's 
my level :-)
 
mike


>
> From: Jorge I Velez 
>To:Katarzyna Kulma  
>Cc: R mailing list  
>Sent: Friday, May 3, 2013 6:01 AM
>Subject: Re: [R] R does not subset
> 
>
>Hi Kasia,
>
>You need
>
>subset(REC2,  INFECTION=="Infected ")
>
>(note the space after "Infected").
>
>HTH,
>Jorge.-
>
>
>On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
>wrote:
>
>> Hi everyone,
>>
>> I know there have been several requests regarding subsetting before, but
>> none of them really helps with my problem:
>>
>> I'm trying to subset only infected individuals from the REC2 data.frame:
>>
>> > str(REC2)
>> 'data.frame':    362 obs. of  7 variables:
>>  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67 41 58
>> 66 17
>>  $ year     : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1 1 3
>> ...
>>  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
>>  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
>>  $ binage  : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
>>  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1 2 2 1 2
>> 2 1 ...
>>  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
>>
>> using either
>>
>> RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
>>
>> or
>>
>> RECinf<-subset(REC2,  INFECTION=="Infected")
>>
>> in both cases I get empty data frame (0 observations):
>>
>> > str(RECinf)
>> 'data.frame':    0 obs. of  7 variables:
>>  $ RINGNO   : Factor w/ 370 levels "BL17546","BL17577",..:
>>  $ year     : Factor w/ 8 levels "Y2002","Y2003",..:
>>  $ ccFLEDGE : int
>>  $ rec2012  : int
>>  $ binage  : Factor w/ 2 levels "ad","juv":
>>  $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
>>  $ all.rsLD : num
>>
>> When subsetting, R doesn't return any warning or error message. Besides, I
>> used same codes many times beforeand they worked perfectly well. Any ideas
>> why this case is different?
>>
>> Thanks for your help,
>> Kasia
>>
>>         [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>    [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Very basic statistics in R

2013-05-03 Thread Xavier Prudent

Dear all,

Very simple question, but apparently uneasy to solve in R:

I have a sampling of a variable x: (3, 4. 5, 2, ...)

I want to know:
 - the mean-> mean(x)
 - the uncertainty on-> std.error(x) ? Or sd(x)?
 - the standard deviation of x  -> ?
 - the uncertainty on the standard deviation -> ?

Anyone has an idea?

Thanks in advance,

regards,
Xavier



-- 
*---
Xavier Prudent
*
*
Computational biology and evolutionary genomics
*
*
*
*Guest scientist at the Max-Planck-Institut für Physik komplexer Systeme*
*(MPI-PKS)*
*Noethnitzer Str. 38*
*01187 Dresden
*
*
*
*Max Planck-Institute for Molecular Cell Biology and Genetics*
*
(MPI-CBG)
*
*
Pfotenhauerstraße 108
*
*
01307 Dresden
*
*

*
*
Phone: +49 351 210-2621
*
*Mail: prudent [ at ] mpi-cbg.de
**---*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Courses: Statistical Analysis with R - Bayesian Data Analysis with R and WinBUGS

2013-05-03 Thread Dr. Pablo E. Verde


Dear list members,

Apologies for cross-posting. Please, find below the information of
two statistical courses with R:

1) Statistical Analysis with R
2) Bayesian Data Analysis with R and WinBUGS

If you have any question don't hesitate to contact me.

Best regards,

Pablo

++
*Two days course in: Statistical Analysis with R
*Where:  Linux Hotel, Essen-Horst, Germany
*When:
14.06-15.06.2013
22.11-23.11.2013
13.12-14.12.2013
*Instructor:
Dr. Pablo E. Verde
++
*Target audience:
Data analysis with basic knowledge in statistics will benefit from this 
course.

The course is intended as a first course in R but not as a first course in
statistics or data analysis.
++
*Course content:
Day 1:
*Introduction to statistical analysis with R
*Classical graphical functions (scatter plots, conditional plots, 
histograms, etc)

*Data management with R (indexing and other advanced techniques)
*Advance graphical techniques for data analysis: lattice plots and 
ggplot2


Day 2:
*Statistical analysis based on computer simulation (bootstrap methods)
*Regression modeling (linear/non-linear/logistic regression)
*Issues in regression modeling (variable selection, model checking, 
etc.)


*Prices:
Public sector and commercial: 737.8 Euros (two days course, included VAT)
Student:  450 Euro (two days course, included VAT). Some of the courses are
frequently fully booked. So please notice that you may have to try several
times, until you get a spare place.
++

++
*Three days course in: Bayesian Data Analysis with R and WinBUGS
*Where: Linux Hotel, Essen-Horst, Germany
*When:
11.07-13.07.2013
07.11-09.11.2013
*Instructor:
Dr. Pablo E. Verde
++
*Target audience:
This course is for data analyst who are familiar with classical statistics
and they want to get a working knowledge in Bayesian analysis. This is a 
3 days
intensive training course with 8 hours per day including lecturing and 
exercises.
The course presentation is practical with many worked examples. To 
attend the

course you do NOT need experience with R or with WinBUGS. Lectures are given
in English. Discussions can be in English, German or Spanish.
++
*Course content:
Day 1
*Lecture 1: Introduction to Bayesian Inference
*Lecture 2: Bayesian analysis for single parameter models
*Lecture 3: Prior distributions: univariate

Day 2
*Lecture 4: Bayesian analysis for multiple parameter models
*Lecture 5: An introduction to WinBUGS
*Lecture 6: Multivariate models with WinBUGS

Day 3
*Lecture 7: An introduction to MCMC computations
*Lecture 8: Bayesian regression with WinBUGS
*Lecture 9: Introduction to Hierarchical Statistical modeling

*Prices:
Public sector and commercial: 1088,85 Euro (three days course, included VAT)
Student:  675 Euro (three days course, included VAT). Some of the 
courses are

frequently fully booked. So please notice that you may have to try several
times, until you get a spare place.

++
**For more information, please contact:  i...@linuxhotel.de
++

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cURL ?

2013-05-03 Thread Jeff Newmiller

If you don't know, we certainly don't. This is not a question about R or RCurl 
anymore... it is a question about cURL. You need to know what operating system 
your computer uses and how to enable SSL for cURL on that operating system... 
perhaps you need local technical assistance.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jawad hussain  wrote:

>Dear Sir 
>I tried to find cURL on web but I do not find reliable file; there are
>some files on http://curl.haxx.se/. But I do not know which is suitable
>for R and how to install?
>Kind Regards 
>
> 
>Jawad Hussain Ashraf 
>VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275
>
>
>> Date: Sun, 28 Apr 2013 19:07:05 +0100
>> From: rip...@stats.ox.ac.uk
>> To: miyanja...@hotmail.com
>> CC: r-help@r-project.org
>> Subject: Re: [R] unsupported url scheme
>> 
>> On 28/04/2013 15:32, jawad hussain wrote:
>> > fileUrl <-
>"https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD"download.file(fileUrl,destfile="./data/Cameras.csv",method="curl")
>I tried it after installing package "RCurl" but it give error message:
>Error in download.file(fileUrl, destfile = "Cameras.csv") :
>> >unsupported URL schemeI can you help me to solve this problem.
>JAWAD HUSSAIN ASHRAF
>> 
>> 
>> Yes, simply install a version of cURL which supports that scheme,
>then 
>> re-install RCurl.
>> 
>>  
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> 
>> That does apply to you, too.  No HTML, tell us your sessionInfo()
>
>> 
>> -- 
>> Brian D. Ripley,  rip...@stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>   
>   
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Jeff Newmiller

This typically occurs because of sloppy manual data entry outside of R. To 
relieve further analysis pain, you can manually clean the data (usually only 
effective for one-time analyses) or use R to fix problems right after loading 
the data (there are multiple methods for doing this... I prefer using ?sub on 
character data before creating the factor).
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Mihai Nica  wrote:

>Hi:
>
>"(note the space after "Infected")"
>
>Since I lost a morning too with this issue, I am just curious, why is
>there a space?�
>
>I know, it must be a dumb question, a reasonable programming rule, but
>that's my level :-)
>�
>mike
>
>
>>
>> From: Jorge I Velez 
>>To:Katarzyna Kulma  
>>Cc: R mailing list  
>>Sent: Friday, May 3, 2013 6:01 AM
>>Subject: Re: [R] R does not subset
>> 
>>
>>Hi Kasia,
>>
>>You need
>>
>>subset(REC2,� INFECTION=="Infected ")
>>
>>(note the space after "Infected").
>>
>>HTH,
>>Jorge.-
>>
>>
>>On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
>>wrote:
>>
>>> Hi everyone,
>>>
>>> I know there have been several requests regarding subsetting before,
>but
>>> none of them really helps with my problem:
>>>
>>> I'm trying to subset only infected individuals from the REC2
>data.frame:
>>>
>>> > str(REC2)
>>> 'data.frame':� � 362 obs. of� 7 variables:
>>>� $ RINGNO�  : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67
>41 58
>>> 66 17
>>>� $ year� �  : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1
>1 3
>>> ...
>>>� $ ccFLEDGE : int� 6 6 6 5 6 7 6 7 6 5 ...
>>>� $ rec2012� : int� 2 1 2 2 1 2 1 1 1 0 ...
>>>� $ binage� : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
>>>� $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1
>2 2 1 2
>>> 2 1 ...
>>>� $ all.rsLD : num� -4.62 -6.19 -3.62 -4.19 -2.62 ...
>>>
>>> using either
>>>
>>> RECinf<-REC2[which (REC2$INFECTION=="Infected"),]
>>>
>>> or
>>>
>>> RECinf<-subset(REC2,� INFECTION=="Infected")
>>>
>>> in both cases I get empty data frame (0 observations):
>>>
>>> > str(RECinf)
>>> 'data.frame':� � 0 obs. of� 7 variables:
>>>� $ RINGNO�  : Factor w/ 370 levels "BL17546","BL17577",..:
>>>� $ year� �  : Factor w/ 8 levels "Y2002","Y2003",..:
>>>� $ ccFLEDGE : int
>>>� $ rec2012� : int
>>>� $ binage� : Factor w/ 2 levels "ad","juv":
>>>� $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
>>>� $ all.rsLD : num
>>>
>>> When subsetting, R doesn't return any warning or error message.
>Besides, I
>>> used same codes many times beforeand they worked perfectly well. Any
>ideas
>>> why this case is different?
>>>
>>> Thanks for your help,
>>> Kasia
>>>
>>>� � � �  [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>��� [[alternative HTML version deleted]]
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>   [[alternative HTML version deleted]]
>
>
>
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very basic statistics in R

2013-05-03 Thread S Ellison

 

>  - the mean-> mean(x)
>  - the uncertainty on-> std.error(x) ? Or sd(x)?
>  - the standard deviation of x  -> ?
>  - the uncertainty on the standard deviation -> ?
> 
> Anyone has an idea?

1. Use R's help system to look up 'standard deviation' and 'mean'
e.g.:
??'standard deviation' 
??'mean'

For the other two questions, consult your basic stats textbook; the answers can 
be calculated from the two above together with the number of observations.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] print multiple plots to jpeg, one lattice and one ggplot2

2013-05-03 Thread Christophe Bouffioux

hello everybody,

I want to print two plots in one png file, I tried several options but i
didn't succeed
the first plot (bwplot) print to the defined position, but the second
(ggplot) doesn't
Any idea?
Thanks a lot
Christophe


#   Example:
#-

library(ggplot2)
library(lattice)
library(grid)

one <- bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
   panel = "panel.superpose",
   panel.groups = "panel.linejoin",
   xlab = "treatment",
   key = list(lines = Rows(trellis.par.get("superpose.line"),
  c(1:7, 1)),
  text = list(lab =
as.character(unique(OrchardSprays$rowpos))),
  columns = 4, title = "Row position"))


df <- data.frame(gp = factor(rep(letters[1:3], each = 10)),
 y = rnorm(30))
# Compute sample mean and standard deviation in each group
library(plyr)
ds <- ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y))

two <- ggplot(df, aes(x = gp, y = y)) +
 geom_point() +
 geom_point(data = ds, aes(y = mean),
  colour = 'red', size = 3)



# 1. not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units="px", res=100)
print(one, position=c(0,0,0.5,1), more=TRUE)
print(two, position=c(0.5,0,1,1), )
dev.off()


# 2 not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units="px", res=100)
 grid.newpage()
 pushViewport(viewport(layout = grid.layout(1, 2)))

  print(one, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
# ça ne fonctionne pas
  print(two, vp = viewport(layout.pos.row = 1, layout.pos.col = 2))
dev.off()



# 3 not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units="px", res=100)
 par(mfrow=c(1,2))
  one
  two
dev.off()

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating distance matrix for large dataset

2013-05-03 Thread David Carlson

Here's the result on R 3.0.0 64 bit under Windows 8:

> A<-matrix(1:365000*144,nrow=365000,ncol=144)
> dim(A)
[1] 365000144
> d <- dist(mydata_nor, method = "euclidean")
Error in as.matrix(x) : object 'mydata_nor' not found
> d <- dist(A, method = "euclidean")
Error: cannot allocate vector of size 496.3 Gb
In addition: Warning messages:
1: In dist(A, method = "euclidean") :
  Reached total allocation of 8078Mb: see help(memory.size)
2: In dist(A, method = "euclidean") :
  Reached total allocation of 8078Mb: see help(memory.size)
3: In dist(A, method = "euclidean") :
  Reached total allocation of 8078Mb: see help(memory.size)
4: In dist(A, method = "euclidean") :
  Reached total allocation of 8078Mb: see help(memory.size)

Your message suggests that your system could not accurately compute the
requirements. Unless you have access to a computer with 500 gigabytes, you
need to consider alternate approaches such as aggregating the data into
longer time blocks or using kmeans.

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of HJ YAN
Sent: Thursday, May 2, 2013 6:02 PM
To: r-help@r-project.org
Subject: [R] Calculating distance matrix for large dataset

Dear R users


I wondered if any of you ever tried to calculate distance matrix with very
large data set, and if anyone out there can confirm this error message I got
actually mean that my data is too large for this task.

negative length vectors are not allowed


My data size and code used

 dim(mydata_nor)[1] 365000144> d <- dist(mydata_nor, method =
"euclidean")



Here my data has 1000 samples each has a year data observed by 10 minutes
interval daily, so the size is  (365* 1000) * 144.


I checked the manual of function 'dist' but can not see the upper limit size
allowed, and I bet there should be one, so any hints is appreciated.


I would also be grateful if any other method for calculating distance matrix
for large dataset could be advised.



I appreciate reproducible code should be provided for your advice, so try
below if needed:

A<-matrix(1:365000*144,nrow=365000,ncol=144)> dim(A)[1] 365000144>
d1<-dist(A,method="euclidean")Error in dist(A, method = "euclidean") :
  negative length vectors are not allowed




Many thanks in advance!

HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very basic statistics in R

2013-05-03 Thread Jeff Newmiller

I recommend you read the Introduction to R document that comes with R. Look for 
making vectors with the c() function, and using the mean() and sd() functions. 

Note that this is not a homework help forum (read the Posting Guide mentioned 
at the bottom of every message). If this is not homework, you are going to need 
to do quite a bit of self study before you can ask questions clearly enough to 
get useful responses on this list. See

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Xavier Prudent  wrote:

>Dear all,
>
>Very simple question, but apparently uneasy to solve in R:
>
>I have a sampling of a variable x: (3, 4. 5, 2, ...)
>
>I want to know:
> - the mean-> mean(x)
> - the uncertainty on-> std.error(x) ? Or sd(x)?
> - the standard deviation of x  -> ?
> - the uncertainty on the standard deviation -> ?
>
>Anyone has an idea?
>
>Thanks in advance,
>
>regards,
>Xavier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] untar() error

2013-05-03 Thread Jeff Newmiller

untar != gunzip
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Prof Brian Ripley  wrote:

>On 03/05/2013 08:31, Hakim Abdi wrote:
>> Dear List,
>>
>> I have a list of 600+ *.gz files that I would like to extract and
>read the
>> geotiffs contained within them. I tried using the untar() function to
>> simplify this task but I am stumped by an error. I've combed the
>Internet
>> for a solution without luck. The details are below, and any help in
>solving
>> this matter is appreciated.
>
>Those are most likely not tar files.  What does file (the command-line 
>program contained in Rtools) say they are?
>
>>
>>> files = list.files(path = "J:/GIMMS/NDVI", pattern = "data.tif.gz",
>> all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case =
>TRUE,
>> include.dirs = TRUE)
>>
>>> lapply(files, untar)
>> Error in rawToChar(block[seq_len(ns)]) :
>>embedded nul in string: 'II*\0ÃŒ <\001Â´
>>
>\0\0`G\0\0\fn\0\0Â¸â€\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034>\002\0Ãˆd\002\0tâ€¹\002\0
>> Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$&\003\0ÃL\003\0|s\003'
>>
>>> untar(files[1])
>> Error in rawToChar(block[seq_len(ns)]) :
>>embedded nul in string: 'II*\0ÃŒ <\001Â´
>>
>\0\0`G\0\0\fn\0\0Â¸â€\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034>\002\0Ãˆd\002\0tâ€¹\002\0
>> Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$&\003\0ÃL\003\0|s\003'
>>
>>>
>untar("J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz")
>> Error in rawToChar(block[seq_len(ns)]) :
>>embedded nul in string: 'II*\0ÃŒ <\001Â´
>>
>\0\0`G\0\0\fn\0\0Â¸â€\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034>\002\0Ãˆd\002\0tâ€¹\002\0
>> Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$&\003\0ÃL\003\0|s\003'
>>
>>> traceback()
>> 3: rawToChar(block[seq_len(ns)])
>> 2: untar2(tarfile, files, list, exdir)
>> 1: untar(files[1])
>>
>>> sessionInfo()
>> R version 2.15.2 (2012-10-26)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>>
>>
>> ___
>>
>> Hakim Abdi
>> Doctoral Student
>>
>> Physical Geography and Ecosystem Science
>> Lund University
>> SÃ¶lvegatan 12, 223 62 Lund, Sweden
>>
>> Office: +46 (0) 46 2223132
>> Mobile: +46 (0) 73 9300116
>>
>> Email: hakim.a...@nateko.lu.se
>>
>>  [[alternative HTML version deleted]]
>>
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Size of a refClass instance

2013-05-03 Thread Jeff Newmiller

Interesting conclusion. Alternatively, that representation of your object model 
may not be computationally effective. This discrepancy may be less exaggerated 
in C++, but you may still find that large numbers of objects are less efficient 
in their use of memory or cpu time than vector processing even there. I would 
read the point of Martin's response as "Don't confuse your mental model of the 
solution with its implementation".
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

David Kulp  wrote:

>Good tip.  Thanks Morgan.
>I agree that a different structure might (necessarily) be in order.  I
>wanted to create a tree where nodes in a tree were of different derived
>sub-classes -- possibly holding more data and behaving polymorphically.
>OO programming seemed ideal for this: lots of small things with
>specialized behavior -- but this isn't R's strength.
>
>On May 2, 2013, at 4:57 PM, Martin Morgan wrote:
>
>> On 05/01/2013 11:20 AM, David Kulp wrote:
>>> I'm using refClass for a complex multi-directional tree structure
>with
>>> possibly 100,000s of nodes.  The refClass design is very impressive
>and I'd
>>> love to use it, but I've found that the size of refClass instances
>are very
>>> large and creation time is slow.  For example, below is a RefClass
>and normal
>>> S4 class.  The RefClass requires about 4KB per instance vs 500B for
>the S4
>>> class -- based on adding the Ncells and Vcells of used memory
>reported by
>>> gc().  And instantiation is more than twice as slow for a RefClass. 
>(R
>>> 2.14.2)
>>> 
>>> Anyone have thoughts on this and whether there's any hope for
>improving
>>> resources on either front?
>> 
>> Hi David -- not necessarily helpful but creating a few large objects
>is always better than creating many small in R, so perhaps
>re-conceptualize your data structure? As a rough analogy, instead of
>constructing a graph as a large number of 'Node' instances each
>pointing to one another, a graph could be represented as a data.frame
>containing columns of 'from' and 'to' indexes (neighbour-edge list, a
>few large objects) or as an adjacency matrix. One would also implement
>creation and update of the few large objects in an R-friendly
>(vectorized) way.
>> 
>> Perhaps there are existing packages that already model the data
>you're interested in? If your multi-directional tree can be represented
>as a graph, then perhaps
>> 
>>  http://bioconductor.org/packages/release/bioc/html/graph.html
>> 
>> including facilities in the Boost graph library (RBGL, on the
>Bioconductor web site, too) or the igraph package can be put to use.
>> 
>> Martin
>> 
>>> 
>>> I wonder what others are doing.  I've been thinking about
>lightweight
>>> alternative implementations, but nothing particularly elegant has
>come to
>>> mind, yet!
>>> 
>>> Thanks!
>>> 
>>> 
>>> simple <- setRefClass('simple', fields = list(a = "character",
>b="numeric")
>>> ) gc() system.time(simple.list <- lapply(1:10, function(i) {
>>> simple$new(a='foo',b=i) })) gc()
>>> 
>>> setClass('simple2', representation(a="character",b="numeric"))
>>> setMethod("initialize", "simple2", function(.Object, a, b) {
>.Object@a <- a
>>> .Object@b <- b .Object })
>>> 
>>> gc() system.time(simple2.list <- lapply(1:10, function(i) {
>>> new('simple2',a='foo',b=i) })) gc()
>>> 
>>> __ R-help@r-project.org
>mailing
>>> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
>posting
>>> guide http://www.R-project.org/posting-guide.html and provide
>commented,
>>> minimal, self-contained, reproducible code.
>>> 
>> 
>> 
>> -- 
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>> 
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Write date class as number of days from 1970

2013-05-03 Thread arun

Hi,
May be this helps:
set.seed(24)
dat1<- 
data.frame(date1=sample(seq(as.Date("2012-09-14",format="%Y-%m-%d"),length.out=40,by="day"),20,replace=FALSE),
 value=sample(1:60,20,replace=TRUE))
dat1$days1<- as.numeric(difftime(dat1$date1,as.Date("1970-01-01")))
#or
library(lubridate) 
dat1$days2<- days(dat1$date1)$day
head(dat1)
#   date1 value days1 days2
#1 2012-09-25 6 15608 15608
#2 2012-09-22    34 15605 15605
#3 2012-10-10    44 15623 15623
#4 2012-10-03 9 15616 15616
#5 2012-10-07    14 15620 15620
#6 2012-10-16    42 15629 15629
#or
library(chron)
as.numeric(as.chron(dat1$date1)-chron(0))
 #[1] 15608 15605 15623 15616 15620 15629 15606 15622 15631 15604 15615 15607
#[13] 15626 15624 15635 15619 15601 15598 15636 15599


A.K.

>Dear all, 
>
>I have a dataset with one column being of class Date. When I 
write the output, I would like that column being written as number of 
days from >1970-01-01. I could not find anywhere a way to do it. 
>
>Thanks, 
>Marco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read .csv file and plot a graph

2013-05-03 Thread Vahe nr

Hi all,

I have a big .csv file (21Mb with 100 rows) it has this shape:
x
1 NaN
2 NaN
3 0.23

and so on.

So the first column has x as a header then row number, the second column
contains values between -1,1 and NaN for empty values.

What should I need to do is: create a new .csv file from this one excluding
NaN values and plot a line graph using the new .csv file.

Or can I use the old .csv file to plot a graph excluding NaN values.

Thanks in advance for any help or suggestions.

Regards,
 Vahe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Write date class as number of days from 1970

2013-05-03 Thread Manta

Dear all,

I have a dataset with one column being of class Date. When I write the
output, I would like that column being written as number of days from
1970-01-01. I could not find anywhere a way to do it.

Thanks,
Marco



--
View this message in context: 
http://r.789695.n4.nabble.com/Write-date-class-as-number-of-days-from-1970-tp4666155.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread Anthony Damico

sorry, i had assumed readWorksheetFromFile would give you back a data
frame.  all of the operations i recommended work on data.frame objects

at different points in the code, check if it's a data.frame or a matrix..

class( temp )

..you can check its current class at any point.


and if it's a matrix, you can convert it to a data frame with

temp <- as.data.frame( temp )







On Fri, May 3, 2013 at 2:00 AM, jpm miao  wrote:

> Hi Anthony,
>
>Thank you very much. It works very well. However, after this line
>
> > temp <- sapply( temp , as.numeric )
>
>the data becomes a series of numbers instead of a matrix. Is there any
> way to keep it a matrix?
>
>Thanks,
>
> Miao
>
>
>
>
> > temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
> startRow=2, endRow= 11, startCol=2, endCol=5)
> > temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
> > temp
>   Col1 Col2   Col3Col4
>  [1,] "647853" "1413" "57662" "27897"
>  [2,] "491400" "1365" "40919" "20411"
>  [3,] "38604"  "-""5505"  "985"
>  [4,] "576""-""20""54"
>  [5,] "80845"  "21"   "10211" "4494"
>  [6,] "36428"  "27"   "1007"  "1953"
>  [7,] "269915" "587"  "32988" "12779"
>  [8,] "224494" "-""30554" "9184"
>  [9,] "11858"  "587"  "-" "686"
> [10,] "3742"   "-""81""415"
>  > temp <- sapply( temp , as.numeric )
> Warning messages:
> 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> > temp
> 647853 491400  38604576  80845  36428 269915
> 647853 491400  38604576  80845  36428 269915
> 224494  11858   3742   1413   1365  -  -
> 224494  11858   3742   1413   1365 NA NA
> 21 27587  -587  -  57662
> 21 27587 NA587 NA  57662
>  40919   5505 20  10211   1007  32988  30554
>  40919   5505 20  10211   1007  32988  30554
>  - 81  27897  20411985 54   4494
> NA 81  27897  20411985 54   4494
>   1953  12779   9184686415
>   1953  12779   9184686415
> > temp[ is.na( temp ) ] <- 0
> > temp
> 647853 491400  38604576  80845  36428 269915
> 647853 491400  38604576  80845  36428 269915
> 224494  11858   3742   1413   1365  -  -
> 224494  11858   3742   1413   1365  0  0
> 21 27587  -587  -  57662
> 21 27587  0587  0  57662
>  40919   5505 20  10211   1007  32988  30554
>  40919   5505 20  10211   1007  32988  30554
>  - 81  27897  20411985 54   4494
>  0 81  27897  20411985 54   4494
>   1953  12779   9184686415
>   1953  12779   9184686415
>
>
> 2013/5/2 Anthony Damico 
>
>> try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
>>
>>
>>
>> if that doesn't work, try a few other steps
>>
>>
>> # view what data types your file is being read in as
>> sapply( temp , class )
>>
>>
>> # convert all fields to character if they're factor variables.. but i
>> don't think you need this, readWorksheet defaults to `character`
>> temp <- sapply( temp , as.character )
>>
>>
>> # you can also convert a subset like this
>> temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character )
>>
>>
>>
>> # remove commas from character strings
>> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
>>
>> # convert all fields to numeric
>> temp <- sapply( temp , as.numeric )
>>
>> # convert all NA fields to zeroes if you prefer
>> temp[ is.na( temp ) ] <- 0
>>
>>
>>
>>
>>
>> On Wed, May 1, 2013 at 11:55 PM, jpm miao  wrote:
>>
>>> Hi,
>>>
>>>Attached are two datasheet to be read.
>>>My raw data "130502temp.xlsx" contains numbers with ' symbols, and
>>> they
>>> can't be read as numbers. Even if I copy and paste as numbers to form a
>>> new
>>> file "130502temp_number1.xlsx", they could not be read smoothly.
>>>
>>>1. How can I read the datasheet as numbers?
>>>2. How can I treat the notation "-" as (1) "NA" or (2) zero?
>>>
>>>Thanks,
>>>
>>> Miao
>>>
>>>
>>>
>>>
>>> > temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
>>> startRow=2, endRow= 11, startCol=2, endCol=5)
>>>
>>> > temp
>>>
>>>   Col1  Col2   Col3   Col4
>>>
>>> 1  647,853 1,413 57,662 27,897
>>>
>>> 2  491,400 1,365 40,919 20,411
>>>
>>> 3   38,604 -  5,505985
>>>
>>> 4  576 - 20 54
>>>
>>> 5   80,84521 10,211  4,494
>>>
>>> 6   36,42827  1,007  1,953
>>>
>>> 7  269,915   587 32,988 12,779
>>>
>>> 8  224,494 - 30,554  9,184
>>>
>>> 9   11,858   587  -686
>>>
>>> 10   3,742 - 81415
>>>
>>> > temp[2,2]
>>>
>>> [1] "1,365"
>>>
>>> > temp[2,2]+3
>>>
>>> Error in temp[2, 2] + 3 : non-numeric argument to bi

Re: [R] print multiple plots to jpeg, one lattice and one ggplot2

2013-05-03 Thread Felipe Carrillo

Something like this?
library(gridExtra)
grid.arrange(one,two)

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish & Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx



>
>From: Christophe Bouffioux 
>To: "r-help@r-project.org"  
>Sent: Friday, May 3, 2013 6:33 AM
>Subject: [R] print multiple plots to jpeg, one lattice and one ggplot2
>
>
>hello everybody,
>
>I want to print two plots in one png file, I tried several options but i
>didn't succeed
>the first plot (bwplot) print to the defined position, but the second
>(ggplot) doesn't
>Any idea?
>Thanks a lot
>Christophe
>
>
>#  Example:
>#-
>
>library(ggplot2)
>library(lattice)
>library(grid)
>
>one <- bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
>      panel = "panel.superpose",
>      panel.groups = "panel.linejoin",
>      xlab = "treatment",
>      key = list(lines = Rows(trellis.par.get("superpose.line"),
>                  c(1:7, 1)),
>                  text = list(lab =
>as.character(unique(OrchardSprays$rowpos))),
>                  columns = 4, title = "Row position"))
>
>
>df <- data.frame(gp = factor(rep(letters[1:3], each = 10)),
>                y = rnorm(30))
># Compute sample mean and standard deviation in each group
>library(plyr)
>ds <- ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y))
>
>two <- ggplot(df, aes(x = gp, y = y)) +
>    geom_point() +
>    geom_point(data = ds, aes(y = mean),
>              colour = 'red', size = 3)
>
>
>
># 1. not working
>jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
>height = 400, units="px", res=100)
>    print(one, position=c(0,0,0.5,1), more=TRUE)
>    print(two, position=c(0.5,0,1,1), )
>dev.off()
>
>
># 2 not working
>jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
>height = 400, units="px", res=100)
>grid.newpage()
>pushViewport(viewport(layout = grid.layout(1, 2)))
>
>      print(one, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
># ça ne fonctionne pas
>      print(two, vp = viewport(layout.pos.row = 1, layout.pos.col = 2))
>dev.off()
>
>
>
># 3 not working
>jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
>height = 400, units="px", res=100)
>    par(mfrow=c(1,2))
>      one
>      two
>dev.off()
>
>    [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Declare a set (list?) of many dataframes or matrices

2013-05-03 Thread jpm miao

Hi,

   I would like to read several datasets and would like to create a set
(list? sequence?) of many empty dataframes. How could this be done? How
could I declare a  set (list? sequence?) of many empty matrices?

   Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Why can't R understand if(num!=NA)?

2013-05-03 Thread jpm miao

I have a program, when I write

if(num!=NA)

it yields an error message.

However, if I write

if(is.na(num)==FALSE)

it works.

Why doesn't the first statement work?

Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cURL ?

2013-05-03 Thread R. Michael Weylandt

On Fri, May 3, 2013 at 11:31 AM, jawad hussain  wrote:
> Dear Sir
> I tried to find cURL on web but I do not find reliable file; there are some 
> files on http://curl.haxx.se/. But I do not know which is suitable for R and 
> how to install?
> Kind Regards

As usual, the OS is relevant here. What are you running?

Linux package managers should be able to handle this for you. And I'd
have guessed this was a "Just works" for OS X.

MW

>
>
> Jawad Hussain Ashraf
> VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275
>
>
>> Date: Sun, 28 Apr 2013 19:07:05 +0100
>> From: rip...@stats.ox.ac.uk
>> To: miyanja...@hotmail.com
>> CC: r-help@r-project.org
>> Subject: Re: [R] unsupported url scheme
>>
>> On 28/04/2013 15:32, jawad hussain wrote:
>> > fileUrl <- 
>> > "https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD"download.file(fileUrl,destfile="./data/Cameras.csv",method="curl")
>> >  I tried it after installing package "RCurl" but it give error message: 
>> > Error in download.file(fileUrl, destfile = "Cameras.csv") :
>> >unsupported URL schemeI can you help me to solve this problem. JAWAD 
>> > HUSSAIN ASHRAF
>>
>>
>> Yes, simply install a version of cURL which supports that scheme, then
>> re-install RCurl.
>>
>>
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide 
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> That does apply to you, too.  No HTML, tell us your sessionInfo() 
>>
>> --
>> Brian D. Ripley,  rip...@stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MANOVA summary.manova(m) :" residuals have rank"

2013-05-03 Thread Ozgul Inceoglu

Dear All, I am trying to perform MANOVA. I have table with 504 columns(species) 
and 36 rows) with two grouping (season and location)
 
Zx <- Z[c(4:504)]
Zxm <- as.matrix(Z)
m<- manova(Zxm~Season*location, data=Z)

when I do summary.aov, I get respond for each species but summary.manova
summary.manova(m) :" residuals have rank" 24<501.

What can it be the reason for this error message?

Thank you,

Ozgul

 Below you can see part of the table.
nameSeason  locationAcetobacter Aerococcus  Alishewanella   
Amaricoccus
xls-nord-01 J   w   0   0,024078979 0   0
bxls-sud-01 J   w   0   0   0   0
brux-nord-04A   w   0   0   0   0
brux-sud-04 A   w   0   0   0   0
br-nord-07  Ju  w   0   0   0   0
br-sud-07   Ju  w   0   0   0   0
b-nord-10   O   w   0   0   0   0
bsud-10 O   w   0,107836089 0   0,107836089 0,035945363
Z1-01   J   u   0   0   0   0,040567951
Z3-01   J   u   0   0   0   0
Z5-01   J   d   0,023116043 0   0   0
Z7-01   J   d   0,014130281 0   0   0
Z9-01   J   d   0   0   0   0
Z10-01  J   d   0   0   0   0
Z12-01  J   d   0   0   0   0
Z1-04   A   u   0   0   0   0
Z3-04   A   u   0   0   0   0
Z5-04   A   d   0   0   0   0
Z7-04   A   d   0   0   0   0
Z9-04   A   d   0   0,013839873 0   0
Z10-04  A   d   0   0   0   0
Z12-04  A   d   0   0   0   0
Z1-07   Ju  u   0   0   0   0
Z3-07   Ju  u   0   0   0   0
Z5-07   Ju  d   0   0   0   0
Z7-07   Ju  d   0   0   0   0
Z9-07   Ju  d   0   0   0   0
Z10-07  Ju  d   0   0   0   0
Z12-07  Ju  d   0   0,022301517 0   0
Z1-10   O   u   0   0   0   0
Z3-10   O   u   0   0   0   0
Z5-10   O   d   0   0   0   0
Z7-10   O   d   0   0   0,052924054 0
Z9-10   O   d   0   0   0,035050824 0
Z10-10  O   d   0   0   0   0,040783034
Z12-10  O   d   0   0   0   0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is it a "Headless problem"? - Same code runs well in interactive R shell, but never terminates with Rscript

2013-05-03 Thread Asis Hallab

Dear R-Experts,

I seem to be dealing with a so called "headless" problem in R.

I wrote a quite extensive program that generates a Bayesian network
from a query protein's Phylogenetic Tree and subsequently uses a
message passing algorithm to infer the most likely annotation for the
query leaf in the tree using the other leaves known -and proven-
protein function annotations.

The program uses the following libraries:
library(tools)
library(Biostrings)
library(RCurl)
library(stringr)
library(ape)
library(gRain) # gRain implements the message passing algorithm
library(RMySQL)
library(XML)
library(parallel)
library(brew)
library(xtable)

When the program is run from the command line as:
Rscript prog.r inp.file
with certain input data "inp" it gets stuck and does not terminate
ever. Memory usage sky-rockets and the process spends almost all of
its time on system calls.

Using the identical R code inside an interactive R shell with the very
same input data "inp" the script does not have any problems and
finishes actually amazingly fast.

I am flabbergasted and do require help.
Hence my questions:

* Is anything known about a problem similar to mine appearing when
using the above libraries?

* What is the difference -aside from the obvious missing
interactiveness- between running the very same R code inside an
interactive R shell or inside a file as an argument to Rscript?

* Does my problem indeed fall into the "headless" category?

The problem occurs in
R version 2.15.2 (2012-10-26) -- "Trick or Treat"
on Debian 6.0.2
uname -or gives
3.2.0-0.bpo.3-amd64 GNU/Linux

Any help will be much appreciated.
Have a pleasant day!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package survey: singularities in linear regression models

2013-05-03 Thread Sebastian Weirich

Well, I have uploaded the data in the public folder of my dropbox. Due 
to data confidentiality, I haved to change the labels. To load the data:

con <- url( "http://dl.dropboxusercontent.com/u/101865137/datEx.rda"; )
print(load(con))

# The replicate weights were created according to the jackknife (JK2) 
procedure in the same way as implemented in WesVar.
# According to 100 JK zones, 100 replicate weights result. The replicate 
weights are labelled "totwgtM_1" to "totwgtM_100"
# The regression I want to specify is achievement on group and origin. 
Both predictors are factors.

library(survey)
design   <- svrepdesign(data = datEx[, c("origin", "group", 
"achievement")], weights = datEx[ ,"pweight"],
 type="JKn", scale = 1, rscales = 1, repweights = 
datEx[,grep("^totwgtM_", colnames(datEx))], combined.weights = TRUE, mse 
= TRUE)

# This works
mod1 <- svyglm(formula = achievement ~ origin + group, design = 
design, return.replicates = FALSE, family = gaussian(link="identity"))

# I get the error message when specifying the interaction
mod2 <- svyglm(formula = achievement ~ origin * group, design = 
design, return.replicates = FALSE, family = gaussian(link="identity"))

# The output of the conventional glm() function reports singularities 
for one coefficient of the interaction
mod3 <- glm(formula = achievement ~ origin * group, data = datEx, 
family = gaussian(link = "identity"))

Thanks again,
Sebastian

-- 
Sebastian Weirich, Dipl.-Psych.

Institut zur Qualitätsentwicklung im Bildungswesen
Humboldt-Universität zu Berlin
Sitz: Hannoversche Straße 19, 10115 Berlin
Postadresse: Unter den Linden 6, 10099 Berlin

Tel: +49-(0)30-2093-46512

Am 02.05.2013 22:02, schrieb Thomas Lumley:
> On Fri, May 3, 2013 at 2:27 AM, Sebastian Weirich 
>  > wrote:
>
> Hello,
>
> I want to specify a linear regression model in which the metric
> outcome is predicted by two factors and their interaction. glm()
> computes effects for each factor level and the levels of the
> interaction. In the case of singularities glm() displays "NA" for
> the corresponding coefficients. However, svyglm() aborts with an
> error message. Is there a possibility that svyglm() provides
> output for coefficients without singularities like glm()?
>
>
> It's not true that svyglm() aborts with an error message whenever 
> there are singularities, eg
>
> > svyglm(enroll~stype+I(stype),design=dclus1)
> 1 - level Cluster Sampling design
> With (15) clusters.
> svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
>
> Call:  svyglm(formula = enroll ~ stype + I(stype), design = dclus1)
>
> Coefficients:
> (Intercept)   stypeH   stypeMI(stype)H  I(stype)M
>   432.9697.4464.9   NA NA
>
> Degrees of Freedom: 182 Total (i.e. Null);  12 Residual
> Null Deviance:   2483
> Residual Deviance: 1512 AIC: 2599
>
>
> So, perhaps you could show us what you actually did, and what actually 
> happened, as the posting guidelines request.
>
> -thomas
>
> -- 
> Thomas Lumley
> Professor of Biostatistics
> University of Auckland


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread S Ellison

 

> -Original Message-
> if(num!=NA)
> it yields an error message.

> Why doesn't the first statement work?
Because you just compared something with NA (usually interpreted as 'missing')  
and because of that the comparison result is also NA. 
'if' then tells you that you have a missing value where you need either TRUE or 
FALSE.
Play with
num!=NA #returns NA
and
if(NA) "Not there"  #returns error

is.na() returns TRUE for NA's, so 'if' knows what to do with the answer.

S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Leandro Marino

You can use only

if(!is.na(num))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2013-05-03 Thread David Winsemius

On May 2, 2013, at 4:15 PM, T P Kharel wrote:

> I have posted a R copula question yesterday but it is not accepted yet. How
> long does it take?

Generally moderated postings are accepted within 4-6 hours, usually sooner.

> I am waiting if some one can help me on my Copula
> package related question. Thanks

I do not see any posting from a sender with a name containing the letters  
"kharel" on May 1, 2, or 3 in the archives and since I just cleared the 
moderation queue it was not waiting there.  Some postings from non-subscribed 
individuals are tossed away automatically by the spam filter and are never seen 
by the moderators as they(we) process the moderation queue. But in your case I 
see that you have subscribed. I am unable to explain why your posting did not 
reach the list. You should be able to see whehter your psotng was received by 
looking at the May 2013 threads at: https://stat.ethz.ch/pipermail/r-help/

>   [[alternative HTML version deleted]]

The HTML notice is evidence that you have not yet understood parts of the 
Posting Guide.

> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Declare a set (list?) of many dataframes or matrices

2013-05-03 Thread Rui Barradas


Hello,

I can't say I understand the question, but if you want a list of empty 
dfs and a list of empty matrices, the following will do.


replicate(10, data.frame())
replicate(10, matrix(NA, nrow = 0, ncol = 0))


Hope this helps,

Rui Barradas

Em 03-05-2013 16:20, jpm miao escreveu:

Hi,

I would like to read several datasets and would like to create a set
(list? sequence?) of many empty dataframes. How could this be done? How
could I declare a  set (list? sequence?) of many empty matrices?

Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Carlson

A logical operation involving NA returns NA, never TRUE or FALSE:

See the 8th Circle of the R Inferno (8.1.4):

http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

> num <- 1
> num==NA
[1] NA
> is.na(num)
[1] FALSE

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of jpm miao
Sent: Friday, May 3, 2013 10:25 AM
To: r-help
Subject: [R] Why can't R understand if(num!=NA)?

I have a program, when I write

if(num!=NA)

it yields an error message.

However, if I write

if(is.na(num)==FALSE)

it works.

Why doesn't the first statement work?

Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread David Winsemius


On May 2, 2013, at 11:00 PM, jpm miao wrote:

> Hi Anthony,
> 
>   Thank you very much. It works very well. However, after this line
> 
>> temp <- sapply( temp , as.numeric )
> 
>   the data becomes a series of numbers instead of a matrix. Is there any
> way to keep it a matrix?

Perhaps (assuming this were a data.frame to be coerced:

temp <- matrix( sapply( temp , as.numeric ), dim(temp)[1]) 

But the persistence of the "-"'s is puzzling. You should (as always) have 
posted the output from dput(temp).



  Thanks,
> 
> Miao
> 
> 
> 
> 
>> temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
> startRow=2, endRow= 11, startCol=2, endCol=5)
>> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
>> temp
>  Col1 Col2   Col3Col4
> [1,] "647853" "1413" "57662" "27897"
> [2,] "491400" "1365" "40919" "20411"
> [3,] "38604"  "-""5505"  "985"
> [4,] "576""-""20""54"
> [5,] "80845"  "21"   "10211" "4494"
> [6,] "36428"  "27"   "1007"  "1953"
> [7,] "269915" "587"  "32988" "12779"
> [8,] "224494" "-""30554" "9184"
> [9,] "11858"  "587"  "-" "686"
> [10,] "3742"   "-""81""415"
>> temp <- sapply( temp , as.numeric )
> Warning messages:
> 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
>> temp
> 647853 491400  38604576  80845  36428 269915
> 647853 491400  38604576  80845  36428 269915
> 224494  11858   3742   1413   1365  -  -
> 224494  11858   3742   1413   1365 NA NA
>21 27587  -587  -  57662
>21 27587 NA587 NA  57662
> 40919   5505 20  10211   1007  32988  30554
> 40919   5505 20  10211   1007  32988  30554
> - 81  27897  20411985 54   4494
>NA 81  27897  20411985 54   4494
>  1953  12779   9184686415
>  1953  12779   9184686415
>> temp[ is.na( temp ) ] <- 0
>> temp
> 647853 491400  38604576  80845  36428 269915
> 647853 491400  38604576  80845  36428 269915
> 224494  11858   3742   1413   1365  -  -
> 224494  11858   3742   1413   1365  0  0
>21 27587  -587  -  57662
>21 27587  0587  0  57662
> 40919   5505 20  10211   1007  32988  30554
> 40919   5505 20  10211   1007  32988  30554
> - 81  27897  20411985 54   4494
> 0 81  27897  20411985 54   4494
>  1953  12779   9184686415
>  1953  12779   9184686415
> 
> 
> 2013/5/2 Anthony Damico 
> 
>> try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
>> 
>> 
>> 
>> if that doesn't work, try a few other steps
>> 
>> 
>> # view what data types your file is being read in as
>> sapply( temp , class )
>> 
>> 
>> # convert all fields to character if they're factor variables.. but i
>> don't think you need this, readWorksheet defaults to `character`
>> temp <- sapply( temp , as.character )
>> 
>> 
>> # you can also convert a subset like this
>> temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character )
>> 
>> 
>> 
>> # remove commas from character strings
>> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
>> 
>> # convert all fields to numeric
>> temp <- sapply( temp , as.numeric )
>> 
>> # convert all NA fields to zeroes if you prefer
>> temp[ is.na( temp ) ] <- 0
>> 
>> 
>> 
>> 
>> 
>> On Wed, May 1, 2013 at 11:55 PM, jpm miao  wrote:
>> 
>>> Hi,
>>> 
>>>   Attached are two datasheet to be read.
>>>   My raw data "130502temp.xlsx" contains numbers with ' symbols, and they
>>> can't be read as numbers. Even if I copy and paste as numbers to form a
>>> new
>>> file "130502temp_number1.xlsx", they could not be read smoothly.
>>> 
>>>   1. How can I read the datasheet as numbers?
>>>   2. How can I treat the notation "-" as (1) "NA" or (2) zero?
>>> 
>>>   Thanks,
>>> 
>>> Miao
>>> 
>>> 
>>> 
>>> 
 temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
>>> startRow=2, endRow= 11, startCol=2, endCol=5)
>>> 
 temp
>>> 
>>>  Col1  Col2   Col3   Col4
>>> 
>>> 1  647,853 1,413 57,662 27,897
>>> 
>>> 2  491,400 1,365 40,919 20,411
>>> 
>>> 3   38,604 -  5,505985
>>> 
>>> 4  576 - 20 54
>>> 
>>> 5   80,84521 10,211  4,494
>>> 
>>> 6   36,42827  1,007  1,953
>>> 
>>> 7  269,915   587 32,988 12,779
>>> 
>>> 8  224,494 - 30,554  9,184
>>> 
>>> 9   11,858   587  -686
>>> 
>>> 10   3,742 - 81415
>>> 
 temp[2,2]
>>> 
>>> [1] "1,365"
>>> 
 temp[2,2]+3
>>> 
>>> Error in temp[2, 2] + 3 : non-numeric argument to binary operator
>>> 
 temp_num<-readWorksheetFromFile("130502temp_number1.xlsx", sheet=1,
>>> header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Marc Schwartz

On May 3, 2013, at 10:24 AM, jpm miao  wrote:

> I have a program, when I write
> 
> if(num!=NA)
> 
> it yields an error message.
> 
> However, if I write
> 
> if(is.na(num)==FALSE)
> 
> it works.
> 
> Why doesn't the first statement work?
> 
> Thanks,
> 
> Miao

NA is undefined:

> NA == NA
[1] NA

> NA != NA
[1] NA

Therefore the equality you are attempting does not return a TRUE or FALSE 
result, it is unknown and NA is returned. ?is.na was designed specifically to 
test for the presence of an NA value and return a TRUE or FALSE result which 
can then be tested.

See: http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Missing-values

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Winsemius

On May 3, 2013, at 8:24 AM, jpm miao wrote:

> I have a program, when I write
> 
> if(num!=NA)
> 
> it yields an error message.
> 
> However, if I write
> 
> if(is.na(num)==FALSE)
> 
> it works.
> 
> Why doesn't the first statement work?

Read the "manual":

  ?"NA"

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MANOVA summary.manova(m) :" residuals have rank"

2013-05-03 Thread peter dalgaard


On May 3, 2013, at 14:59 , Ozgul Inceoglu wrote:

> Dear All, I am trying to perform MANOVA. I have table with 504 
> columns(species) and 36 rows) with two grouping (season and location)
> 
> Zx <- Z[c(4:504)]
> Zxm <- as.matrix(Z)
> m<- manova(Zxm~Season*location, data=Z)
> 
> when I do summary.aov, I get respond for each species but summary.manova
> summary.manova(m) :" residuals have rank" 24<501.
> 
> What can it be the reason for this error message?

Too many columns and too few rows. Multivariate tests require more degrees of 
freedom than response variables.


> 
> Thank you,
> 
> Ozgul
> 
> Below you can see part of the table.
> name  Season  locationAcetobacter Aerococcus  Alishewanella   
> Amaricoccus
> xls-nord-01   J   w   0   0,024078979 0   0
> bxls-sud-01   J   w   0   0   0   0
> brux-nord-04  A   w   0   0   0   0
> brux-sud-04   A   w   0   0   0   0
> br-nord-07Ju  w   0   0   0   0
> br-sud-07 Ju  w   0   0   0   0
> b-nord-10 O   w   0   0   0   0
> bsud-10   O   w   0,107836089 0   0,107836089 
> 0,035945363
> Z1-01 J   u   0   0   0   0,040567951
> Z3-01 J   u   0   0   0   0
> Z5-01 J   d   0,023116043 0   0   0
> Z7-01 J   d   0,014130281 0   0   0
> Z9-01 J   d   0   0   0   0
> Z10-01J   d   0   0   0   0
> Z12-01J   d   0   0   0   0
> Z1-04 A   u   0   0   0   0
> Z3-04 A   u   0   0   0   0
> Z5-04 A   d   0   0   0   0
> Z7-04 A   d   0   0   0   0
> Z9-04 A   d   0   0,013839873 0   0
> Z10-04A   d   0   0   0   0
> Z12-04A   d   0   0   0   0
> Z1-07 Ju  u   0   0   0   0
> Z3-07 Ju  u   0   0   0   0
> Z5-07 Ju  d   0   0   0   0
> Z7-07 Ju  d   0   0   0   0
> Z9-07 Ju  d   0   0   0   0
> Z10-07Ju  d   0   0   0   0
> Z12-07Ju  d   0   0,022301517 0   0
> Z1-10 O   u   0   0   0   0
> Z3-10 O   u   0   0   0   0
> Z5-10 O   d   0   0   0   0
> Z7-10 O   d   0   0   0,052924054 0
> Z9-10 O   d   0   0   0,035050824 0
> Z10-10O   d   0   0   0   0,040783034
> Z12-10O   d   0   0   0   0
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Berend Hasselman

On 03-05-2013, at 17:24, jpm miao  wrote:

> I have a program, when I write
> 
> if(num!=NA)
> 
> it yields an error message.
> 

it?
What is unclear about the error message?

> However, if I write
> 
> if(is.na(num)==FALSE)
> 
> it works.
> 
> Why doesn't the first statement work?
> 

Read section 2.5 'Missing values" of the manual "An Introduction to R".

Berend

> Thanks,
> 
> Miao
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read .csv file and plot a graph

2013-05-03 Thread jim holtman

Just read in and plot the data.  The NaN will not be plotted:

> input <- read.table(text = "x
+ 1 NaN
+ 2 NaN
+ 3 0.23
+ 4 .34
+ 5 .55
+ 6 .66
+ 7 NaN
+ 8 .88", header = TRUE)
> plot(input$x)
>



On Fri, May 3, 2013 at 9:49 AM, Vahe nr  wrote:

> Hi all,
>
> I have a big .csv file (21Mb with 100 rows) it has this shape:
> x
> 1 NaN
> 2 NaN
> 3 0.23
>
> and so on.
>
> So the first column has x as a header then row number, the second column
> contains values between -1,1 and NaN for empty values.
>
> What should I need to do is: create a new .csv file from this one excluding
> NaN values and plot a line graph using the new .csv file.
>
> Or can I use the old .csv file to plot a graph excluding NaN values.
>
> Thanks in advance for any help or suggestions.
>
> Regards,
>  Vahe
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Declare a set (list?) of many dataframes or matrices

2013-05-03 Thread arun





Hi,
I am not sure about what you meant.
lapply(1:5,function(i) data.frame())
[[1]]
data frame with 0 columns and 0 rows

[[2]]
data frame with 0 columns and 0 rows

[[3]]
data frame with 0 columns and 0 rows

[[4]]
data frame with 0 columns and 0 rows

[[5]]
data frame with 0 columns and 0 rows

A.K.


- Original Message -
From: jpm miao 
To: r-help 
Cc: 
Sent: Friday, May 3, 2013 11:20 AM
Subject: [R] Declare a set (list?) of many dataframes or matrices

Hi,

   I would like to read several datasets and would like to create a set
(list? sequence?) of many empty dataframes. How could this be done? How
could I declare a  set (list? sequence?) of many empty matrices?

   Thanks,

Miao

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread arun

 num1<- c(0,NA,1,3)
 num1==NA
#[1] NA NA NA NA
 num1!=NA
#[1] NA NA NA NA
 is.na(num1)
#[1] FALSE  TRUE FALSE FALSE
A.K.



- Original Message -
From: jpm miao 
To: r-help 
Cc: 
Sent: Friday, May 3, 2013 11:24 AM
Subject: [R] Why can't R understand if(num!=NA)?

I have a program, when I write

if(num!=NA)

it yields an error message.

However, if I write

if(is.na(num)==FALSE)

it works.

Why doesn't the first statement work?

Thanks,

Miao

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change Selected Variables from Numeric to Factors

2013-05-03 Thread arun

Hi ST,
Try this:
set.seed(51)
df1<- as.data.frame(matrix(sample(1:40,60,replace=TRUE),ncol=10))
df2<- df1
check<- c("V3","V7","V9")
 
df1[,match(check,colnames(df1))]<-lapply(df1[,match(check,colnames(df1))],as.factor)

str(df1)
#'data.frame':    6 obs. of  10 variables:
# $ V1 : int  32 9 12 40 9 34
# $ V2 : int  31 17 39 5 21 28
# $ V3 : Factor w/ 6 levels "1","6","7","10",..: 3 5 1 6 2 4
# $ V4 : int  26 4 8 18 39 2
# $ V5 : int  39 21 4 26 6 21
# $ V6 : int  27 33 35 8 17 8
# $ V7 : Factor w/ 5 levels "4","8","9","24",..: 2 3 4 1 3 5
# $ V8 : int  4 12 12 32 13 37
# $ V9 : Factor w/ 5 levels "10","31","33",..: 1 4 2 3 5 5
# $ V10: int  13 26 20 22 14 5

#or
 df2[check]<- lapply(check,function(x) as.factor(df2[[x]]))
# str(df2)
#'data.frame':    6 obs. of  10 variables:
# $ V1 : int  32 9 12 40 9 34
# $ V2 : int  31 17 39 5 21 28
# $ V3 : Factor w/ 6 levels "1","6","7","10",..: 3 5 1 6 2 4
# $ V4 : int  26 4 8 18 39 2
# $ V5 : int  39 21 4 26 6 21
# $ V6 : int  27 33 35 8 17 8
# $ V7 : Factor w/ 5 levels "4","8","9","24",..: 2 3 4 1 3 5
# $ V8 : int  4 12 12 32 13 37
# $ V9 : Factor w/ 5 levels "10","31","33",..: 1 4 2 3 5 5
# $ V10: int  13 26 20 22 14 5


A.K.

>I have a dataframe df with several columns. I need to change some of 
these to factors. What colums I need to change to factors is in another 
vector >check. 
>I am using this command 
>sapply(check , function(x) df[[x]] <- as.factor(df[[x]])) 
>
>But this is not working. Can someone please advise. 
>
>Thanks. 
>-ST

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread jim holtman

you can also try:

temp[] <- lapply(temp, as.numeric)


On Fri, May 3, 2013 at 11:54 AM, David Winsemius wrote:

>
> On May 2, 2013, at 11:00 PM, jpm miao wrote:
>
> > Hi Anthony,
> >
> >   Thank you very much. It works very well. However, after this line
> >
> >> temp <- sapply( temp , as.numeric )
> >
> >   the data becomes a series of numbers instead of a matrix. Is there any
> > way to keep it a matrix?
>
> Perhaps (assuming this were a data.frame to be coerced:
>
> temp <- matrix( sapply( temp , as.numeric ), dim(temp)[1])
>
> But the persistence of the "-"'s is puzzling. You should (as always) have
> posted the output from dput(temp).
>
>
>
>   Thanks,
> >
> > Miao
> >
> >
> >
> >
> >> temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
> > startRow=2, endRow= 11, startCol=2, endCol=5)
> >> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
> >> temp
> >  Col1 Col2   Col3Col4
> > [1,] "647853" "1413" "57662" "27897"
> > [2,] "491400" "1365" "40919" "20411"
> > [3,] "38604"  "-""5505"  "985"
> > [4,] "576""-""20""54"
> > [5,] "80845"  "21"   "10211" "4494"
> > [6,] "36428"  "27"   "1007"  "1953"
> > [7,] "269915" "587"  "32988" "12779"
> > [8,] "224494" "-""30554" "9184"
> > [9,] "11858"  "587"  "-" "686"
> > [10,] "3742"   "-""81""415"
> >> temp <- sapply( temp , as.numeric )
> > Warning messages:
> > 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> > 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> > 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> > 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> > 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> >> temp
> > 647853 491400  38604576  80845  36428 269915
> > 647853 491400  38604576  80845  36428 269915
> > 224494  11858   3742   1413   1365  -  -
> > 224494  11858   3742   1413   1365 NA NA
> >21 27587  -587  -  57662
> >21 27587 NA587 NA  57662
> > 40919   5505 20  10211   1007  32988  30554
> > 40919   5505 20  10211   1007  32988  30554
> > - 81  27897  20411985 54   4494
> >NA 81  27897  20411985 54   4494
> >  1953  12779   9184686415
> >  1953  12779   9184686415
> >> temp[ is.na( temp ) ] <- 0
> >> temp
> > 647853 491400  38604576  80845  36428 269915
> > 647853 491400  38604576  80845  36428 269915
> > 224494  11858   3742   1413   1365  -  -
> > 224494  11858   3742   1413   1365  0  0
> >21 27587  -587  -  57662
> >21 27587  0587  0  57662
> > 40919   5505 20  10211   1007  32988  30554
> > 40919   5505 20  10211   1007  32988  30554
> > - 81  27897  20411985 54   4494
> > 0 81  27897  20411985 54   4494
> >  1953  12779   9184686415
> >  1953  12779   9184686415
> >
> >
> > 2013/5/2 Anthony Damico 
> >
> >> try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
> >>
> >>
> >>
> >> if that doesn't work, try a few other steps
> >>
> >>
> >> # view what data types your file is being read in as
> >> sapply( temp , class )
> >>
> >>
> >> # convert all fields to character if they're factor variables.. but i
> >> don't think you need this, readWorksheet defaults to `character`
> >> temp <- sapply( temp , as.character )
> >>
> >>
> >> # you can also convert a subset like this
> >> temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character
> )
> >>
> >>
> >>
> >> # remove commas from character strings
> >> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
> >>
> >> # convert all fields to numeric
> >> temp <- sapply( temp , as.numeric )
> >>
> >> # convert all NA fields to zeroes if you prefer
> >> temp[ is.na( temp ) ] <- 0
> >>
> >>
> >>
> >>
> >>
> >> On Wed, May 1, 2013 at 11:55 PM, jpm miao  wrote:
> >>
> >>> Hi,
> >>>
> >>>   Attached are two datasheet to be read.
> >>>   My raw data "130502temp.xlsx" contains numbers with ' symbols, and
> they
> >>> can't be read as numbers. Even if I copy and paste as numbers to form a
> >>> new
> >>> file "130502temp_number1.xlsx", they could not be read smoothly.
> >>>
> >>>   1. How can I read the datasheet as numbers?
> >>>   2. How can I treat the notation "-" as (1) "NA" or (2) zero?
> >>>
> >>>   Thanks,
> >>>
> >>> Miao
> >>>
> >>>
> >>>
> >>>
>  temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
> >>> startRow=2, endRow= 11, startCol=2, endCol=5)
> >>>
>  temp
> >>>
> >>>  Col1  Col2   Col3   Col4
> >>>
> >>> 1  647,853 1,413 57,662 27,897
> >>>
> >>> 2  491,400 1,365 40,919 20,411
> >>>
> >>> 3   38,604 -  5,505985
> >>>
> >>> 4  576 - 20 54
> >>>
> >>> 5   80,84521 10,211  4,494
> >>>
> >>> 6   36,42827  1,007  1,953
> >>>
> >>> 7  269,915   587 32,988 12,779
> >>>
> >>> 8  224,494 -

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread William Dunlap

> if(num!=NA)
> Why doesn't the first statement work?

An NA value means that the value is unknown.  E.g.,
  age <- NA
means the you do not know the age of your subject.
(The subject has an age, NA means you did not collect
that data.)  Thus you do not know the value of 
  age == 6
either, the subject might be 6 or it might not be.
Hence R makes the value of age==6  NA.

Since R does not have different evaluation rules for literal values
and expressions that means that NA==6 and NA==someAge
must evaluate to NA as well.

The second part of the question is why
   if (NA) { } else { }
causes an error.  It is a bit arbitrary, but there is a mismatch
between a 2-way 'if' statement and 3-valued logical data
and R deals with it by insisting that the condition in
   if (condition) { } else {}
be either TRUE or FALSE, not NA.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of jpm miao
> Sent: Friday, May 03, 2013 8:25 AM
> To: r-help
> Subject: [R] Why can't R understand if(num!=NA)?
> 
> I have a program, when I write
> 
> if(num!=NA)
> 
> it yields an error message.
> 
> However, if I write
> 
> if(is.na(num)==FALSE)
> 
> it works.
> 
> Why doesn't the first statement work?
> 
> Thanks,
> 
> Miao
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Kevin Wright

At a minimum, the first statement needs "==".

Also, is.na() gives TRUE/FALSE.  While a logical comparison to NA gives NA
as a value.

Kevin

On Fri, May 3, 2013 at 10:24 AM, jpm miao  wrote:

> I have a program, when I write
>
> if(num!=NA)
>
> it yields an error message.
>
> However, if I write
>
> if(is.na(num)==FALSE)
>
> it works.
>
> Why doesn't the first statement work?
>
> Thanks,
>
> Miao
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Daniel Nordlund

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Katarzyna Kulma
> Sent: Friday, May 03, 2013 4:21 AM
> To: David Kulp
> Cc: r-help@r-project.org
> Subject: Re: [R] R does not subset
> 
> Jorge, thanks for your suggestions, but they give the same (empty) result:
> 
> > RECinf<-subset(REC2,  INFECTION=="Infected")
> > head(RECinf)
> [1] RINGNOyear  ccFLEDGE  rec2012   binageINFECTION all.rsLD
> <0 rows> (or 0-length row.names)
> 
> but David's suggestion worked! :
> 
> > RECinf<-REC2[REC2$INFECTION=="Infected ",]
> > head(RECinf)
> RINGNO  year ccFLEDGE rec2012 binage INFECTION   all.rsLD
> 2  BX23298 Y20036   1juv Infected  -6.1938776
> 4  BT53646 Y20035   2 ad Infected  -4.1938776
> 7  BT53248 Y20036   1 ad Infected  -2.1938776
> 11 BY75833 Y20045   0 ad Infected  -4.6574803
> 13 BX23067 Y20046   0 ad Infected  -3.6574803
> 17 BX24240 Y20046   0 ad Infected   0.3425197
> 
> 
> still not sure why the subset() function didn't work, though.
> 
> Thanks for your help!
> 
> 
> 

Maybe it didn't work because you still didn't have a space at the end of the 
value you were comparing (apparently the factor was defined with a space). 
with).  Try the following (and notice the space at the end of "Infected ". 

RECinf<-subset(REC2,  INFECTION=="Infected ")

David's suggestion worked because you did include a space there.


Dan

Daniel Nordlund
Bothell, WA USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Counting number of consecutive occurrences per rows

2013-05-03 Thread zuzana zajkova

Hi,

I'm sorry that it takes me so much time to respond, finally yesterday I got
time to try your suggestions. Thank you for them!

I tried both, they give the same results, but in both there are some things
I still need to solve. I would appreciate your help.
I include a little bigger dataframe (test2, in the end of this email), with
more differencies in variables, to be able to better explain what I would
like to calculate in addition.

*Jim's code:*
I needed to make some changes in assigning the key. Yours worked ok for
that small "test" data, but when I tried it on my dataframe which has
around 25000rows, it didn't work properly.

test2$key[test2$act == 0] <- 1
test2$key[test2$act > 0 & test2$act < 200] <- 2
test2$key[test2$act == 200] <- 3

# this works ok
test2$resChange <- cumsum(c(1, abs(diff(test2$key
test2$res <- ave(test2$resChange, test2$resChange, FUN = length)
# I added new column by jul date
test2$resJ <- ave(test2$resChange, test2$resChange, test2$juln, FUN =
length)
# this works fine as well, for dividing between day 0 and day 1
test2$resJD <- ave(test2$resChange, test2$resChange, test2$juln, test2$day,
FUN = length)
# resume
test2Resume <- test2[ , list(maxres = max(res)
   , minres = min(res)
   , sumres = length(unique(resChange)))
   , keyby = c('day', 'key')]
# change 'key'
 test2Resume_day$key <- c('0', '1-199', '200')[test2Resume_day$key]
 test2Resume_day
   day   key maxres minres sumres
1:   0 0  2  2  3
2:   0 1-199  3  1  9
3:   0   200  6  1  7
4:   1 0  1  1  1
5:   1 1-199 10  1  7
6:   1   200  6  1  6

# resume by juln
 test2Resume_jul <- test2[ , list(maxres = max(res)
   , minres = min(res)
  , sumres = length(unique(resChange)))
  , keyby = c('juln', 'key')]  # by juln
 # change 'key'
 test2Resume_jul$key <- c('0', '1-199', '200')[test2Resume_jul$key]
 test2Resume_jul
juln   key maxres minres sumres
1: 15173 0  2  2  1
2: 15173 1-199  3  1  7
3: 15173   200  6  1  6
4: 15174 0  2  1  3
5: 15174 1-199 10  1  8
6: 15174   200  6  1  6

It is ok, but what I would like to get is resume for juln and for  variable
day (0 and 1) aswell.
Like this:

juln   day  key   maxres   minressumres
15173   00
15173   01-199
15173   0200
15173   10
15173   11-199
15173   1200
15174  0 0
15174  0 1-199
15174  0 200
15174  1 0
15174  1 1-199
15174  1 200
...

The other thing is that the "sumres" I would like to calculate like a sum
of values of occurencies for each "key".
For example, if in the test2 dataframe res values for key 200 (juln 15173)
are 1, 1, 2,2,1,2 the sumres should be 9 (1+1+2+2+1+2), not 6 (which I
suppose come form sum of number of unique occurencies).


*Petr's code:*
This works fine also, the thing is that doing the aggregation I would need
the intervals to be like this
[0, 1)
[1, 199]
(199, 200]
what I don't know if is possible... I checked the hepl for cut, but I found
that it can be closed just right or left...

Thank you very much for your time and sharing your knowledge!

Zuzana


## here is the bigger test2 dataframe
> dput(test2)
structure(list(daten = structure(c(15173, 15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174), class = "Date"), juln = c(15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15173, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174), fen = c("win", "win", "win", "win", "win",
"win", "win", "win", "win", "win", "win", "win", "win", "win",
"win", "win", "win", "win", "win", "win", "win", "win", "win",
"win", "win", "win", "win", "win", "win", "win", "win", "win",
"win", "win", "win", "win", "win", "win", "win", "win", "win",
"win", "win", "win", "win", "win", "win", "win", "win", "win",
"win", "win", "win", "win", "win", "win", "win", "win", "win",
"win"), night = structure(c(1310962792, 1310963392, 1310963992,
1310964592, 1310965192, 1310965792, 1310966392, 131096699

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread peter dalgaard

On May 3, 2013, at 17:24 , jpm miao wrote:

> I have a program, when I write
> 
> if(num!=NA)
> 
> it yields an error message.
> 
> However, if I write
> 
> if(is.na(num)==FALSE)
> 
> it works.
> 
> Why doesn't the first statement work?

Because comparison with an unknown value yields an unknown result. 

By the way, comparing a logical value to FALSE is silly: 

if ( !is.na(num) ) will do it.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating distance matrix for large dataset

2013-05-03 Thread Uwe Ligges




On 03.05.2013 15:36, David Carlson wrote:

Here's the result on R 3.0.0 64 bit under Windows 8:


A<-matrix(1:365000*144,nrow=365000,ncol=144)
dim(A)

[1] 365000144

d <- dist(mydata_nor, method = "euclidean")

Error in as.matrix(x) : object 'mydata_nor' not found

d <- dist(A, method = "euclidean")

Error: cannot allocate vector of size 496.3 Gb
In addition: Warning messages:
1: In dist(A, method = "euclidean") :
   Reached total allocation of 8078Mb: see help(memory.size)
2: In dist(A, method = "euclidean") :
   Reached total allocation of 8078Mb: see help(memory.size)
3: In dist(A, method = "euclidean") :
   Reached total allocation of 8078Mb: see help(memory.size)
4: In dist(A, method = "euclidean") :
   Reached total allocation of 8078Mb: see help(memory.size)

Your message suggests that your system could not accurately compute the
requirements. Unless you have access to a computer with 500 gigabytes, you
need to consider alternate approaches such as aggregating the data into
longer time blocks or using kmeans.



Or to show how we can calculate it:
Or simpler speaking, you need to calculate 365000 * (365000-1) / 2 = 
66612317500 distances and with 8 bytes each, hence you need 66612317500 
* 8 = 53289854 Bytes = 53289854 / (1024)^3 GB ~= 496.3 Gb to 
store it in memory.


Best,
Uwe Ligges






-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of HJ YAN
Sent: Thursday, May 2, 2013 6:02 PM
To: r-help@r-project.org
Subject: [R] Calculating distance matrix for large dataset

Dear R users


I wondered if any of you ever tried to calculate distance matrix with very
large data set, and if anyone out there can confirm this error message I got
actually mean that my data is too large for this task.

negative length vectors are not allowed


My data size and code used

  dim(mydata_nor)[1] 365000144> d <- dist(mydata_nor, method =
"euclidean")



Here my data has 1000 samples each has a year data observed by 10 minutes
interval daily, so the size is  (365* 1000) * 144.


I checked the manual of function 'dist' but can not see the upper limit size
allowed, and I bet there should be one, so any hints is appreciated.


I would also be grateful if any other method for calculating distance matrix
for large dataset could be advised.



I appreciate reproducible code should be provided for your advice, so try
below if needed:

A<-matrix(1:365000*144,nrow=365000,ncol=144)> dim(A)[1] 365000144>
d1<-dist(A,method="euclidean")Error in dist(A, method = "euclidean") :
   negative length vectors are not allowed




Many thanks in advance!

HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Empirica Copula

2013-05-03 Thread T P Kharel

Dear users
I am reposting this and hope it will be accepted this time.

I am using copula package to fit my bivariate data and simulation. As
explained in package documentation we can use our own data distribution to
feed on copula as long as we have d, p and q (pdf, cdf and quantile)
functions are available.  Hence my code for those are:

# Make the functions for data distribution
dSAR<-function(SAR){dexp(SAR, rate=0.5)}
pSAR<-function(SAR){pexp(SAR, rate=0.5)}
qSAR<-function(SAR){qexp(c(seq(0,1, .01)),SAR, rate=0.5)}


dper<-function(per) {dexp(per,rate=0.5)}
pper<-function(per){pexp(per,rate=0.5)}
qper<-function(per){qexp(c(seq(0,1,.01)),per, rate=0.5)}

gmb<-gumbelCopula(3,dim=2) # create bivariate copula object with dim=2

#tau(gmb)
## construct a bivariate distribution with defined marginals
 myCDF<- mvdc(gmb, margins=c("exp","exp"),
paramMargins=list(list(rate=0.5),list(rate=0.5)))

# Use own data for bivariate CDF construction
myCDF2<- mvdc(gmb, margins=c("SAR","per"),
paramMargins=list(list(rate=.5),list(rate=.5)))

# Generate (bivariate) random numbers from that, and visualize
x <- rMvdc(1000, myCDF2)

And I get error message everytime as:
> x <- rMvdc(1000, myCDF2)
Error in qSAR(x, rate = 0.5) : unused argument(s) (rate = 0.5)

It works fine with  myCDF and generate bivariate data:
x <- rMvdc(1000, myCDF2)

But my problem is simulated data (using myCDF) does not show the same
relationship as in original data.  Hence I want to use my own empirical
distribution (myCDF2) to simulate data.  It looks like it is not taking the
quantile function, qSAR. Is there any other way I can define my data
distribution and feed  to copula ?   Thanks for help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Mihai Nica

Thi is great! Thank you so much for taking the time to give is this hint.
Â 
mike


>
> From: Jeff Newmiller 
>To:Mihai Nica ; Mihai Nica ; Jorge I 
>Velez ; Katarzyna Kulma  
>Cc: R mailing list  
>Sent: Friday, May 3, 2013 8:16 AM
>Subject: Re: [R] R does not subset
> 
>
>This typically occurs because of sloppy manual data entry outside of R. To 
>relieve further analysis pain, you can manually clean the data (usually only 
>effective for one-time analyses) or use R to fix problems right after loading 
>the data (there are multiple methods for doing this... I prefer using ?sub on 
>character data before creating the factor).
>---
>Jeff NewmillerÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  TheÂ  Â   .Â  Â  Â   
>.Â  Go Live...
>DCN:Â  Â  Â  Â  Basics: ##.#.Â  Â  Â   ##.#.Â  Live 
>Go...
>Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Live:Â  OO#.. Dead: 
>OO#..Â  Playing
>Research Engineer (Solar/BatteriesÂ  Â  Â  Â  Â  Â  O.O#.Â  Â  Â  #.O#.Â  with
>/Software/Embedded Controllers)Â  Â  Â  Â  Â  Â  Â  .OO#.Â  Â  Â  .OO#.Â  
>rocks...1k
>--- 
>Sent from my phone. Please excuse my brevity.
>
>Mihai Nica  wrote:
>
>>Hi:
>>
>>"(note the space after "Infected")"
>>
>>Since I lost a morning too with this issue, I am just curious, why is
>>there a space?ï¿½
>>
>>I know, it must be a dumb question, a reasonable programming rule, but
>>that's my level :-)
>>ï¿½
>>mike
>>
>>
>>>
>>> From: Jorge I Velez 
>>>To:Katarzyna Kulma  
>>>Cc: R mailing list  
>>>Sent: Friday, May 3, 2013 6:01 AM
>>>Subject: Re: [R] R does not subset
>>> 
>>>
>>>Hi Kasia,
>>>
>>>You need
>>>
>>>subset(REC2,ï¿½ INFECTION=="Infected ")
>>>
>>>(note the space after "Infected").
>>>
>>>HTH,
>>>Jorge.-
>>>
>>>
>>>On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
>>>wrote:
>>>
 Hi everyone,

 I know there have been several requests regarding subsetting before,
>>but
 none of them really helps with my problem:

 I'm trying to subset only infected individuals from the REC2
>>data.frame:

 > str(REC2)
 'data.frame':ï¿½ ï¿½ 362 obs. ofï¿½ 7 variables:
ï¿½ $ RINGNOï¿½Â  : Factor w/ 370 levels "BL17546","BL17577",..: 78 81 67
>>41 58
 66 17
ï¿½ $ yearï¿½ ï¿½Â  : Factor w/ 8 levels "Y2002","Y2003",..: 1 2 1 2 1 1 2 1
>>1 3
 ...
ï¿½ $ ccFLEDGE : intï¿½ 6 6 6 5 6 7 6 7 6 5 ...
ï¿½ $ rec2012ï¿½ : intï¿½ 2 1 2 2 1 2 1 1 1 0 ...
ï¿½ $ binageï¿½ : Factor w/ 2 levels "ad","juv": 1 2 1 1 1 1 1 1 1 1 ...
ï¿½ $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ": 2 1 2 1
>>2 2 1 2
 2 1 ...
ï¿½ $ all.rsLD : numï¿½ -4.62 -6.19 -3.62 -4.19 -2.62 ...

 using either

 RECinf<-REC2[which (REC2$INFECTION=="Infected"),]

 or

 RECinf<-subset(REC2,ï¿½ INFECTION=="Infected")

 in both cases I get empty data frame (0 observations):

 > str(RECinf)
 'data.frame':ï¿½ ï¿½ 0 obs. ofï¿½ 7 variables:
ï¿½ $ RINGNOï¿½Â  : Factor w/ 370 levels "BL17546","BL17577",..:
ï¿½ $ yearï¿½ ï¿½Â  : Factor w/ 8 levels "Y2002","Y2003",..:
ï¿½ $ ccFLEDGE : int
ï¿½ $ rec2012ï¿½ : int
ï¿½ $ binageï¿½ : Factor w/ 2 levels "ad","juv":
ï¿½ $ INFECTION: Factor w/ 2 levels "Infected ","Uninfected ":
ï¿½ $ all.rsLD : num

 When subsetting, R doesn't return any warning or error message.
>>Besides, I
 used same codes many times beforeand they worked perfectly well. Any
>>ideas
 why this case is different?

 Thanks for your help,
 Kasia

ï¿½ ï¿½ ï¿½ ï¿½Â  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>ï¿½ï¿½ï¿½ [[alternative HTML version deleted]]
>>>
>>>__
>>>R-help@r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>Â Â Â  [[alternative HTML version deleted]]
>>
>>
>>
>>
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.c

Re: [R] Create and read symbolic links in Windows

2013-05-03 Thread Santosh

Thanks for your suggestion... I upgraded to R.3.0.0 in 64-bit Windows 7
environment..

This time when I use file.link..
I get the following error message: 'Cannot create a file when that file
already exists"
And I don't see the link.

The other function, file.copy, correctly copies to the target location.

Still confuse with the error msges...

Thanks,
Santosh


On Thu, May 2, 2013 at 11:42 PM, Prof Brian Ripley wrote:

> On 03/05/2013 07:33, Santosh wrote:
>
>> Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't
>> get "file.symlink" to work, but "file.link" did return the result to be
>> "TRUE" but at the target location, I did not see any link.
>>
>> Not sure I am missing anything more.. Hope it's nothing to do with
>> administrator accounts and administrator rights... Is it something I
>> should check with my system administrator?
>>
>
> You may need to update your R: although the posting guide asked you to do
> that before posting.  There was a relevant bug fix in 2.15.3.
>
>
>> Thanks,
>> Santosh
>>
>>
>> On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley
>> mailto:rip...@stats.ox.ac.uk>**> wrote:
>>
>> On 02/05/2013 19:50, Santosh wrote:
>>
>> Dear Rxperts..
>> Got a couple of quick q's..
>> I am using R in windows environment (both 32-bit and 64-bit)
>> a) Is there a way to create symbolic links to some data files?
>>
>>
>> See ?file.symlink.  ??'symbolic link' should have got you there.
>>
>> Note that this is not very useful for files, but that is a Windows
>> and not an R restriction.
>>
>>
>>  > b) How do I read data from symbolic links?
>>
>> The same ways you read data from files.
>>
>>
>> Thanks so much..
>> Santosh
>>
>>
>>
>> --
>> Brian D. Ripley, rip...@stats.ox.ac.uk 
>> Professor of Applied Statistics,
>> 
>> http://www.stats.ox.ac.uk/~__**ripley/
>>
>> 
>> >
>> University of Oxford, Tel: +44 1865 272861
>>  (self)
>> 1 South Parks Road, +44 1865 272866  (PA)
>>
>> Oxford OX1 3TG, UKFax: +44 1865 272595
>> 
>>
>> __**__
>> R-help@r-project.org  mailing list
>> 
>> https://stat.ethz.ch/mailman/_**_listinfo/r-help
>>
>> 
>> 
>> >
>> PLEASE do read the posting guide
>> 
>> http://www.R-project.org/__**posting-guide.html
>>
>> 
>> 
>> >
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  
> http://www.stats.ox.ac.uk/~**ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create and read symbolic links in Windows

2013-05-03 Thread Santosh

Just got it right please ignore the previous posting...

It worked!
 Prof Ripley made my day!! :) THANK YOU!


On Fri, May 3, 2013 at 11:23 AM, Santosh  wrote:

> Thanks for your suggestion... I upgraded to R.3.0.0 in 64-bit Windows 7
> environment..
>
> This time when I use file.link..
> I get the following error message: 'Cannot create a file when that file
> already exists"
> And I don't see the link.
>
> The other function, file.copy, correctly copies to the target location.
>
> Still confuse with the error msges...
>
> Thanks,
> Santosh
>
>
> On Thu, May 2, 2013 at 11:42 PM, Prof Brian Ripley 
> wrote:
>
>> On 03/05/2013 07:33, Santosh wrote:
>>
>>> Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't
>>> get "file.symlink" to work, but "file.link" did return the result to be
>>> "TRUE" but at the target location, I did not see any link.
>>>
>>> Not sure I am missing anything more.. Hope it's nothing to do with
>>> administrator accounts and administrator rights... Is it something I
>>> should check with my system administrator?
>>>
>>
>> You may need to update your R: although the posting guide asked you to do
>> that before posting.  There was a relevant bug fix in 2.15.3.
>>
>>
>>> Thanks,
>>> Santosh
>>>
>>>
>>> On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley
>>> mailto:rip...@stats.ox.ac.uk>**> wrote:
>>>
>>> On 02/05/2013 19:50, Santosh wrote:
>>>
>>> Dear Rxperts..
>>> Got a couple of quick q's..
>>> I am using R in windows environment (both 32-bit and 64-bit)
>>> a) Is there a way to create symbolic links to some data files?
>>>
>>>
>>> See ?file.symlink.  ??'symbolic link' should have got you there.
>>>
>>> Note that this is not very useful for files, but that is a Windows
>>> and not an R restriction.
>>>
>>>
>>>  > b) How do I read data from symbolic links?
>>>
>>> The same ways you read data from files.
>>>
>>>
>>> Thanks so much..
>>> Santosh
>>>
>>>
>>>
>>> --
>>> Brian D. Ripley, rip...@stats.ox.ac.uk >> >
>>> Professor of Applied Statistics,
>>> 
>>> http://www.stats.ox.ac.uk/~__**ripley/
>>>
>>> 
>>> 
>>> >
>>> University of Oxford, Tel: +44 1865 272861
>>>  (self)
>>> 1 South Parks Road, +44 1865 272866  (PA)
>>>
>>> Oxford OX1 3TG, UKFax: +44 1865 272595
>>> 
>>>
>>> __**__
>>> R-help@r-project.org  mailing list
>>> 
>>> https://stat.ethz.ch/mailman/_**_listinfo/r-help
>>>
>>> 
>>> 
>>> >
>>> PLEASE do read the posting guide
>>> 
>>> http://www.R-project.org/__**posting-guide.html
>>>
>>> 
>>> 
>>> >
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>
>> --
>> Brian D. Ripley,  rip...@stats.ox.ac.uk
>> Professor of Applied Statistics,  
>> http://www.stats.ox.ac.uk/~**ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Winsemius

> 
> On May 3, 2013, at 17:24 , jpm miao wrote:
> 
>> I have a program, when I write
>> 
>> if(num!=NA)
>> 
> snipped

On May 3, 2013, at 10:46 AM, peter dalgaard wrote:

> Because comparison with an unknown value yields an unknown result. 

Anything else would violate the Second Law of Thermodynamics. We cannot have 
comparisons reducing entropy, now can we? Uncertainty cannot run uphill.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fortune candidate! Re: Why can't R understand if(num!=NA)?

2013-05-03 Thread Sarah Goslee

On Fri, May 3, 2013 at 3:36 PM, David Winsemius  wrote:
>>
>> On May 3, 2013, at 17:24 , jpm miao wrote:
>>
>>> I have a program, when I write
>>>
>>> if(num!=NA)
>>>
>> snipped
>
> On May 3, 2013, at 10:46 AM, peter dalgaard wrote:
>
>> Because comparison with an unknown value yields an unknown result.
>
> Anything else would violate the Second Law of Thermodynamics. We cannot have 
> comparisons reducing entropy, now can we? Uncertainty cannot run uphill.
>
> --
>
> David Winsemius
> Alameda, CA, USA
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2013-05-03 Thread Tien trung Dinh

Hi.

After I installed R 3.0.0.pkg for mac version , when click the icon R to 
startup . I receive the annoucement in red color to inform that something 
wrongs , but I do not know how to fix them .
R version 3.0.0 (2013-04-03) -- "Masked Marvel"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin10.8.0 (64-bit)

"R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C" 
2: Setting LC_COLLATE failed, using "C" 
3: Setting LC_TIME failed, using "C" 
4: Setting LC_MESSAGES failed, using "C" 
5: Setting LC_PAPER failed, using "C" 
[R.app GUI 1.60 (6476) x86_64-apple-darwin10.8.0]

WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will 
work.
Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system 
preferences accordingly.
[History restored from /Users/dinhtientrung/.Rapp.history]

starting httpd help server ... done
> 
"

Would you mind sharing your experiences in this situation for me please !

Thank you so much .

Hope to hear the answer from you soon 

Trung 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Winsemius

On May 3, 2013, at 10:46 AM, peter dalgaard wrote:

> 
> On May 3, 2013, at 17:24 , jpm miao wrote:
> 
>> I have a program, when I write
>> 
>> if(num!=NA)
>> 
>> it yields an error message.
>> 
>> However, if I write
>> 
>> if(is.na(num)==FALSE)
>> 
>> it works.
>> 
>> Why doesn't the first statement work?
> 
> 
> Because comparison with an unknown value yields an unknown result. 

Anything else would violate the Second Law of Thermodynamics. We cannot have 
comparisons reducing entropy, now can we? Uncertainty cannot run uphill.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2013-05-03 Thread David Winsemius


On May 3, 2013, at 9:44 AM, Tien trung Dinh wrote:

> Hi.
> 
> After I installed R 3.0.0.pkg for mac version , when click the icon R to 
> startup . I receive the annoucement in red color to inform that something 
> wrongs , but I do not know how to fix them .
> R version 3.0.0 (2013-04-03) -- "Masked Marvel"
> Copyright (C) 2013 The R Foundation for Statistical Computing
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
> 
> "R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
> 
>   Natural language support but running in an English locale
> 
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> 
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> 
> During startup - Warning messages:
> 1: Setting LC_CTYPE failed, using "C" 
> 2: Setting LC_COLLATE failed, using "C" 
> 3: Setting LC_TIME failed, using "C" 
> 4: Setting LC_MESSAGES failed, using "C" 
> 5: Setting LC_PAPER failed, using "C" 
> [R.app GUI 1.60 (6476) x86_64-apple-darwin10.8.0]
> 
> WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will 
> work.
> Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system 
> preferences accordingly.
> [History restored from /Users/dinhtientrung/.Rapp.history]
> 
> starting httpd help server ... done
>>  
> "
> 
> Would you mind sharing your experiences in this situation for me please !

Why have you stopped at this point?  (My experiences have bee quite good with 
following advice.)  You have been given a very specific warning (not an error). 
It is telling you where to find additional information. It is your 
responsibility to educate yourself further. The document referred to can be 
found by pulling down the Help menu (while running R.app)  and choosing "R 
Help".

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] color by group in ggplot

2013-05-03 Thread Ye Lin

Hey,

I have a dataset like this:

ID Var1   Var2 Group
A1 11BB
A2 1   2AA
B1  2  1 CC
B2 13DD
C1  12EE

I would like to plot the points of Var1 and Var2, use "ID" as X-axis, but
color the points by "Group". I can only manage to color the points by "ID"
after transform the dataset to "tall" using "reshape" package.

Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting certain rows from data frame

2013-05-03 Thread arun



Hi,
You can use ?split()
 lst1<-split(DF,DF$ID)
lst1[1:2]
#$`1`
#  ID  drugs month
#1  1 drug x 1
#4  1 drug x 1
#5  1 drug y 2
#6  1 drug z 3
#
#$`2`
 # ID  drugs month
#2  2 drug y 2
#7  2 drug x 1

mean(sapply(lst1,nrow))
#[1] 2.4
#or
library(plyr)
 mean(ddply(DF,.(ID),nrow)[,2])
#[1] 2.4
#or
mean(with(DF,tapply(ID,ID,FUN=length)))
#[1] 2.4
A.K.





From: Sarah Jo Sinnott <105405...@umail.ucc.ie>
To: arun  
Sent: Friday, May 3, 2013 4:35 PM
Subject: Re: selecting certain rows from data frame



Yes - but if I can count the number of rows for each ID, this equates to number 
of drugs per each ID. So that way I can get a mean #rows(drugs). 

e.g., 

ID 1 = 4 rows (approx=4drugs)
ID2= 2 rows
ID 3 = 3 rows
ID 4 = 2 rows
ID 5 = 1 row

12 rows/5people = 2.4rows/person

that is 2.4 drugs per person. 

Do you think it is possible to isolate the number of rows per unique ID? It 
would be great if you could! I'v etried reorganising my data into wide format - 
but it doesn't work very well, so I'm left with his option really!

Thank you for you help thus far

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread David Winsemius

On May 3, 2013, at 1:37 PM, Ye Lin wrote:

> Hey,
> 
> I have a dataset like this:
> 
> ID Var1   Var2 Group
> A1 11BB
> A2 1   2AA
> B1  2  1 CC
> B2 13DD
> C1  12EE
> 
> I would like to plot the points of Var1 and Var2, use "ID" as X-axis, but
> color the points by "Group". I can only manage to color the points by "ID"
> after transform the dataset to "tall" using "reshape" package.

If I were given the task of designing a plotting system that would "decide" 
what to do with a categorical "x-axis" request, it would probably deliver a 
barplot. My guess is that you do not want that. But what do you mean by a 
"point" whose x-value is "A1"?

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread Ye Lin

I want to plot the values of "Var1" and "Var2" on the same plot, with
x-axis labeling as the list of IDs. But I want to color the points by
 their category in "Group". Is it possible to do in ggplot, or do i have to
plot from scratch using basic plot?


On Fri, May 3, 2013 at 1:49 PM, David Winsemius wrote:

>
> On May 3, 2013, at 1:37 PM, Ye Lin wrote:
>
> > Hey,
> >
> > I have a dataset like this:
> >
> > ID Var1   Var2 Group
> > A1 11BB
> > A2 1   2AA
> > B1  2  1 CC
> > B2 13DD
> > C1  12EE
> >
> > I would like to plot the points of Var1 and Var2, use "ID" as X-axis, but
> > color the points by "Group". I can only manage to color the points by
> "ID"
> > after transform the dataset to "tall" using "reshape" package.
>
> If I were given the task of designing a plotting system that would
> "decide" what to do with a categorical "x-axis" request, it would probably
> deliver a barplot. My guess is that you do not want that. But what do you
> mean by a "point" whose x-value is "A1"?
>
> --
>
> David Winsemius
> Alameda, CA, USA
>
>
<>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread arun

HI,
May be this helps:

dat1<- read.table(text="
ID    Var1  Var2    Group
A1    1    1    BB
A2    1  2    AA
B1  2  1    CC
B2    1    3    DD
C1  1    2    EE
",sep="",header=TRUE)
library(reshape2)
dat2<-melt(dat1,id.var=c("ID","Group"))
library(ggplot2)
ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point()
A.K.



- Original Message -
From: Ye Lin 
To: R help 
Cc: 
Sent: Friday, May 3, 2013 4:37 PM
Subject: [R] color by group in ggplot

Hey,

I have a dataset like this:

ID     Var1       Var2     Group
A1         1            1            BB
A2         1           2            AA
B1          2          1             CC
B2         1            3            DD
C1          1            2            EE

I would like to plot the points of Var1 and Var2, use "ID" as X-axis, but
color the points by "Group". I can only manage to color the points by "ID"
after transform the dataset to "tall" using "reshape" package.

Thanks for your help!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread Ye Lin

Thanks A.K 
I also add "shape=variable" so that it is much easier to tell two variables
by color +shape



On Fri, May 3, 2013 at 2:14 PM, arun  wrote:

> HI,
> May be this helps:
>
> dat1<- read.table(text="
> IDVar1  Var2Group
> A111BB
> A21  2AA
> B1  2  1CC
> B213DD
> C1  12EE
> ",sep="",header=TRUE)
> library(reshape2)
> dat2<-melt(dat1,id.var=c("ID","Group"))
> library(ggplot2)
> ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point()
> A.K.
>
>
>
> - Original Message -
> From: Ye Lin 
> To: R help 
> Cc:
> Sent: Friday, May 3, 2013 4:37 PM
> Subject: [R] color by group in ggplot
>
> Hey,
>
> I have a dataset like this:
>
> ID Var1   Var2 Group
> A1 11BB
> A2 1   2AA
> B1  2  1 CC
> B2 13DD
> C1  12EE
>
> I would like to plot the points of Var1 and Var2, use "ID" as X-axis, but
> color the points by "Group". I can only manage to color the points by "ID"
> after transform the dataset to "tall" using "reshape" package.
>
> Thanks for your help!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector allocation problem while trying to plot 6 MB data file

2013-05-03 Thread Uwe Ligges




On 02.05.2013 14:37, Ramon Hofer wrote:

Hi all

I'm trying to analyse the network speed and used iperf to create a csv
file containing the link test data. It's only about 6 MB big but
contains about 40'000 samples.

I can do boxplots (apart from printing the number of samples but I ask
separately for that).

To find the behaviour over time I wanted to plot the throuphput. So I
have this command:

plot(A$Timestamp, A$Bandwidth.bit.sec., xlab = "Timestamp", ylab =
"Bandwidth [bit/s]", ylim = quantile(A$Bandwidth.bit.sec., c(0, .99),
na.rm = TRUE))

Unfortunately I get this:
Error: cannot allocate vector of size 12.5 Gb


4 samples and 6MB can't be the issue unless this is not a regular 
plot but the classes of A$Timestamp or A$Bandwidth.bit.sec are rather 
special.


What do
str(A$Timestamp)
str(A$Bandwidth.bit.sec.)
tell us?

Can you make a reprducible examples available?

Best,
Uwe Ligges





Is there a way around this problem or will I have to split the data?


Best
Ramon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Write date class as number of days from 1970

2013-05-03 Thread Uwe Ligges




On 03.05.2013 15:59, Manta wrote:

Dear all,

I have a dataset with one column being of class Date. When I write the
output, I would like that column being written as number of days from
1970-01-01. I could not find anywhere a way to do it.



as.numeric(x)

where x is the Date object.

Uwe Ligges




Thanks,
Marco



--
View this message in context: 
http://r.789695.n4.nabble.com/Write-date-class-as-number-of-days-from-1970-tp4666155.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD building SPEEDY

2013-05-03 Thread Uwe Ligges




On 02.05.2013 05:10, ren_az wrote:

Hello every one:

I get following warning when building my R package with R-3.0.0.



building 'SPEEDY.tar.gz' Warning in utils::tar(filepath, pkgname,
compression = "gzip", compression_level = 9L, : number of items to replace
is not a multiple of replacement length thanks Michael



I have no idea for this, can you help me.


Can you show us the package?
I am not able to generate a problem like this one with R-3.0.0, i.e. 
that R tries to create a file called "SPEEDY.tar.gz" without version number.


Best,
Uwe Ligges




Best regard






$BG$!!0&DA(B/ Ren Aizhen

r...@bi.cs.titech.ac.jp

$BEl5~9)6HBg3X>pJsM}9)3X8&5f2J!!7W;;9)3X@l96!!=);38&5f<(B8$B9f4[(BE507$B9f<<(B)

Tel:03-5734-3645, Fax:03-5734-3646


-




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread David Winsemius


On May 3, 2013, at 1:57 PM, Ye Lin wrote:

> I want to plot the values of "Var1" and "Var2" on the same plot, with x-axis 
> labeling as the list of IDs. Sth like this:
> 
> 
>  But I want to color the points based on the category in "Group", I dont know 
> how to do it with ggplot.

You didn't say what class the ID variable was, but if it were a factor ( as is 
most likely), then:

plot(  as.numeric(dfrm$ID), Var1)
points( as.numeric(dfrm$ID), Var2) 

With whatever means of disiguishing overlapping points (pch, col, jittering)  
might suit you.

-- 
David.

> Thanks!
> 
> 
> On Fri, May 3, 2013 at 1:49 PM, David Winsemius  
> wrote:
> 
> On May 3, 2013, at 1:37 PM, Ye Lin wrote:
> 
> > Hey,
> >
> > I have a dataset like this:
> >
> > ID Var1   Var2 Group
> > A1 11BB
> > A2 1   2AA
> > B1  2  1 CC
> > B2 13DD
> > C1  12EE
> >
> > I would like to plot the points of Var1 and Var2, use "ID" as X-axis, but
> > color the points by "Group". I can only manage to color the points by "ID"
> > after transform the dataset to "tall" using "reshape" package.
> 
> If I were given the task of designing a plotting system that would "decide" 
> what to do with a categorical "x-axis" request, it would probably deliver a 
> barplot. My guess is that you do not want that. But what do you mean by a 
> "point" whose x-value is "A1"?
> 
> --
> 
> David Winsemius
> Alameda, CA, USA
> 
> 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread Aldi Kraja


Hi,
Based on par function, I can split the screen into  two parts left and 
right.
I wish x occupies the half left screen, and all plants occupy half right 
screen, which happens right now.


But I wish the right screen, to be split in 3 or more vertical parts 
where each pair of the same type of plant, are together in its own block 
of boxplot, because each plant has its own unit of measure.
Let's say wheat is measured in ton, tomato in pound and cucumbers as 
counts. :-)


x<-rnorm(1000,mean=0,sd=1,main="Right screen")

wheat1<-rnorm(100,mean=0,sd=1)
wheat2<-rnorm(150,mean=0,sd=2)
tomatos3<-rnorm(200,mean=0,sd=3)
tomatos4<-rnorm(250,mean=0,sd=4)
cucumbers5<-rnorm(300,mean=0,sd=5)
cucumbers6<-rnorm(400,mean=0,sd=6)
par(mfrow=c(1,2))

hist(x, main="Left screen OK")

boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
title ("Right screen: boxplot with plants")

Thank you in advance for any suggestions,

Aldi

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread Aldi Kraja


Hmm,
I had a typo paste by mistake in my x vector
It has to be:

x<-rnorm(1000,mean=0,sd=1)
wheat1<-rnorm(100,mean=0,sd=1)
wheat2<-rnorm(150,mean=0,sd=2)
tomatos3<-rnorm(200,mean=0,sd=3)
tomatos4<-rnorm(250,mean=0,sd=4)
cucumbers5<-rnorm(300,mean=0,sd=5)
cucumbers6<-rnorm(400,mean=0,sd=6)
par(mfrow=c(1,2))

hist(x, main="Left screen OK")

boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
title ("Right screen: boxplot with plants")

Thanks,

Aldi

On 5/3/2013 4:46 PM, Aldi Kraja wrote:

Hi,
Based on par function, I can split the screen into  two parts left and 
right.
I wish x occupies the half left screen, and all plants occupy half 
right screen, which happens right now.


But I wish the right screen, to be split in 3 or more vertical parts 
where each pair of the same type of plant, are together in its own 
block of boxplot, because each plant has its own unit of measure.
Let's say wheat is measured in ton, tomato in pound and cucumbers as 
counts. :-)


x<-rnorm(1000,mean=0,sd=1,main="Right screen")

wheat1<-rnorm(100,mean=0,sd=1)
wheat2<-rnorm(150,mean=0,sd=2)
tomatos3<-rnorm(200,mean=0,sd=3)
tomatos4<-rnorm(250,mean=0,sd=4)
cucumbers5<-rnorm(300,mean=0,sd=5)
cucumbers6<-rnorm(400,mean=0,sd=6)
par(mfrow=c(1,2))

hist(x, main="Left screen OK")

boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
title ("Right screen: boxplot with plants")

Thank you in advance for any suggestions,

Aldi

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread Sarah Goslee

Hi Aldi,

You might want
?layout
instead.

Sarah

On Fri, May 3, 2013 at 5:54 PM, Aldi Kraja  wrote:
> Hmm,
> I had a typo paste by mistake in my x vector
> It has to be:
>
> x<-rnorm(1000,mean=0,sd=1)
> wheat1<-rnorm(100,mean=0,sd=1)
> wheat2<-rnorm(150,mean=0,sd=2)
> tomatos3<-rnorm(200,mean=0,sd=3)
> tomatos4<-rnorm(250,mean=0,sd=4)
> cucumbers5<-rnorm(300,mean=0,sd=5)
> cucumbers6<-rnorm(400,mean=0,sd=6)
> par(mfrow=c(1,2))
>
> hist(x, main="Left screen OK")
>
> boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
> title ("Right screen: boxplot with plants")
>
> Thanks,
>
> Aldi
>
> On 5/3/2013 4:46 PM, Aldi Kraja wrote:
>>
>> Hi,
>> Based on par function, I can split the screen into  two parts left and
>> right.
>> I wish x occupies the half left screen, and all plants occupy half right
>> screen, which happens right now.
>>
>> But I wish the right screen, to be split in 3 or more vertical parts where
>> each pair of the same type of plant, are together in its own block of
>> boxplot, because each plant has its own unit of measure.
>> Let's say wheat is measured in ton, tomato in pound and cucumbers as
>> counts. :-)
>>
>> x<-rnorm(1000,mean=0,sd=1,main="Right screen")
>>
>> wheat1<-rnorm(100,mean=0,sd=1)
>> wheat2<-rnorm(150,mean=0,sd=2)
>> tomatos3<-rnorm(200,mean=0,sd=3)
>> tomatos4<-rnorm(250,mean=0,sd=4)
>> cucumbers5<-rnorm(300,mean=0,sd=5)
>> cucumbers6<-rnorm(400,mean=0,sd=6)
>> par(mfrow=c(1,2))
>>
>> hist(x, main="Left screen OK")
>>
>> boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
>> title ("Right screen: boxplot with plants")
>>
>> Thank you in advance for any suggestions,
>>
>> Aldi
>>


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R package for bootstrapping (comparing two quadratic regression models)

2013-05-03 Thread Elaine Kuo

Hello ,

I want to compare two quadratic regression models with non-parametric
bootstrap.
However, I do not know which R package can serve the purpose,
such as boot, rms, or bootstrap, DeltaR.
Please kindly advise and thank you.

Elaine

The two quadratic regression models are

y1=a1x^2+b1x+c1

y1= observed migration distance of butterflies()

y2=a2x^2+b2x+c2

y2= predicted migration distance of butterflies (based on body mass)

x= body mass of butterflies


null hypothesis: a1=a2 and b1=b2 and c1=c2

bootstrap to test if the coeffients (a, b, c) of the y1 and the y2 model
differ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread David Winsemius


On May 3, 2013, at 3:21 PM, Sarah Goslee wrote:

> Hi Aldi,
> 
> You might want
> ?layout
> instead.
> 

Indeed. In particular a matrix argument might be:

matrix(c(1,2,3, 4,4,4)


> Sarah
> 
> On Fri, May 3, 2013 at 5:54 PM, Aldi Kraja  wrote:
>> Hmm,
>> I had a typo paste by mistake in my x vector
>> It has to be:
>> 
>> x<-rnorm(1000,mean=0,sd=1)
>> wheat1<-rnorm(100,mean=0,sd=1)
>> wheat2<-rnorm(150,mean=0,sd=2)
>> tomatos3<-rnorm(200,mean=0,sd=3)
>> tomatos4<-rnorm(250,mean=0,sd=4)
>> cucumbers5<-rnorm(300,mean=0,sd=5)
>> cucumbers6<-rnorm(400,mean=0,sd=6)
>> par(mfrow=c(1,2))
>> 
>> hist(x, main="Left screen OK")
>> 
>> boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)

I think you will need a separate call to boxplot for each grouping. The 
`boxplot` function will nto be able to access the device specifications.


-- 
David.


>> title ("Right screen: boxplot with plants")
>> 
>> Thanks,
>> 
>> Aldi
>> 
>> On 5/3/2013 4:46 PM, Aldi Kraja wrote:
>>> 
>>> Hi,
>>> Based on par function, I can split the screen into  two parts left and
>>> right.
>>> I wish x occupies the half left screen, and all plants occupy half right
>>> screen, which happens right now.
>>> 
>>> But I wish the right screen, to be split in 3 or more vertical parts where
>>> each pair of the same type of plant, are together in its own block of
>>> boxplot, because each plant has its own unit of measure.
>>> Let's say wheat is measured in ton, tomato in pound and cucumbers as
>>> counts. :-)
>>> 
>>> x<-rnorm(1000,mean=0,sd=1,main="Right screen")
>>> 
>>> wheat1<-rnorm(100,mean=0,sd=1)
>>> wheat2<-rnorm(150,mean=0,sd=2)
>>> tomatos3<-rnorm(200,mean=0,sd=3)
>>> tomatos4<-rnorm(250,mean=0,sd=4)
>>> cucumbers5<-rnorm(300,mean=0,sd=5)
>>> cucumbers6<-rnorm(400,mean=0,sd=6)
>>> par(mfrow=c(1,2))
>>> 
>>> hist(x, main="Left screen OK")
>>> 
>>> boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
>>> title ("Right screen: boxplot with plants")
>>> 
>>> Thank you in advance for any suggestions,
>>> 
>>> Aldi
>>> 
> 
> 


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to parallelize 'apply' across multiple cores on a Mac

2013-05-03 Thread David Romano

Hi everyone,

I'm trying to use apply (with a call to zoo's rollapply within) on the
columns of a 1.5Kx165K matrix, and I'd like to make use of the other cores
on my machine to speed it up. (And hopefully also leave more memory free: I
find that after I create a big object like this, I have to save my
workspace and then close and reopen R to be able to recover memory tied up
by R, but maybe that's a separate issue -- if so, please let me know!)

It seems the package 'multicore' has a parallel version of 'lapply', which
I suppose I could combine with a 'do.call' (I think) to gather the elements
of the output list into a matrix, but I was wondering whether there might
be another route.

And, in case the particular way I constructed the call to 'apply' might be
the source of the problem, here is a deconstructed version of what I did to
each column, for easier parsing:
-  begin call to 'apply'

Step 1:  Identify several disjoint subsequences of fixed length, say length
three, of a column.

column.values <- 1:16
desired.subseqs <- c( NA, NA, NA, 1, 1, 1, NA, 1, 1, 1, NA, NA, 1,1,1, NA
)   # this vector is used for every column.
desired.values <- desired.subseq * column.values

Step 2:  Find the average value of each subsequence.

desired.means <- rollapply( desired.values, 3, mean, fill=NA, align =
"right", na.rm = FALSE)  # put mean in highest index of subsequence and
retain original vector length
desired.means
[1] NA NA NA NA NA 5 NA NA NA 9 NA NA NA NA NA 14 NA

Step 3:   Shift values forward by one index value, retaining original
vector length.

desired.means <- zoo( desired.means )  # in order to be able to use lag.zoo
desired.means <- lag( desired.means, k = -1, na.pad = TRUE)
desired.means
[1] NA NA NA NA NA NA 5 NA NA NA 9 NA NA NA NA 14

Step 4:   Use last-observation-carried-forward, retaining original vector
length.

desired.means <- na.locf( desired.means, na.rm = FALSE )
desired.means
[1] NA NA NA NA NA NA 5 5 5 5 9 9 9 9 9 14

Step 5:  Use next-observation-carried-backward to assign values to initial
sequence of NAs.

desired.means <- na.locf( desired.means, fromLast = TRUE)
desired.means
[1] 5 5 5 5 5 5 5 5 5 5 9 9 9 9 9 14

Step 6:  Convert back to vector (from zoo object), and subtract from column.

desired.column <- vector.values - coredata(desired.means)
desired.column
[1] -4 -3 -2 -1 0 1 2 3 4 5 2 3 4 5 6 2
-  end call to 'apply' 

Thanks,
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to best add columns to a matrix with many columns

2013-05-03 Thread David Romano

Hi everyone,

I have large data frame, say df1,  with 165K columns, and all but the first
four columns of df1 are numeric.   I transformed the numeric data and
obtained a matrix, call it data.m, with 165K - 4 columns, and then tried to
create a second data frame by replacing the numeric columns of df1 by
data.m.  I did this in two ways, and both ways instantly used up all the
available memory, so I was wondering whether there was a better way to do
this.

Here's what I tried:

df2 <- df1
df2[ ,5:length(df1)] <- data.m

and

df2 <- cbind( df1[1:4], data.m)

Thanks,
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean for each observation

2013-05-03 Thread arun

HI,
Not sure I understand it correctly.

dat1<- read.table(text="
site Year doy fish Feed swim agr_1 agr_2 agr_3 rest hide
3 2012 203 1 1 0 0 0 0 0 0
3 2012 203 1 0 1 0 0 0 0 0
3 2012 203 1 0 1 0 0 0 0 0
3 2012 203 2 0 0 0 0 0 1 0
3 2012 203 2 1 0 0 0 0 0 0
3 2012 203 2 1 0 0 0 0 0 0
4 2012 197 1 0 0 0 0 0 1 0
4 2012 197 1 1 0 0 0 0 0 0
4 2012 197 1 0 1 0 0 0 0 0
4 2012 197 3 0 0 0 0 0 0 1
4 2012 197 3 1 0 0 0 0 0 0
",sep="",header=TRUE) 
dat2<-reshape(dat1,direction="long",varying=7:9,sep="_")
row.names(dat2)<- 1:nrow(dat2)
 head(dat2)
#  site Year doy fish Feed swim rest hide time agr id
#1    3 2012 203    1    1    0    0    0    1   0  1
#2    3 2012 203    1    0    1    0    0    1   0  2
#3    3 2012 203    1    0    1    0    0    1   0  3
#4    3 2012 203    2    0    0    1    0    1   0  4
#5    3 2012 203    2    1    0    0    0    1   0  5
#6    3 2012 203    2    1    0    0    0    1   0  6

library(plyr)
#fish, year, site
ddply(dat2,.(fish,Year,site),function(x) numcolwise(mean)(x[,c(5:8)])) 
#  fish Year site  Feed  swim  rest hide
#1    1 2012    3 0.333 0.667 0.000  0.0
#2    1 2012    4 0.333 0.333 0.333  0.0
#3    2 2012    3 0.667 0.000 0.333  0.0
#4    3 2012    4 0.500 0.000 0.000  0.5

#fish 

 ddply(dat2,.(fish),function(x) numcolwise(mean)(x[,c(5:8)]))
#  fish  Feed swim  rest hide
#1    1 0.333  0.5 0.167  0.0
#2    2 0.667  0.0 0.333  0.0
#3    3 0.500  0.0 0.000  0.5
A.K.

>Hi 
>I did fish behavior at different sites. 
>Each fish represent a rep at each site. 
>e.g for my data 
>site   Yeardoy fishFeedswimagr_1   agr_2   agr_3   rest
>hide 
>3  2012203 1   1   0   0   0   0   0   
>0 
>3  2012203 1   0   1   0   0   0   0   
>0 
>3  2012203 1   0   1   0   0   0   0   
>0 
>3  2012203 2   0   0   0   0   0   1   
>0 
>3  2012203 2   1   0   0   0   0   0   
>0 
>3  2012203 2   1   0   0   0   0   0   
>0 
>4  2012197 1   0   0   0   0   0   1   
>0 
>4  2012197 1   1   0   0   0   0   0   
>0 
>4  2012197 1   0   1   0   0   0   0   
>0 
>4  2012197 3   0   0   0   0   0   0   
>1 
>4  2012197 3   1   0   0   0   0   0   
>0 
>
>1. I would like to combine column agr_1, agr_2 and agr_3 
>2. How to calculate mean for each fish for each behavior 
>Any suggestion is appreciated. 
Thanks 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to best add columns to a matrix with many columns

2013-05-03 Thread Jeff Newmiller

I am not seeing any good justification in your description for converting to 
matrix if you are planning to convert it back to data frame. Memory is going to 
be inefficiently-used if you do this.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

David Romano  wrote:

>Hi everyone,
>
>I have large data frame, say df1,  with 165K columns, and all but the
>first
>four columns of df1 are numeric.   I transformed the numeric data and
>obtained a matrix, call it data.m, with 165K - 4 columns, and then
>tried to
>create a second data frame by replacing the numeric columns of df1 by
>data.m.  I did this in two ways, and both ways instantly used up all
>the
>available memory, so I was wondering whether there was a better way to
>do
>this.
>
>Here's what I tried:
>
>df2 <- df1
>df2[ ,5:length(df1)] <- data.m
>
>and
>
>df2 <- cbind( df1[1:4], data.m)
>
>Thanks,
>David
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating distance matrix for large dataset

2013-05-03 Thread steven mosher

 I have a version that uses bigmemory on my blog, but looks at distance on
a sphere for a 36k * 36K  matrix

 not hundreds of Gb  so I dont know if the approach will work for you


http://stevemosher.wordpress.com/2012/04/12/nick-stokes-distance-code-now-with-big-memory/


Steve

However,  I never tested it with
On May 2, 2013 9:40 PM, "HJ YAN"  wrote:

> Dear R users
>
>
> I wondered if any of you ever tried to calculate distance matrix with very
> large data set, and if anyone out there can confirm this error message I
> got actually mean that my data is too large for this task.
>
> negative length vectors are not allowed
>
>
> My data size and code used
>
>  dim(mydata_nor)[1] 365000144> d <- dist(mydata_nor, method =
> "euclidean")
>
>
>
> Here my data has 1000 samples each has a year data observed by 10 minutes
> interval daily, so the size is  (365* 1000) * 144.
>
>
> I checked the manual of function 'dist' but can not see the upper limit
> size allowed, and I bet there should be one, so any hints is appreciated.
>
>
> I would also be grateful if any other method for calculating distance
> matrix for large dataset could be advised.
>
>
>
> I appreciate reproducible code should be provided for your advice, so try
> below if needed:
>
> A<-matrix(1:365000*144,nrow=365000,ncol=144)> dim(A)[1] 365000144>
> d1<-dist(A,method="euclidean")Error in dist(A, method = "euclidean") :
>   negative length vectors are not allowed
>
>
>
>
> Many thanks in advance!
>
> HJ
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R 2.15.2 "Failed to load sRGB colorspace file"

2013-05-03 Thread Beto .

Hello,

I built R 2.15.2 on Solaris X64,
I have an issue when trying to execute the "check" target to test if everything 
goes ok.
Do you have any idea what could be causing this issue?

Error message:

Examples/tools-Ex.Rout.fail

> cat("Time elapsed: ", proc.time() - get("ptime", pos = 'CheckExEnv'),"\n")
Time elapsed:  1.483 0.045 2.195 0 0
> grDevices::dev.off()
Error in grDevices::dev.off() : Failed to load sRGB colorspace file
Execution halted



Thanks for your help,
Humberto.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Factor deletion criteria

2013-05-03 Thread Iuri Gavronski

Hi,
I would like to know the criteria by which R removes a factor in linear
models. For example, I have a four level factor, and R creates 3 dummies to
estimate coefficients. Which level is chosen? Can I chance it?
Thanks,
Iuri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to best add columns to a matrix with many columns

2013-05-03 Thread David Romano

Sorry, Jeff, I misspoke:  the 'matrix' data.m is really a data frame -- I
was just thinking about it as a matrix since it's the numeric part of df1,
and didn't realize the thought made it's way in the message.   So the
memory issues are unrelated to converting between data frames and
matrices.  -David

On Fri, May 3, 2013 at 8:20 PM, Jeff Newmiller wrote:

> I am not seeing any good justification in your description for converting
> to matrix if you are planning to convert it back to data frame. Memory is
> going to be inefficiently-used if you do this.
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> David Romano  wrote:
>
> >Hi everyone,
> >
> >I have large data frame, say df1,  with 165K columns, and all but the
> >first
> >four columns of df1 are numeric.   I transformed the numeric data and
> >obtained a matrix, call it data.m, with 165K - 4 columns, and then
> >tried to
> >create a second data frame by replacing the numeric columns of df1 by
> >data.m.  I did this in two ways, and both ways instantly used up all
> >the
> >available memory, so I was wondering whether there was a better way to
> >do
> >this.
> >
> >Here's what I tried:
> >
> >df2 <- df1
> >df2[ ,5:length(df1)] <- data.m
> >
> >and
> >
> >df2 <- cbind( df1[1:4], data.m)
> >
> >Thanks,
> >David
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factor deletion criteria

2013-05-03 Thread David Winsemius

On May 3, 2013, at 3:32 PM, Iuri Gavronski wrote:

> Hi,
> I would like to know the criteria by which R removes a factor in linear
> models. For example, I have a four level factor, and R creates 3 dummies to
> estimate coefficients. Which level is chosen? Can I chance it?

The default order is alphabetical. Lowest lexical sorted item is the reference 
level.

Changing levels is possible:

?levels
?factor

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read .csv file and plot a graph

2013-05-03 Thread Jim Lemon


On 05/03/2013 11:49 PM, Vahe nr wrote:

Hi all,

I have a big .csv file (21Mb with 100 rows) it has this shape:
x
1 NaN
2 NaN
3 0.23

and so on.

So the first column has x as a header then row number, the second column
contains values between -1,1 and NaN for empty values.

What should I need to do is: create a new .csv file from this one excluding
NaN values and plot a line graph using the new .csv file.

Or can I use the old .csv file to plot a graph excluding NaN values.


Hi Vahe,
If you want to plot the line ignoring the NaN values, rather than having 
the line break at each NaN, use this:


vndat<-data.frame(1:10,
 x=c(-1,-0.6,-0.4,NaN,-0.2,0.2,0.4,NaN,0.6,0.8))
plot(vndat$x[complete.cases(vndat$x)],type="l")

Jim (the other one)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread peter dalgaard

On May 3, 2013, at 21:36 , David Winsemius wrote:

> 
> On May 3, 2013, at 10:46 AM, peter dalgaard wrote:
>> 
>> 
>> Because comparison with an unknown value yields an unknown result. 
> 
> Anything else would violate the Second Law of Thermodynamics. We cannot have 
> comparisons reducing entropy, now can we? Uncertainty cannot run uphill.

Now what does this say about SAS, where the missing value is smaller than all 
regular numbers? I.e.,

DATA;
  iteen = (age < 20);

turns people of unknown age into instant teenagers.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

92 matches

Mail list logo