Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread Endri Raco
Hi Boris,
Your code works wonderful but still I need pies plotting over map of a
country.
For now I get axis where coordinates are shown and it works great.
Regards


On Wed, Apr 16, 2014 at 12:09 AM, Boris Steipe wrote:

> Here you go:
>
> install.packages("plotrix")
> library(plotrix)
>
> ?floating.pie
>
> dat<- data.frame(x=c(19.9437314, 20.2214171, 20.0955891, 20.9506636),
>  y=c(40.7086377, 41.4924336, 39.9481364, 40.6447347),
>  City=c(120,1, 4, 10),
>  Village=c(425, 1, 4, 15)
> )
>
> # first, set up an existing plot to plot your pies into, for example
> plot(c(min(dat$x)-0.5, max(dat$x)+0.5),
>  c(min(dat$y)-0.5, max(dat$y)+0.5),
>  xlab = "", ylab = "", type = "n"
> )
>
> # your pie-chart sectors are rows of the City and Village columns
> sectors = cbind(dat$City,dat$Village)
>
> # plot each pie chart in turn
> for (i in 1:length(dat$x)) {
> floating.pie(dat$x[i],
>  dat$y[i],
>  sectors[i,],
>  radius=0.1
> )
> }
>
> Enjoy,
> B.
>
>
> On 2014-04-15, at 2:43 PM, DrunkenPhD wrote:
>
> > Sorry Jim not for my beginner level :(((
> >
> > So if my data are like this :
> >
> > x y   City  Village
> >
> > 19.9437314  40.7086377  120 425
> >
> > 20.2214171  41.4924336  1  1
> >
> > 20.0955891  39.9481364  4  4
> >
> > 20.9506636  40.6447347  10   15
> >
> > how do I plot for every point(x,y) a pie chart with slices(City,Village)?
> > Regards
> >
> >
> > how do I plot for every point(x,y) a pie chart with slices(City,Village)?
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
 Endri Raço PhD
_
Polytechnic University of Tirana
Faculty of Mathematical Engineering and Physics Engineering
Department of Mathematical Engineering
Address:  Sheshi Nene Tereza, Nr. 4, Tirana - ALBANIA
Mobile: ++ 355 682061988

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I can't programe routine comp()

2014-04-16 Thread PIKAL Petr
Hi
see in line.

From: Endy BlackEndy [mailto:pert...@gmail.com]
Sent: Tuesday, April 15, 2014 5:38 PM
To: PIKAL Petr
Subject: Re: [R] I can't programe routine comp()

Dear Pert,
thank you for your advise. Unfortunately it does not work as you can see from 
the commands below.

>d=read.table("C:\\Program 
>Files\\R\\Data\\Survival\\Sasdata.txt",fill=TRUE,header=TRUE)

Ah I see, you got an error when reading data as I did.

> d=read.table("C:\\Program 
> Files\\R\\Data\\Survival\\Sasdata.txt",fill=TRUE,header=TRUE)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'C:\Program Files\R\Data\Survival\Sasdata.txt': No such file 
or directory
>

You need to have Sasdata.txt in your Survival directory.
Do you get the same error?
How does your nn looks like.

Mind reading function is not developped yet. Sorry.

> nn<-names(d)
> s=survfit(Surv(get(nn[2]), get(nn[3]))~get(nn[1]), data=d)
> comp(s)
Error in get(t1, loc1) : object 'get(nn[2])' not found
> a=dput(d, "foo")
>

Also, I do not know how to use the command 'dput'. I used the command   
'a=dput(d,"foo")'. I hope it shall work.
I would greatly appreciate any other suggestion.

What dit you find from ?dput help page?
Please try to do some homework and read docs.

I do not know anything about survival and just reading help pages and about 10 
minutes trial and error I  could do following:

library(survival)
?survfit
head(heart)
  start stop eventage  year surgery transplant id
1 0   50 1 -17.155373 0.1232033   0  0  1
2 06 1   3.835729 0.2546201   0  0  2
3 01 0   6.297057 0.2655715   0  0  3
4 1   16 1   6.297057 0.2655715   0  1  3
5 0   36 0  -7.737166 0.4900753   0  0  4
636   39 1  -7.737166 0.4900753   0  1  4
nn<-names(heart)
survfit(with(heart, (Surv(get(nn[1]), get(nn[2]), get(nn[3]~1)
Call: survfit(formula = with(heart, (Surv(get(nn[1]), get(nn[2]), get(nn[3] 
~
1)

records   n.max n.start  events  median 0.95LCL 0.95UCL
172 103   0  75 100  72 263
plot(survfit(with(heart, (Surv(get(nn[1]), get(nn[2]), get(nn[3]~1))

So my suggestion actually works with right data and a bit of effort to get the 
right code.

Regards
Petr

Many thanks
Endy

2014-04-15 12:26 GMT+03:00 PIKAL Petr 
mailto:petr.pi...@precheza.cz>>:
Hi

better to use dput for presenting data as it is directly usable by anybody. So 
I did not test it.

Try to get names for appropriate columns

nn<-names(d)

and use get

s=survfit(Surv(get(nn[2]), get(nn[3]))~get(nn[1]), data=d)

It works with lm and shall work with survit too.

Regards
Petr

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-bounces@r-
> project.org] On Behalf Of Endy
> Sent: Monday, April 14, 2014 12:56 PM
> To: r-help@r-project.org
> Subject: [R] I can't programe routine comp()
>
> Dear R users.
> I am trying to program the comp() routine in package survMisc. I am
> reading the data below with d=read.table( "C:\\. .
> .",fill=TRUE,header=TRUE) Then I load the packages 'survival' and
> 'survMisc', library(survival), library(survMisc)
>  and I run the commands
>   s=survfit(Surv(d[,2],
> d[,3])~d[,1], data=d)
>   comp(s)
>  and I am getting the error
>Error in get(t1, loc1) : object
> 'd[, 2]' not found If instead I use the commands
> s=survfit(Surv(T,
> Status)~Group, data=d)
>comp(s) routine comp()  runs
> perfectly. However when I am programing I can't see a way to know in
> advance the variable names in order to use them.
> Can anybody  give me a suggestion?
>
>  Thanks in advance
>Endy
>
> NB. The data must be stacked in three (3) columns before red.
> They are repeated in nine (9) columns for space saving.
>
> GroupTStatusGroupTStatusGroupTStatus
>
> 120810155124141
> 116020111222041
> 11496011071210631
> 1146201110124811
> 1143301332121051
> 11377022569026411
> 11330022506023901
> 1996022409022881
> 1226022218024211
> 1119902185702791
> 1021829027481
> 1530021562024861
> 1118202147002481
> 11167021363022721
> 14181210300210741
> 138312860023811
> 127612125802101
> 110412224602531
> 160912187002801
> 117212179902351
> 1487121709022481
> 1662121674027041
> 1194121568022111
> 1230121527022191
> 1526121324026061
> 1122129570
> 1129129320
> 174128470
> 1122128480
> 1861218500
> 14661218430
> 11921215350
> 11091214470
> 1551213840
>   [[alternative HTML version deleted]]


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 

Re: [R] netCDF to raster and spatial projection

2014-04-16 Thread Frede Aakmann Tøgersen
Hi Beatriz

Try to skip this step

>  # Reprojecting into CH1903_LV03
>  # First, change the coordinate reference system (crs)
>  proj4string(rasterDF1) <- "+init=epsg:21781"


And just do this

>  # Second, reproject the raster
>  rasterDF1.CH <- spTransform(rasterDF1, crs("+init=epsg:21781"))
>

Also there is a spTransform both in the rgdal and sp packages. So are they 
masking each other? rgdal should be before sp in the search() list.

I cannot be of more help since you provided no data. 


Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Beatriz R. Gonzalez Dominguez
> Sent: 16. april 2014 08:22
> To: r-help@r-project.org
> Subject: [R] netCDF to raster and spatial projection
> 
> I've recently started using R for spatial data. I'd be really grateful
> if you could could help me with this. Thanks!
> 
> Sorry I don't provide a reproducible example. Please ask me if you have
> any questions about the data.
> 
> I've extracted data from a multidimensional netCDF file. This file had
> longitude, latitude and temperature data (for 12 months of a specific year).
> 
>  From this netCDF I've got a data frame for January with these
> variables: longitude, latitude, temperature.
> 
> With this data frame I've created a raster.
> 
>  # Packages
>  library("sp")
>  library("raster")
>  library("rgdal")
>  library("ncdf")
>  library("maptools")
>  library("rgeos")
>  library("sm")
>  library("chron")
> 
>  # Dataframe to raster
>  # Create spatial points data frame
>  coordinates(tmp.df01) <- ~ lon + lat
>  # Coerce to SpatialPixelsDataFrame
>  gridded(tmp.df01) <- T
>  # Coerce to raster
>  rasterDF1 <- raster(tmp.df01)
>  > print(tmp.df01)
>  class   : SpatialPixelsDataFrame
>  dimensions  : 103, 241, 24823, 1  (nrow, ncol, npixels, nlayers)
>  resolution  : 0.0208, 0.0208  (x, y)
>  extent  : 5.739583, 10.76042, 45.73958, 47.88542  (xmin, xmax,
> ymin, ymax)
>  coord. ref. : NA
>  names   :   TabsM_1
>  min values  : -18.1389980316162
>  max values  :  2.26920962333679
> 
> There is no value for 'coord. ref.'
> 
> The projection of the original netCDF was WGS84. So I gave this
> projection to the raster.
> 
>  proj4string(rasterDF1) <- "+proj=longlat +datum=WGS84 +ellps=WGS84
> +towgs84=0,0,0"
> 
> Then, I wanted to reproject my raster to another projection:
> 
>  # Reprojecting into CH1903_LV03
>  # First, change the coordinate reference system (crs)
>  proj4string(rasterDF1) <- "+init=epsg:21781"
>  # Second, reproject the raster
>  rasterDF1.CH <- spTransform(rasterDF1, crs("+init=epsg:21781"))
> 
> At this point I get the following error:
> 
>  Error in spTransform(rasterDF1, crs("+init=epsg:21781")) :
>load package rgdal for spTransform methods
> 
> But the package rgdal is already uploaded! It must be something wrong in
> the code!
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netCDF to raster and spatial projection

2014-04-16 Thread Bea GD
Hi Frede,

Thanks so much for your reply!

I've tried what you said but I get the following error:

Error in spTransform(rasterDF1, crs("+init=epsg:21781")) :
   load package rgdal for spTransform methods


I've checked search() and rgdal is before sp.

>search()
  [1] ".GlobalEnv""package:chron" "package:sm"
"package:rgeos"
  [5] "package:maptools"  "package:ncdf"  "package:rgdal" 
"package:raster"
  [9] "package:sp""tools:rstudio" "package:stats" 
"package:graphics"
[13] "package:grDevices" "package:utils" "package:datasets"  
"package:methods"
[17] "Autoloads" "package:base"


Also, when I do library("rgdal") I get this message:

>library("rgdal")
rgdal: version: 0.8-16, (SVN revision 498)
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 1.10.1, released 2013/08/26
Path to GDAL shared files: 
C:/Users/bgonzale/Documents/R/win-library/3.0/rgdal/gdal
GDAL does not use iconv for recoding strings.
Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
Path to PROJ.4 shared files: 
C:/Users/bgonzale/Documents/R/win-library/3.0/rgdal/proj


Maybe the problem is something to do with this? I don't know how to fix it.

I'd like to provide you with some data, how would be the best way to 
post/sahre a raster?

Thanks!



On 16.04.2014 10:17, Frede Aakmann Tøgersen wrote:
> Hi Beatriz
>
> Try to skip this step
>
>>   # Reprojecting into CH1903_LV03
>>   # First, change the coordinate reference system (crs)
>>   proj4string(rasterDF1) <- "+init=epsg:21781"
>
> And just do this
>
>>   # Second, reproject the raster
>>   rasterDF1.CH <- spTransform(rasterDF1, crs("+init=epsg:21781"))
>>
> Also there is a spTransform both in the rgdal and sp packages. So are they 
> masking each other? rgdal should be before sp in the search() list.
>
> I cannot be of more help since you provided no data.
>
>
> Yours sincerely / Med venlig hilsen
>
>
> Frede Aakmann Tøgersen
> Specialist, M.Sc., Ph.D.
> Plant Performance & Modeling
>
> Technology & Service Solutions
> T +45 9730 5135
> M +45 2547 6050
> fr...@vestas.com
> http://www.vestas.com
>
> Company reg. name: Vestas Wind Systems A/S
> This e-mail is subject to our e-mail disclaimer statement.
> Please refer to www.vestas.com/legal/notice
> If you have received this e-mail in error please contact the sender.
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of Beatriz R. Gonzalez Dominguez
>> Sent: 16. april 2014 08:22
>> To: r-help@r-project.org
>> Subject: [R] netCDF to raster and spatial projection
>>
>> I've recently started using R for spatial data. I'd be really grateful
>> if you could could help me with this. Thanks!
>>
>> Sorry I don't provide a reproducible example. Please ask me if you have
>> any questions about the data.
>>
>> I've extracted data from a multidimensional netCDF file. This file had
>> longitude, latitude and temperature data (for 12 months of a specific year).
>>
>>   From this netCDF I've got a data frame for January with these
>> variables: longitude, latitude, temperature.
>>
>> With this data frame I've created a raster.
>>
>>   # Packages
>>   library("sp")
>>   library("raster")
>>   library("rgdal")
>>   library("ncdf")
>>   library("maptools")
>>   library("rgeos")
>>   library("sm")
>>   library("chron")
>>
>>   # Dataframe to raster
>>   # Create spatial points data frame
>>   coordinates(tmp.df01) <- ~ lon + lat
>>   # Coerce to SpatialPixelsDataFrame
>>   gridded(tmp.df01) <- T
>>   # Coerce to raster
>>   rasterDF1 <- raster(tmp.df01)
>>   > print(tmp.df01)
>>   class   : SpatialPixelsDataFrame
>>   dimensions  : 103, 241, 24823, 1  (nrow, ncol, npixels, nlayers)
>>   resolution  : 0.0208, 0.0208  (x, y)
>>   extent  : 5.739583, 10.76042, 45.73958, 47.88542  (xmin, xmax,
>> ymin, ymax)
>>   coord. ref. : NA
>>   names   :   TabsM_1
>>   min values  : -18.1389980316162
>>   max values  :  2.26920962333679
>>
>> There is no value for 'coord. ref.'
>>
>> The projection of the original netCDF was WGS84. So I gave this
>> projection to the raster.
>>
>>   proj4string(rasterDF1) <- "+proj=longlat +datum=WGS84 +ellps=WGS84
>> +towgs84=0,0,0"
>>
>> Then, I wanted to reproject my raster to another projection:
>>
>>   # Reprojecting into CH1903_LV03
>>   # First, change the coordinate reference system (crs)
>>   proj4string(rasterDF1) <- "+init=epsg:21781"
>>   # Second, reproject the raster
>>   rasterDF1.CH <- spTransform(rasterDF1, crs("+init=epsg:21781"))
>>
>> At this point I get the following error:
>>
>>   Error in spTransform(rasterDF1, crs("+init=epsg:21781")) :
>> load package rgdal for spTransform methods
>>
>> But the package rgdal is already uploaded! It must be somethin

[R] R regular expression and locales

2014-04-16 Thread Luca Cerone
Hi everybody,
I am developing some functions that use regular expressions
and grepl, to check whether certain strings match a given pattern or not.

In the regular expression I use some shortcuts such as [:alnum:].

Reading the documentation for regular expression there is one sentence that
is not entirely clear to me:

<< The only portable way to specify all ASCII letters is to list them all
as the character class
[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz].
(The current implementation uses numerical order of the encoding.)

Certain named classes of characters are predefined. Their interpretation
depends on the locale (see locales); the interpretation below is that of
the POSIX locale.

[:alnum:]
Alphanumeric characters: [:alpha:] and [:digit:]. >>

Does this mean that I can use [:alnum:] safely to check for letters and
numbers?
Or is there the risk that the code won't work in a computer using a
different locale?
If so, can't I tell grepl to use the POSIX locale to interpret the
alfanumeric characters?

Thanks a lot in advance for the help!
Best,
Luca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Save multiple plots as pdf or jpeg

2014-04-16 Thread starter
David you mentioned that one can deliver jpegs embedded in other formats like
pdf. Do you know which programs can allow me to do that or how I can go on
about it?

Or does it involve some kind of loop whereby, it saves the plots as jpeg and
then collates them into pdf. So it’s a multi-page document but the graphs
are pictures saved as jpegs (or png/bitmap)




--
View this message in context: 
http://r.789695.n4.nabble.com/Save-multiple-plots-as-pdf-or-jpeg-tp4688802p4688878.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fw: Save multiple plots as pdf or jpeg

2014-04-16 Thread Pavneet Arora
Hello Uwe

Can you explain what you mean by "try to plot the sensity only with few 
hundreds of segemnts"
Do you mean that I should take some kind of sample and do it? And if so 
what's the best number for the sample and how do i ensure that its not 
biased?






From:   Uwe Ligges 
To: Pavneet Arora/UK/RoyalSun@RoyalSun, r-help@r-project.org, 
pavnee...@yahoo.co.uk
Date:   15/04/2014 14:08
Subject:Re: [R] Fw: Save multiple plots as pdf or jpeg



You have > 1e6 observations and your lines() have these many segments, 
try to plot the sensity only with few hndreds of segemnts.

Best,
Uwe Ligges


On 15.04.2014 12:27, Pavneet Arora wrote:
> Hello All,
>
> I have multiple plots that I want to save it a single file. At the 
moment,
> I am saving as pdf. Each plot has a qqnorm, qqline, tiny histogram in 
the
> top left graph with its density drawn top of it and the normal
> superimposed on the histogram.
>
> As a result, the pdf takes forever to load and doesn?t let me print, as 
it
> runs out of memory. I was wondering if there is a way if I can save each
> plot as jpeg and then export it on pdf. Will that make it quicker in
> loading? If so, how can I do this? And if not, then what are the
> alternatives.
>
> The code that I have at the moment is as follows:
>
> 
~
> pdf(file="C:/qqnorm/qqnorms.pdf") ##- Saves all plots in the same pdf 
file
>
>
> for (k in 1:ncol(nums2)){
>  par(mfrow=c(1,1))
>
> ##- QQNorm
>  qqnorm(nums2[,k],col="lightblue",main=names(nums2)[k])
>  qqline(nums2[,k],col="red",lwd=2)
>
> ##- Tiny Histogram
> op = par(fig=c(.02,.5,0.4,0.98), new=TRUE)
>  hist(nums2[,k],freq=F,col="blue",xlab="", ylab="", main="",
>   axes=F,density=20)
>
> ##- Density of the variable
>  lines(density(nums2[,k],na.rm=T), col="darkred", lwd=2)
>
> ##- Super-imposed Normal Density
> curve(dnorm(x,mean=mean(nums2[,k],na.rm=T),sd=sd(nums2[,k],na.rm=T)),
>col="black",lwd=3,add=T) ##- Footnote: title1 <- "nums2[k]"
>
> library(moments)
>  s_kurt <- kurtosis (nums2[,k])
>  s_skew <- skewness (nums2[,k])
>  mtxt <- paste ("Variable=",title1, ":" ,
> "Kurt=",round(s_kurt,digits=4), "Skew",
> round(s_skw,digits=4), sep=" ")
>
> mtext (mtxt,col="green4",side=1,line=15.6,adj=0.0,cex=0.8,font=2,las=1)
>
> }
> dev.off()
> 
~
>
> Structure of My data:
>
> str(nums2)
> 'data.frame':1355615 obs. of  39 variables:
>   $ month   : int  1 1 1 1 1 1 1 1 1 1 ...
>   $ Location_Easting_OSGR   : int  525680 524170 524520 526900
> 528060 524770 524220 525890 527350 524550 ...
>   $ Location_Northing_OSGR  : int  178240 181650 182240 177530
> 179040 181160 180830 179710 177650 180810 ...
>   $ Longitude   : num  -0.191 -0.212 -0.206 -0.174
> -0.157 ...
>   $ Latitude: num  51.5 51.5 51.5 51.5 51.5 ...
>   $ Police_Force: int  1 1 1 1 1 1 1 1 1 1 ...
>   $ Number_of_Vehicles  : int  1 1 2 1 1 2 2 1 2 2 ...
>   $ Number_of_Casualties: int  1 1 1 1 1 1 1 2 2 5 ...
>   $ Day_of_Week : int  3 4 5 6 2 3 5 6 7 7 ...
>   $ Local_Authority__District_  : int  12 12 12 12 12 12 12 12 12 12
> ...
>   $ X_1_st_Road_Class   : int  3 4 5 3 6 6 5 3 3 4 ...
>   $ X_1_st_Road_Number  : int  3218 450 0 3220 0 0 0 315 
3212
> 450 ...
>   $ Road_Type   : int  6 3 6 6 6 6 6 3 6 6 ...
>   $ Speed_limit : int  30 30 30 30 30 30 30 30 30 30
> ...
>   $ Junction_Detail : int  0 6 0 0 0 0 3 0 6 3 ...
>   $ Junction_Control: int  -1 2 -1 -1 -1 -1 4 -1 2 4 ...
>   $ X_2_nd_Road_Class   : int  -1 5 -1 -1 -1 -1 6 -1 4 5 ...
>   $ X_2_nd_Road_Number  : int  0 0 0 0 0 0 0 0 304 0 ...
>   $ Pedestrian_Crossing_Human_Contro: int  0 0 0 0 0 0 0 0 0 0 ...
>   $ Pedestrian_Crossing_Physical_Fac: int  1 5 0 0 0 0 0 0 5 8 ...
>   $ Light_Conditions: int  1 4 4 1 7 1 4 1 4 1 ...
>   $ Weather_Conditions  : int  2 1 1 1 1 2 1 1 1 1 ...
>   $ Road_Surface_Conditions : int  2 1 1 1 2 2 1 1 1 1 ...
>   $ Special_Conditions_at_Site  : int  0 0 0 0 0 6 0 0 0 0 ...
>   $ Carriageway_Hazards : int  0 0 0 0 0 0 0 0 0 0 ...
>   $ Urban_or_Rural_Area : int  1 1 1 1 1 1 1 1 1 1 ...
>   $ Did_Police_Officer_Attend_Scene_: int  1 1 1 1 1 1 1 1 1 1 ...
>   $ year: int  2005 2005 2005 2005 2005 2005
> 2005 2005 2005 2005 ...
>   $ m   : int  1 1 1 1 1 1 1 1 1 1 ...
>   $ Qrtr: int  1 1 1 1 1 1 1 1 1 1 ...
>   $ h2  : int  17 17 0 10 21 12 20 17 22 16 
...
>   $ NumberVehGr

Re: [R] Meta-analysis of prevalence at the country level with mgcv/gamm4

2014-04-16 Thread Julien Riou
Hi Michael,

Thank you for your input. Actually the idea of using the model to estimate
a number of sick people by age group came later in the project, and I
missed the issue with the ecological nature of the analysis.

Initially, I intended to use the model to predict a single prevalence for
each country at the median age of the country population. Should I go back
to this approach? If I'm not mistaken, this could be interpreted as
age-adjusted pooled prevalence at the country level. In this case, should I
use the model with the study-level random effect?

Tkanks again for your time,
Julien





On 11 April 2014 13:15, Michael Dewey  wrote:

> At 13:17 10/04/2014, Julien Riou wrote:
>
>> >
>> > Message: 11
>> > Date: Wed, 09 Apr 2014 18:39:30 +0100
>> > From: Michael Dewey 
>> > To: Julien Riou , r-help@r-project.org
>> > Subject: Re: [R] Meta-analysis of prevalence at the country level with
>> > mgcv/gamm4
>> > Message-ID: 
>> > Content-Type: text/plain; charset="us-ascii"; format=flowed
>> >
>>
>>
>> Hi Michael,
>>
>> Thank you for taking the time to help me.
>>
>>  So the first UK study with a median age of 25 is going to be used to
>> > estimate prevalence over a range of ages? You are going to have to
>> > make some very strong assumptions here which I personally would not
>> > want to make.
>> >
>>
>> I'm a little confused by this. In my understanding, the mixed-effects
>> model
>> does not do that. The slope of the relation between age and prevalence
>> will
>> be estimated from the full pool of studies, and the country-level random
>> intercept will be estimated from all studies in the country. So the
>> assumption here is that the relation between age and incidence is the same
>> in every country, which is quite reasonable. Of course, there will be more
>> uncertainty with the estimation of the random intercept if there is few
>> studies in a country, or if there is a strong inter-study variance in a
>> country. This will influence the confidence interval of the random
>> intercept, and so the CI of the predicted prevalence for this country.
>>
>
> Your studies are ecological. You are estimating the relationship between
> prevalence and being in a study of median age X which is not necessarily
> the same as the relationship between prevalence and being a person of age X.
>
>
>
>  Is there any possibility that in the real dataset you can fit your
>> > model to those studies which do provide age-specific prevalences and
>> > then use that to impute?
>> >
>> > You do not say when these studies were published but I would ask the
>> > authors of the primary studies if they can make the information
>> > available to you. You may have already done that of course. I referee
>> > quite a few papers on systematic reviews and my impression is that
>> > some authors are amenable to doing the work for you. You mileage may
>> > vary of course.
>> >
>>
>> Yes, it would be easier to have prevalence for age subgroups of studies,
>> but we did not have access to that information for most studies even after
>> contacting the authors.
>>
>>
>> > >*Standard random-effect meta-analysis* with package meta.
>> > >
>> > >I used metaprop() to get a first estimate of the prevalence in each
>> > country
>> > >without taking age into account, and to obtain weights. As expected,
>> > >heterogeneity was very high, so I used weights from the random-effects
>> > >model.
>> >
>> > Which will be nearly equal and so hardly worth using in my opinion
>> > but again your mileage may vary.
>> >
>>
>> The weights from the random-effects method were actually far from equals,
>> as sample size was highly variable between studies. With the RE method,
>> small studies have much more impact.
>>
>>
>> > I am afraid that is the way with systematic reviews, you can only
>> > synthesise what you find, not what you would like to have found.
>> > Anyone who has done a review will sympathise with you, not that that
>> > is any consolation.
>> >
>>
>> I'm not sure I'm following your point. My objective is to synthesise the
>> included studies, while taking the age factor into account, since it is
>> strongly linked to prevalence and very heterogeneous. The alternative is
>> to
>> only include studies with low median age, but I would lose a lot of
>> information.
>>
>> Thank you again,
>> Julien
>>
>> >
>> >
>>
>> [[alternative HTML version deleted]]
>>
>
> Michael Dewey
> i...@aghmed.fsnet.co.uk
> http://www.aghmed.fsnet.co.uk/home.html
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread Jim Lemon

On 04/16/2014 04:43 AM, DrunkenPhD wrote:

Sorry Jim not for my beginner level :(((

So if my data are like this :

xyCityVillage

19.943731440.7086377120425

20.221417141.492433611

20.095589139.948136444

20.950663640.64473471015

how do I plot for every point(x,y) a pie chart with slices(City,Village)?

Regards


how do I plot for every point(x,y) a pie chart with slices(City,Village)?



Okay, let's try again.

# get yer data
sampledf<-read.table(text="x y City Village
 19.9437314 40.7086377 120 425
 20.2214171 41.4924336 1 1
 20.0955891 39.9481364 4 4
 20.9506636 40.6447347 10 15",header=TRUE)
# make the graphics device high enough to hold Albania
x11(height=10,width=5)
# display the map upon which you want the pies
map(xlim=c(19,21.1),ylim=c(39.5,42.6))
# display the pies
library(plotrix)
for(pp in 1:4) floating.pie(sampledf[pp,"x"],sampledf[pp,"y"],
 unlist(sampledf[pp,c("City","Village")]),radius=0.1)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing a function in a corrrect way

2014-04-16 Thread Ashis Deb
Hi  all  ,  I   am  about  to  make   a  package  using   gWidgets  and
RGTK,   so  for that  I  had  put all  my  codes  into  a  empty  function
to  access  it afterward by the function name  using   liketrade()   .
Here is the  below  code  example :---



trade<-function()     HERE   IS  THE
EMPTY   FUNCTION


   {



  library(RGtk2)

library(gWidgets)

 library(gWidgets2)

library(gWidgetsRGtk2)

 library(FuzzyToolkitUoN)

 library(splines)

 library(plyr)

library(Cairo)

  library(cairoDevice)

 library(e1071)

  library(quantmod)

 library(TTR)


 library(xts)


#main GUI code
win=gwindow(" PREDICTIVE TRADER ",visible=FALSE)




group<-ggroup(cont=win)

left_group<-ggroup(cont=group,horizontal=FALSE)


right_group<-ggroup(cont=group,horizontal=FALSE)





frame<-gframe("widgets",cont=left_group,expand=TRUE)

size(frame)<-c(400,450)

tbl=glayout(cont=frame)



tbl[7,1]= gbutton("BUYANDSELLSIGNALS ",cont=tbl,handler=BSS)

tbl[5,1]=glabel("ENTER SYMBOL:-",cont=tbl)

tbl[5,2]<-gedit(""  , cont=tbl,coerce.with=as.character)


tbl[11,1]<-glabel("ENTER PROFIT TARGET:-",cont=tbl)

tbl[11,2]= gedit("", cont=tbl,coerce.with=as.numeric)

tbl[17,1]= gbutton("RUNBACKTEST  ",cont=tbl,handler=RBS)

tbl[22,1]<- plut<<-gbutton("PLOT GRAPH ",cont=tbl,handler=ploit)

tbl[22,2]<-rprt<<-gbutton("TRADEREPORT",cont=tbl,handler=report)


tbl[1,1]=img <- gimage("E:/clusterring/pr.1.png")

tbl[19,1]<-b<<-gbutton("GENERATING BACKTEST SUMMARY ",cont=tbl,handler=gbs)

tbl[24,1]<-gbutton("CLEAR",cont=tbl,handler=clear)


tbl[13,1] =glabel("STOP LOSS TARGET:-")

tbl[13,2] =gedit("", container=tbl,coerce.with=as.numeric)


visible(win)=TRUE

}




I  Didn't   put the   whole  function  handlers   and   the remaining
codesfor its  lengthiness   ,  my  issue   is  the   function  is
showingsome  problem   like   when  i  execute   the   function   and
called the  GUI   using   "trade()"   it is  showing   the GUI   ,but
when  I  click  any  button   it  is  showing  some  errorlike   below
:-


Error in svalue(tbl[5, 2]) :

 error in evaluating the argument 'obj' in selecting a method for function
'svalue': Error in (function (classes, fdef, mtable)  :

 unable to find an inherited method for function ‘.tag’ for signature
‘"", "guiWidgetsToolkitRGtk2"’


Can  any body   help  me  with this   issue   ,   I  am  sorry  if   I
had   not  follwed   any  rules  in here  ,  will  learn  if taught   .



Thank  You

ASHIS  DEB

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netCDF to raster and spatial projection

2014-04-16 Thread Frede Aakmann Tøgersen
Well no need to have your data. Usually one can find some suitable data in the 
help for the functions under question. So here is a reproducible example from 
?meuse.grid.

> data(meuse.grid)
> coordinates(meuse.grid) <- ~x+y
> proj4string(meuse.grid) <- CRS("+init=epsg:28992") # see ?meuse for this crs
> gridded(meuse.grid) <- TRUE
> summary(meuse.grid)

Object of class SpatialPixelsDataFrame
Coordinates:
 minmax
x 178440 181560
y 329600 333760
Is projected: TRUE
proj4string :
[+init=epsg:28992 +proj=sterea +lat_0=52.156160
+lon_0=5.387639 +k=0.079 +x_0=155000 +y_0=463000
+ellps=bessel +units=m +no_defs]
Number of points: 3103
Grid attributes:
  cellcentre.offset cellsize cells.dim
x178460   4078
y329620   40   104
Data attributes:
 part.a   part.bdistsoil ffreq
 Min.   :0.   Min.   :0.   Min.   :0.   1:1665   1: 779
 1st Qu.:0.   1st Qu.:0.   1st Qu.:0.1193   2:1084   2:1335
 Median :0.   Median :1.   Median :0.2715   3: 354   3: 989
 Mean   :0.3986   Mean   :0.6014   Mean   :0.2971
 3rd Qu.:1.   3rd Qu.:1.   3rd Qu.:0.4402
 Max.   :1.   Max.   :1.   Max.   :0.9926

> rasterMG <- raster(meuse.grid)
> print(rasterMG)

class   : RasterLayer
dimensions  : 104, 78, 8112  (nrow, ncol, ncell)
resolution  : 40, 40  (x, y)
extent  : 178440, 181560, 329600, 333760  (xmin, xmax, ymin, ymax)
coord. ref. : +init=epsg:28992 +proj=sterea +lat_0=52.156160 
+lon_0=5.387639 +k=0.079 +x_0=155000 +y_0=463000 +ellps=bessel 
+units=m +no_defs
data source : in memory
names   : part.a
values  : 0, 1  (min, max)

> rasterMG.proj <- spTransform(rasterMG, CRS("+init=epsg:21781"))
Error in spTransform(rasterMG, CRS("+init=epsg:21781")) :
  load package rgdal for spTransform methods
>

Now perhaps doing the projection before casting it to a raster will work (yes 
I'm guessing since error thrown above is not very informative and traceback() 
does not give anything useful).

> meuse.grid.proj <- spTransform(meuse.grid, CRS("+init=epsg:21781"))
Warning message:
In spTransform(meuse.grid, CRS("+init=epsg:21781")) :
  Grid warping not available, coercing to points

> summary(meuse.grid.proj)

Object of class SpatialPointsDataFrame
Coordinates:
   min  max
x 479029.8 482197.1
y 646927.4 651003.9
Is projected: TRUE
proj4string :
[+init=epsg:21781 +proj=somerc +lat_0=46.952406
+lon_0=7.4395833 +k_0=1 +x_0=60 +y_0=20 +ellps=bessel
+towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs]
Number of points: 3103
Data attributes:
 part.a   part.bdistsoil ffreq
 Min.   :0.   Min.   :0.   Min.   :0.   1:1665   1: 779
 1st Qu.:0.   1st Qu.:0.   1st Qu.:0.1193   2:1084   2:1335
 Median :0.   Median :1.   Median :0.2715   3: 354   3: 989
 Mean   :0.3986   Mean   :0.6014   Mean   :0.2971
 3rd Qu.:1.   3rd Qu.:1.   3rd Qu.:0.4402
 Max.   :1.   Max.   :1.   Max.   :0.9926

> gridded(meuse.grid.proj) <- TRUE
suggested tolerance minimum: 0.737332
Error in points2grid(points, tolerance, round) :
  dimension 1 : coordinate intervals are not constant
>

A warning and an error indicates that the projection results in something that 
is not on a regular grid.

I don't know what to do but to read some more of the documentation for sp, 
rgdal, etc.

Hopefully somebody comes up with something that helps us.

Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender.

From: Bea GD [mailto:aguitatie...@hotmail.com]
Sent: 16. april 2014 11:00
To: Frede Aakmann Tøgersen; r-help@r-project.org
Subject: Re: [R] netCDF to raster and spatial projection

Hi Frede,

Thanks so much for your reply!

I've tried what you said but I get the following error:



Error in spTransform(rasterDF1, crs("+init=epsg:21781")) :

  load package rgdal for spTransform methods

I've checked search() and rgdal is before sp.


> search()

 [1] ".GlobalEnv""package:chron" "package:sm""package:rgeos"

 [5] "package:maptools"  "package:ncdf"  "package:rgdal" 
"package:raster"

 [9] "package:sp""tools:rstudio" "package:stats" 
"package:graphics"

[13] "package:grDevices" "package:utils" "package:datasets"  
"package:methods"

[17] "Autoloads" "package:base"

Also, when I do library("rgdal") I get this message:


> library("rgdal")

rgdal: version: 0.8-16, (SVN revision 498)

Geospatial Data Abst

Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread Jim Lemon

On 04/16/2014 07:59 PM, DrunkenPhD wrote:

Ok this seems to work perfect,

Dont want to be boring but what if plotting in real map smth like:

hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype =
'roadmap')
ggmap(hdf, extent = 'device')

I'm not sure that this will work, as ggmap is part of ggplot2 and thus 
based on grid graphics. The plotrix package is written for base 
graphics. If there have been any transformations of coordinates, the 
pies might be in the wrong place or not appear at all. Just substitute 
the above call for the map call in my example and "print" the resulting 
object, then see what happens with the pies.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread Jim Lemon

See inline

On 04/16/2014 08:13 PM, DrunkenPhD wrote:

No nothing :
I tried :
 > hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype
= 'roadmap')
 > sampledf<-read.table(text="x y City Village

19.9437314 40.7086377 120 425
20.2214171 41.4924336 1 1
20.0955891 39.9481364 4 4
20.9506636 40.6447347 10 15",header=TRUE)
# make the graphics device high enough to hold Albania
x11(height=10,width=5)
# display the map upon which you want the pies


hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8,
 maptype = 'roadmap')
ggmap(hdf, extent = 'device')
# I don't know if you have to do anything else here


# display the pies
library(plotrix)
for(pp in 1:4) floating.pie(sampledf[pp,"x"],sampledf[pp,"y"],
unlist(sampledf[pp,c("City","Village")]),radius=0.1)

but p nothing
What to do


Look at the documentation for ggmap (I don't have the package on my 
system, so I can't do it for you). Try par("usr") to see the coordinates 
of the plot.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netCDF to raster and spatial projection

2014-04-16 Thread Frede Aakmann Tøgersen
Hi Bea

Well the first hit lead me to rasterProject{raster}. Will this suit you?


> rasterMG.proj <- projectRaster(rasterMG, crs=CRS("+init=epsg:21781"))
> print(rasterMG.proj)

class   : RasterLayer 
dimensions  : 116, 91, 10556  (nrow, ncol, ncell)
resolution  : 40.1, 40.1  (x, y)
extent  : 478794.9, 482444, 646645.8, 651297.4  (xmin, xmax, ymin, ymax)
coord. ref. : +init=epsg:21781 +proj=somerc +lat_0=46.952406 
+lon_0=7.4395833 +k_0=1 +x_0=60 +y_0=20 +ellps=bessel 
+towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs 
data source : in memory
names   : part.a 
values  : 0, 1  (min, max)


The difference can be seen by plotting.

plot(rasterMG)
plot(rasterMG.proj)




Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Frede Aakmann Tøgersen
> Sent: 16. april 2014 12:11
> To: Bea GD; r-help@r-project.org
> Subject: Re: [R] netCDF to raster and spatial projection
> 
> Well no need to have your data. Usually one can find some suitable data in
> the help for the functions under question. So here is a reproducible example
> from ?meuse.grid.
> 
> > data(meuse.grid)
> > coordinates(meuse.grid) <- ~x+y
> > proj4string(meuse.grid) <- CRS("+init=epsg:28992") # see ?meuse for this
> crs
> > gridded(meuse.grid) <- TRUE
> > summary(meuse.grid)
> 
> Object of class SpatialPixelsDataFrame
> Coordinates:
>  minmax
> x 178440 181560
> y 329600 333760
> Is projected: TRUE
> proj4string :
> [+init=epsg:28992 +proj=sterea +lat_0=52.156160
> +lon_0=5.387639 +k=0.079 +x_0=155000 +y_0=463000
> +ellps=bessel +units=m +no_defs]
> Number of points: 3103
> Grid attributes:
>   cellcentre.offset cellsize cells.dim
> x178460   4078
> y329620   40   104
> Data attributes:
>  part.a   part.bdistsoil ffreq
>  Min.   :0.   Min.   :0.   Min.   :0.   1:1665   1: 779
>  1st Qu.:0.   1st Qu.:0.   1st Qu.:0.1193   2:1084   2:1335
>  Median :0.   Median :1.   Median :0.2715   3: 354   3: 989
>  Mean   :0.3986   Mean   :0.6014   Mean   :0.2971
>  3rd Qu.:1.   3rd Qu.:1.   3rd Qu.:0.4402
>  Max.   :1.   Max.   :1.   Max.   :0.9926
> 
> > rasterMG <- raster(meuse.grid)
> > print(rasterMG)
> 
> class   : RasterLayer
> dimensions  : 104, 78, 8112  (nrow, ncol, ncell)
> resolution  : 40, 40  (x, y)
> extent  : 178440, 181560, 329600, 333760  (xmin, xmax, ymin, ymax)
> coord. ref. : +init=epsg:28992 +proj=sterea +lat_0=52.156160
> +lon_0=5.387639 +k=0.079 +x_0=155000 +y_0=463000
> +ellps=bessel +units=m +no_defs
> data source : in memory
> names   : part.a
> values  : 0, 1  (min, max)
> 
> > rasterMG.proj <- spTransform(rasterMG, CRS("+init=epsg:21781"))
> Error in spTransform(rasterMG, CRS("+init=epsg:21781")) :
>   load package rgdal for spTransform methods
> >
> 
> Now perhaps doing the projection before casting it to a raster will work (yes
> I'm guessing since error thrown above is not very informative and traceback()
> does not give anything useful).
> 
> > meuse.grid.proj <- spTransform(meuse.grid, CRS("+init=epsg:21781"))
> Warning message:
> In spTransform(meuse.grid, CRS("+init=epsg:21781")) :
>   Grid warping not available, coercing to points
> 
> > summary(meuse.grid.proj)
> 
> Object of class SpatialPointsDataFrame
> Coordinates:
>min  max
> x 479029.8 482197.1
> y 646927.4 651003.9
> Is projected: TRUE
> proj4string :
> [+init=epsg:21781 +proj=somerc +lat_0=46.952406
> +lon_0=7.4395833 +k_0=1 +x_0=60 +y_0=20 +ellps=bessel
> +towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs]
> Number of points: 3103
> Data attributes:
>  part.a   part.bdistsoil ffreq
>  Min.   :0.   Min.   :0.   Min.   :0.   1:1665   1: 779
>  1st Qu.:0.   1st Qu.:0.   1st Qu.:0.1193   2:1084   2:1335
>  Median :0.   Median :1.   Median :0.2715   3: 354   3: 989
>  Mean   :0.3986   Mean   :0.6014   Mean   :0.2971
>  3rd Qu.:1.   3rd Qu.:1.   3rd Qu.:0.4402
>  Max.   :1.   Max.   :1.   Max.   :0.9926
> 
> > gridded(meuse.grid.proj) <- TRUE
> suggested tolerance minimum: 0.737332
> Error in points2grid(points, tolerance, round) :
>   dimension 1 : coordinate intervals are not constant
> >
> 
> A warning and an error indicates that the projection results in something that
> is not on a regular grid.
> 
> I don't know what to 

Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread Jim Lemon

On 04/16/2014 08:29 PM, Endri Raco wrote:

I see function *segmentGoogleMaps* from package *plotGoogleMaps* does
exactly what I need with data from dataset *meuse*.
When I ran code:
coordinates(sampledf)<-~x+y # convert to SPDF
proj4string(sampledf) <- CRS('+init=epsg:28992')# adding Coordinate
Referent Sys.
# Create web map of Point data
m<-plotGoogleMaps(sampledf,filename='myMap1.htm')

Everything works just fine. When I try to run code:

# colPalette defines colors for plot
m<-segmentGoogleMaps(sampledf, zcol=c('City','Village'),
mapTypeId='ROADMAP',
filename='myMap4.htm',colPalette=c('#E41A1C','#377EB8'),
strokeColor='black')

everything goes wrong and all points seems to show in one single
coordinate pair.

The problem is I think that data in meuse dataset are different from
mine (I think UTM).
When tried to convert same result.
I desperately need to see pie chart in a google map so please help
solving this issue

Whoa! All bets are off! plotGoogleMaps generates overlays on an HTML 
page linked to Google Maps. This is way off what you do with R graphics 
devices. Your best bet is to search:


pie chart google maps

on Google and start reading. I have used plotGoogleMaps, but never put 
pie charts on them, and I really don't have time to learn how at the moment.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trend test for hazard ratios

2014-04-16 Thread Göran Broström

On 04/15/2014 10:51 PM, David Winsemius wrote:


On Apr 15, 2014, at 6:32 AM, Therneau, Terry M., Ph.D. wrote:


You can do statistical tests within a single model, for whether
portions of it fit or do not fit.  But one cannot take three
separate fits and compare them.  The program needs context to know
how the three relate to one another.  Say that "group" is your
strata variable, trt the variable of interest, and x1, x2 are
adjusters.

fit <- coxph(Surv(time,status) ~ trt * strata(group) + x1 + x2,
data=mydata)

Will fit a model with a separate treatment coefficient for each of
the groups, and a separate baseline hazard for each.  One can now
create a contrast that corresponds to your trend test, using
vcov(fit) for the variance matrix and coef(fit) to retrieve the
coefficients.



I have at the moment on my workspace a dataset of breast cancer cases
extracted from SEER that has a factor representing three grades of
histology: $Grade.abb. $ Grade.abb : Factor w/ 3 levels
"Well","Moderate", "Poorly"

It would be reasonable to test whether the grading has a "trend" in
its effect when controlling for other factors (and it would be
surprising to a medical audience if there were no effect.). So I
compare across models using  AgeStgSiz.NG.Rad as my "linear" trend"
model (with one df for an `as.numeric` version, AgeStgSizRad as my
no-grade-included baseline, and AgeStgSiz.FG.Rad as my full factor
version:


David, your problem is different from the original one; I think you can 
solve yours by transforming the (unordered) factor to an ordered.


Try

AgeStgSiz.NG.Rad <- coxph(Surv(Survmon,
Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
SEER.historic.stage.A+ as.numeric(Size.cat) + ordered(Grade.abb)
 + Rgrp , data=BrILR)

and contrasts based on orthogonal polynomials are used for Grade.abb

see ?contr.poly

Göran B.



AgeStgSiz.NG.Rad <- coxph( Surv(Survmon,
Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
SEER.historic.stage.A+ as.numeric(Size.cat) + as.numeric(Grade.abb)
+ Rgrp , data=BrILR) AgeStgSizRad <- coxph( Surv(Survmon,
Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
SEER.historic.stage.A+ as.numeric(Size.cat) + Rgrp , data=BrILR)
AgeStgSiz.FG.Rad <- coxph( Surv(Survmon,
Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
SEER.historic.stage.A+ as.numeric(Size.cat) + Grade.abb + Rgrp ,
data=BrILR) 2*diff(summary(AgeStgSizRad)[['loglik']])

[1] 5166.291

2*diff(summary(AgeStgSiz.NG.Rad)[['loglik']])

[1] 5888.492

2*diff(summary(AgeStgSiz.FG.Rad)[['loglik']])

[1] 5889.379

So there is strong evidence that adding grade to the existing
covariates improves the fit but that representing as separate factor
values with one extra degree of freedom may not be needed. When I add
grade as a stratum I get a very different 2*loglikelihood:


AgeStgSiz.SG.Rad <- coxph( Surv(Survmon,
Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
SEER.historic.stage.A+ as.numeric(Size.cat) + strata(Grade.abb) +
Rgrp , data=BrILR) 2*diff(summary(AgeStgSiz.SG.Rad)[['loglik']])

[1] 3980.908


dput( vcov(AgeStgSiz.SG.Rad) )

structure(c(9.00241385282728e-06, -4.45446264398645e-07,
5.18927440846587e-07, 2.62020260612094e-07, 7.47434378232446e-07,
-4.45446264398645e-07, 0.0046168537719431, 0.00445692601518848,
-8.67833275051278e-07, -3.68093395861629e-05, 5.18927440846587e-07,
0.00445692601518848, 0.00464685164887969, -1.61616621634903e-05,
-3.77256742079467e-05, 2.62020260612094e-07, -8.67833275051278e-07,
-1.61616621634903e-05, 7.84049821976807e-06, 1.8221575745622e-06,
7.47434378232446e-07, -3.68093395861629e-05, -3.77256742079467e-05,
1.8221575745622e-06, 0.000313989310316303), .Dim = c(5L, 5L),
.Dimnames = list(c("AgeDx", "SEER.historic.stage.ALocalized",
"SEER.historic.stage.ARegional", "as.numeric(Size.cat)", "RgrpTRUE"),
c("AgeDx", "SEER.historic.stage.ALocalized",
"SEER.historic.stage.ARegional", "as.numeric(Size.cat)", "RgrpTRUE"
)))

dput(coef(AgeStgSiz.SG.Rad))

structure(c(0.0393472050734995, 0.901971276489615, 1.46695753267995,
0.108860100649677, -0.273688779502084), .Names = c("AgeDx",
"SEER.historic.stage.ALocalized", "SEER.historic.stage.ARegional",
"as.numeric(Size.cat)", "RgrpTRUE" ))

I'm not particularly facile with contrast construction with var-covar
matrices, so hoping for a worked example. Also wondering of the
cross-model comparisons are invalid or less powerful?



Terry T.



On 04/15/2014 05:00 AM, r-help-requ...@r-project.org wrote:

Hello,

I have the following problem. I stratified my patient cohort into
three ordered groups and performed multivariate adjusted Cox
regression analysis on each group separately. Now I would like to
calculate a p for trend across the hazard ratios that I got for
the three groups. How can I do that if I only have the HR and the
confidence interval? For example I got the following HRs for one
endpoint:

1.09(0.68-1.74),1.29(0.94-1.76) and 1.64(1.01-2.68).

There is a trend but how do I calculate if it is significant?

Re: [R] The explanation of ns() with df =2

2014-04-16 Thread John Fox
Dear Xing,

My point wasn't that you should add 1 df for a constant to each cubic but that 
you need not subtract 1 for continuity. I'm sorry that I didn't make that 
clear. When you use the natural spline in a regression, there's a constant in 
the model as a separate term with 1 df, which in effect adjusts the level of 
the regression curve (assuming one X for simplicity). If you add a constant to 
each component cubic, the constant in the model is rendered redundant, 
accounting for the extra df.

I'm afraid that if this is still not sufficiently clear, I'll have to defer to 
someone with greater powers of explanation.

Best,
John

-
John Fox
McMaster University
Hamilton, Ontario, Canada
http://socserv.socsci.mcmaster.ca/jfox

> On Apr 16, 2014, at 12:10 AM, Xing Zhao  wrote:
> 
> Dear John
> 
> Sorry I use 3 degree of freedom for  cubic spline. After using 4, it
> is still not 2. I may make some naive mistake, but I cannot figure
> out. Where is the problem?
> 
> 4 (cubic on the right side of the *interior* knot 8)
> + 4 (cubic on the left side of the *interior* knot 8)
> - 1 (two curves must be continuous at the *interior* knot 8)
> - 1 (two curves must have 1st order derivative continuous at the
> *interior* knot 8)
> - 1 (two curves must have 2nd order derivative continuous at the
> *interior* knot 8)
> - 1 (right side cubic curve must have 2nd order derivative = 0 at the
> boundary knot 15 due to the linearity constraint)
> - 1 (similar for the left)
> = 3, not 2
> 
> Thanks
> Xing
> 
>> On Tue, Apr 15, 2014 at 10:54 AM, John Fox  wrote:
>> Dear Xing,
>> 
>>> -Original Message-
>>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>>> project.org] On Behalf Of Xing Zhao
>>> Sent: Tuesday, April 15, 2014 1:18 PM
>>> To: John Fox
>>> Cc: r-help@r-project.org; Michael Friendly
>>> Subject: Re: [R] The explanation of ns() with df =2
>>> 
>>> Dear Michael and Fox
>>> 
>>> Thanks for your elaboration. Combining your explanations would, to my
>>> understanding, lead to the following  calculation of degree of
>>> freedoms.
>>> 
>>> 3 (cubic on the right side of the *interior* knot 8)
>>> + 3 (cubic on the left side of the *interior* knot 8)
>>> - 1 (two curves must be continuous at the *interior* knot 8)
>> 
>> You shouldn't subtract 1 for continuity since you haven't allowed a
>> different level on each side of the knot (that is your initial counting of 3
>> parameters for the cubic doesn't include a constant).
>> 
>> Best,
>> John
>> 
>>> - 1 (two curves must have 1st order derivative continuous at the
>>> *interior* knot 8)
>>> - 1 (two curves must have 2nd order derivative continuous at the
>>> *interior* knot 8)
>>> - 1 (right side cubic curve must have 2nd order derivative = 0 at the
>>> boundary knot 15 due to the linearity constraint)
>>> - 1 (similar for the left)
>>> = 1, not 2
>>> 
>>> Where is the problem?
>>> 
>>> Best,
>>> Xing
>>> 
 On Tue, Apr 15, 2014 at 6:17 AM, John Fox  wrote:
 Dear Xing Zhao,
 
 To elaborate slightly on Michael's comments, a natural cubic spline
>>> with 2 df has one *interior* knot and two boundary knots (as is
>>> apparent in the output you provided). The linearity constraint applies
>>> beyond the boundary knots.
 
 I hope this helps,
 John
 
 
 John Fox, Professor
 McMaster University
 Hamilton, Ontario, Canada
 http://socserv.mcmaster.ca/jfox/
 
 On Tue, 15 Apr 2014 08:18:40 -0400
 Michael Friendly  wrote:
> No, the curves on each side of the know are cubics, joined
> so they are continuous.  Se the discussion in \S 17.2 in
> Fox's Applied Regression Analysis.
> 
>> On 4/15/2014 4:14 AM, Xing Zhao wrote:
>> Dear all
>> 
>> I understand the definition of Natural Cubic Splines are those
>>> with
>> linear constraints on the end points. However, it is hard to think
>> about how this can be implement when df=2. df=2 implies there is
>>> just
>> one knot, which, according the the definition, the curves on its
>>> left
>> and its right should be both be lines. This means the whole line
>> should be a line. But when making some fits. the result still
>>> looks
>> like 2nd order polynomial.
>> 
>> How to think about this problem?
>> 
>> Thanks
>> Xing
>> 
>> ns(1:15,df =2)
>>   1   2
>>  [1,] 0.000  0.
>>  [2,] 0.1084782 -0.07183290
>>  [3,] 0.2135085 -0.13845171
>>  [4,] 0.3116429 -0.19464237
>>  [5,] 0.3994334 -0.23519080
>>  [6,] 0.4734322 -0.25488292
>>  [7,] 0.5301914 -0.24850464
>>  [8,] 0.5662628 -0.21084190
>>  [9,] 0.5793481 -0.13841863
>> [10,] 0.5717456 -0.03471090
>> [11,] 0.5469035  0.09506722
>> [12,] 0.5082697  0.24570166
>> [13,] 0.4592920  0.41197833
>> [14,] 0.4034184  0.58868315
>> [15,] 0

Re: [R] netCDF to raster and spatial projection

2014-04-16 Thread Bea GD

Hi Frede,

Thanks so much! It seems to work perfectly, exactly what I wanted to do! :)

All the best,

Bea

On 16.04.2014 12:25, Frede Aakmann Tøgersen wrote:

Hi Bea

Well the first hit lead me to rasterProject{raster}. Will this suit you?



rasterMG.proj <- projectRaster(rasterMG, crs=CRS("+init=epsg:21781"))
print(rasterMG.proj)

class   : RasterLayer
dimensions  : 116, 91, 10556  (nrow, ncol, ncell)
resolution  : 40.1, 40.1  (x, y)
extent  : 478794.9, 482444, 646645.8, 651297.4  (xmin, xmax, ymin, ymax)
coord. ref. : +init=epsg:21781 +proj=somerc +lat_0=46.952406 
+lon_0=7.4395833 +k_0=1 +x_0=60 +y_0=20 +ellps=bessel 
+towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs
data source : in memory
names   : part.a
values  : 0, 1  (min, max)


The difference can be seen by plotting.

plot(rasterMG)
plot(rasterMG.proj)




Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender.



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Frede Aakmann Tøgersen
Sent: 16. april 2014 12:11
To: Bea GD; r-help@r-project.org
Subject: Re: [R] netCDF to raster and spatial projection

Well no need to have your data. Usually one can find some suitable data in
the help for the functions under question. So here is a reproducible example
from ?meuse.grid.


data(meuse.grid)
coordinates(meuse.grid) <- ~x+y
proj4string(meuse.grid) <- CRS("+init=epsg:28992") # see ?meuse for this

crs

gridded(meuse.grid) <- TRUE
summary(meuse.grid)

Object of class SpatialPixelsDataFrame
Coordinates:
  minmax
x 178440 181560
y 329600 333760
Is projected: TRUE
proj4string :
[+init=epsg:28992 +proj=sterea +lat_0=52.156160
+lon_0=5.387639 +k=0.079 +x_0=155000 +y_0=463000
+ellps=bessel +units=m +no_defs]
Number of points: 3103
Grid attributes:
   cellcentre.offset cellsize cells.dim
x178460   4078
y329620   40   104
Data attributes:
  part.a   part.bdistsoil ffreq
  Min.   :0.   Min.   :0.   Min.   :0.   1:1665   1: 779
  1st Qu.:0.   1st Qu.:0.   1st Qu.:0.1193   2:1084   2:1335
  Median :0.   Median :1.   Median :0.2715   3: 354   3: 989
  Mean   :0.3986   Mean   :0.6014   Mean   :0.2971
  3rd Qu.:1.   3rd Qu.:1.   3rd Qu.:0.4402
  Max.   :1.   Max.   :1.   Max.   :0.9926


rasterMG <- raster(meuse.grid)
print(rasterMG)

class   : RasterLayer
dimensions  : 104, 78, 8112  (nrow, ncol, ncell)
resolution  : 40, 40  (x, y)
extent  : 178440, 181560, 329600, 333760  (xmin, xmax, ymin, ymax)
coord. ref. : +init=epsg:28992 +proj=sterea +lat_0=52.156160
+lon_0=5.387639 +k=0.079 +x_0=155000 +y_0=463000
+ellps=bessel +units=m +no_defs
data source : in memory
names   : part.a
values  : 0, 1  (min, max)


rasterMG.proj <- spTransform(rasterMG, CRS("+init=epsg:21781"))

Error in spTransform(rasterMG, CRS("+init=epsg:21781")) :
   load package rgdal for spTransform methods
Now perhaps doing the projection before casting it to a raster will work (yes
I'm guessing since error thrown above is not very informative and traceback()
does not give anything useful).


meuse.grid.proj <- spTransform(meuse.grid, CRS("+init=epsg:21781"))

Warning message:
In spTransform(meuse.grid, CRS("+init=epsg:21781")) :
   Grid warping not available, coercing to points


summary(meuse.grid.proj)

Object of class SpatialPointsDataFrame
Coordinates:
min  max
x 479029.8 482197.1
y 646927.4 651003.9
Is projected: TRUE
proj4string :
[+init=epsg:21781 +proj=somerc +lat_0=46.952406
+lon_0=7.4395833 +k_0=1 +x_0=60 +y_0=20 +ellps=bessel
+towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs]
Number of points: 3103
Data attributes:
  part.a   part.bdistsoil ffreq
  Min.   :0.   Min.   :0.   Min.   :0.   1:1665   1: 779
  1st Qu.:0.   1st Qu.:0.   1st Qu.:0.1193   2:1084   2:1335
  Median :0.   Median :1.   Median :0.2715   3: 354   3: 989
  Mean   :0.3986   Mean   :0.6014   Mean   :0.2971
  3rd Qu.:1.   3rd Qu.:1.   3rd Qu.:0.4402
  Max.   :1.   Max.   :1.   Max.   :0.9926


gridded(meuse.grid.proj) <- TRUE

suggested tolerance minimum: 0.737332
Error in points2grid(points, tolerance, round) :
   dimension 1 : coordinate intervals are not constant
A warning and an error indicates that the projection results in something that
is not on a regular grid.

I don't know what to do but to read some more of the documentation 

Re: [R] Testing correlation of equation in a SUR model fitted by systemfit

2014-04-16 Thread Arne Henningsen
Dear Paul

On 15 April 2014 19:23, Paul Smith  wrote:
> How to test whether the correlation in the matrix of correlation of a
> two-equations SUR model fitted by package systemfit are significant?

You can use a likelihood-ratio test to compare the SUR model with the
corresponding OLS model. The only difference is that the OLS model
assumes that all off-diagonal elements of the covariance matrix of the
residuals of the different equations are zero. An example:

library( "systemfit" )

# load data set
data( "Kmenta" )

# specify system of equations
eqDemand <- consump ~ price + income
eqSupply <- consump ~ price + farmPrice + trend
system <- list( demand = eqDemand, supply = eqSupply )

# estimate OLS model
fitols <- systemfit( system, data=Kmenta )

# estimate SUR model
fitsur <- systemfit( system, "SUR", data = Kmenta )

# LR test: OLS vs. SUR
lrtest( fitols, fitsur )

# estimate iterated SUR model
fititsur <- systemfit( system, "SUR", data = Kmenta, maxit = 100 )

# LR test: OLS vs. SUR
lrtest( fitols, fititsur )


If you have further questions regarding the systemfit package, you can
also use the "help" forum at systemfit's R-Forge site:

https://r-forge.r-project.org/projects/systemfit/

... and please do not forget to cite systemfit in your publications
(see output of the R command 'citation("systemfit")').

Best regards,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread Endri Raco
I see function * segmentGoogleMaps* from package *plotGoogleMaps* does
exactly what I need with data from dataset *meuse*.
When I ran code:
coordinates(sampledf)<-~x+y # convert to SPDF
proj4string(sampledf) <- CRS('+init=epsg:28992')# adding Coordinate
Referent Sys.
# Create web map of Point data
m<-plotGoogleMaps(sampledf,filename='myMap1.htm')

Everything works just fine. When I try to run code:

# colPalette defines colors for plot
m<-segmentGoogleMaps(sampledf, zcol=c('City','Village'),
mapTypeId='ROADMAP',
filename='myMap4.htm',colPalette=c('#E41A1C','#377EB8'),
strokeColor='black')

everything goes wrong and all points seems to show in one single coordinate
pair.

The problem is I think that data in meuse dataset are different from mine
(I think UTM).
When tried to convert same result.
I desperately need to see pie chart in a google map so please help solving
this issue




On Wed, Apr 16, 2014 at 12:17 PM, Jim Lemon  wrote:

> See inline
>
>
> On 04/16/2014 08:13 PM, DrunkenPhD wrote:
>
>> No nothing :
>> I tried :
>>  > hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype
>> = 'roadmap')
>>  > sampledf<-read.table(text="x y City Village
>>
>> 19.9437314 40.7086377 120 425
>> 20.2214171 41.4924336 1 1
>> 20.0955891 39.9481364 4 4
>> 20.9506636 40.6447347 10 15",header=TRUE)
>> # make the graphics device high enough to hold Albania
>> x11(height=10,width=5)
>> # display the map upon which you want the pies
>>
>
> hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8,
>  maptype = 'roadmap')
> ggmap(hdf, extent = 'device')
> # I don't know if you have to do anything else here
>
>
>  # display the pies
>> library(plotrix)
>> for(pp in 1:4) floating.pie(sampledf[pp,"x"],sampledf[pp,"y"],
>> unlist(sampledf[pp,c("City","Village")]),radius=0.1)
>>
>> but p nothing
>> What to do
>>
>
> Look at the documentation for ggmap (I don't have the package on my
> system, so I can't do it for you). Try par("usr") to see the coordinates of
> the plot.
>
> Jim
>



-- 
 Endri Raço PhD
_
Polytechnic University of Tirana
Faculty of Mathematical Engineering and Physics Engineering
Department of Mathematical Engineering
Address:  Sheshi Nene Tereza, Nr. 4, Tirana - ALBANIA
Mobile: ++ 355 682061988

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread DrunkenPhD
Ok this seems to work perfect,

Dont want to be boring but what if plotting in real map smth like:

hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype = 
'roadmap')
ggmap(hdf, extent = 'device')

Regards

On Wednesday, April 16, 2014 11:51:43 AM UTC+2, Jim Lemon wrote:
>
> On 04/16/2014 04:43 AM, DrunkenPhD wrote: 
> > Sorry Jim not for my beginner level :((( 
> > 
> > So if my data are like this : 
> > 
> > xyCityVillage 
> > 
> > 19.943731440.7086377120425 
> > 
> > 20.221417141.492433611 
> > 
> > 20.095589139.948136444 
> > 
> > 20.950663640.64473471015 
> > 
> > how do I plot for every point(x,y) a pie chart with 
> slices(City,Village)? 
> > 
> > Regards 
> > 
> > 
> > how do I plot for every point(x,y) a pie chart with 
> slices(City,Village)? 
> > 
> > 
> Okay, let's try again. 
>
> # get yer data 
> sampledf<-read.table(text="x y City Village 
>   19.9437314 40.7086377 120 425 
>   20.2214171 41.4924336 1 1 
>   20.0955891 39.9481364 4 4 
>   20.9506636 40.6447347 10 15",header=TRUE) 
> # make the graphics device high enough to hold Albania 
> x11(height=10,width=5) 
> # display the map upon which you want the pies 
> map(xlim=c(19,21.1),ylim=c(39.5,42.6)) 
> # display the pies 
> library(plotrix) 
> for(pp in 1:4) floating.pie(sampledf[pp,"x"],sampledf[pp,"y"], 
>   unlist(sampledf[pp,c("City","Village")]),radius=0.1) 
>
> Jim 
>
> __ 
> r-h...@r-project.org  mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pie charts using plotGooglemaps

2014-04-16 Thread DrunkenPhD
No nothing :
I tried :
> hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype = 
'roadmap')
> sampledf<-read.table(text="x y City Village

 19.9437314 40.7086377 120 425
 20.2214171 41.4924336 1 1
 20.0955891 39.9481364 4 4
 20.9506636 40.6447347 10 15",header=TRUE)
# make the graphics device high enough to hold Albania
x11(height=10,width=5)
# display the map upon which you want the pies
map(hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype =
   'roadmap')
ggmap(hdf, extent = 'device'))
# display the pies
library(plotrix)
for(pp in 1:4) floating.pie(sampledf[pp,"x"],sampledf[pp,"y"],

unlist(sampledf[pp,c("City","Village")]),radius=0.1)

but p nothing 
What to do






On Wednesday, April 16, 2014 11:59:45 AM UTC+2, DrunkenPhD wrote:
>
> Ok this seems to work perfect,
>
> Dont want to be boring but what if plotting in real map smth like:
>
> hdf <- get_map(location = c(20.166065, 41.270679), zoom = 8, maptype = 
> 'roadmap')
> ggmap(hdf, extent = 'device')
>
> Regards
>
> On Wednesday, April 16, 2014 11:51:43 AM UTC+2, Jim Lemon wrote:
>>
>> On 04/16/2014 04:43 AM, DrunkenPhD wrote: 
>> > Sorry Jim not for my beginner level :((( 
>> > 
>> > So if my data are like this : 
>> > 
>> > xyCityVillage 
>> > 
>> > 19.943731440.7086377120425 
>> > 
>> > 20.221417141.492433611 
>> > 
>> > 20.095589139.948136444 
>> > 
>> > 20.950663640.64473471015 
>> > 
>> > how do I plot for every point(x,y) a pie chart with 
>> slices(City,Village)? 
>> > 
>> > Regards 
>> > 
>> > 
>> > how do I plot for every point(x,y) a pie chart with 
>> slices(City,Village)? 
>> > 
>> > 
>> Okay, let's try again. 
>>
>> # get yer data 
>> sampledf<-read.table(text="x y City Village 
>>   19.9437314 40.7086377 120 425 
>>   20.2214171 41.4924336 1 1 
>>   20.0955891 39.9481364 4 4 
>>   20.9506636 40.6447347 10 15",header=TRUE) 
>> # make the graphics device high enough to hold Albania 
>> x11(height=10,width=5) 
>> # display the map upon which you want the pies 
>> map(xlim=c(19,21.1),ylim=c(39.5,42.6)) 
>> # display the pies 
>> library(plotrix) 
>> for(pp in 1:4) floating.pie(sampledf[pp,"x"],sampledf[pp,"y"], 
>>   unlist(sampledf[pp,c("City","Village")]),radius=0.1) 
>>
>> Jim 
>>
>> __ 
>> r-h...@r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code. 
>>
>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lazy loading of CSV file

2014-04-16 Thread Luca Cerone
Hi, in a package I am developing some functions need to use some external
data.
I have these data as a set of .csv files that I have placed in the
inst/extdata folder.

At the moment I have a file "db-internal.r" where I  load all the internal
databases that could be used by the functions in my package;
and assign them to some global (to the package) variables (all with the
prefix db_ in front of them)
For example (I didn't come out with a better name, sorry)

db_italian_cities = read.csv(system.file("extdata/italian_cities.csv")

like this I can use db_italian_cities in my functions.

Some of these datasets are quite big and really slow down loading the
package, plus for some of the task the package is meant to solve they might
not even be required.
I would like to be able to lazyload these datasets only when needed, how
can I possibly achieve this without creating special databases?

Some of them could change, so I intend to be able to download the most
recent ones through a function that ensure the package is using the most
recent version,
so I would really prefer to simply use csv files.

Thanks a lot in advance for the help!

Cheers,
Luca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple plots on same sheet in R

2014-04-16 Thread eliza botto
  Dear useRs,
I drew 12 separate raster maps. I want to combine them in such a way the they 
appear on the same sheet in R, which will later on be saved. Each row should 
contain three raster maps, so in total we should have 4 rows with each row 
containing 3 rasters. I know that mfrow() can do it but I don't exactly know 
how?

Eliza 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple plots on same sheet in R

2014-04-16 Thread Pascal Oettli
Hi,

Did you have a look at the "rasterVis" package?

Regards,
Pascal

On Wed, Apr 16, 2014 at 9:33 PM, eliza botto  wrote:
>   Dear useRs,
> I drew 12 separate raster maps. I want to combine them in such a way the they 
> appear on the same sheet in R, which will later on be saved. Each row should 
> contain three raster maps, so in total we should have 4 rows with each row 
> containing 3 rasters. I know that mfrow() can do it but I don't exactly know 
> how?
>
> Eliza
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Pascal Oettli
Project Scientist
JAMSTEC
Yokohama, Japan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lazy loading of CSV file

2014-04-16 Thread Jeff Newmiller
The standard way to put data into a package is to convert it to RDA as 
described in the Writing R Extensions document. This is faster and more compact 
than CSV.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 16, 2014 5:10:03 AM PDT, Luca Cerone  wrote:
>Hi, in a package I am developing some functions need to use some
>external
>data.
>I have these data as a set of .csv files that I have placed in the
>inst/extdata folder.
>
>At the moment I have a file "db-internal.r" where I  load all the
>internal
>databases that could be used by the functions in my package;
>and assign them to some global (to the package) variables (all with the
>prefix db_ in front of them)
>For example (I didn't come out with a better name, sorry)
>
>db_italian_cities = read.csv(system.file("extdata/italian_cities.csv")
>
>like this I can use db_italian_cities in my functions.
>
>Some of these datasets are quite big and really slow down loading the
>package, plus for some of the task the package is meant to solve they
>might
>not even be required.
>I would like to be able to lazyload these datasets only when needed,
>how
>can I possibly achieve this without creating special databases?
>
>Some of them could change, so I intend to be able to download the most
>recent ones through a function that ensure the package is using the
>most
>recent version,
>so I would really prefer to simply use csv files.
>
>Thanks a lot in advance for the help!
>
>Cheers,
>Luca
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lazy loading of CSV file

2014-04-16 Thread Luca Cerone
Hi Jeff,
thanks for your reply.

I read about it in the documentation and I plan to build the functions to
download the csv and convert to rda in the future.
At the moment, however, I am looking for a quick and dirty solution to lazy
load the files I already have.
(I know I could convert them straight to .rda files and will do it, I am
using the occasion to learn some other
way to do it).

For example I have tried doing something like

lazyLoadFun <- function(path){
ds <- NULL
read.data <- function(){
   if (is.null(ds)){
  message("importing dataset the first time")
  ds <- read.csv(path)
   } else {
  message("using cached copy of the dataset")
   }
   return(ds)
}
return(read.data)
}

get.italian.cities <- lazyLoadFun(system.file("extdata/italian_cities.csv"))


In the functions where I need to use italian_cities I then do:
italian_cities <- get.italian.cities()
The first time I use get.italian.cities the file is loaded and cached into
"ds", while the other time the cached variable "ds" is retrieved.

The problem is that R doesn't pass variables by reference, so that ds is
copied into italian_cities, which makes things quite slow for big datasets
and makes using cached data useless.

I am figuring out, for example, if it is possible to use "assign" and avoid
copying the data into italian_cities.

Thanks again for your reply, I will definitively look into .rda files!

Luca


2014-04-16 15:39 GMT+02:00 Jeff Newmiller :

> The standard way to put data into a package is to convert it to RDA as
> described in the Writing R Extensions document. This is faster and more
> compact than CSV.
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> On April 16, 2014 5:10:03 AM PDT, Luca Cerone 
> wrote:
> >Hi, in a package I am developing some functions need to use some
> >external
> >data.
> >I have these data as a set of .csv files that I have placed in the
> >inst/extdata folder.
> >
> >At the moment I have a file "db-internal.r" where I  load all the
> >internal
> >databases that could be used by the functions in my package;
> >and assign them to some global (to the package) variables (all with the
> >prefix db_ in front of them)
> >For example (I didn't come out with a better name, sorry)
> >
> >db_italian_cities = read.csv(system.file("extdata/italian_cities.csv")
> >
> >like this I can use db_italian_cities in my functions.
> >
> >Some of these datasets are quite big and really slow down loading the
> >package, plus for some of the task the package is meant to solve they
> >might
> >not even be required.
> >I would like to be able to lazyload these datasets only when needed,
> >how
> >can I possibly achieve this without creating special databases?
> >
> >Some of them could change, so I intend to be able to download the most
> >recent ones through a function that ensure the package is using the
> >most
> >recent version,
> >so I would really prefer to simply use csv files.
> >
> >Thanks a lot in advance for the help!
> >
> >Cheers,
> >Luca
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
*Luca Cerone*

Tel: +34 692 06 71 28
Skype: luca.cerone

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] indexing names for looping across computations done on pairs of matrices

2014-04-16 Thread Cade, Brian
A.K.  Wonderful - thank you.  The last set of code using sapply(),
seq_along(), and get() worked like a charm and was what I was missing.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  ca...@usgs.gov 
tel:  970 226-9326



On Tue, Apr 15, 2014 at 8:11 PM, arun  wrote:

> Hi,
>
> If the two pairs of matrices are in a list:
> set.seed(42)
> lst1 <- lapply(1:11, function(x) matrix(sample(40, 20, replace=TRUE), 5,4))
>
> names(lst1) <- paste0("gs", paste0("4.",seq(0,100,by=10)))
> set.seed(585)
>
> lst2 <- lapply(1:11, function(x) matrix(sample(40, 20, replace=TRUE), 5,4))
>
> names(lst2) <- paste0("ps", paste0("1.",seq(0,100,by=10)))
> mean.comb <- data.frame(simulation=0:10,
> mean.horses.removed=sapply(seq_along(lst1),function(i)
> mean(rowSums(lst1[[i]]*lst2[[i]]
>
> #or
>
> # if the datasets are standalone with no particular order
> #just for demonstration
> attach(lst1)
>
> attach(lst2)
>
> vec1 <- sample(paste0("gs", paste0("4.",seq(0,100,by=10))), 11,
> replace=FALSE)
> vec2 <- sample(paste0("ps", paste0("1.",seq(0,100,by=10))), 11,
> replace=FALSE)
>
> vec1New <- vec1[order(as.numeric(gsub(".*\\.","",vec1)))]
>
> vec2New <- vec2[order(as.numeric(gsub(".*\\.","",vec2)))]
>
> mean.comb2 <- data.frame(simulation=0:10,
> mean.horses.removed=sapply(seq_along(vec1New),function(i)
> mean(rowSums(get(vec1New[i])* get(vec2New[i])
> identical(mean.comb,mean.comb2)
>
> #[1] TRUE
>
>
> A.K.
>
>
>
>
>
> On Tuesday, April 15, 2014 4:30 PM, "Cade, Brian"  wrote:
> So I know I must be missing something simple and obvious for the following
> data manipulation where I have (in this example) 11 pairs of matrices
> (gs4.0 to gs4.100 and ps1.0 to ps1.100) from some population simulations
> (all with same dimensions) where I want to get some summary statistics on
> the products of the cells in a pair (e.g., gs4.0 * ps1.0).  The code I
> wrote below works fine, but it seems like there ought to be a simple way to
> index the extensions on the names (.0 to .100) in a for loop to simplify
> this code greatly.  I've spent some time trying various things using
> paste() and assign() and have had no success.
>
> mean.comb <- as.matrix(0:10,nrow = 11, ncol=1)
> mean.comb <- cbind(mean.comb,0)
>
> ###to see list of files created
> gs4files <- ls(pattern="gs4.*0")
> ps1files <- ls(pattern="ps1.*0")
>
> mean.comb[1,2] <- mean(apply(gs4.0 * ps1.0,1,sum))
> mean.comb[2,2] <- mean(apply(gs4.10 * ps1.10,1,sum))
> mean.comb[3,2] <- mean(apply(gs4.20 * ps1.20,1,sum))
> mean.comb[4,2] <- mean(apply(gs4.30 * ps1.30,1,sum))
> mean.comb[5,2] <- mean(apply(gs4.40 * ps1.40,1,sum))
> mean.comb[6,2] <- mean(apply(gs4.50 * ps1.50,1,sum))
> mean.comb[7,2] <- mean(apply(gs4.60 * ps1.60,1,sum))
> mean.comb[8,2] <- mean(apply(gs4.70 * ps1.70,1,sum))
> mean.comb[9,2] <- mean(apply(gs4.80 * ps1.80,1,sum))
> mean.comb[10,2] <- mean(apply(gs4.90 * ps1.90,1,sum))
> mean.comb[11,2] <- mean(apply(gs4.100 * ps1.100,1,sum))
>
> mean.comb<- data.frame(mean.comb)
> colnames(mean.comb) <- c("simulation", "mean.horses.removed")
>
>
> Brian
>
> Brian S. Cade, PhD
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
>
> email:  ca...@usgs.gov 
> tel:  970 226-9326
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] interpreting weight in meta-analysis of proportion

2014-04-16 Thread petretta

Dear all,

I use R 3.0 for Windows.

I performed a meta-analysis of the prevalence (single proportion)  
reported in 14 different studyes using the command:


res<-metaprop(case,n,sm="PFT", comb.fixed=FALSE, comb.random=TRUE,  
studlab<- paste(Study))


print(res)

A referee ask a brief explanation of the W-statistic reported in the  
results, in particular, why the summ of the individual weights of all  
the studies is 100%.


Any suggestion is welcome.


--
Mario Petretta
Department of Translational Medical Sciences
Naples University Federico II
Italy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lazy loading of CSV file

2014-04-16 Thread Luca Cerone
I just found out that the function "delayedAssign" is what I was looking
for.

2014-04-16 16:51 GMT+02:00 Luca Cerone :

> Hi Jeff,
> thanks for your reply.
>
> I read about it in the documentation and I plan to build the functions to
> download the csv and convert to rda in the future.
> At the moment, however, I am looking for a quick and dirty solution to
> lazy load the files I already have.
> (I know I could convert them straight to .rda files and will do it, I am
> using the occasion to learn some other
> way to do it).
>
> For example I have tried doing something like
>
> lazyLoadFun <- function(path){
> ds <- NULL
> read.data <- function(){
>if (is.null(ds)){
>   message("importing dataset the first time")
>   ds <- read.csv(path)
>} else {
>   message("using cached copy of the dataset")
>}
>return(ds)
> }
> return(read.data)
> }
>
> get.italian.cities <-
> lazyLoadFun(system.file("extdata/italian_cities.csv"))
>
>
> In the functions where I need to use italian_cities I then do:
> italian_cities <- get.italian.cities()
> The first time I use get.italian.cities the file is loaded and cached into
> "ds", while the other time the cached variable "ds" is retrieved.
>
> The problem is that R doesn't pass variables by reference, so that ds is
> copied into italian_cities, which makes things quite slow for big datasets
> and makes using cached data useless.
>
> I am figuring out, for example, if it is possible to use "assign" and
> avoid copying the data into italian_cities.
>
> Thanks again for your reply, I will definitively look into .rda files!
>
> Luca
>
>
> 2014-04-16 15:39 GMT+02:00 Jeff Newmiller :
>
> The standard way to put data into a package is to convert it to RDA as
>> described in the Writing R Extensions document. This is faster and more
>> compact than CSV.
>>
>> ---
>> Jeff NewmillerThe .   .  Go
>> Live...
>> DCN:Basics: ##.#.   ##.#.  Live
>> Go...
>>   Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
>> /Software/Embedded Controllers)   .OO#.   .OO#.
>>  rocks...1k
>>
>> ---
>> Sent from my phone. Please excuse my brevity.
>>
>> On April 16, 2014 5:10:03 AM PDT, Luca Cerone 
>> wrote:
>> >Hi, in a package I am developing some functions need to use some
>> >external
>> >data.
>> >I have these data as a set of .csv files that I have placed in the
>> >inst/extdata folder.
>> >
>> >At the moment I have a file "db-internal.r" where I  load all the
>> >internal
>> >databases that could be used by the functions in my package;
>> >and assign them to some global (to the package) variables (all with the
>> >prefix db_ in front of them)
>> >For example (I didn't come out with a better name, sorry)
>> >
>> >db_italian_cities = read.csv(system.file("extdata/italian_cities.csv")
>> >
>> >like this I can use db_italian_cities in my functions.
>> >
>> >Some of these datasets are quite big and really slow down loading the
>> >package, plus for some of the task the package is meant to solve they
>> >might
>> >not even be required.
>> >I would like to be able to lazyload these datasets only when needed,
>> >how
>> >can I possibly achieve this without creating special databases?
>> >
>> >Some of them could change, so I intend to be able to download the most
>> >recent ones through a function that ensure the package is using the
>> >most
>> >recent version,
>> >so I would really prefer to simply use csv files.
>> >
>> >Thanks a lot in advance for the help!
>> >
>> >Cheers,
>> >Luca
>> >
>> >   [[alternative HTML version deleted]]
>> >
>> >__
>> >R-help@r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
> *Luca Cerone*
>
> Tel: +34 692 06 71 28
> Skype: luca.cerone
>



-- 
*Luca Cerone*

Tel: +34 692 06 71 28
Skype: luca.cerone

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple plots on same sheet in R

2014-04-16 Thread Federico Lasa
see: ?par

Does running

par(mfrow=c(4,3))

do the job?


On Wed, Apr 16, 2014 at 7:33 AM, eliza botto  wrote:
>   Dear useRs,
> I drew 12 separate raster maps. I want to combine them in such a way the they 
> appear on the same sheet in R, which will later on be saved. Each row should 
> contain three raster maps, so in total we should have 4 rows with each row 
> containing 3 rasters. I know that mfrow() can do it but I don't exactly know 
> how?
>
> Eliza
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] paper on cross-validation pitfalls

2014-04-16 Thread Damjan
Dear All,
We have recently published an article on cross-validation pitfalls 
http://www.jcheminf.com/content/6/1/10/
In the article, amongst other things, we discuss nested cross-validation and 
variable selection with cross-validation which I am sure is of interest to 
statistical modelling community. I am keen to hear critical assessment of our 
paper from the community. 
So far it was brought to my attention thata) In the section "Preprocessing" I 
failed to mention that we used the caret R package and failed to reference Max 
Kuhn's 2008 paper "Building Predictive Models in R Using the caret Package".b) 
In the section "Nested cross-validation" I failed to mention the Bioinformtics 
2004 paper by Statnitkov et al. "A comprehensive evaluation of multicategory 
classification methods...", because they also suggested nested 
cross-validationc) In the section "Discussion" I should have explained that 
nested cross-validation may also be applied to cases where variable selection 
is performed with cross-validation. There is a Bioinformatics 2010 paper by 
Lagani and Tsamardions "Structure-Based Variable Selection for Survival Data" 
where they applied nested cross-validation for variable selection with 
cross-validation.
Looking forward to hearing from you.DKP.S. I am the corresponding author and 
please use my email address specified in the paper. 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mean not working in function

2014-04-16 Thread arun
Hi,
Try:
set.seed(48)
dat1 <- data.frame(matrix(sample(c(NA, 1:20), 4 * 20, replace = TRUE), ncol = 
4),  fac1 = sample(LETTERS, 20, replace = TRUE)) mysummary <- function(dataf) { 
indx <- sapply(dataf, is.numeric) for (name in names(dataf)[indx]) { 
cat("Variable name: ", name, ": Mean=", mean(dataf[, name], na.rm = TRUE),  
"\n") }
} mysummary2 <- function(dataf) { indx <- sapply(dataf, is.numeric) 
cat(paste(paste("Variable name: ", names(dataf)[indx], ": Mean=", 
format(colMeans(dataf[,  indx], na.rm = TRUE), digits = 7), collapse = "\n"), 
"\n"))
} 

A.K.



Thanks for the reply.
How can I avoid getting factor variables in this loop. I think I need to use 
dataf[sapply(df2, is.numeric)], but I am not able to combine it with: 
mean(dataf[,name],na.rm=TRUE). 


On Friday, April 11, 2014 12:47 PM, arun  wrote:
Hi,
Try:
mysummary <- function(dataf){ for(name in names(dataf)){ cat ("Variable name: 
", name, ": Mean=", mean(dataf[,name],na.rm=TRUE),"\n") }
} 

##Using some example data:

set.seed(48)
dat1 <- as.data.frame(matrix(sample(c(NA,1:20),4*20,replace=TRUE),ncol=4)) 
mysummary(dat1)
#Variable name:  V1 : Mean= 11.64706
#Variable name:  V2 : Mean= 10.9
#Variable name:  V3 : Mean= 12.35 

#Variable name:  V4 : Mean= 10.52632 


#Another way would be:

mysummary2 <- function(dataf){ cat(paste(paste("Variable name: ", names(dataf), 
": Mean=", format(colMeans(dataf,na.rm=TRUE),digits=7),collapse="\n"),"\n"))
}
mysummary2(dat1) 

#Variable name:  V1 : Mean= 11.64706
#Variable name:  V2 : Mean= 10.9
#Variable name:  V3 : Mean= 12.35000
#Variable name:  V4 : Mean= 10.52632 


A.K.


I am trying following function: mysummary <- function(dataf){ for(name in 
names(dataf)){ cat ("Variable name: ", name, ": Mean=", mean(name),"\n") }
} The variable name is properly picked up but the mean is shows as NA. The 
command warnings() show:
In mean.default(name) : argument is not numeric or logical: returning NA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting Width and Length of Each Shape in EPS File

2014-04-16 Thread Muhammad Abdur Rehman Khawaja
Respected Fellows,
I need little bit guidance regarding, how Can I Extract/Calculate/Measure
width and length of each every shape in .eps file. EPS file is being
generated from Adobe Illustrator.
I have used grImport Library in R language to import eps file in R
environment, but I couldn't understand in which format the data is in and
How can I manipulate it.
I'll shall be thankful for your cooperation
--
Kind Regards
Khawaja Muhammad Abdur Rehman
Mechatronics Engineer NUST
Professional Profile:
http://pk.linkedin.com/in/khawajamechatronicscaps/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems installing packages

2014-04-16 Thread Anton FRANCOIS
Hello,

I am woking on ubuntu 13.10 (saucy)

And my version() function return:

version


 _

platform i686-pc-linux-gnu

arch i686

os linux-gnu

system i686, linux-gnu

status

major 3

minor 1.0

year 2014

month 04

day 10

svn rev 65387

language R

version.string R version 3.1.0 (2014-04-10)

nickname Spring Dance



With my class (we are all on ubuntu 13.10) we tried to install plural
packages, like rDeducer, ggplot2 and doBy.
Originally I was working with R 3.0.1, and I managed to install ggplot2 but
not others.
So I upgrade to 3.1.0 version, And no package would install.
I get each time the warning message :


Warning message:

In install.packages("ggplot2") :

installation of package ‘ggplot2’ had non-zero exit status

Maybe the problem comes from jJava, or a bad compilator ?

I really do not know what to do to make it working.
Can you help ?

Thank for the answer.
Sincerly

Anton Francois

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot legend in filled.contour plot with infinite limits

2014-04-16 Thread jlehm
Dear R-users,

 

I would like to manipulate the legend bar of a filled.contour plot in the
same way as it is done in the attached example I found on the web.

So, in particular, I would like to limit my z-range and then have triangles
at the ends of the legend that indicate that higher values than max(z-range)
or lower values than min(z-range) are included in the last color given at
then ends of the legend.

Does anyone have an idea how to do this?

Any help would be highly appreciated as I just can't find a solution myself.



--
View this message in context: 
http://r.789695.n4.nabble.com/plot-legend-in-filled-contour-plot-with-infinite-limits-tp4688905.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Values cells

2014-04-16 Thread Alexsandro Cândido de Oliveira Silva
Hi folks,

I’m using the raster package.

I need to get the min and max values between the cells of a raster,
something like this: [min, max]. How?



---
Este email está limpo de vírus e malwares porque a proteção do avast! Antivírus 
está ativa.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Values cells

2014-04-16 Thread Frede Aakmann Tøgersen
Well, if you are using the print method on a raster object  what do you get 
then?

Try this:

> library(raster)
Loading required package: sp
>  logo <- raster(system.file("external/rlogo.grd", package="raster")) 
> print(logo)
class   : RasterLayer 
band: 1  (of  3  bands)
dimensions  : 77, 101,   (nrow, ncol, ncell)
resolution  : 1, 1  (x, y)
extent  : 0, 101, 0, 77  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=merc 
data source : c:\Programmer\R\R-3.0.2\library\raster\external\rlogo.grd 
names   : red 
values  : 0, 255  (min, max)

See the values entry. Since this is a color image there are 3 bands. What you 
see above is for the first band ("red").

Going from 0 to 255 for red is no surprise. You will probably see the same for 
green and blue as well.

What are you really looking for?

Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Alexsandro Cândido de Oliveira Silva
> Sent: 16. april 2014 20:07
> To: r-help@r-project.org
> Subject: [R] Values cells
> 
> Hi folks,
> 
> I'm using the raster package.
> 
> I need to get the min and max values between the cells of a raster,
> something like this: [min, max]. How?
> 
> 
> 
> ---
> Este email está limpo de vírus e malwares porque a proteção do avast!
> Antivírus está ativa.
> 
> 
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using pre-defined IRT parameters in ltm package

2014-04-16 Thread Luana Marotta
Hi all,

I want to estimate scores using IRT when the model parameters are known.
I'd like to define my parameters a priori using the ltm package, but
haven't found much information on the manual.

I appreciate any help on this.

Thanks,

Luana

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help incorporating data subset lengths in function with ddply

2014-04-16 Thread Steve E.
Dear R Community,

I am having some trouble with a task that I hope you might be able to help
with. I have a dataset that includes the time and corresponding stream
discharge from numerous storms (example of structure with simplified data
below). I would like to produce a field that details the duration of each
storm, where each storm is a subset of the data and the duration runs from
zero to end for each unique storm. I have been trying to accomplish this
with ddply but to no avail as I am unable to provide ddply (e.g., below)
with the length of the storm (i.e., subset of data). Thank you in advance,
any help would be appreciated.


existing df:
storm,Q_time,Q
s1,2008-08-07 21:15:00,0.000
s1,2008-08-07 21:16:00,3.020
s1,2008-08-07 21:17:00,6.041
s1,2008-08-07 21:18:00,9.061
s1,2008-08-07 21:19:00,12.082
s1,2008-08-07 21:20:00,15.102
s1,2008-08-07 21:21:00,18.123
s1,2008-08-07 21:22:00,11.143
s1,2008-08-07 21:23:00,0.000
s2,2010-10-05 21:00:00,0.000
s2,2010-10-05 21:01:00,1.812
s2,2010-10-05 21:02:00,3.625
s2,2010-10-05 21:03:00,5.437
s2,2010-10-05 21:04:00,7.249
s2,2010-10-05 21:05:00,9.061
s2,2010-10-05 21:06:00,0.874
s2,2010-10-05 21:07:00,0.000

desired df:
storm,Q_time,Q, duration
s1,2008-08-07 21:15:00,0.000,1
s1,2008-08-07 21:16:00,3.020,2
s1,2008-08-07 21:17:00,6.041,3
s1,2008-08-07 21:18:00,9.061,4
s1,2008-08-07 21:19:00,12.082,5
s1,2008-08-07 21:20:00,15.102,6
s1,2008-08-07 21:21:00,18.123,7
s1,2008-08-07 21:22:00,11.143,8
s1,2008-08-07 21:23:00,0.000,9
s2,2010-10-05 21:00:00,0.000,1
s2,2010-10-05 21:01:00,1.812,2
s2,2010-10-05 21:02:00,3.625,3
s2,2010-10-05 21:03:00,5.437,4
s2,2010-10-05 21:04:00,7.249,5
s2,2010-10-05 21:05:00,9.061,6
s2,2010-10-05 21:06:00,0.874,7
s2,2010-10-05 21:07:00,0.000,8

I have been trying variations of the following statement, but I cannot seem
to get the length of the subset correct as I receive an error of the type
'Error: arguments imply differing number of rows: 2401, 0'.

newdf <- ddply(df, "storm", transform, FUN = function(x)
{duration=seq(from=1, by=1, length.out=nrow(x))})

I would really like to get a handle on ddply in this instance as it will be
quite helpful for many other similar calculations that I need to do with
this dataset.

Thanks again,
Stevan




--
View this message in context: 
http://r.789695.n4.nabble.com/help-incorporating-data-subset-lengths-in-function-with-ddply-tp4688926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help incorporating data subset lengths in function with ddply

2014-04-16 Thread Frede Aakmann Tøgersen
Hi

Do you seek something like this:


mydat <- read.table(text="
storm,Q_time,Q, duration
s1,2008-08-07 21:15:00,0.000,1
s1,2008-08-07 21:16:00,3.020,2
s1,2008-08-07 21:17:00,6.041,3
s1,2008-08-07 21:18:00,9.061,4
s1,2008-08-07 21:19:00,12.082,5
s1,2008-08-07 21:20:00,15.102,6
s1,2008-08-07 21:21:00,18.123,7
s1,2008-08-07 21:22:00,11.143,8
s1,2008-08-07 21:23:00,0.000,9
s2,2010-10-05 21:00:00,0.000,1
s2,2010-10-05 21:01:00,1.812,2
s2,2010-10-05 21:02:00,3.625,3
s2,2010-10-05 21:03:00,5.437,4
s2,2010-10-05 21:04:00,7.249,5
s2,2010-10-05 21:05:00,9.061,6
s2,2010-10-05 21:06:00,0.874,7
s2,2010-10-05 21:07:00,0.000,8",
h = TRUE, sep =",", stringsAsFactors = FALSE)

mydat$Q_time <- as.POSIXct(strptime(mydat$Q_time, format = "%Y-%m-%d %H:%M:%S"))

tmp <- aggregate(Q_time ~ storm, data = mydat, FUN = function(x) diff(range(x)))

tmp

str(tmp)



Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Steve E.
> Sent: 16. april 2014 21:01
> To: r-help@r-project.org
> Subject: [R] help incorporating data subset lengths in function with ddply
> 
> Dear R Community,
> 
> I am having some trouble with a task that I hope you might be able to help
> with. I have a dataset that includes the time and corresponding stream
> discharge from numerous storms (example of structure with simplified data
> below). I would like to produce a field that details the duration of each
> storm, where each storm is a subset of the data and the duration runs from
> zero to end for each unique storm. I have been trying to accomplish this
> with ddply but to no avail as I am unable to provide ddply (e.g., below)
> with the length of the storm (i.e., subset of data). Thank you in advance,
> any help would be appreciated.
> 
> 
> existing df:
> storm,Q_time,Q
> s1,2008-08-07 21:15:00,0.000
> s1,2008-08-07 21:16:00,3.020
> s1,2008-08-07 21:17:00,6.041
> s1,2008-08-07 21:18:00,9.061
> s1,2008-08-07 21:19:00,12.082
> s1,2008-08-07 21:20:00,15.102
> s1,2008-08-07 21:21:00,18.123
> s1,2008-08-07 21:22:00,11.143
> s1,2008-08-07 21:23:00,0.000
> s2,2010-10-05 21:00:00,0.000
> s2,2010-10-05 21:01:00,1.812
> s2,2010-10-05 21:02:00,3.625
> s2,2010-10-05 21:03:00,5.437
> s2,2010-10-05 21:04:00,7.249
> s2,2010-10-05 21:05:00,9.061
> s2,2010-10-05 21:06:00,0.874
> s2,2010-10-05 21:07:00,0.000
> 
> desired df:
> storm,Q_time,Q, duration
> s1,2008-08-07 21:15:00,0.000,1
> s1,2008-08-07 21:16:00,3.020,2
> s1,2008-08-07 21:17:00,6.041,3
> s1,2008-08-07 21:18:00,9.061,4
> s1,2008-08-07 21:19:00,12.082,5
> s1,2008-08-07 21:20:00,15.102,6
> s1,2008-08-07 21:21:00,18.123,7
> s1,2008-08-07 21:22:00,11.143,8
> s1,2008-08-07 21:23:00,0.000,9
> s2,2010-10-05 21:00:00,0.000,1
> s2,2010-10-05 21:01:00,1.812,2
> s2,2010-10-05 21:02:00,3.625,3
> s2,2010-10-05 21:03:00,5.437,4
> s2,2010-10-05 21:04:00,7.249,5
> s2,2010-10-05 21:05:00,9.061,6
> s2,2010-10-05 21:06:00,0.874,7
> s2,2010-10-05 21:07:00,0.000,8
> 
> I have been trying variations of the following statement, but I cannot seem
> to get the length of the subset correct as I receive an error of the type
> 'Error: arguments imply differing number of rows: 2401, 0'.
> 
> newdf <- ddply(df, "storm", transform, FUN = function(x)
> {duration=seq(from=1, by=1, length.out=nrow(x))})
> 
> I would really like to get a handle on ddply in this instance as it will be
> quite helpful for many other similar calculations that I need to do with
> this dataset.
> 
> Thanks again,
> Stevan
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/help-
> incorporating-data-subset-lengths-in-function-with-ddply-tp4688926.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding snp data

2014-04-16 Thread David Winsemius

On Apr 15, 2014, at 4:15 PM, Manan Amin wrote:

> 
> Hello,
> 
> I have a data file of 108mb with snps data of 0,1,2.  I am new in the program 
> and have to do this project.  My first step is to get the p value using 
> chisq.test function.  I have tried running the function however since I am 
> new to language I am aware of how to do this step. My error am getting is 
> cannot allocate vector of size.  Now I have data loaded as in table text file 
> into a variable called data and passing that variable to function chisq.test 
> with df value of 2.
> Sent from my iPhone

Since you have gotten no response in 20 hours, you should start asking yourself 
whether anyone will be able to offer assistance with what little information 
you have provided. Try reading the Posting Guide to see what level of detail is 
expected for communication on a technical newsgroup such as R-help.


> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help incorporating data subset lengths in function with ddply

2014-04-16 Thread Steve E.
Hi Frede - Thank you for responding. Not quite what I am after. Notice that I
included two data sets in my post, the first is the raw data whereas the
second (the desired df) is similar but has a column of sequential numbers in
another column at the end - that column of sequential numbers for each storm
(i.e., subset of data) is what I am after. Thanks again, Stevan



--
View this message in context: 
http://r.789695.n4.nabble.com/help-incorporating-data-subset-lengths-in-function-with-ddply-tp4688926p4688933.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help in analysis of similarity in Iramuteq

2014-04-16 Thread Ana Karla Silva Soares
I am conducting the analysis of similarity and when I request the 3D graph
the following error appears:

erreur R
None1
none
none

Can anyone help me?
Thank you.

Ana Karla S Soares


-- 
Ana Karla Silva Soares
Doutoranda em Psicologia Social - UFPB
Mestre em Psicologia Social - UFPB
Graduada em Psicologia - UFPB
CV Lattes:
http://buscatextual.cnpq.br/buscatextual/visualizacv.jsp?id=K4206524T0
Contatos: a kssoa...@gmail.com / (83) 9652-8800

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trend test for hazard ratios

2014-04-16 Thread David Winsemius

On Apr 16, 2014, at 4:07 AM, Göran Broström wrote:

> On 04/15/2014 10:51 PM, David Winsemius wrote:
>> 
>> On Apr 15, 2014, at 6:32 AM, Therneau, Terry M., Ph.D. wrote:
>> 
>>> You can do statistical tests within a single model, for whether
>>> portions of it fit or do not fit.  But one cannot take three
>>> separate fits and compare them.  The program needs context to know
>>> how the three relate to one another.  Say that "group" is your
>>> strata variable, trt the variable of interest, and x1, x2 are
>>> adjusters.
>>> 
>>> fit <- coxph(Surv(time,status) ~ trt * strata(group) + x1 + x2,
>>> data=mydata)
>>> 
>>> Will fit a model with a separate treatment coefficient for each of
>>> the groups, and a separate baseline hazard for each.  One can now
>>> create a contrast that corresponds to your trend test, using
>>> vcov(fit) for the variance matrix and coef(fit) to retrieve the
>>> coefficients.
>>> 
>> 
>> I have at the moment on my workspace a dataset of breast cancer cases
>> extracted from SEER that has a factor representing three grades of
>> histology: $Grade.abb. $ Grade.abb : Factor w/ 3 levels
>> "Well","Moderate", "Poorly"
>> 
>> It would be reasonable to test whether the grading has a "trend" in
>> its effect when controlling for other factors (and it would be
>> surprising to a medical audience if there were no effect.). So I
>> compare across models using  AgeStgSiz.NG.Rad as my "linear" trend"
>> model (with one df for an `as.numeric` version, AgeStgSizRad as my
>> no-grade-included baseline, and AgeStgSiz.FG.Rad as my full factor
>> version:
> 
> David, your problem is different from the original one; I think you can solve 
> yours by transforming the (unordered) factor to an ordered.

Thank you for your interest, I don't see that it is different. I think I do 
understand model comparison when the models do not have strata.

> 
> Try
> 
> AgeStgSiz.NG.Rad <- coxph(Surv(Survmon,
> Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
> SEER.historic.stage.A+ as.numeric(Size.cat) + ordered(Grade.abb)
> + Rgrp , data=BrILR)
> 
> and contrasts based on orthogonal polynomials are used for Grade.abb
> 
> see ?contr.poly

Yes. That should give similar (and hopefully identical)  results to those 
obtained with the nested models for which I supplied the deviance estimates. 
The as.numeric() version would correspond to the linear term in an ordered 
factor polynomial contrast (although it would not be identical since it is not 
in the full model.) It should be reported as "highly significant" since it is 
associated with a change in deviance of 5888.492 - 5166.291 on one d.f.. 
Because there was no material difference in deviance (change of less than 1)  
with adding the covariate as an unordered factor, I would predict that the 
quadratic term should report a p-value in the conventionally "insignificant" 
range and that is what is seen:

(AgeStgSiz.OG.Rad <- coxph( Surv(Survmon, 
Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+ 
SEER.historic.stage.A+ as.numeric(Size.cat) + ordered(Grade.abb) + Rgrp , 
data=BrILR))
Call:
coxph(formula = Surv(Survmon, Vital.status.recode..study.cutoff.used. == 
"Dead") ~ AgeDx + SEER.historic.stage.A + as.numeric(Size.cat) + 
ordered(Grade.abb) + Rgrp, data = BrILR)


  coef exp(coef) se(coef)   zp
AgeDx   0.0393 1.040   0.0030  13.099 0.00
SEER.historic.stage.ALocalized  0.8986 2.456   0.0679  13.225 0.00
SEER.historic.stage.ARegional   1.4663 4.333   0.0682  21.513 0.00
as.numeric(Size.cat)0.1101 1.116   0.0028  39.362 0.00
ordered(Grade.abb).L0.5291 1.697   0.0234  22.608 0.00
ordered(Grade.abb).Q   -0.0166 0.984   0.0177  -0.941 0.35
RgrpTRUE   -0.2696 0.764   0.0177 -15.212 0.00

Likelihood ratio test=5889  on 7 df, p=0  n= 59583, number of events= 13070

Notice that : 1- pchisq( 2*diff(summary(AgeStgSiz.FG.Rad)[['loglik']]) 
-2*diff(summary(AgeStgSiz.NG.Rad)[['loglik']]), 1)
[1] 0.3462616
 
... is exactly the p-value reported for the quadratic term. (And the 
z-statistic is the negative of the square root of the difference in deviance.)

I was hoping for a demonstration of taking the coef() and vcov() values to 
construct contrast estimates and variances? I've tried hacking the estimable 
function in the gmodels package so that it would accept an object that had a 
coef and vcov object, but I'm "not quite there yet".

-- 
David.

> 
> Göran B.
>> 
>>> AgeStgSiz.NG.Rad <- coxph( Surv(Survmon,
>>> Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
>>> SEER.historic.stage.A+ as.numeric(Size.cat) + as.numeric(Grade.abb)
>>> + Rgrp , data=BrILR) AgeStgSizRad <- coxph( Surv(Survmon,
>>> Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+
>>> SEER.historic.stage.A+ as.numeric(Size.cat) + Rgrp , data=BrILR)
>>> AgeStgSiz.FG.Rad <- coxph( Surv(Survmon,
>>> Vital.status.recode..study.cuto

Re: [R] problem on package "survey" , function svyglm,

2014-04-16 Thread Thomas Lumley
On Tue, Apr 15, 2014 at 7:07 AM, Milan Bouchet-Valat wrote:

> Le lundi 14 avril 2014 à 13:59 -0400, Hanze Zhang a écrit :
> > Hi,
> >
> > I want to do logistic regression based on a complex sample design. I used
> > package survey, but when I ran svyglm, an error message came out:
> > Error in onestrat(x[index, , drop = FALSE], clusters[index],
> > nPSU[index][1],  :
> >   Stratum (16) has only one PSU at stage 1
> >
> 
> > How to solve this issue? Thank you.
> You need to merge manually the stratum with only one PSU with another
> stratum. See 3.2.1 in http://books.google.fr/books?id=L96ludyhFBsC
> (look for "single" in the whole book to find it).
>
>
Or set options(survey.lonely.psu) to one of the other values. But merging
strata is probably better.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems installing packages

2014-04-16 Thread Uwe Ligges



On 16.04.2014 14:59, Anton FRANCOIS wrote:

Hello,

I am woking on ubuntu 13.10 (saucy)

And my version() function return:

version


  _

platform i686-pc-linux-gnu

arch i686

os linux-gnu

system i686, linux-gnu

status

major 3

minor 1.0

year 2014

month 04

day 10

svn rev 65387

language R

version.string R version 3.1.0 (2014-04-10)

nickname Spring Dance



With my class (we are all on ubuntu 13.10) we tried to install plural
packages, like rDeducer, ggplot2 and doBy.
Originally I was working with R 3.0.1, and I managed to install ggplot2 but
not others.
So I upgrade to 3.1.0 version, And no package would install.
I get each time the warning message :


Warning message:

In install.packages("ggplot2") :

installation of package ‘ggplot2’ had non-zero exit status

Maybe the problem comes from jJava, or a bad compilator ?

I really do not know what to do to make it working.
Can you help ?


Not without the output and the error messages

Best,
Uwe Ligges




Thank for the answer.
Sincerly

Anton Francois

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Values cells

2014-04-16 Thread MacQueen, Don
I'm pretty sure this is described and not hard to find in the raster
package documentation, which you can download from the CRAN package page.
In addition, r-sig-geo is generally the best place to ask question related
to R's 'geo' packages (which includes raster).

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/16/14 11:06 AM, "Alexsandro Cândido de Oliveira Silva"
 wrote:

>Hi folks,
>
>I¹m using the raster package.
>
>I need to get the min and max values between the cells of a raster,
>something like this: [min, max]. How?
>
>
>
>---
>Este email está limpo de vírus e malwares porque a proteção do avast!
>Antivírus está ativa.
>
>
>   [[alternative HTML version deleted]]
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fw: Save multiple plots as pdf or jpeg

2014-04-16 Thread Uwe Ligges



On 16.04.2014 11:00, Pavneet Arora wrote:

Hello Uwe

Can you explain what you mean by "try to plot the sensity only with few
hundreds of segemnts"

>

Do you mean that I should take some kind of sample and do it? And if so
what's the best number for the sample and how do i ensure that its not
biased?


Just ensure that you various plots do not contain too many elements to 
be plotted. If you really need many of these, a bitmap graphics may be 
exceptionally a better idea.


Best,
Uwe Ligges










From:   Uwe Ligges 
To: Pavneet Arora/UK/RoyalSun@RoyalSun, r-help@r-project.org,
pavnee...@yahoo.co.uk
Date:   15/04/2014 14:08
Subject:Re: [R] Fw: Save multiple plots as pdf or jpeg



You have > 1e6 observations and your lines() have these many segments,
try to plot the sensity only with few hndreds of segemnts.

Best,
Uwe Ligges


On 15.04.2014 12:27, Pavneet Arora wrote:

Hello All,

I have multiple plots that I want to save it a single file. At the

moment,

I am saving as pdf. Each plot has a qqnorm, qqline, tiny histogram in

the

top left graph with its density drawn top of it and the normal
superimposed on the histogram.

As a result, the pdf takes forever to load and doesn?t let me print, as

it

runs out of memory. I was wondering if there is a way if I can save each
plot as jpeg and then export it on pdf. Will that make it quicker in
loading? If so, how can I do this? And if not, then what are the
alternatives.

The code that I have at the moment is as follows:



~

pdf(file="C:/qqnorm/qqnorms.pdf") ##- Saves all plots in the same pdf

file



for (k in 1:ncol(nums2)){
  par(mfrow=c(1,1))

##- QQNorm
  qqnorm(nums2[,k],col="lightblue",main=names(nums2)[k])
  qqline(nums2[,k],col="red",lwd=2)

##- Tiny Histogram
op = par(fig=c(.02,.5,0.4,0.98), new=TRUE)
  hist(nums2[,k],freq=F,col="blue",xlab="", ylab="", main="",
   axes=F,density=20)

##- Density of the variable
  lines(density(nums2[,k],na.rm=T), col="darkred", lwd=2)

##- Super-imposed Normal Density
curve(dnorm(x,mean=mean(nums2[,k],na.rm=T),sd=sd(nums2[,k],na.rm=T)),
col="black",lwd=3,add=T) ##- Footnote: title1 <- "nums2[k]"

library(moments)
  s_kurt <- kurtosis (nums2[,k])
  s_skew <- skewness (nums2[,k])
  mtxt <- paste ("Variable=",title1, ":" ,
 "Kurt=",round(s_kurt,digits=4), "Skew",
 round(s_skw,digits=4), sep=" ")

mtext (mtxt,col="green4",side=1,line=15.6,adj=0.0,cex=0.8,font=2,las=1)

}
dev.off()


~


Structure of My data:

str(nums2)
'data.frame':1355615 obs. of  39 variables:
   $ month   : int  1 1 1 1 1 1 1 1 1 1 ...
   $ Location_Easting_OSGR   : int  525680 524170 524520 526900
528060 524770 524220 525890 527350 524550 ...
   $ Location_Northing_OSGR  : int  178240 181650 182240 177530
179040 181160 180830 179710 177650 180810 ...
   $ Longitude   : num  -0.191 -0.212 -0.206 -0.174
-0.157 ...
   $ Latitude: num  51.5 51.5 51.5 51.5 51.5 ...
   $ Police_Force: int  1 1 1 1 1 1 1 1 1 1 ...
   $ Number_of_Vehicles  : int  1 1 2 1 1 2 2 1 2 2 ...
   $ Number_of_Casualties: int  1 1 1 1 1 1 1 2 2 5 ...
   $ Day_of_Week : int  3 4 5 6 2 3 5 6 7 7 ...
   $ Local_Authority__District_  : int  12 12 12 12 12 12 12 12 12 12
...
   $ X_1_st_Road_Class   : int  3 4 5 3 6 6 5 3 3 4 ...
   $ X_1_st_Road_Number  : int  3218 450 0 3220 0 0 0 315

3212

450 ...
   $ Road_Type   : int  6 3 6 6 6 6 6 3 6 6 ...
   $ Speed_limit : int  30 30 30 30 30 30 30 30 30 30
...
   $ Junction_Detail : int  0 6 0 0 0 0 3 0 6 3 ...
   $ Junction_Control: int  -1 2 -1 -1 -1 -1 4 -1 2 4 ...
   $ X_2_nd_Road_Class   : int  -1 5 -1 -1 -1 -1 6 -1 4 5 ...
   $ X_2_nd_Road_Number  : int  0 0 0 0 0 0 0 0 304 0 ...
   $ Pedestrian_Crossing_Human_Contro: int  0 0 0 0 0 0 0 0 0 0 ...
   $ Pedestrian_Crossing_Physical_Fac: int  1 5 0 0 0 0 0 0 5 8 ...
   $ Light_Conditions: int  1 4 4 1 7 1 4 1 4 1 ...
   $ Weather_Conditions  : int  2 1 1 1 1 2 1 1 1 1 ...
   $ Road_Surface_Conditions : int  2 1 1 1 2 2 1 1 1 1 ...
   $ Special_Conditions_at_Site  : int  0 0 0 0 0 6 0 0 0 0 ...
   $ Carriageway_Hazards : int  0 0 0 0 0 0 0 0 0 0 ...
   $ Urban_or_Rural_Area : int  1 1 1 1 1 1 1 1 1 1 ...
   $ Did_Police_Officer_Attend_Scene_: int  1 1 1 1 1 1 1 1 1 1 ...
   $ year: int  2005 2005 2005 2005 2005 2005
2005 2005 2005 2005 ...
   $ m   : int  1 1 1 1 1 1 1 1 1 1 ...
   $ Qrtr:

Re: [R] Extracting Width and Length of Each Shape in EPS File

2014-04-16 Thread Jeff Newmiller
Have you read the vignettes that accompany that package?

You should also read the Posting Guide for this mailing list, as HTML email is 
not in general a good idea on this list.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 16, 2014 6:35:55 AM PDT, Muhammad Abdur Rehman Khawaja 
 wrote:
>Respected Fellows,
>I need little bit guidance regarding, how Can I
>Extract/Calculate/Measure
>width and length of each every shape in .eps file. EPS file is being
>generated from Adobe Illustrator.
>I have used grImport Library in R language to import eps file in R
>environment, but I couldn't understand in which format the data is in
>and
>How can I manipulate it.
>I'll shall be thankful for your cooperation
>--
>Kind Regards
>Khawaja Muhammad Abdur Rehman
>Mechatronics Engineer NUST
>Professional Profile:
>http://pk.linkedin.com/in/khawajamechatronicscaps/
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Updates to Hmisc, rms, and greport packages

2014-04-16 Thread Frank Harrell

The Hmisc package has had a number of updates/fixes:

Changes in version 3.14-4 (2014-04-13)
   * rcspline.eval: stop new logic for ignoring outer values when there 
are many ties when there are also many ties on interior values.  Added 
new logic to use interior unique values of x when the number of unique x 
is small.

   * latexBuild: generalized with insert argument
   * latex.default: modified to use mods of latexBuild, fixed bug with 
caption.loc='bottom' (thanks: YacineH)
   * latex.default: fixed bug where comma did not appear after 
caption={} for ctable (thanks: Johannes Hofrichter)
   * tests: fit.mult.impute.bootstrap.r: added new example (thanks: 
Jane Cook)
   * fit.mult.impute: added call for fit.mult.impute in returned 
object, replacing call from fitter; makes update work for fit.mult.impute
   * summary.formula: fixed recognition of multiple left-hand-side 
variables to trigger call to summaryM (thanks: Kevin Thorpe)
   * summaryM: changed group label to '' instead of 0 for formulas like 
age + sex ~ 1

   * Ecdf: added what argument to all functions
   * nobsY: return constructed id vector
   * addMarginal: instead of .marginal. being logical, make it contain 
names of variables being marginalized over

   * mChoice.c: fixed some inconsistencies

The rms package has also had some changes, and the survfit.formula 
function has been REMOVED, replaced by the npsurv function:


Changes in version 4.2-0 (2014-04-13)
   * Deprecated survfit.formula so would not overlap with function in 
survival

   * Added function npsurv, survplot.npsurv
   * REMOVED survfit.formula
   * Used new type argument to label.Surv for fitting functions
   * cph: added weights argument to residuals.coxph (Thanks: Thomas Lumley)
   * survfit.cph: fixed bug in using wrong type variable.  Thanks: 
Zhiyuan Sun

   * cph: added weighted=TRUE in call to residuals.coxph (Thanks: T Lumley)
   * orm.fit: improved ormfit to not try to deal with NaN in V, 
assuming that step-halving will happen


Some improvements have been made in the greport package:

Changes in version 0.5-1 (2014-04-15)
   * survReport: changed to use npsurv instead of survfit.formula
	 * exReport: changed order of output so that analysis of randomized 
patients marked for exclusions appears last; use LaTeX chngpage package 
to allow detailed table to go into left margin so as to be centered on page

   * exReport: added adjustwidth argument
 * accrualReport: allowed enrollment target N to be omitted
   * exReport: fine tuning
   * nriskReport: new report to show number of subjects still being 
followed at each day

 * Merge: added support for data.table
   * nriskReport: added id() variable
   * exReport: fixed bug when there is an exclusion with 0 frequency
   * accrualReport: improved graphics formatting, added minrand argument
   * accrualReport: added enrollmax argument, clarified notation
   * exReport: added ignoreExcl, ignoreRand arguments
   * all: added greportOption texwhere; default is 'gentex'; can 
specify texwhere='' to write non-appendix LaTeX code to console as for knitr
   * dReport: for byx for discrete Y, sense when Y is binary and use 
Wilson interval instead of bootstrap; adjust SE using confidence 
interval if proportion is 0 or 1
   * dReport: changed discreteness non-binary classification to use 
maximum number of unique values or Y instead of minimum

   * add globalVariables call to nriskReport


--
Frank E Harrell Jr Professor and Chairman  School of Medicine
   Department of Biostatistics Vanderbilt University

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting Width and Length of Each Shape in EPS File

2014-04-16 Thread Paul Murrell

Hi

Here is a demonstration that might give you some ideas ...

library(grImport)
PostScriptTrace("flower.ps", "flower.xml")
flower <- readPicture("flower.xml")

grid.newpage()
grid.picture(flower)

# Extract each path, then look at the 'summary' for the path
for (i in 1:flower@summary@numPaths) {
  bb <- flower[i]@summary
  # Draw the result as a check
  grid.polygon(c(bb@xscale[1], bb@xscale[2],
 bb@xscale[2], bb@xscale[1]),
   c(bb@yscale[1], bb@yscale[1],
 bb@yscale[2], bb@yscale[2]),
   default.units="native",
   gp=gpar(col=NA, fill=adjustcolor(i, alpha=.5)),
   vp="picture.shape::picture.scale")
}

The flower.ps file in that example is available here ...

https://www.stat.auckland.ac.nz/~paul/R/grImport/importFiles.tar.gz

Hope that helps.

Paul

On 04/17/14 10:56, Jeff Newmiller wrote:

Have you read the vignettes that accompany that package?

You should also read the Posting Guide for this mailing list, as HTML email is 
not in general a good idea on this list.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

On April 16, 2014 6:35:55 AM PDT, Muhammad Abdur Rehman Khawaja 
 wrote:

Respected Fellows,
I need little bit guidance regarding, how Can I
Extract/Calculate/Measure
width and length of each every shape in .eps file. EPS file is being
generated from Adobe Illustrator.
I have used grImport Library in R language to import eps file in R
environment, but I couldn't understand in which format the data is in
and
How can I manipulate it.
I'll shall be thankful for your cooperation
--
Kind Regards
Khawaja Muhammad Abdur Rehman
Mechatronics Engineer NUST
Professional Profile:
http://pk.linkedin.com/in/khawajamechatronicscaps/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot y scale label help

2014-04-16 Thread Ahmed Attia
Hi R usres,

I would like to specify y label units in xyplot. I tried

the yaxp, but it does not work.

Following are my codes;

trellis.device(col=F)
xyplot(GY~N|S,data = Ahmed,
layout=c(5,1,1),
xlab=expression(paste("N rate (kg ", ha^-1,")")),
ylab=expression(paste("Biomass yield (Mg  ",ha^-1,")")),
ylim=c(12,20),  #xlim=c(-50,700),
  strip = function(...) strip.default(...,style = 1), col=0,
par.strip.text = list(cex = 0.7),
scales = list(alternating=F, cex=0.7,yaxp=c(12,20,8)),aspect = 2,
  panel = function(x,y,subscripts,groups,...) {
   #panel.grid(-1,-1)
 m1 <- lm(y ~ x)
 x.seq <- data.frame(x=seq(min(x)*0.95,max(x)*1.05,max(x)/10))
 pred <- predict(m1, x.seq, se.fit=T)
 grid.polygon(c(x.seq$x,rev(x.seq$x)),
  c(pred$fit + 2*pred$se.fit, rev(pred$fit - 2*pred$se.fit) ),
  gp=gpar(col=0, fill="light grey", alpha = 1), default.units
= "native")
   llines(x.seq$x, pred$fit, lty = 1)
 panel.xyplot(x,y)
 print(round(summary(m1)$coef[ 2,c(1,4) ],3) )
 #ltext(0,0.5,paste("Slope = ",round(summary(m1)$coef[2],3),sep=""),
# cex=0.65, adj=0)
 #ltext(0,0.4,paste("P(slope) =
",round(summary(m1)$coef[2,4],3),sep=""),
# cex=0.65, adj=0)
 print(summary(m1))
}
   )

Any ideas

Thank you


Ahmed M. Attia


Research Assistant
Dept. of Soil&Crop Sciences
Texas A&M University
ahmed.at...@ag.tamu.edu
Cell phone: 001-979-248-5215
FAX: 001-308-455-4024

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help incorporating data subset lengths in function with ddply

2014-04-16 Thread Jeff Newmiller
Note that ddply is a heavyweight solution, and as your data gets larger 
you may find that using it for little things like this hits performance.


Also, "df" is a base function that you might actually want to use someday,
and you also introduce confusion in the mind of someone reading your code
if you redefine it this way.

existingdf <- read.csv( text=
"storm,Q_time,Q
s1,2008-08-07 21:15:00,0.000
s1,2008-08-07 21:16:00,3.020
s1,2008-08-07 21:17:00,6.041
s1,2008-08-07 21:18:00,9.061
s1,2008-08-07 21:19:00,12.082
s1,2008-08-07 21:20:00,15.102
s1,2008-08-07 21:21:00,18.123
s1,2008-08-07 21:22:00,11.143
s1,2008-08-07 21:23:00,0.000
s2,2010-10-05 21:00:00,0.000
s2,2010-10-05 21:01:00,1.812
s2,2010-10-05 21:02:00,3.625
s2,2010-10-05 21:03:00,5.437
s2,2010-10-05 21:04:00,7.249
s2,2010-10-05 21:05:00,9.061
s2,2010-10-05 21:06:00,0.874
s2,2010-10-05 21:07:00,0.000
", as.is=TRUE )

library(plyr)
# plyr solution
newdf <- ddply( existingdf
  , "storm"
  , function( DF ) {
  transform( DF
   , duration=seq.int( length.out=nrow( DF ) ) )
}
  )

# base R solution
newdf2 <- transform( existingdf
   , duration=ave( rep( 1, nrow(existingdf) )
 , storm
 , FUN=cumsum ) )


On Wed, 16 Apr 2014, Steve E. wrote:


Dear R Community,

I am having some trouble with a task that I hope you might be able to help
with. I have a dataset that includes the time and corresponding stream
discharge from numerous storms (example of structure with simplified data
below). I would like to produce a field that details the duration of each
storm, where each storm is a subset of the data and the duration runs from
zero to end for each unique storm. I have been trying to accomplish this
with ddply but to no avail as I am unable to provide ddply (e.g., below)
with the length of the storm (i.e., subset of data). Thank you in advance,
any help would be appreciated.


existing df:
storm,Q_time,Q
s1,2008-08-07 21:15:00,0.000
s1,2008-08-07 21:16:00,3.020
s1,2008-08-07 21:17:00,6.041
s1,2008-08-07 21:18:00,9.061
s1,2008-08-07 21:19:00,12.082
s1,2008-08-07 21:20:00,15.102
s1,2008-08-07 21:21:00,18.123
s1,2008-08-07 21:22:00,11.143
s1,2008-08-07 21:23:00,0.000
s2,2010-10-05 21:00:00,0.000
s2,2010-10-05 21:01:00,1.812
s2,2010-10-05 21:02:00,3.625
s2,2010-10-05 21:03:00,5.437
s2,2010-10-05 21:04:00,7.249
s2,2010-10-05 21:05:00,9.061
s2,2010-10-05 21:06:00,0.874
s2,2010-10-05 21:07:00,0.000

desired df:
storm,Q_time,Q, duration
s1,2008-08-07 21:15:00,0.000,1
s1,2008-08-07 21:16:00,3.020,2
s1,2008-08-07 21:17:00,6.041,3
s1,2008-08-07 21:18:00,9.061,4
s1,2008-08-07 21:19:00,12.082,5
s1,2008-08-07 21:20:00,15.102,6
s1,2008-08-07 21:21:00,18.123,7
s1,2008-08-07 21:22:00,11.143,8
s1,2008-08-07 21:23:00,0.000,9
s2,2010-10-05 21:00:00,0.000,1
s2,2010-10-05 21:01:00,1.812,2
s2,2010-10-05 21:02:00,3.625,3
s2,2010-10-05 21:03:00,5.437,4
s2,2010-10-05 21:04:00,7.249,5
s2,2010-10-05 21:05:00,9.061,6
s2,2010-10-05 21:06:00,0.874,7
s2,2010-10-05 21:07:00,0.000,8

I have been trying variations of the following statement, but I cannot seem
to get the length of the subset correct as I receive an error of the type
'Error: arguments imply differing number of rows: 2401, 0'.

newdf <- ddply(df, "storm", transform, FUN = function(x)
{duration=seq(from=1, by=1, length.out=nrow(x))})

I would really like to get a handle on ddply in this instance as it will be
quite helpful for many other similar calculations that I need to do with
this dataset.

Thanks again,
Stevan




--
View this message in context: 
http://r.789695.n4.nabble.com/help-incorporating-data-subset-lengths-in-function-with-ddply-tp4688926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help incorporating data subset lengths in function with ddply

2014-04-16 Thread Jeff Newmiller
Note that the solution you asked for is not robust in the presence of missing 
data, though Frede's suggestion or something like it would be.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 16, 2014 1:02:16 PM PDT, "Steve E."  wrote:
>Hi Frede - Thank you for responding. Not quite what I am after. Notice
>that I
>included two data sets in my post, the first is the raw data whereas
>the
>second (the desired df) is similar but has a column of sequential
>numbers in
>another column at the end - that column of sequential numbers for each
>storm
>(i.e., subset of data) is what I am after. Thanks again, Stevan
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/help-incorporating-data-subset-lengths-in-function-with-ddply-tp4688926p4688933.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2: using coord_trans for logit -> probability

2014-04-16 Thread Michael Friendly
I'm trying to see if & how I can use coord_trans() with ggplot2 to 
transform the
Y axis of a plot on the logit scale to the probability scale, as opposed 
to  recalculating

everything "manually" and constructing a new plot.
Here is a simple example of the 'base' plot I'd like to transform:

data(Arthritis, package="vcdExtra")
Arthritis$Better <- as.numeric(Arthritis$Improved > "None")
arth.logistic <- glm(Better ~ Age, data=Arthritis, family=binomial)

# get fitted values on the logit scale
pred <- data.frame(Arthritis,
   predict(arth.logistic, se.fit=TRUE))
library(ggplot2)
library(scales)
# plot on logit scale
gg <- ggplot(pred, aes(x=Age, y=fit)) +
  geom_line(size = 2) + theme_bw() +
  geom_ribbon(aes(ymin = fit - 1.96 * se.fit,
  ymax = fit + 1.96 * se.fit,), alpha = 0.2,  color = 
"transparent") +

  labs(x = "Age", y = "Log odds (Better)")
gg

Things I've tried that don't work:

> gg + coord_trans(ytrans="logis")
Error in get(as.character(FUN), mode = "function", envir = envir) :
  object 'logis_trans' of mode 'function' was not found
>
> gg + coord_trans(ytrans=probability_trans("logis"))
Error in if (zero_range(range)) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In qfun(x, ...) : NaNs produced
>

Doing what I want "manually":

# doing it manually
pred2 <- within(pred, {
 prob  <- plogis(fit)
 lower <- plogis(fit - 1.96 * se.fit)
 upper <- plogis(fit + 1.96 * se.fit)
 })


gg2 <- ggplot(pred2, aes(x=Age, y=prob)) +
  geom_line(size = 2) + theme_bw() +
  geom_ribbon(aes(ymin = lower,
  ymax = upper), alpha = 0.2,  color = "transparent") +
  labs(x = "Age", y = "Probability (Better)")
gg2



--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: using coord_trans for logit -> probability

2014-04-16 Thread Tim Marcella
I think all you have to do is add type="response" to your call for the
predictions.

Does this work for you

# get fitted values on the logit scale
pred <- data.frame(Arthritis,
   predict(arth.logistic, se.fit=TRUE,type="response"))

library(ggplot2)
library(scales)
# plot on logit scale
gg <- ggplot(pred, aes(x=Age, y=fit)) +
  geom_line(size = 2) + theme_bw() +
  geom_ribbon(aes(ymin = fit - 1.96 * se.fit,
  ymax = fit + 1.96 * se.fit,), alpha = 0.2,  color =
"transparent") +
  labs(x = "Age", y = "Log odds (Better)")
gg

-Tim


On Wed, Apr 16, 2014 at 7:03 PM, Michael Friendly  wrote:

> I'm trying to see if & how I can use coord_trans() with ggplot2 to
> transform the
> Y axis of a plot on the logit scale to the probability scale, as opposed
> to  recalculating
> everything "manually" and constructing a new plot.
> Here is a simple example of the 'base' plot I'd like to transform:
>
> data(Arthritis, package="vcdExtra")
> Arthritis$Better <- as.numeric(Arthritis$Improved > "None")
> arth.logistic <- glm(Better ~ Age, data=Arthritis, family=binomial)
>
> # get fitted values on the logit scale
> pred <- data.frame(Arthritis,
>predict(arth.logistic, se.fit=TRUE))
> library(ggplot2)
> library(scales)
> # plot on logit scale
> gg <- ggplot(pred, aes(x=Age, y=fit)) +
>   geom_line(size = 2) + theme_bw() +
>   geom_ribbon(aes(ymin = fit - 1.96 * se.fit,
>   ymax = fit + 1.96 * se.fit,), alpha = 0.2,  color =
> "transparent") +
>   labs(x = "Age", y = "Log odds (Better)")
> gg
>
> Things I've tried that don't work:
>
> > gg + coord_trans(ytrans="logis")
> Error in get(as.character(FUN), mode = "function", envir = envir) :
>   object 'logis_trans' of mode 'function' was not found
> >
> > gg + coord_trans(ytrans=probability_trans("logis"))
> Error in if (zero_range(range)) { : missing value where TRUE/FALSE needed
> In addition: Warning message:
> In qfun(x, ...) : NaNs produced
> >
>
> Doing what I want "manually":
>
> # doing it manually
> pred2 <- within(pred, {
>  prob  <- plogis(fit)
>  lower <- plogis(fit - 1.96 * se.fit)
>  upper <- plogis(fit + 1.96 * se.fit)
>  })
>
>
> gg2 <- ggplot(pred2, aes(x=Age, y=prob)) +
>   geom_line(size = 2) + theme_bw() +
>   geom_ribbon(aes(ymin = lower,
>   ymax = upper), alpha = 0.2,  color = "transparent") +
>   labs(x = "Age", y = "Probability (Better)")
> gg2
>
>
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele StreetWeb:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Tim Marcella
508.498.2989
timmarce...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem on package "survey" , function svyglm,

2014-04-16 Thread Hanze Zhang
Thank you, guys. I merged strata and fitted my model.


On Wed, Apr 16, 2014 at 5:33 PM, Thomas Lumley  wrote:

>
> On Tue, Apr 15, 2014 at 7:07 AM, Milan Bouchet-Valat wrote:
>
>> Le lundi 14 avril 2014 à 13:59 -0400, Hanze Zhang a écrit :
>> > Hi,
>> >
>> > I want to do logistic regression based on a complex sample design. I
>> used
>> > package survey, but when I ran svyglm, an error message came out:
>> > Error in onestrat(x[index, , drop = FALSE], clusters[index],
>> > nPSU[index][1],  :
>> >   Stratum (16) has only one PSU at stage 1
>> >
>> 
>>
>> > How to solve this issue? Thank you.
>> You need to merge manually the stratum with only one PSU with another
>> stratum. See 3.2.1 in http://books.google.fr/books?id=L96ludyhFBsC
>> (look for "single" in the whole book to find it).
>>
>>
> Or set options(survey.lonely.psu) to one of the other values. But merging
> strata is probably better.
>
>-thomas
>
> --
> Thomas Lumley
> Professor of Biostatistics
> University of Auckland
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: using coord_trans for logit -> probability

2014-04-16 Thread Tim Marcella
Sorry I jumped the gun. That does not provide you with the same plot as gg2
that you are aiming for.

-T


On Wed, Apr 16, 2014 at 7:37 PM, Tim Marcella  wrote:

> I think all you have to do is add type="response" to your call for the
> predictions.
>
> Does this work for you
>
> # get fitted values on the logit scale
> pred <- data.frame(Arthritis,
>predict(arth.logistic, se.fit=TRUE,type="response"))
>
> library(ggplot2)
> library(scales)
> # plot on logit scale
> gg <- ggplot(pred, aes(x=Age, y=fit)) +
>   geom_line(size = 2) + theme_bw() +
>   geom_ribbon(aes(ymin = fit - 1.96 * se.fit,
>   ymax = fit + 1.96 * se.fit,), alpha = 0.2,  color =
> "transparent") +
>   labs(x = "Age", y = "Log odds (Better)")
> gg
>
> -Tim
>
>
> On Wed, Apr 16, 2014 at 7:03 PM, Michael Friendly wrote:
>
>> I'm trying to see if & how I can use coord_trans() with ggplot2 to
>> transform the
>> Y axis of a plot on the logit scale to the probability scale, as opposed
>> to  recalculating
>> everything "manually" and constructing a new plot.
>> Here is a simple example of the 'base' plot I'd like to transform:
>>
>> data(Arthritis, package="vcdExtra")
>> Arthritis$Better <- as.numeric(Arthritis$Improved > "None")
>> arth.logistic <- glm(Better ~ Age, data=Arthritis, family=binomial)
>>
>> # get fitted values on the logit scale
>> pred <- data.frame(Arthritis,
>>predict(arth.logistic, se.fit=TRUE))
>> library(ggplot2)
>> library(scales)
>> # plot on logit scale
>> gg <- ggplot(pred, aes(x=Age, y=fit)) +
>>   geom_line(size = 2) + theme_bw() +
>>   geom_ribbon(aes(ymin = fit - 1.96 * se.fit,
>>   ymax = fit + 1.96 * se.fit,), alpha = 0.2,  color =
>> "transparent") +
>>   labs(x = "Age", y = "Log odds (Better)")
>> gg
>>
>> Things I've tried that don't work:
>>
>> > gg + coord_trans(ytrans="logis")
>> Error in get(as.character(FUN), mode = "function", envir = envir) :
>>   object 'logis_trans' of mode 'function' was not found
>> >
>> > gg + coord_trans(ytrans=probability_trans("logis"))
>> Error in if (zero_range(range)) { : missing value where TRUE/FALSE needed
>> In addition: Warning message:
>> In qfun(x, ...) : NaNs produced
>> >
>>
>> Doing what I want "manually":
>>
>> # doing it manually
>> pred2 <- within(pred, {
>>  prob  <- plogis(fit)
>>  lower <- plogis(fit - 1.96 * se.fit)
>>  upper <- plogis(fit + 1.96 * se.fit)
>>  })
>>
>>
>> gg2 <- ggplot(pred2, aes(x=Age, y=prob)) +
>>   geom_line(size = 2) + theme_bw() +
>>   geom_ribbon(aes(ymin = lower,
>>   ymax = upper), alpha = 0.2,  color = "transparent") +
>>   labs(x = "Age", y = "Probability (Better)")
>> gg2
>>
>>
>>
>> --
>> Michael Friendly Email: friendly AT yorku DOT ca
>> Professor, Psychology Dept. & Chair, Quantitative Methods
>> York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
>> 4700 Keele StreetWeb:   http://www.datavis.ca
>> Toronto, ONT  M3J 1P3 CANADA
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Tim Marcella
> 508.498.2989
> timmarce...@gmail.com
>



-- 
Tim Marcella
508.498.2989
timmarce...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Labeling/identifying observations in plot after MclustDR from Mclust

2014-04-16 Thread Mark Manger
Hi,

I’m trying to figure out how to label points in a contour plot produced from 
the output of MclustDR, the dimension reduction function in the Mclust package.

The original data frame has row names

RRcoarse govtDeficitres_gdp GrossFixedCapForm_UN fuelExports
AUS_762.000  -4.5125520  1.5197260  26.5169  12.7851900
AUS_813.500  -3.6041720  3.7029600  27.1058  20.8597300
AUS_914.000  -1.2131010  3.8150380  24.1444  20.5832700
AUS_014.000   0.0279132  4.4617860  26.9491  25.5349700
AUT_762.000  -2.7850080  7.0378630  25.7365   1.6797000
AUT_811.000  -3.4403770  6.4757080  23.2928   1.4688700

I then use these variables to 

imbalances.selections.def = subset(imbalances.all, select=c(5:8,10:12), cu_gdp 
< 0)
clusters.defs = Mclust(imbalances.selections.def, G=1:10)


I use MclustDR to reduce the dimensions to plot the results, which works 
beautifully and produces a nice contour plot.

dr.defs = MclustDR(clusters.defs)
plot(dr.defs, what="contour”)

But the plot is not particularly informative unless I know which observation is 
which.

How can I label the points displayed in the plot to identify the observations? 
I can probably extract this from the dr.defs object whose str() is below, but I 
don’t know how to do that, or somehow from the original data frame row names. 
Help would be greatly appreciated. I know this is trivial in a regular 
plot(x,y) situation.

str(dr.defs)

List of 22
 $ call : language MclustDR(object = clusters.defs)
 $ type : chr "Mclust"
 $ x: num [1:49, 1:7] 2 3.5 4 4 2 1 1 2 2 2 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:49] "AUS_76" "AUS_81" "AUS_91" "AUS_01" ...
  .. ..$ : chr [1:7] "RRcoarse" "govtDeficit" "res_gdp" "GrossFixedCapForm_UN" 
...
 $ Sigma: num [1:7, 1:7] 0.875 0.116 -0.781 -0.525 2.339 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:7] "RRcoarse" "govtDeficit" "res_gdp" "GrossFixedCapForm_UN" …
etc etc etc

Many thanks in advance,

Mark S. Manger 
Assistant Professor

Munk School of Global Affairs and Department of Political Science

University of Toronto
Observatory Site | 315 Bloor Street West | Room 212
Toronto, ON   M5S 0A7

mark.man...@utoronto.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] : Quantile and rowMean from multiple files in a folder

2014-04-16 Thread arun
Hi,
Use this code after `lst2`.
lapply(seq_along(lst2), function(i) {
lstN <- lapply(lst2[[i]], function(x) {
datN <- as.data.frame(matrix(NA, nrow = 101, ncol = length(names1), 
dimnames = list(NULL, 
names1)))
x1 <- x[, -1]
qt <- numcolwise(function(y) quantile(y, seq(0, 1, by = 0.01), na.rm = 
TRUE))(x1)
datN[, match(names(x1), names(datN))] <- qt
datN
})
arr1 <- array(unlist(lstN), dim = c(dim(lstN[[1]]), length(lstN)), dimnames 
= list(NULL, 
names1))
res <- rowMeans(arr1, dims = 2, na.rm = TRUE)
colnames(res) <- gsub(" ", "_", colnames(res))
res1 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), res, 
stringsAsFactors = FALSE)
write.csv(res1, paste0(paste(getwd(), "final", paste(names(lst1)[[i]], 
"Quantile", 
sep = "_"), sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", 
list.files(recursive = TRUE))], 
function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
sapply(ReadOut1, function(x) dim(x)) 

lstNew <- simplify2array(ReadOut1)
 nrow(lstNew)
#[1] 258
dir.create("Indices")
lapply(2:nrow(lstNew), function(i) {
dat1 <- data.frame(Percentiles = lstNew[1], do.call(cbind, 
lstNew[i, ]), stringsAsFactors = FALSE)
colnames(dat1) <- c("Percentiles", paste(names(lst2), 
rep(rownames(lstNew)[i], 
length(lst2)), sep = "_"))
write.csv(dat1, paste0(paste(getwd(), "Indices", gsub(" ", "_", 
rownames(lstNew)[i]), 
sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

## Output2:
ReadOut2 <- lapply(list.files(recursive = TRUE)[grep("Indices", 
list.files(recursive = TRUE))], 
function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
names(ReadOut2) <- gsub(".*\\/(.*)\\.csv","\\1",list.files(recursive = 
TRUE)[grep("Indices", list.files(recursive = TRUE))])
ReadOut2$pint_DJF[1:3,1:3]
#  Percentiles G100_pint_DJF G101_pint_DJF
#1  0%  0.982001  1.020892
#2  1%  1.005563  1.039288
#3  2%  1.029126  1.057685
any(is.na(ReadOut2$pint_DJF))
 [1] FALSE 
A.K.







On Wednesday, April 16, 2014 12:34 PM, Zilefac Elvis  
wrote:
Hi AK,
I tried the updated "Quantilecode.txt". It works well but when I open the files 
in "Indices", I find some columns filled with NAs. This should not be the case 
given that I am working with simulations and there are no missing values in the 
process. The ##not correct section yielded no NAs. Check for example, 
pint_..._DJF in "Indices".

Let be be sure we are in the same page. I removed the ##not correct section of 
the code, ran the code from beginning to end; Q1 and then Q2. My results are 
found in the "Indices" folder.

Thanks,
Atem.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The explanation of ns() with df =2

2014-04-16 Thread Xing Zhao
Dear John,

Thanks for your patience, got your idea

Best,
Xing

On Wed, Apr 16, 2014 at 4:11 AM, John Fox  wrote:
> Dear Xing,
>
> My point wasn't that you should add 1 df for a constant to each cubic but 
> that you need not subtract 1 for continuity. I'm sorry that I didn't make 
> that clear. When you use the natural spline in a regression, there's a 
> constant in the model as a separate term with 1 df, which in effect adjusts 
> the level of the regression curve (assuming one X for simplicity). If you add 
> a constant to each component cubic, the constant in the model is rendered 
> redundant, accounting for the extra df.
>
> I'm afraid that if this is still not sufficiently clear, I'll have to defer 
> to someone with greater powers of explanation.
>
> Best,
> John
>
> -
> John Fox
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.socsci.mcmaster.ca/jfox
>
>> On Apr 16, 2014, at 12:10 AM, Xing Zhao  wrote:
>>
>> Dear John
>>
>> Sorry I use 3 degree of freedom for  cubic spline. After using 4, it
>> is still not 2. I may make some naive mistake, but I cannot figure
>> out. Where is the problem?
>>
>> 4 (cubic on the right side of the *interior* knot 8)
>> + 4 (cubic on the left side of the *interior* knot 8)
>> - 1 (two curves must be continuous at the *interior* knot 8)
>> - 1 (two curves must have 1st order derivative continuous at the
>> *interior* knot 8)
>> - 1 (two curves must have 2nd order derivative continuous at the
>> *interior* knot 8)
>> - 1 (right side cubic curve must have 2nd order derivative = 0 at the
>> boundary knot 15 due to the linearity constraint)
>> - 1 (similar for the left)
>> = 3, not 2
>>
>> Thanks
>> Xing
>>
>>> On Tue, Apr 15, 2014 at 10:54 AM, John Fox  wrote:
>>> Dear Xing,
>>>
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Xing Zhao
 Sent: Tuesday, April 15, 2014 1:18 PM
 To: John Fox
 Cc: r-help@r-project.org; Michael Friendly
 Subject: Re: [R] The explanation of ns() with df =2

 Dear Michael and Fox

 Thanks for your elaboration. Combining your explanations would, to my
 understanding, lead to the following  calculation of degree of
 freedoms.

 3 (cubic on the right side of the *interior* knot 8)
 + 3 (cubic on the left side of the *interior* knot 8)
 - 1 (two curves must be continuous at the *interior* knot 8)
>>>
>>> You shouldn't subtract 1 for continuity since you haven't allowed a
>>> different level on each side of the knot (that is your initial counting of 3
>>> parameters for the cubic doesn't include a constant).
>>>
>>> Best,
>>> John
>>>
 - 1 (two curves must have 1st order derivative continuous at the
 *interior* knot 8)
 - 1 (two curves must have 2nd order derivative continuous at the
 *interior* knot 8)
 - 1 (right side cubic curve must have 2nd order derivative = 0 at the
 boundary knot 15 due to the linearity constraint)
 - 1 (similar for the left)
 = 1, not 2

 Where is the problem?

 Best,
 Xing

> On Tue, Apr 15, 2014 at 6:17 AM, John Fox  wrote:
> Dear Xing Zhao,
>
> To elaborate slightly on Michael's comments, a natural cubic spline
 with 2 df has one *interior* knot and two boundary knots (as is
 apparent in the output you provided). The linearity constraint applies
 beyond the boundary knots.
>
> I hope this helps,
> John
>
> 
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
>
> On Tue, 15 Apr 2014 08:18:40 -0400
> Michael Friendly  wrote:
>> No, the curves on each side of the know are cubics, joined
>> so they are continuous.  Se the discussion in \S 17.2 in
>> Fox's Applied Regression Analysis.
>>
>>> On 4/15/2014 4:14 AM, Xing Zhao wrote:
>>> Dear all
>>>
>>> I understand the definition of Natural Cubic Splines are those
 with
>>> linear constraints on the end points. However, it is hard to think
>>> about how this can be implement when df=2. df=2 implies there is
 just
>>> one knot, which, according the the definition, the curves on its
 left
>>> and its right should be both be lines. This means the whole line
>>> should be a line. But when making some fits. the result still
 looks
>>> like 2nd order polynomial.
>>>
>>> How to think about this problem?
>>>
>>> Thanks
>>> Xing
>>>
>>> ns(1:15,df =2)
>>>   1   2
>>>  [1,] 0.000  0.
>>>  [2,] 0.1084782 -0.07183290
>>>  [3,] 0.2135085 -0.13845171
>>>  [4,] 0.3116429 -0.19464237
>>>  [5,] 0.3994334 -0.23519080
>>>  [6,] 0.4734322 -0.25488292
>>>  [7,] 0.5301914 -0.24850464
>>>  [8,

Re: [R] : Quantile and rowMean from multiple files in a folder

2014-04-16 Thread Zilefac Elvis
Hi AK,
Thanks very much.
Atem.



On Wednesday, April 16, 2014 9:32 PM, arun  wrote:
Hi,
Use this code after `lst2`.
lapply(seq_along(lst2), function(i) {
    lstN <- lapply(lst2[[i]], function(x) {
        datN <- as.data.frame(matrix(NA, nrow = 101, ncol = length(names1), 
dimnames = list(NULL, 
            names1)))
        x1 <- x[, -1]
        qt <- numcolwise(function(y) quantile(y, seq(0, 1, by = 0.01), na.rm = 
TRUE))(x1)
        datN[, match(names(x1), names(datN))] <- qt
        datN
    })
    arr1 <- array(unlist(lstN), dim = c(dim(lstN[[1]]), length(lstN)), dimnames 
= list(NULL, 
        names1))
    res <- rowMeans(arr1, dims = 2, na.rm = TRUE)
    colnames(res) <- gsub(" ", "_", colnames(res))
    res1 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), res, 
stringsAsFactors = FALSE)
    write.csv(res1, paste0(paste(getwd(), "final", paste(names(lst1)[[i]], 
"Quantile", 
        sep = "_"), sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", 
list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
sapply(ReadOut1, function(x) dim(x)) 

lstNew <- simplify2array(ReadOut1)
nrow(lstNew)
#[1] 258
dir.create("Indices")
lapply(2:nrow(lstNew), function(i) {
    dat1 <- data.frame(Percentiles = lstNew[1], do.call(cbind, 
        lstNew[i, ]), stringsAsFactors = FALSE)
    colnames(dat1) <- c("Percentiles", paste(names(lst2), 
rep(rownames(lstNew)[i], 
        length(lst2)), sep = "_"))
    write.csv(dat1, paste0(paste(getwd(), "Indices", gsub(" ", "_", 
rownames(lstNew)[i]), 
        sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

## Output2:
ReadOut2 <- lapply(list.files(recursive = TRUE)[grep("Indices", 
list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
names(ReadOut2) <- gsub(".*\\/(.*)\\.csv","\\1",list.files(recursive = 
TRUE)[grep("Indices", list.files(recursive = TRUE))])
ReadOut2$pint_DJF[1:3,1:3]
#  Percentiles G100_pint_DJF G101_pint_DJF
#1          0%      0.982001      1.020892
#2          1%      1.005563      1.039288
#3          2%      1.029126      1.057685
any(is.na(ReadOut2$pint_DJF))
 [1] FALSE 
A.K.








On Wednesday, April 16, 2014 12:34 PM, Zilefac Elvis  
wrote:
Hi AK,
I tried the updated "Quantilecode.txt". It works well but when I open the files 
in "Indices", I find some columns filled with NAs. This should not be the case 
given that I am working with simulations and there are no missing values in the 
process. The ##not correct section yielded no NAs. Check for example, 
pint_..._DJF in "Indices".

Let be be sure we are in the same page. I removed the ##not correct section of 
the code, ran the code from beginning to end; Q1 and then Q2. My results are 
found in the "Indices" folder.

Thanks,
Atem.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.