date:20110715

[R] Plotting survival curves from a Cox model with time dependent covariates

2011-07-15 Thread Bjoern


Dear all,

Let's assume I have a clinical trial with two treatments and a time to 
event outcome. I am trying to fit a Cox model with a time dependent 
treatment effect and then plot the predicted survival curve for one 
treatment (or both).


library(survival)
test <- 
list(time=runif(100,0,10),event=sample(0:1,100,replace=T),trmt=sample(0:1,100,replace=T))
model1 <- coxph(Surv(time, event) ~ tt(trmt), data=test, tt=function(x, 
t, ...) pspline(x + t))

newdat1 <- data.frame(trmt=1,time=list(0,1,2,3,4,5))
plot(survfit(model1,newdata=newdat1,individual=T), xlab = "Years", 
ylab="Survival")


Where I think I am failing is with how to correctly specify what I want 
the survfit function to do. My understanding on reading the 
documentation for the survival package is that I should use newdata to 
not only specify the treatment, but also timepoints for which I want 
survival estimates and that this is the scenario for which the 
individual=T option can be appropriate. However, I just seem to fail to 
figure out exactly how I should specify this.


It would be greatly appreciated if someone who has done this before or 
knows how to do it could give me a quick (or extensive, of course) hint.


Many thanks,
Björn

PS: Yes, I realise that a Kaplan-Meier plot would do something like the 
above very nicely, but once I get this to work, I am actually looking at 
something a bit more complicated where a KM plot would not help me.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add a density line to a cumulative histogram - second try

2011-07-15 Thread Jochen1980

Thanks, I found the function ecdf() which does the job.

plot( ecdf( nvtpoints), col="BLUE", lwd=1, add=TRUE )


--
View this message in context: 
http://r.789695.n4.nabble.com/Add-a-density-line-to-a-cumulative-histogram-second-try-tp3666969p3669310.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stacked bar plot of frequency vs time

2011-07-15 Thread marcel

Thank you for the solutions! 

I have the first one working and it does exactly what I am looking for.
Unfortunately I have to put the plot in a common figure alongside other
plots made in the basic environment (challenging!). With the second method,
I was unable to make the stacked bars locate to the appropriate positions
along the X axis (ie the appropriate time), which, though unconventional is
required for my figure. So I am still looking for a complete solution in the
basic plotting environment. 

I have boiled my problem down to this minimal example: 

# Made-up data
tC <- textConnection("
Time Type1 Type2 Type3
1.3 .50 .25 .25
4.5 .55 .25 .20
5.2 .65 .20 .15
")

data1 <- read.table(header=TRUE, tC)
data2 <- data.frame(Time=rep(data1$Time, 3), stack(data1[,2:4]))
close.connection(tC)

# PLOT1 Scatterplot
attach(data1)
par(mar=c(1,1,1,1))
plot(Time, Type1, frame=T, ylab="Divergence",
col=rgb(0,100,0,50,maxColorValue=255), main="plot 1", xlim= c(0,6), ylim=
c(0, 1), axes=FALSE, xlab=" ")
detach(data1)

# PLOT2 barplot
require(lattice)
attach(data2)
barchart(values ~ Time, group=ind, data=data2, stack=TRUE, horizontal=FALSE,
main="not there yet")
plot2 <- xyplot(values ~ Time, group=ind, data=data2, stack=TRUE,
horizontal=FALSE, panel=panel.barchart, ylim=c(-0.05,1.05), xlim=c(0,6),
main="Plot 2- how can I plot below plot1?")
print(plot2)
detach(data2)

The only thing left is to get both plots to be vertically aligned, one above
the other on the same figure. Is this possible? Thanks for all of your
thoughts.

Marcel

Marcel

--
View this message in context: 
http://r.789695.n4.nabble.com/Stacked-bar-plot-of-frequency-vs-time-tp3659715p3669311.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining elements in a data frame

2011-07-15 Thread Dennis Murphy

Hi:

Q1: Try something like

# Formula interface works for R-2.11.0 and later versions
RTavg <- aggregate(RT ~ word, data = alldat, FUN = mean)
merge(CCCW, RTavg, by.x = row.names(CCCW), by.y = 'word', all = TRUE)

If the merge doesn't work (which is entirely possible), you might want
to define a variable word in CCCW first and then try again with
something like
merge(CCCW, RTavg, by = 'word', all = TRUE)

IIRC, all = TRUE keeps words from both data frames filling missing
values with NA, all.x keeps everything from the first argument of
merge() and all.y keeps everything from the second argument. If you
omit the option, it returns only the words that occur in both data
frames. From your description, it appears you want all.x = TRUE where
x = CCCW. See ?merge for specific details.

Q2:  See ?table and ?ftable

Utterly untested code in the absence of a reproducible example, so
caveat emptor.

Dennis

On Thu, Jul 14, 2011 at 9:17 PM, Lee Averell
 wrote:
> Hi all,
>        I have 2 data frames the first contains a list with repeats of words 
> and an associated response time (RT) measure for each word. The second is a 
> tabulation of each unique word and other information such as the amount and 
> of responses for each word. I need to determine the mean RT for each word and 
> add that as a column in the second data frame.
> Any help would be appreciated
> Cheers
> Lee
>
> Data frame 1
>
>> head(alldat)
>     s expt session cycle trial left.right freq concr    word   rt resp 
> Response correct corrResp
> 121 1a    a       1    C1     1          1   lf    hc pianist 1529  old       
> hi   FALSE      new
> 122 1a    a       1    C1     2          1   hf    hc   sweat 1518  new       
> hi    TRUE      new
> 123 1a    a       1    C1     3          1   lf    lc carnage 1046  old       
> hi    TRUE      old
> 124 1a    a       1    C1     4          1   lf    hc   nymph 1142  old       
> hi    TRUE      old
> 125 1a    a       1    C1     5          1   hf    lc    hank 1487  new       
> hi    TRUE      new
> 126 1a    a       1    C1     6          1   lf    hc   waist 1199  new       
> hi    TRUE      new
>    respType
> 121        s
> 122        s
> 123        s
> 124        s
> 125        s
> 126        s
>>
>
> Data frame 2
>
>> head(CCCW)
>        FALSE TRUE    CC    propCC lo hi
> abode       2   11  TRUE 0.8461538  4  9
> abyss       1   12  TRUE 0.9230769  2 11
> accord      2   11  TRUE 0.8461538  2 11
> account     0    0 FALSE       NaN  0  0
> acre        4    9  TRUE 0.6923077  4  9
> adage       0    0 FALSE       NaN  0  0
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LME and overall treatment effects

2011-07-15 Thread ONKELINX, Thierry

Dear Mark,

Interpreting one of the main effects when they are part of an interaction is, 
AFAIK, not possible.
Your statement about comparing treatments when Year is continuous is not 
correct. The parameters of treatment assume that Year == 0! Which might lead to 
very strange effect when year is not centered to a year close to the observed 
years. Have a look at the example below

set.seed(123456)
dataset <- expand.grid(cYear = 0:20, Treatment = factor(c("A", "B", "C")), Obs 
= 1:3)
dataset$Year <- dataset$cYear + 2000
Trend <- c(A = 1, B = 0.1, C = -0.5)
TreatmentEffect <- c(A = 2, B = -1, C = 0.5)
sdNoise <- 1
dataset$Value <- with(dataset, TreatmentEffect[Treatment] + Trend[Treatment] * 
cYear) + rnorm(nrow(dataset), sd = sdNoise)

lm(Value ~ Year * Treatment, data = dataset)
lm(Value ~ cYear * Treatment, data = dataset)


If you want to focus on the treatment effect alone but take the year effect 
into account, then add Year as a random effect.

library(lme4)
lmer(Value ~ 0 + Treatment + (0 + Treatment|Year), data = dataset)

In your case you want to cross the random effect of year with those of plot. 
Crossed random effects are hard to do with the nlme package but easy with the 
lme4 package.

Model <- lmer(Species~ 0 + Treatment + (0 + Treatment|Year) + (1|Plot/Quadrat) 
,na.action =na.omit,data=UDD)

Best regards,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

> -Oorspronkelijk bericht-
> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> Namens Mark Bilton
> Verzonden: donderdag 14 juli 2011 22:05
> Aan: Bert Gunter
> CC: r-help@r-project.org
> Onderwerp: Re: [R] LME and overall treatment effects
> 
> Ok...lets try again with some code...
> 
> ---
> Hello fellow R users,
> 
> I am having a problem finding the estimates for some overall treatment effects
> for my mixed models using 'lme' (package nlme). I hope someone can help.
> 
> Firstly then, the model:
> The data: Plant biomass (log transformed) Fixed Factors: Treatment(x3 Dry, 
> Wet,
> Control) Year(x8 2002-2009) Random Factors: 5 plots per treatment, 10 quadrats
> per plot (N=1200 (3*5*10*8)).
> 
> I am modelling this in two ways, firstly with year as a continuous variable
> (interested in the difference in estimated slope over time in each treatment
> 'year*treatment'), and secondly with year as a categorical variable 
> (interested in
> difference between 'treatments').
> 
> --
> ie: (with Year as either numeric or factor)
> 
> Model<-lme(Species~Year*Treatment,random=~1|Plot/Quadrat,na.action =
> na.omit,data=UDD)
> -
> 
> When using Year as a continuous variable, the output of the lme means that I
> can compare the 3 treatments within my model...
> i.e. it takes one of the Treatment*year interactions as the baseline and
> compares (contrasts) the other two to that.
> -
> ie
> 
> Fixed effects: Species ~ Year * Treatment
>   Value Std.Error   DF   t-value p-value
> (Intercept)  1514.3700  352.7552 1047  4.292978  0.
> Year   -0.75190.1759 1047 -4.274786  0.
> Treatment0   -461.9500  498.8711   12 -0.925991  0.3727
> Treatment1  -1355.0450  498.8711   12 -2.716222  0.0187
> Year:Treatment0 0.23050.2488 1047  0.926537  0.3544
> Year:Treatment1 0.67760.2488 1047  2.724094  0.0066
> 
> so Year:Treatment0 differs from baseline Year:Treatment-1 by 0.2305 and
> Year:Treatment1 is significantly different (p=0.0066) from
> Year:Treatment-1
> --
> 
> I can then calculate the overall treatment*year effect using 
> 'anova.lme(Model)'.
> -
> > anova.lme(Model1)
> numDF denDF   F-value p-value
> (Intercept)1  1047 143.15245  <.0001
> Year   1  1047  19.56663  <.0001
> Treatment  212   3.73890  0.0547
> Year:Treatment 2  1047   3.83679  0.0219
> 
> so there is an overall difference in slope between treatments (Year:Treatment
> interaction) p=0.0219
> --
> 
> However, the problem comes when I use Year as a categorical variable.
> Here, I am interested

[R] searching and replacing in a data frame.

2011-07-15 Thread Ashim Kapoor

Dear R helpers,


Please have a look at the following : -

Note : My goal is to find and replace all Inf's in a data array with 0.

> t<-data.frame(A=c(Inf,0,0),B=c(1,2,3))
> t
A B
1 Inf 1
2   0 2
3   0 3

>str(t)
'data.frame':3 obs. of  2 variables:
 $ A: num  Inf 0 0
 $ B: num  1 2 3
> t[which(t==Inf,arr.ind=T)]
[1] Inf
>  t[which(t==Inf,arr.ind=T)]<-0
Error in `[<-.data.frame`(`*tmp*`, which(t == Inf, arr.ind = T), value = 0)
:
  only logical matrix subscripts are allowed in replacement

Query : Why does the search work but the replace not work ?

Many thanks for your time and efforts.

Ashim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-15 Thread Dennis Murphy

Hi:

Is this what you're looking for?

Lines <- "
ASG,UXW,AFODJEL
E,TDIWE,ROFD"

# Read in the above lines (for purposes of this example only)
# Note the stringsAs Factors = FALSE option!
df <- read.csv(textConnection(Lines), header = FALSE, stringsAsFactors = FALSE)
closeAllConnections()
dm <- as.matrix(df)   # convert to a character matrix

# Function to sort a character string in alphabetical (lexical) order
sortfun <- function(x)
   paste(sort(unlist(strsplit(x, ''))), collapse = '')

# Apply to the rows of the matrix
t(apply(df, 1, function(x) sapply(x, sortfun)))

Result:
 V1V2  V3
[1,] "AGS" "UWX"   "ADEFJLO"
[2,] "E"   "DEITW" "DFOR"

If you need to do this for only a subset of your variables, create a
character submatrix and follow the script above on that, after which
you would need to do some post-processing on your own.

HTH,
Dennis

On Thu, Jul 14, 2011 at 6:18 PM, onthetopo  wrote:
> Hi there,
>
>  I have a large amino acid csv file like this:
>
> input.txt:
> P,LV,Q,Z
> P,VL,Q,Z
> P,ML,QL,Z
>
> There is a problem with this file, since LV and VL are in fact the same
> thing.
> How do I order each element according to alphabetical order so that the
> desired output would look like:
>
> output.txt:
> P,LV,Q,Z
> P,LV,Q,Z
> P,LM,LQ,Z
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3668997.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smart legend ???

2011-07-15 Thread Jim Lemon


On 07/14/2011 05:20 AM, JIA Pei wrote:

Hi, all:

Is there an automatic smart legend for R?
Since my R code is running in a row, which will produce a bunch of R plots
in a single run, some of the produced plots are really "ridiculous".
Because my legend is fixed to "topleft", sometimes, which occludes the key
parts of the figure/plots, but most of the time, the legend works just fine.

I'm wondering is there a smart legend in R? Whenever I set "topleft" but
occlude the actual plots, the smart legend may reset from "topleft" to
"topright".
Or, just try "topleft", "topright", "bottomleft" and "bottomright" in a
particular sequence, and calculate the occlusion ratio.
Pick up either the legend with the least occlusion, or the first priority
legend when some legends are of the same occlusion?



Hi JIA Pei,
The "emptyspace" function in the plotrix package may be helpful. This 
tries to find the largest empty rectangle on a plot and returns the 
coordinates of the center of that rectangle.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot a vertical column of colored rectangles

2011-07-15 Thread Nacho Caballero

Hi, I've been really struggling with this.

If I have a vector like
dat <- c(0,0,0,0,1,1,1,0,0,0,1,1,0,0,0,1,0,0,0)

I want to plot each element as a colored rectangle (red=1, blue=1) in the
right order, so they all stack up forming a vertical column on the graph.
Sort of like a building, with each floor in the appropriate color.

Any ideas?
I've tried using ggplot and geom_tile, but my data has a million elements
and the plots take forever to generate.
I've also tried using a heatmap, but I need 2 columns at least, and I only
have 1.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correct behavior of Hmisc::capitalize()?

2011-07-15 Thread Uwe Ligges




On 14.07.2011 23:32, Henrik Bengtsson wrote:

Hi,

from example(capitalize) of the Hmisc package (v 0.8.3) you get:


capitalize(c("Hello", "bob", "daN"))

[1] "Hello" "Bob"   "daN"

Is that "daN" correct?

If so, then this behavior that only *all lowercase strings*, which the
code indicates,  will be capitalized is not documented.


Hmisc::capitalize

function (string)
{
 capped<- grep("^[^A-Z]*$", string, perl = TRUE)
 substr(string[capped], 1, 1)<- toupper(substr(string[capped],
 1, 1))
 return(string)
}


There are also some misspelled words in help("capitalize").



sessionInfo()

R version 2.13.1 Patched (2011-07-09 r56344)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] Hmisc_3.8-3 survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.14.0  grid_2.13.1 lattice_0.19-30 tools_2.13.1


/Henrik
(Hmisc maintainer cc:ed)


I don't see you CCed. The Hmisc maintainer is the only one who can 
answer your message appropriately.


Best,
Uwe






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] crr - computationally singular

2011-07-15 Thread mgordon

Hi!

I guess that you have solved this error by now but I figured I'd post the
result of my 12 hour debugging session in case anyone else has the same
issue. Lets start with a more intuitive example that the one crr offers:

# CODE START #
# Define a set size
my_set_size <- 1000

# Create the covariables
cov <- cbind(rbinom(my_set_size, 1, .5), 
  rbinom(my_set_size, 1, .05),
  rbinom(my_set_size, 1, .1))
dimnames(cov)[[2]] <- c('gender','risk factor 1','risk factor 2')

# Create random time to failure/cens periods
ftime <- rexp(my_set_size)

# Create events
my_event1 <- rbinom(my_set_size, 1, .04)
my_event2 <- rbinom(my_set_size, 1, .20)
# The competing event can't happen if 1 has already occurred
my_event2[my_event1 > 0] <- 0
fstatus <- my_event1 + my_event2*2

# Factor the censor variable
fstatus <- factor(fstatus, levels=c(0,1,2), labels=c("censored",
"re-operation", "death"))

# Check that it seems Ok
table(fstatus)

# Do the test
test_results <- crr(ftime, fstatus, cov, failcode="re-operation",
cencode="censored")

# Output the results
summary(test_results)
# CODE END #

Ok, so the error occurs in the .Fortran call to "crrval" (I think it was
called) that returns an empty variable if you forget to specify the factor
failcode, in other words exchange above crr to:
test_results <- crr(ftime, fstatus, cov, failcode=1, cencode="censored")

And you get the:
Error in solve.default(v[[1]]) : 
  Lapack routine dgesv: system is exactly singular

Another way to get a singular error is to have a covariate that is 0. Try to
exchange to this code for the covariates:
cov <- cbind(rbinom(my_set_size, 1, .5), 
  rbinom(my_set_size, 1, .05),
  rbinom(my_set_size, 1, .1)*0)

And you get:
Error in drop(.Call("La_dgesv", a, as.matrix(b), tol, PACKAGE = "base")) : 
  Lapack routine dgesv: system is exactly singular

This code has been checked with 2.13.1 and cmprsk ver. 2.2.2

I'm not so familiar with R but I believe that this is actually a bug in the
cmprsk package which should check for the variables being factors and then
handle them as expected. I've noticed similar issues with cuminc function
that doesn't behave as expected when providing factored censoring variables.
I haven't seen any issues with factoring the covariates although I've used 
http://www.stat.unipg.it/~luca/R/crr-addson.R Scruccas factor2ind  function
when I've had non-binomial factors.

I hope someone out there will be able to avoid my 12 hours of debugging with
this post.

Max Gordon

--
View this message in context: 
http://r.789695.n4.nabble.com/crr-computationally-singular-tp891659p3669639.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-15 Thread David Winsemius



On Jul 15, 2011, at 12:23 AM, onthetopo wrote:


dd

[,1] [,2]
[1,] "OP" "SU"
[2,] "XA" "YQ"

sapply( lapply(
+ strsplit(dd, split=""), sort),
+ paste, collapse="")

[1] "OP" "AX" "SU" "QY"

The result is not what I intended since it is a single line.
It should be:
[,1] [,2]
[1,] "OP" "SU"
[2,] "AX" "QY"


 sortvec <- function(x)
   paste(
   sapply( strsplit(x, split=""), sort),
   sep="")

 apply(dd, 1:2, sortvec)
 [,1] [,2]
[1,] "OP" "SU"
[2,] "AX" "QY"

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Drawing a histogram from a massive dataset

2011-07-15 Thread Paul Smith

Dear All,

I have a massive dataset from which I would like to draw a histogram.
Any ideas on how to accomplish this?

Thanks in advance,

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Out of Sample Prediction Interval/Point Estimate

2011-07-15 Thread Zd Gibbs

Hi All, I have been requested to come up with an out-of-sample prediction 
einterval and point estimate. I have never done this and I am hoping for help 
from you all.

First can R do this?
If so, what are the steps?
What do I need?

I have a data file that I can include, if that would help.

I'm between a beginner and intermediate user of R so if it is complicated, I 
may 
be asking for a lot of help.

Thanks so much

Zeda. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export Unicode characters from R

2011-07-15 Thread Duncan Murdoch

On 11-07-14 7:11 PM, Sverre Stausland wrote:

>  funny.g<- "\u1E21"
>  funny.g

[1] "ḡ"

>  data.frame (funny.g) ->  funny.g
>  funny.g$funny.g

[1] ḡ
Levels:

I think the problem is in the data.frame code, not in writing. 
Data.frames try to display things in a readable way, and since you're on 
Windows where UTF-8 is not really supported, the code helpfully changes 
that character to the "" string. for display.

You should be able to write the Unicode character to file if you use 
lower level methods such as cat(), on a connection opened using the 
file() function with the encoding set explicitly.

Duncan Murdoch

>  write.table (funny.g, file = "C:/~funny.g.txt", col.names = FALSE, row.names = FALSE, 
quote = FALSE, fileEncoding = "UTF-8")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching and replacing in a data frame.

2011-07-15 Thread David Winsemius



On Jul 15, 2011, at 5:20 AM, Ashim Kapoor wrote:


Dear R helpers,


Please have a look at the following : -

Note : My goal is to find and replace all Inf's in a data array with  
0.



t<-data.frame(A=c(Inf,0,0),B=c(1,2,3))
t

   A B
1 Inf 1
2   0 2
3   0 3


str(t)

'data.frame':3 obs. of  2 variables:
$ A: num  Inf 0 0
$ B: num  1 2 3

t[which(t==Inf,arr.ind=T)]

[1] Inf


Several problems here.
`t` is a perfectly good function name so using it as an object name is  
confusing.




t[which(t==Inf,arr.ind=T)]<-0
Error in `[<-.data.frame`(`*tmp*`, which(t == Inf, arr.ind = T),  
value = 0)

:
 only logical matrix subscripts are allowed in replacement

Query : Why does the search work but the replace not work ?


Because you gave a numeric matrix as an argument to "data.frame.[<-"  
and it wanted a different mode. I think it would have worked if `t`  
were a matrix.





Many thanks for your time and efforts.


Two methods that would accomplish the task:

ttt<-data.frame(A=c(Inf,0,0),B=c(1,2,3))

ttt[is.infinite(as.matrix(ttt))] <- 0

Or:

apply(ttt, 1:2, function(x) x[is.infinite(x)] <- 0 )


Ashim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fixed effects Tobit, Honore style?

2011-07-15 Thread David Hugh-Jones

A cleaner and slightly more tested version is at

http://davidhughjones.blogspot.com/2011/07/honore-style-fixed-effects-estimators.html

David Hugh-Jones
Research Associate
CAGE, Department of Economics
University of Warwick
http://davidhughjones.googlepages.com



On 13 July 2011 15:33, David Hugh-Jones  wrote:

> True! Here's my attempt -- use at your own risk.
>
> honore <- function (b, dataset, x1, x2) {
> dxb <- (x2 - x1) %*% b
> y1 <- # insert your y variable here
> y2 <- # insert your y variable here
> sum(
> (pmax(y1, dxb) - pmax(y2, dxb) - dxb)^2 +
> 2*(y1 < dxb)*(dxb-y1)*y2 +
> 2*(y2 < -dxb)* (-dxb-y2)*y1
> )
> }
>
> fetobit <- function (dataset, form) {
> x2 <- model.matrix(form, dataset[,T=2])
> x1 <- model.matrix(form, dataset[,T=1])
># could maybe set initial values to something different
> res <- optim(rep(0, ncol(x1)), fn=honore, x1=x1, x2=x2,
> dataset=dataset, method="BFGS", control=list(maxit=1000))
> if (res$convergence != 0) warning("Didn't converge")
> res$par
> }
>
> For standard errors, bootstrap.
>
>
> David Hugh-Jones
> Research Associate
> CAGE, Department of Economics
> University of Warwick
> http://davidhughjones.googlepages.com
>
>
>
> On 12 July 2011 21:38, Daniel Malter  wrote:
>
>> Not that I know of, but the paper says that they are easy to compute. If
>> you
>> did, you could contribute the code.
>>
>> Best,
>> Daniel
>>
>>
>> David Hugh-Jones-3 wrote:
>> >
>> > Hi all,
>> >
>> > Is there any code to run fixed effects Tobit models in the style of
>> Honore
>> > (1992) in R?
>> > (The original Honore article is here:
>> >
>> http://www.jstor.org/sici?sici=0012-9682%28199205%2960%3A3%3C533%3ATLALSE%3E2.0.CO%3B2-2
>> )
>> >
>> > Cheers
>> > David
>> >
>> >   [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/fixed-effects-Tobit-Honore-style-tp3662246p3663464.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing Complex Formulas

2011-07-15 Thread peter dalgaard

On Jul 14, 2011, at 20:19 , Duncan Murdoch wrote:

> On 14/07/2011 12:46 PM, warmstron1 wrote:
>> I resolved this issue.  It appears that "^" won't work for this case, but
>> "**" worked.  I can't find any reference to this, but where "^" seems to be
>> used to raise a value to a numerical function, "**" is used for a y raised
>> to the power of x where x it a computation.
> 
> Those should be equivalent.  Can you post the code that wasn't working, and 
> describe what "not working" meant?

More easily, demonstrate that code that _is_ working stops working if you 
replace "**" with "^". Or stop spreading misinformation!

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-15 Thread peter dalgaard

For a little lateral thinking, consider the use of "." on the LHS. That could 
play out as follows:

> myvars <- c("Ozone","Wind")
> f <- . ~ Month
> j <- union(all.vars(f[[3]]), myvars)
> aggregate(. ~ Month, data=airquality[j], mean, na.rm=T)
  MonthOzone  Wind
1 5 23.61538 11.457692
2 6 29.4 12.18
3 7 59.11538  8.523077
4 8 59.96154  8.565385
5 9 31.44828 10.075862

(and of course, when you play with something unusual, a buglet pops up: it 
doesn't work with f instead of the explicit formula in the call to aggregate.)


On Jul 15, 2011, at 00:10 , Dennis Murphy wrote:

> Hi:
> 
> I think Bill's got the right idea for your problem, but for the fun of
> it, here's how Bert's suggestion would play out:
> 
> # Kind of works, but only for the first variable in myvars...
>> aggregate(get(myvars) ~ group + mydate, FUN = sum, data = example)
>   group mydate get(myvars)
> 1 group1 2008-12-01   4
> 2 group2 2008-12-01   6
> 3 group1 2009-01-01  40
> 4 group2 2009-01-01  60
> 5 group1 2009-02-01 400
> 6 group2 2009-02-01 600
> 
> # Maybe sapply() with get as the function will work...
>> aggregate(sapply(myvars, get) ~ group + mydate, FUN = sum, data = example)
>   group mydate myvars   get
> 1 group1 2008-12-01  4   4.2
> 2 group2 2008-12-01  6   6.2
> 3 group1 2009-01-01 40  40.2
> 4 group2 2009-01-01 60  60.2
> 5 group1 2009-02-01400 400.2
> 6 group2 2009-02-01600 600.2
> 
> Apart from the variable names, it matches example.agg1. OTOH, Bill's
> suggestion matches example.agg1 exactly and has an advantage in terms
> of code clarity:
> 
> byVars <- c('group', 'mydate')
>> aggregate(example[myvars], by = example[byVars], FUN = sum)
>   group mydate value1 value2
> 1 group1 2008-12-01  44.2
> 2 group2 2008-12-01  66.2
> 3 group1 2009-01-01 40   40.2
> 4 group2 2009-01-01 60   60.2
> 5 group1 2009-02-01400  400.2
> 6 group2 2009-02-01600  600.2
> 
> FWIW,
> Dennis
> 
> On Thu, Jul 14, 2011 at 12:05 PM, Dimitri Liakhovitski
>  wrote:
>> Hello!
>> 
>> I am aggregating using a formula in aggregate - of the type:
>> aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)
>> 
>> However, I actually have an object (vector of my variables to be aggregated):
>> myvars<-c("var1","var2","var3")
>> 
>> I'd like my aggregate formula (its "cbind" part) to be able to use my
>> "myvars" object. Is it possible?
>> Thanks for your help!
>> 
>> Dimitri
>> 
>> Reproducible example:
>> 
>> mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4)
>> value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
>> value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)
>> 
>> example<-data.frame(mydate=mydate,value1=value1,value2=value2)
>> example$group<-c(rep("group1",3),rep("group2",3),rep("group1",3),rep("group2",3))
>> example$group<-as.factor(example$group)
>> (example);str(example)
>> 
>> example.agg1<-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
>> # this works
>> (example.agg1)
>> 
>> ### Building my object (vector of 2 names - in reality, many more):
>> myvars<-c("value1","value2")
>> example.agg1<-aggregate(cbind(myvars)~group+mydate,sum,data=example)
>> ### does not work
>> 
>> 
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-15 Thread Dimitri Liakhovitski

THAT'S IT, Bill - exactly what I was looking for! Thanks a lot for the
input, everyone.
I find the "by" method the most straigtfoward and clear.
Dimitri


On Thu, Jul 14, 2011 at 5:12 PM, William Dunlap  wrote:
> You may find it easier to use the data.frame method for aggregate
> instead of the formula method when you are using vectors of column
> names.   E.g.,
>
>  responseVars <- c("mpg", "wt")
>  byVars <- c("cyl", "gear")
>  aggregate(mtcars[responseVars], by=mtcars[byVars], FUN=median)
>
> gives the same result as
>
>  aggregate(cbind(mpg, wt) ~ cyl + gear, FUN=median, data=mtcars)
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Dimitri Liakhovitski
> Sent: Thursday, July 14, 2011 1:45 PM
> To: David Winsemius
> Cc: r-help
> Subject: Re: [R] cbind in aggregate formula - based on an existing object 
> (vector)
>
> Thanks a lot!
>
> actually, what I tried to do is very simple - just passing tons of
> variable names into the formula. Maybe that "get" thing suggested by
> Bert would work...
>
> Dimitri
>
>
> On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius  
> wrote:
>> Dmitri:
>>
>> as.matrix makes a matrix out of the dataframe that is passed to it.
>>
>> As a further note I attempted and failed for reasons that are unclear to me
>> to construct a formula that would (I hoped) preserve the column names which
>> are being mangle in the posted effort:
>>
>> form <- as.formula(paste(
>>                     "cbind(",
>>                      paste( myvars, collapse=","),
>>                      ") ~ group+mydate",
>>                      sep=" ") )
>>> myvars<-c("value1","value2")
>>> example.agg1<-aggregate(formula=form,data=example, FUN=sum)
>> Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
>>> traceback()
>> 2: aggregate.formula(formula = form, data = example, FUN = sum)
>> 1: aggregate(formula = form, data = example, FUN = sum)
>>
>>> form
>> cbind(value1, value2) ~ group + mydate
>>> parse(text=form)
>> expression(~
>> cbind(value1, value2), group + mydate)
>>
>> So it seems to be correctly dispatched to aggregate.formula but not passing
>> some check or another. Also tried with formula() rather than as.formula with
>> identical error message. Also tried including without naming the argument.
>>
>> --
>> David
>>
>>
>> On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote:
>>
>>> Thank you, David, it does work.
>>> Could you please explain why? What exactly does changing it to "as matrix"
>>> do?
>>> Thank you!
>>> Dimitri
>>>
>>> On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius 
>>> wrote:

 On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:

> Hello!
>
> I am aggregating using a formula in aggregate - of the type:
> aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)
>
> However, I actually have an object (vector of my variables to be
> aggregated):
> myvars<-c("var1","var2","var3")
>
> I'd like my aggregate formula (its "cbind" part) to be able to use my
> "myvars" object. Is it possible?
> Thanks for your help!
>

 Not sure I have gotten all the way there, but this does work:


 example.agg1<-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)

> example.agg1

  group     mydate example[myvars]    NA
 1 group1 2008-12-01               4   4.2
 2 group2 2008-12-01               6   6.2
 3 group1 2009-01-01              40  40.2
 4 group2 2009-01-01              60  60.2
 5 group1 2009-02-01             400 400.2
 6 group2 2009-02-01             600 600.2

> Dimitri
>
> Reproducible example:
>
> mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4)
> value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
> value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)
>
> example<-data.frame(mydate=mydate,value1=value1,value2=value2)
>
>
> example$group<-c(rep("group1",3),rep("group2",3),rep("group1",3),rep("group2",3))
> example$group<-as.factor(example$group)
> (example);str(example)
>
>
>
> example.agg1<-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
> # this works
> (example.agg1)
>
> ### Building my object (vector of 2 names - in reality, many more):
> myvars<-c("value1","value2")
> example.agg1<-aggregate(cbind(myvars)~group+mydate,sum,data=example)
> ### does not work
>
>
> --
> Dimitri Liakhovitski
> Ninah Consulting
> www.ninah.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproduc

Re: [R] Very slow optim()

2011-07-15 Thread John C Nash

As I'm at least partly responsible for CG in optim, and packager of Rcgmin, 
I'll recommend
the latter based on experience since it was introduced. I've so far seen no 
example where
CG does better than Rcgmin, though I'm sure there are cases to be found.

However, Ben is right that if ADMB does so well (it uses effectively analytic
derivatives), then use it. Rcgmin really wants you to provide gradient code, 
and that is
work.

JN

On 07/14/2011 06:00 AM, r-help-requ...@r-project.org wrote:
> Message: 85 Date: Wed, 13 Jul 2011 20:20:47 + From: Ben Bolker 
>  To:
>  Subject: Re: [R] Very slow optim() Message-ID:
>  Content-Type: text/plain; 
> charset="utf-8"
> Hamazaki, Hamachan (DFG  alaska.gov> writes:
>> > 
>> > Dear list, 
>> > 
>> > I am using optim() function to MLE ~55 parameters, but it is very slow to
> converge (~ 25 min), whereas I can do
>> > the same in ~1 sec. using ADMB, and ~10 sec using MS EXCEL Solver.
>> > 
>> > Are there any tricks to speed up?
>> > 
>> > Are there better optimization functions? 
>> > 
>   There's absolutely no way to tell without knowing more about your code.  You
> might try method="CG":
> 
> Method ?"CG"? is a conjugate gradients method based on that by
>  Fletcher and Reeves (1964) (but with the option of Polak-Ribiere
>  or Beale-Sorenson updates).  Conjugate gradient methods will
>  generally be more fragile than the BFGS method, but as they do not
>  store a matrix they may be successful in much larger optimization
>  problems.
> 
>   If ADMB works better, why not use it?  You can use the R2admb
> package (on R forge) to wrap your ADMB calls in R code, if you
> prefer that workflow.
> 
>   Ben
> 
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting survival curves from a Cox model with time dependent covariates

2011-07-15 Thread Terry Therneau

 The time-transform (tt() arguments) feature is the most recent addition
to coxph.  Most of the follow-up functions, in particular survfit(fit)
have not yet been updated to deal with such models.  Your message points
out that I need to at least update them to add a "not yet available"
error message.  
  I'm still learning what can be done with the tt() option.  If I assume
that your trmt variable is 0/1, then the code below is a clever way to
look at time dependent treatment effects.  I had not thought of it.

 Terry Therneau

--- begin included message ---

Let's assume I have a clinical trial with two treatments and a time to 
event outcome. I am trying to fit a Cox model with a time dependent 
treatment effect and then plot the predicted survival curve for one 
treatment (or both).

library(survival)
test <- 
list(time=runif(100,0,10),event=sample(0:1,100,replace=T),trmt=sample(0:1,100,replace=T))
model1 <- coxph(Surv(time, event) ~ tt(trmt), data=test, tt=function(x, 
t, ...) pspline(x + t))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-15 Thread peter dalgaard


On Jul 15, 2011, at 15:06 , peter dalgaard wrote:

> For a little lateral thinking, consider the use of "." on the LHS. That could 
> play out as follows:
> 
>> myvars <- c("Ozone","Wind")
>> f <- . ~ Month
>> j <- union(all.vars(f[[3]]), myvars)
>> aggregate(. ~ Month, data=airquality[j], mean, na.rm=T)
>  MonthOzone  Wind
> 1 5 23.61538 11.457692
> 2 6 29.4 12.18
> 3 7 59.11538  8.523077
> 4 8 59.96154  8.565385
> 5 9 31.44828 10.075862
> 
> (and of course, when you play with something unusual, a buglet pops up: it 
> doesn't work with f instead of the explicit formula in the call to aggregate.)
> 


...however, once you go down that road, you might as well construct the LHS 
directly:

> lhs <- as.call(lapply(c("cbind", myvars), as.name))
> eval(bquote(aggregate(.(lhs) ~ Month, data=airquality, mean, na.rm=T)))
  MonthOzone  Wind
1 5 23.61538 11.457692
2 6 29.4 12.18
3 7 59.11538  8.523077
4 8 59.96154  8.565385
5 9 31.44828 10.075862


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reorganize data fram

2011-07-15 Thread anglor

Hi, thanks for your reply. 

I didn't get cast() to work and didn't know how to find information about it
either. I used reshape but then I had to subset only those columns (actually
I have 28 columns of other data) Could cast or reshape work also with more
columns?

Angelica



--
View this message in context: 
http://r.789695.n4.nabble.com/Reorganize-data-fram-tp3662123p3669899.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reorganize data fram

2011-07-15 Thread anglor

Thank you!

I used this one and it worked really great.



/Angelica

--
View this message in context: 
http://r.789695.n4.nabble.com/Reorganize-data-fram-tp3662123p3669782.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] use pcls to solve least square fitting with constraints

2011-07-15 Thread sonem

Hi,

I need help with imposing constraints on GAM parameters, maybe through
pcls..

I have a GAM model without intercept with several strictly parametric and
smooth parameters. I need to set a linear constraint such that sum of
parametric coefficients and first derivatives of the smoothes is equal to 1.
I saw examples with monotonicity and inequality constraints, but can't
figure out how to adapt them for my case.. appreciate any help.

Rgrds,

--
View this message in context: 
http://r.789695.n4.nabble.com/use-pcls-to-solve-least-square-fitting-with-constraints-tp3074869p3669806.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grey colored lines and overwriting labels i qqplot2

2011-07-15 Thread Sigrid

Okay, seems like ddply is not the right method to add my model. That is okay,
though. I already calculated the slopes and intercepts fore each for the
treatments and country. How can I add those 14 lines?

--
View this message in context: 
http://r.789695.n4.nabble.com/grey-colored-lines-and-overwriting-labels-i-qqplot2-tp3657119p3669823.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Querying RData Files, SQL style?

2011-07-15 Thread Bhushan, Vipul

Hello. Is there a package or functionality available somewhere which will allow 
for complex searches (such as what SQL can do) of collections of RData files? 
Search capability within a given RData file at a time (which could be put in a 
loop) would be good, but the capability to perform joins to data across 
multiple RData files would be great. These queries might be ad-hoc, so writing 
an R program to get(load(...)) each file and customize the search in home-grown 
R code isn't feasible.

This shouldn't be dependent on environment details, but just in case: I'm 
running version 2.13.0 in a Unix environment (but could easily run in Windows 
too).

Thanks very much.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Querying RData Files, SQL style?

2011-07-15 Thread Denis Kazakiewicz


http://cran.r-project.org/web/packages/RMySQL/

On 15.07.2011 17:29, Bhushan, Vipul wrote:

Hello. Is there a package or functionality available somewhere which will allow 
for complex searches (such as what SQL can do) of collections of RData files? 
Search capability within a given RData file at a time (which could be put in a 
loop) would be good, but the capability to perform joins to data across 
multiple RData files would be great. These queries might be ad-hoc, so writing 
an R program to get(load(...)) each file and customize the search in home-grown 
R code isn't feasible.

This shouldn't be dependent on environment details, but just in case: I'm 
running version 2.13.0 in a Unix environment (but could easily run in Windows 
too).

Thanks very much.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] WLS regression, lm() with weights as a matrix

2011-07-15 Thread Victor11

Dear All,

Now I am thinking to use a for loop:

  for (i in 1:200) { /Results/ <-lm(R[,i] ~ F, weights=W[,i])}

The thing is, I can get WLS regression coefficients and residuals for each
company each with unique weight, but I am wondering how to easily combine
all coefficients and residuals for ALL companies?

Any suggestions would be greatly appreciated.

--
View this message in context: 
http://r.789695.n4.nabble.com/WLS-regression-lm-with-weights-as-a-matrix-tp3668577p3670176.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing a histogram from a massive dataset

2011-07-15 Thread Dennis Murphy

Hi:

I would suggest that you avoid the histogram and make a density plot
instead. It would be more informative and probably require a lot less
time and ink. If you're married to the histogram concept, try taking a
sample of about 1 and get a histogram of that instead. The result
shouldn't be much different from that of the entire sample - to test
out this hypothesis, take several random samples of size 1 and
compare the histograms. If they're not much different in shape, it's
likely that the full sample is close to the same. If there are
noticeable differences, try 5 or 10 instead (rinse and
repeat).

HTH,
Dennis

On Fri, Jul 15, 2011 at 4:21 AM, Paul Smith  wrote:
> Dear All,
>
> I have a massive dataset from which I would like to draw a histogram.
> Any ideas on how to accomplish this?
>
> Thanks in advance,
>
> Paul
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Drawing a histogram from a massive dataset

2011-07-15 Thread Kyaw Sint (Joe)

Hello,

I assume you have imported the dataset. You can use the hist from the
graphics package from the main R program. A tricky part is that the
freq=TRUE (the default) plots frequencies and freq=FALSE plots probability
densities, not percent of the histogram cells. You can sum the counts and
calculate the percent before plotting.

hist1<-hist(varname, plot=FALSE)
sum <- sum(hist1$counts)
hist1$counts <- hist1$counts/sum*100
plot(hist1, main=paste("Histogram of",deparse(substitute(varname))),
xlab=deparse(substitute(varname)), ylab="Percent",
)

Also, if you are new to R, there are very useful manuals and guides at
http://cran.r-project.org/manuals.html . You can look up documention in R,
such as ?hist command for documentation for hist function. 

Regards,
Kyaw Sint (Joe) 


> Dear All,
> 
> I have a massive dataset from which I would like to draw a histogram.
> Any ideas on how to accomplish this?
> 
> Thanks in advance,
>
> Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Querying RData Files, SQL style?

2011-07-15 Thread David Winsemius



On Jul 15, 2011, at 10:29 AM, Bhushan, Vipul wrote:

Hello. Is there a package or functionality available somewhere which  
will allow for complex searches (such as what SQL can do) of  
collections of RData files? Search capability within a given RData  
file at a time (which could be put in a loop) would be good, but the  
capability to perform joins to data across multiple RData files  
would be great. These queries might be ad-hoc, so writing an R  
program to get(load(...)) each file and customize the search in home- 
grown R code isn't feasible.


As I read the question it appears that your are not expecting to load  
the data into R and are rather asking for a program other than R (or  
Rscript or littler)  to read .Rdata files and perform database joins.  
As I understand it, that is not available. As I understand it, there  
is not even a package that can look at .Rdata files for their object  
names and structure without actual loading them.


Hoping to be corrected on either of these points.



This shouldn't be dependent on environment details, but just in  
case: I'm running version 2.13.0 in a Unix environment (but could  
easily run in Windows too).


Thanks very much.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing Complex Formulas

2011-07-15 Thread William Armstrong

Forgive me.  I had a legitimate problem that I found resolvable using "**"
instead of "^".  I can't seem to recreate the problem to obtain the error
message that I was receiving.  "Incomplete information" is perhaps more
appropriate than "*mis*information."

Here is the exact code I used (still not correct to my original question,
but a step along the way):

> J <- 3

> r_ <- 1.959

> q_ <- 1.45

> scale_ <- 0.3

> N <- 4

> fq <- seq(0, N-1, 1)

> fq

[1] 0 1 2 3

> center_frequencies <- function(J = 3, r_ = 1.959, q_ = 1.45, scale_ =
0.3){

+ j <- seq(0,J-1,1)

+ fc <- (q_ + j)^r_/scale_

+ }

> fc <- center_frequencies(3,r_,q_,scale_)

> fc

[1]  6.902377 19.286575 37.710853

> cf <- t(fc)

> cf

 [,1] [,2] [,3]

[1,] 6.902377 19.28657 37.71085

> lambda <- function(cf, J = 3, scale_ = 0.3){

+ B <- cf*scale_

+ }

> B <- lambda(cf, 3, 0.3)

> B

 [,1] [,2] [,3]

[1,] 2.070713 5.785972 11.31326

> Fc <- 1/cf

> Fc

  [,1]   [,2]   [,3]

[1,] 0.1448776 0.05184954 0.02651757

> dummy <- fq%*%Fc

> dummy

  [,1]   [,2]   [,3]

[1,] 0.000 0. 0.

[2,] 0.1448776 0.05184954 0.02651757

[3,] 0.2897553 0.10369908 0.05303513

[4,] 0.4346329 0.15554862 0.07955270

> U <- -dummy+1

> for(j in 1:J)

+ {

+ Z <- dummy**B[j]

+ U <- (-dummy+1)**B[j]

+ }

> Z

 [,1] [,2] [,3]

[1,] 0.00e+00 0.00e+00 0.00e+00

[2,] 3.222504e-10 2.881245e-15 1.462288e-18

[3,] 8.200170e-07 7.331782e-12 3.721022e-15

[4,] 8.053568e-05 7.200705e-10 3.654498e-13

> U

[,1]  [,2]  [,3]

[1,] 1.0 1.000 1.000

[2,] 0.170223061 0.5475283 0.7378244

[3,] 0.020842075 0.2897999 0.5398325

[4,] 0.001577800 0.1476795 0.3914810

> for(i in 1:4)

+ {

+ for(j in 1:3){

+ U[i,j]<-ifelse( U[i,j]>30,30,U[i,j])

+ }

+ U <- exp(U)

> W <- Z*U

> U

 [,1] [,2] [,3]

[1,] 2.718282 2.718282 2.718282

[2,] 1.185569 1.728974 2.091381

[3,] 1.021061 1.336160 1.715719

[4,] 1.001579 1.159141 1.479170

> W

 [,1] [,2] [,3]

[1,] 0.00e+00 0.00e+00 0.00e+00

[2,] 3.820502e-10 4.981599e-15 3.058201e-18

[3,] 8.372872e-07 9.796435e-12 6.384229e-15

[4,] 8.066285e-05 8.346635e-10 5.405623e-13

>
 I can now get W using:

W <- as.complex(((fq/cf[j])^B[j])*(exp(-(fq/cf[j])+1)^B[j]))
where j is the index for cf (i.e., each center frequency is run individually
and written to a table). Still not the most efficient way to accomplish this
step, but it is working for me.

Jeff

On Fri, Jul 15, 2011 at 5:34 AM, peter dalgaard  wrote:

>
> On Jul 14, 2011, at 20:19 , Duncan Murdoch wrote:
>
> > On 14/07/2011 12:46 PM, warmstron1 wrote:
> >> I resolved this issue.  It appears that "^" won't work for this case,
> but
> >> "**" worked.  I can't find any reference to this, but where "^" seems to
> be
> >> used to raise a value to a numerical function, "**" is used for a y
> raised
> >> to the power of x where x it a computation.
> >
> > Those should be equivalent.  Can you post the code that wasn't working,
> and describe what "not working" meant?
>
> More easily, demonstrate that code that _is_ working stops working if you
> replace "**" with "^". Or stop spreading misinformation!
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>


-- 
*W. Jeffrey Armstrong, Ph.D.
*Assistant Professor
Exercise Science

*Managing Editor
Clinical Kinesiology*
Official Journal of the American Kinesiotherapy Association

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reorganize data fram

2011-07-15 Thread Dennis Murphy

Hi:

On Fri, Jul 15, 2011 at 6:04 AM, anglor  wrote:
> Hi, thanks for your reply.
>
> I didn't get cast() to work and didn't know how to find information about it
> either.

 Hadley Wickham's home page http://had.co.nz/ has a link (last one
under the heading 'R Packages') to the reshape package page, where
some documentation resides.

Could you post a representative sample of your input data frame using
dput()? Here's an example:
df <- data.frame(x = 1:4, y = rnorm(4), z = rpois(4, 3))
> dput(df)
structure(list(x = 1:4, y = c(-0.49491054748322, -1.53013240418216,
0.0189088048735591, -0.0766510981813545), z = c(2, 2, 3, 4)), .Names = c("x",
"y", "z"), row.names = c(NA, -4L), class = "data.frame")

Copy and paste the result of dput() into your e-mail. This is the
preferred way to transport data that is readable on all platforms
while guaranteeing that a potential R-helper sees the same data
structure you do.

Clearly, you don't want to send 700,000 observations with dput(), but
a small sample that is sufficient to illustrate the problem is
desirable. If possible, also send the code that you tried and the
expected result, as you did in your initial post.

 I used reshape but then I had to subset only those columns (actually
> I have 28 columns of other data) Could cast or reshape work also with more
> columns?

Are these columns 'constant' apart from Temperature? If so, then the
following should work, but this needs the 'new and improved' reshape2
package instead. I'm using the same data frame d as before with a
couple added 'constant' variables:

d$age <- 12
d$region <- 'NW'
d$zone <- 'CET'
d
 Date Temperature Category age region zone
1 2007102  16A  12 NW  CET
2 2007102  17B  12 NW  CET
3 2007102  18C  12 NW  CET

library(reshape2)
dcast(d, ... ~ Category, value_var = 'Temperature')
 Date age region zone  A  B  C
1 2007102  12 NW  CET 16 17 18

If they're not (all) constant, then you need to post some data per
above and describe your desired outcome.

HTH,
Dennis

>
> Angelica
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Reorganize-data-fram-tp3662123p3669899.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem in installing rJava from source

2011-07-15 Thread Bogaso Christofer

Hi all, I was trying to install rJava package (some older version) from
source. However could not achieve using "Rcmd  build -binary rJava" syntax
with windows cmd. The building process stopped with following error:

 

ERROR*> JavaSoft\{JRE|JDK} can't open registry keys.

ERROR: cannot find Java Development Kit.

Please set JAVA_HOME to specify it's location normally 

ERROR: configuration failed for package 'rJava'

 

With this error it seems that, I need to install some additional tool(s),
however I have Duncan's Rtools installed. 

 

Can somebody through some light on this issue, what I should do with this
error?

 

Thanks,

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in installing rJava from source

2011-07-15 Thread Joshua Wiley

Hi,

rJava depends on having appropriate version of Java installed.  You
can download the JDK from oracle for free at their website.  It should
automatically set the appropriate environment variables, but if you
are having difficulty with that still, you may need to set JAVA_HOME
to the directory where you installed Java.

Cheers,

Josh

P.S. Duncan Murdoch's Rtools gives you what you need to build R and
many packages but not all third party software a particular package
may depend on.

On Fri, Jul 15, 2011 at 9:40 AM, Bogaso Christofer
 wrote:
> Hi all, I was trying to install rJava package (some older version) from
> source. However could not achieve using "Rcmd  build -binary rJava" syntax
> with windows cmd. The building process stopped with following error:
>
>
>
> ERROR*> JavaSoft\{JRE|JDK} can't open registry keys.
>
> ERROR: cannot find Java Development Kit.
>
> Please set JAVA_HOME to specify it's location normally
>
> ERROR: configuration failed for package 'rJava'
>
>
>
> With this error it seems that, I need to install some additional tool(s),
> however I have Duncan's Rtools installed.
>
>
>
> Can somebody through some light on this issue, what I should do with this
> error?
>
>
>
> Thanks,
>
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grey colored lines and overwriting labels i qqplot2

2011-07-15 Thread Dennis Murphy

Hi:

What did you do and what do you mean by 'add[ing] those 14 lines'? A
reproducible example would be helpful. I've used plyr successfully to
get model coefficients before, so I'm interested in what you mean by
'ddply is not the right method to add my model.' Here's a toy
reproducible example to counter your claim:

library('plyr')
set.seed(1036)
df <- data.frame(gp = rep(1:5, each = 10), x = 1:10,
  y = 1.5 + 2 * rep(1:10, 5) + rnorm(50))
# function to generate the model coefficients for a generic data frame
lmfun <- function(d) coef(lm(y ~ x, data = d))

# Apply the function to each sub-data frame associated with groups:
ddply(df, .(gp), lmfun)
  gp (Intercept)x
1  1   1.2481481 2.011974
2  2   1.3125070 1.977223
3  3   0.5988811 2.212524
4  4   0.8575467 2.075925
5  5   2.1428869 1.903015

Internally, ddply() splits df into five sub-data frames corresponding
to each level of gp. The function lmfun() is applied to each sub-data
frame. Notice that the function argument is a data frame (observe that
data = d inside lm()). It is often advantageous to run lm() by group,
exporting the output to a list of lists (since the output from lm() is
a list), from which plyr can use the ldply() function to pick off
pieces of output from each group. I've done this several times before
in this forum, so I'm not going to repeat it here.

If you post what you tried that didn't work, perhaps I or someone else
can get it to work for you. As mentioned above, reproducible code and
data (with dput()) is ideal.

Dennis

On Fri, Jul 15, 2011 at 5:26 AM, Sigrid  wrote:
> Okay, seems like ddply is not the right method to add my model. That is okay,
> though. I already calculated the slopes and intercepts fore each for the
> treatments and country. How can I add those 14 lines?
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/grey-colored-lines-and-overwriting-labels-i-qqplot2-tp3657119p3669823.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odd behaviour of as.POSIXct

2011-07-15 Thread Johannes Egner

Dear all,

how come the first loop in the below fails, but the second performs as
expected?

days <- as.Date( c("2000-01-01", "2000-01-02") )

for(day in days)
{
as.POSIXct(day)
}

for( n in 1:length(days) )
{
show(as.POSIXct(days[n]))
}

Many thanks, Jo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grey colored lines and overwriting labels i qqplot2

2011-07-15 Thread Hadley Wickham

> You should only have one scale_ call for each scale type.  Here, you have
> three scale_colour_ calls, the first selecting a grey scale, the second
> defining a single break with its label (and thus implicitly subsetting on
> that single break value), and a second which defines a different
> break/label/subset.  Only the last one has any effect.

Just to clarify: breaks/labels control the appearance of the
legend/axis, limits modify what data is shown on the plot.

Hadley


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in installing rJava from source

2011-07-15 Thread Jeff Newmiller

If you cannot build a java program outside R then you won't be able to do so 
inside R. Find a Java development resource (the JDK is one such) and get 
command-line ability to compile java enabled, and then come back to interfacing 
R with Java.
---
Jeff Newmiller The . . Go Live...
DCN: Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Bogaso Christofer  wrote:

Hi all, I was trying to install rJava package (some older version) from
source. However could not achieve using "Rcmd build -binary rJava" syntax
with windows cmd. The building process stopped with following error:



ERROR*> JavaSoft\{JRE|JDK} can't open registry keys.

ERROR: cannot find Java Development Kit.

Please set JAVA_HOME to specify it's location normally 

ERROR: configuration failed for package 'rJava'



With this error it seems that, I need to install some additional tool(s),
however I have Duncan's Rtools installed. 



Can somebody through some light on this issue, what I should do with this
error?



Thanks,




[[alternative HTML version deleted]]

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odd behaviour of as.POSIXct

2011-07-15 Thread B77S

day doesn't exist?
That would be the 1st problem.



Johannes Egner wrote:
> 
> Dear all,
> 
> how come the first loop in the below fails, but the second performs as
> expected?
> 
> days <- as.Date( c("2000-01-01", "2000-01-02") )
> 
> for(day in days)
> {
> as.POSIXct(day)
> }
> 
> for( n in 1:length(days) )
> {
> show(as.POSIXct(days[n]))
> }
> 
> Many thanks, Jo
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/Odd-behaviour-of-as-POSIXct-tp3670414p3670454.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] turning coefficients into an lm obect

2011-07-15 Thread Jarrett Byrnes

I'm working with a dataset and fitting and comparing various lms.  I also have 
a fitted model parameter values and SE estimated from the literature.  In doing 
my comparison, I'd like to turn these estimates into an lm object itself for 
ease of use with some of the code I'm writing.  While putting in the 
coefficients is a simple matter - just take a fitted model object and change 
the values of the mylm$coefficients, for example, it is not transparent to me 
how I could incorporate the parameter variance and, say, the unexplained 
variance in the previous fit.

Although, thinking about it further, the unexplained variance is specific to 
that dataset - so, I shouldn't have to worry about that.  But how can I 
incorporate known variance in the parameter estimates?

Thanks!

-Jarrett
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using str() in a function.

2011-07-15 Thread andrewH

Thanks, everybody, this has been very edifying. One last question:

It seems that sometimes when a function returns something and you don't
assign it, it prints to the console, and sometimes it doesn't. I'm not sure
I understand which is which. My best current theory is that, if the function
returns NULL, by itself and not as part of some larger object, it does not
print it, but non-null values are printed. Is that correct?

Thanks! Andrew


--
View this message in context: 
http://r.789695.n4.nabble.com/Using-str-in-a-function-tp3655785p3670513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error Message Help: Differing Number of Rows

2011-07-15 Thread ScottM

Hello all, 

I'm relatively new to "R" and programming in general - I had previously used
MatLab, but decided to make the transition to R, as the computational times
are much better!

Anyway, I'm trying to use R to run a gamma distribution model to estimate
mean transit times of water moving through a hydrological catchment.

My input are 3 .txt format files as follow:

Precipitation_18o: 2 columns - column 1 is a date (Excel number format) and
column 2 is an isotope ratio (i.e. -8.12)
Runoff_18o: same as above
Daily_Precip: 2 columns - column 1 is the same date format but column 2 is a
weekly bulk precipitation value (i.e. 10mm)

When running the script, I keep getting the following error message:

Error in data.frame(cQ.ou[ind.mea], cQ[cal.cQ:nrow(cQ), 2]) : 
  arguments imply differing number of rows: 42, 44

Now, I know it's not the script, as it run perfectly for one site, but not
the other, but having read previous threads on other forums, it suggests
that there aren't the same number of values in all 3 input files.

A quick use of the str(FILENAME) quickly confirms that all 3 input files
have 322 entries with 2 variables.

I'd be grateful of ANY help, as this is really hampering my research
progress right now!

Cheers, 

Scott_M

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-Message-Help-Differing-Number-of-Rows-tp3670451p3670451.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using str() in a function.

2011-07-15 Thread David Winsemius



On Jul 15, 2011, at 1:31 PM, andrewH wrote:


Thanks, everybody, this has been very edifying. One last question:

It seems that sometimes when a function returns something and you  
don't
assign it, it prints to the console, and sometimes it doesn't. I'm  
not sure
I understand which is which. My best current theory is that, if the  
function
returns NULL, by itself and not as part of some larger object, it  
does not

print it, but non-null values are printed. Is that correct?


I think you should start testing your theories:

fn <- function() return(NULL)
fn()

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using str() in a function.

2011-07-15 Thread Bert Gunter

Below.
-- Bert

On Fri, Jul 15, 2011 at 10:31 AM, andrewH  wrote:
> Thanks, everybody, this has been very edifying. One last question:
>
> It seems that sometimes when a function returns something and you don't
> assign it, it prints to the console, and sometimes it doesn't. I'm not sure
> I understand which is which. My best current theory is that, if the function
> returns NULL, by itself and not as part of some larger object, it does not
> print it, but non-null values are printed. Is that correct?

-- No.
It depends on whether the function uses invisible() in the return,
?invisible

If invisible() is not used and the value is not assigned, it's
printed. Otherwise not.cf:

f <- function()NULL
g <- function()invisible(NULL)

f() ## NULL is printed
g() ## nothing printed

z1 <- f() ## nothing printed
z2 <- g() ## nothing printed

z1 ## NULL
z2 ##NULL

Cheers,
Bert

>
> Thanks! Andrew
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Using-str-in-a-function-tp3655785p3670513.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
"Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions."

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding rows based on column value

2011-07-15 Thread Bansal, Vikas

Dear all,

I have one problem and did not find any solution.
I have attached the question in text file also because sometimes spacing is not 
good in mail.

I have a file(file.txt) attached with this mail.I am reading it using this code 
to make a data frame (file)-

file=read.table("file.txt",fill=T,colClasses = "character",header=T)

file looks like this-

 Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

Now some of the values in column Pos are same.For these same positions i want 
to add the values of columns 3:6
I will explain with an example-
The output of first row should be-

 Chr   Pos   CaseA CaseC  CaseG  CaseT
 10 135344110  0.00 24.00  48.00  0.00

because first three rows have same value in Pos column.

so the whole output for above input should be-

 Chr   PosCaseA CaseC CaseG  CaseT
 10 1353441100.00  24.00  48.000.00
 10 1353441130.00   0.00   24.000.00
  10 135344114  48.00  0.000.00 0.00
  10 135344116   0.00   0.000.0024.00
  10 135344118   0.00  24.00   0.0024.00
  10 135344122  24.00 24.00   0.000.00
  10 135344123   0.00  48.00   0.0024.00
  10 135344126   0.00  0.0024.00   0.00

Can you please help me.


Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College LondonDear all,

I have one problem and did not find any solution.

I have a file(file.txt) attached with this mail.I am reading it using this code 
to make a data frame (file)-

file=read.table("file.txt",fill=T,colClasses = "character",header=T)

file looks like this-

 Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

Now some of the values in column Pos are same.For these same positions i want 
to add the values of columns 3:6
I will explain with an example-
The output of first row should be-

 Chr   Pos   CaseA CaseC  CaseG  CaseT
 10 135344110  0.00 24.00  48.00  0.00

because first three rows have same value in Pos column.

so the whole output for above input should be-

 Chr   PosCaseA CaseC CaseG  CaseT
 10 1353441100.00  24.00  48.000.00
 10 1353441130.00   0.00   24.000.00
  10 135344114  48.00  0.000.00 0.00
  10 135344116   0.00   0.000.0024.00
  10 135344118   0.00  24.00   0.0024.00
  10 135344122  24.00 24.00   0.000.00
  10 135344123   0.00  48.00   0.0024.00
  10 135344126   0.00  0.0024.00   0.00

Can you please help me. Chr   Pos CaseA CaseC CaseG CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.

Re: [R] Odd behaviour of as.POSIXct

2011-07-15 Thread Duncan Murdoch


On 15/07/2011 12:15 PM, Johannes Egner wrote:

Dear all,

how come the first loop in the below fails, but the second performs as
expected?

days<- as.Date( c("2000-01-01", "2000-01-02") )

for(day in days)
{
 as.POSIXct(day)
}


"day" in the loop above is an integer without a class, it's not a Date.  
If you did


for (day in days) {
  class(day) <- class(days)
  print(as.POSIXct(day))
}

you won't get an error.  (I don't know if you'll be happy with what you 
get; the time zone is an issue.)


Duncan Murdoch


for( n in 1:length(days) )
{
 show(as.POSIXct(days[n]))
}

Many thanks, Jo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] transforming year.weeknumber into dates

2011-07-15 Thread Dimitri Liakhovitski

Hello!

I know how to transform dates into year.weeknumber format using zoo:

library(zoo)
as.numeric(format(as.Date("2010-10-02"), "%Y.%W"))

But is there a straightforward way to do the opposite - to transform
character strings like "2009.12" or "2009.30" back into dates
(assuming that weeks start on Monday)?
Thanks a lot!


-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export Unicode characters from R

2011-07-15 Thread Duncan Murdoch

On 15/07/2011 1:42 PM, Sverre Stausland wrote:

>>>
>>>  >funny.g<- "\u1E21"
>>>  >funny.g
>>
>>  [1] "ḡ"
>>
>>>  >data.frame (funny.g) ->funny.g
>>>  >funny.g$funny.g
>>
>>  [1] ḡ
>>  Levels:
>
>  I think the problem is in the data.frame code, not in writing. Data.frames
>  try to display things in a readable way, and since you're on Windows where
>  UTF-8 is not really supported, the code helpfully changes that character to
>  the "" string. for display.

I thought the data.frame function didn't alter the unicode coding,
since funny.g$funny.g above still displays the right unicode character
(although it does list the levels as).

>  You should be able to write the Unicode character to file if you use lower
>  level methods such as cat(), on a connection opened using the file()
>  function with the encoding set explicitly.

I'm sorry, but I don't understand what it means "to use cat() on a
connection opened using the file() function". Could you please clarify
that?

Sorry, I think my suggestion was wrong.  What I meant was something like

file <- file("your filename", encoding="UTF-8")
cat("\u1E21", file= file)

but this doesn't appear to work.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using str() in a function.

2011-07-15 Thread Duncan Murdoch

On 15/07/2011 1:44 PM, Bert Gunter wrote:

Below.
-- Bert

On Fri, Jul 15, 2011 at 10:31 AM, andrewH  wrote:
>  Thanks, everybody, this has been very edifying. One last question:
>
>  It seems that sometimes when a function returns something and you don't
>  assign it, it prints to the console, and sometimes it doesn't. I'm not sure
>  I understand which is which. My best current theory is that, if the function
>  returns NULL, by itself and not as part of some larger object, it does not
>  print it, but non-null values are printed. Is that correct?

-- No.
It depends on whether the function uses invisible() in the return,
?invisible

If invisible() is not used and the value is not assigned, it's
printed. Otherwise not.cf:

f<- function()NULL
g<- function()invisible(NULL)

f() ## NULL is printed
g() ## nothing printed

z1<- f() ## nothing printed
z2<- g() ## nothing printed

z1 ## NULL
z2 ##NULL

Right.  And what invisible() does is set a flag so that the console is 
told "don't print this".  You can see the flag if you use the 
withVisible() function.  For example, with Bert's definitions,

> withVisible(f())
$value
NULL

$visible
[1] TRUE

> withVisible(g())
$value
NULL

$visible
[1] FALSE

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Querying RData Files, SQL style?

2011-07-15 Thread Dennis Murphy

Hi:

If you load the data into R, there is a package called sqldf that
allows one to apply SQL syntax to an R data frame. Is that what you
had in mind? If so, Google 'sqldf R' and you should get a pointer to
its home page.

HTH,
Dennis

On Fri, Jul 15, 2011 at 7:29 AM, Bhushan, Vipul
 wrote:
> Hello. Is there a package or functionality available somewhere which will 
> allow for complex searches (such as what SQL can do) of collections of RData 
> files? Search capability within a given RData file at a time (which could be 
> put in a loop) would be good, but the capability to perform joins to data 
> across multiple RData files would be great. These queries might be ad-hoc, so 
> writing an R program to get(load(...)) each file and customize the search in 
> home-grown R code isn't feasible.
>
> This shouldn't be dependent on environment details, but just in case: I'm 
> running version 2.13.0 in a Unix environment (but could easily run in Windows 
> too).
>
> Thanks very much.
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing Complex Formulas

2011-07-15 Thread Berend Hasselman

warmstron1 wrote:
> 
>> for(j in 1:J)
> + {
> + Z <- dummy**B[j]
> + U <- (-dummy+1)**B[j]
> + }
>> Z
> 

I replaced ** with ^ and got the same results as you.

But why are you doing a for loop here?
At each iteration you are overwriting the previous results of Z and U and
retaining only the values obtained for j=J.
You could just as well do

Z <- dummy^B[J]# J  and not j
U <- (-dummy+1)^B[J] # same

I got the same results as you.

>> for(i in 1:4)
> + {
> + for(j in 1:3){
> + U[i,j]<-ifelse( U[i,j]>30,30,U[i,j])
> + }
> + U <- exp(U)
> 

The second closing } seems to be missing.
Why are you using ifelse elementwisely?
It is a vectorized function.
This is equivalent to what you are doing

U <- ifelse(U>30,30,U)

and still gives the same results.

Berend

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-Complex-Formulas-tp3638379p3670624.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-15 Thread Dennis Murphy

Hi:

This seems to work:

library(plyr)
# select the variables to summarize:
vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '')

# Alternatively,
# vars <- names(df)[grep('Case', names(df))]

# One way: the ddply() function in package plyr in
# conjunction with the colwise() function

> ddply(df, .(Pos), colwise(sum, vars))
Pos CaseA CaseC CaseG CaseT
1 135344110 02448 0
2 135344113 0 024 0
3 13534411448 0 0 0
4 135344116 0 0 024
5 135344118 024 024
6 1353441222424 0 0
7 135344123 048 024
8 135344126 0 024 0

The colwise() function applies the same function (here, sum) to each
variable in the variable list given by vars. The wrapper function
ddply() applies the colwise() function to each subset of the data
defined by a unique value of Pos.

Another way is to use the aggregate() function from base R. The
following code comes from another thread on this list in the past
couple of days due to Bill Dunlap.

> aggregate(df[vars], by = df['Pos'], FUN = sum)
Pos CaseA CaseC CaseG CaseT
1 135344110 02448 0
2 135344113 0 024 0
3 13534411448 0 0 0
4 135344116 0 0 024
5 135344118 024 024
6 1353441222424 0 0
7 135344123 048 024
8 135344126 0 024 0

HTH,
Dennis


2011/7/15 Bansal, Vikas :
> Dear all,
>
> I have one problem and did not find any solution.
> I have attached the question in text file also because sometimes spacing is 
> not good in mail.
>
> I have a file(file.txt) attached with this mail.I am reading it using this 
> code to make a data frame (file)-
>
> file=read.table("file.txt",fill=T,colClasses = "character",header=T)
>
> file looks like this-
>
>  Chr       Pos    CaseA     CaseC    CaseG      CaseT
>  10 135344110  0.00 24.00  0.00  0.00
>  10 135344110  0.00  0.00 24.00  0.00
>  10 135344110  0.00  0.00 24.00  0.00
>  10 135344113  0.00  0.00 24.00  0.00
>  10 135344114 24.00  0.00  0.00  0.00
>  10 135344114 24.00  0.00  0.00  0.00
>  10 135344116  0.00  0.00  0.00 24.00
>  10 135344118  0.00 24.00  0.00  0.00
>  10 135344118  0.00  0.00  0.00 24.00
>  10 135344122 24.00  0.00  0.00  0.00
>  10 135344122  0.00 24.00  0.00  0.00
>  10 135344123  0.00 24.00  0.00  0.00
>  10 135344123  0.00 24.00  0.00  0.00
>  10 135344123  0.00  0.00  0.00 24.00
>  10 135344126  0.00  0.00 24.00  0.00
>
> Now some of the values in column Pos are same.For these same positions i want 
> to add the values of columns 3:6
> I will explain with an example-
> The output of first row should be-
>
>  Chr       Pos   CaseA     CaseC      CaseG      CaseT
>  10 135344110  0.00 24.00  48.00  0.00
>
> because first three rows have same value in Pos column.
>
> so the whole output for above input should be-
>
>  Chr       Pos    CaseA     CaseC         CaseG      CaseT
>  10 135344110    0.00  24.00  48.00    0.00
>  10 135344113    0.00   0.00   24.00    0.00
>  10 135344114  48.00  0.00    0.00     0.00
>  10 135344116   0.00   0.00    0.00    24.00
>  10 135344118   0.00  24.00   0.00    24.00
>  10 135344122  24.00 24.00   0.00    0.00
>  10 135344123   0.00  48.00   0.00    24.00
>  10 135344126   0.00  0.00    24.00   0.00
>
> Can you please help me.
>
>
> Thanking you,
> Warm Regards
> Vikas Bansal
> Msc Bioinformatics
> Kings College London
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export Unicode characters from R

2011-07-15 Thread Duncan Murdoch

On 15/07/2011 1:42 PM, Sverre Stausland wrote:

>>>
>>>  >funny.g<- "\u1E21"
>>>  >funny.g
>>
>>  [1] "ḡ"
>>
>>>  >data.frame (funny.g) ->funny.g
>>>  >funny.g$funny.g
>>
>>  [1] ḡ
>>  Levels:
>
>  I think the problem is in the data.frame code, not in writing. Data.frames
>  try to display things in a readable way, and since you're on Windows where
>  UTF-8 is not really supported, the code helpfully changes that character to
>  the "" string. for display.

I thought the data.frame function didn't alter the unicode coding,
since funny.g$funny.g above still displays the right unicode character
(although it does list the levels as).

>  You should be able to write the Unicode character to file if you use lower
>  level methods such as cat(), on a connection opened using the file()
>  function with the encoding set explicitly.

I'm sorry, but I don't understand what it means "to use cat() on a
connection opened using the file() function". Could you please clarify
that?

I just checked on how R does it.  We use UTF-8 encodings in the help 
pages, regardless of what kind of system you're running on.

It converts the strings to UTF-8 internally first (your funny.g is 
already encoded that way; see Encoding(funny.g)) then uses

writeLines( ..., useBytes=TRUE)

to write it.  The useBytes argument says not to try to make the file 
readable on the local system, just write out the bytes.

Another way to do it is to get your strings in the UTF-8 encoding, 
convert them to raw vectors, and use writeBin() to write those out.  For 
example,

funny.g<- "\u1E21"
rawstuff<- charToRaw(funny.g)
writeBin(rawstuff, "funny.g.txt")

All of this appears hard, because you're thinking of UTF-8 as text, but 
on Windows, R thinks of it as a binary encoding.  Modern Windows systems 
can handle UTF-8, but not all programs on them can.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export Unicode characters from R

2011-07-15 Thread Sverre Stausland

>>>
>>> >  funny.g<- "\u1E21"
>>> >  funny.g
>>
>> [1] "ḡ"
>>
>>> >  data.frame (funny.g) ->  funny.g
>>> >  funny.g$funny.g
>>
>> [1] ḡ
>> Levels:
>
> I think the problem is in the data.frame code, not in writing. Data.frames
> try to display things in a readable way, and since you're on Windows where
> UTF-8 is not really supported, the code helpfully changes that character to
> the "" string. for display.

I thought the data.frame function didn't alter the unicode coding,
since funny.g$funny.g above still displays the right unicode character
(although it does list the levels as ).

> You should be able to write the Unicode character to file if you use lower
> level methods such as cat(), on a connection opened using the file()
> function with the encoding set explicitly.

I'm sorry, but I don't understand what it means "to use cat() on a
connection opened using the file() function". Could you please clarify
that?

Thanks
Sverre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Querying RData Files, SQL style?

2011-07-15 Thread Bhushan, Vipul

Thanks very much for your response. This sqldf package looks promising. I just 
need to figure out if a dbms needs to be running/installed in our environment 
(to hold the temporary SQLite DB it creates). The examples in the documentation 
are helpful too. 

-Original Message-
From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Friday, July 15, 2011 2:10 PM
To: Bhushan, Vipul
Cc: r-help@r-project.org
Subject: Re: [R] Querying RData Files, SQL style?

Hi:

If you load the data into R, there is a package called sqldf that
allows one to apply SQL syntax to an R data frame. Is that what you
had in mind? If so, Google 'sqldf R' and you should get a pointer to
its home page.

HTH,
Dennis

On Fri, Jul 15, 2011 at 7:29 AM, Bhushan, Vipul
 wrote:
> Hello. Is there a package or functionality available somewhere which will 
> allow for complex searches (such as what SQL can do) of collections of RData 
> files? Search capability within a given RData file at a time (which could be 
> put in a loop) would be good, but the capability to perform joins to data 
> across multiple RData files would be great. These queries might be ad-hoc, so 
> writing an R program to get(load(...)) each file and customize the search in 
> home-grown R code isn't feasible.
>
> This shouldn't be dependent on environment details, but just in case: I'm 
> running version 2.13.0 in a Unix environment (but could easily run in Windows 
> too).
>
> Thanks very much.
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] running previous versions of R

2011-07-15 Thread jstevens

I'm having problems trying to get an older version of R (2.9.2) running in a
Linux terminal. I have both R 2.9.2 and 2.12 installed and typing 'R' into
the terminal results in version 2.12 running. I am trying to use a program
that requires version 2.4 or greater, but will not run on version 2.10 or
higher.

Anyone have an idea of what to do?

Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/running-previous-versions-of-R-tp3670587p3670587.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Executing a function correctly

2011-07-15 Thread saskay

Marc, Many thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/Executing-a-function-correctly-tp3665765p3670602.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Querying RData Files, SQL style?

2011-07-15 Thread Bhushan, Vipul

Thank you for your response. To clarify, I don't mind if R loads the data (in 
the background), but was hoping to have to only specify the query as a simple 
request and the list of input files. I'd like to do this relatively 
efficiently, so searching across ~100 RData files (10 to 100 KB each) only 
takes many seconds and not lots of minutes or hours. 

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Friday, July 15, 2011 11:42 AM
To: Bhushan, Vipul
Cc: r-help@r-project.org
Subject: Re: [R] Querying RData Files, SQL style?

On Jul 15, 2011, at 10:29 AM, Bhushan, Vipul wrote:

> Hello. Is there a package or functionality available somewhere which  
> will allow for complex searches (such as what SQL can do) of  
> collections of RData files? Search capability within a given RData  
> file at a time (which could be put in a loop) would be good, but the  
> capability to perform joins to data across multiple RData files  
> would be great. These queries might be ad-hoc, so writing an R  
> program to get(load(...)) each file and customize the search in home- 
> grown R code isn't feasible.

As I read the question it appears that your are not expecting to load  
the data into R and are rather asking for a program other than R (or  
Rscript or littler)  to read .Rdata files and perform database joins.  
As I understand it, that is not available. As I understand it, there  
is not even a package that can look at .Rdata files for their object  
names and structure without actual loading them.

Hoping to be corrected on either of these points.

>
> This shouldn't be dependent on environment details, but just in  
> case: I'm running version 2.13.0 in a Unix environment (but could  
> easily run in Windows too).
>
> Thanks very much.

-- 

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Querying RData Files, SQL style?

2011-07-15 Thread Dennis Murphy

My understanding is that sqldf works in conjunction with the sqlite
and H2 DBMSs. You should be able to verify that from the sqldf home
page; if I'm wrong, Gabor will quickly correct me :)

Dennis

On Fri, Jul 15, 2011 at 11:51 AM, Bhushan, Vipul
 wrote:
> Thanks very much for your response. This sqldf package looks promising. I 
> just need to figure out if a dbms needs to be running/installed in our 
> environment (to hold the temporary SQLite DB it creates). The examples in the 
> documentation are helpful too.
>
> -Original Message-
> From: Dennis Murphy [mailto:djmu...@gmail.com]
> Sent: Friday, July 15, 2011 2:10 PM
> To: Bhushan, Vipul
> Cc: r-help@r-project.org
> Subject: Re: [R] Querying RData Files, SQL style?
>
> Hi:
>
> If you load the data into R, there is a package called sqldf that
> allows one to apply SQL syntax to an R data frame. Is that what you
> had in mind? If so, Google 'sqldf R' and you should get a pointer to
> its home page.
>
> HTH,
> Dennis
>
> On Fri, Jul 15, 2011 at 7:29 AM, Bhushan, Vipul
>  wrote:
>> Hello. Is there a package or functionality available somewhere which will 
>> allow for complex searches (such as what SQL can do) of collections of RData 
>> files? Search capability within a given RData file at a time (which could be 
>> put in a loop) would be good, but the capability to perform joins to data 
>> across multiple RData files would be great. These queries might be ad-hoc, 
>> so writing an R program to get(load(...)) each file and customize the search 
>> in home-grown R code isn't feasible.
>>
>> This shouldn't be dependent on environment details, but just in case: I'm 
>> running version 2.13.0 in a Unix environment (but could easily run in 
>> Windows too).
>>
>> Thanks very much.
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] running previous versions of R

2011-07-15 Thread Marc Schwartz

On Jul 15, 2011, at 1:06 PM, jstevens wrote:

> I'm having problems trying to get an older version of R (2.9.2) running in a
> Linux terminal. I have both R 2.9.2 and 2.12 installed and typing 'R' into
> the terminal results in version 2.12 running. I am trying to use a program
> that requires version 2.4 or greater, but will not run on version 2.10 or
> higher.
> 
> Anyone have an idea of what to do?
> 
> Thanks!

How did you install R? Did you build from source or use pre-compiled binaries 
(eg. RPMs or .debs)? Which Linux distribution are you running?

If in fact, you have two versions of R installed, the likelihood is that the 
2.12.??? binary is in your $PATH and 2.9.2 is not or is after the former. 
Hence, the newer version will be found and run. Typically, there is a symlink 
to the R executable placed in a common location such as /usr/bin or perhaps 
/usr/local/bin, which is in the default $PATH so that R can be run easily.

This may be as simple as knowing where the 2.9.2 installation is located on 
your HD and running PATH.TO.THE.R.EXECUTABLE/R from the command line.

Are you running a CRAN package that has not been updated for more recent 
versions of R, or are you replicating an analysis that was done some time ago 
and have to use the same versions? If the former, be sure to contact the 
package maintainer to request that it be fixed, if they have not already 
orphaned it.

Cheers,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transforming year.weeknumber into dates

2011-07-15 Thread jim holtman

try this:

> x <- c('2009.12', '2009.30')
> as.Date(paste(x, '1'), format = "%Y.%W %w")
[1] "2009-03-23" "2009-07-27"
>
>


On Fri, Jul 15, 2011 at 1:54 PM, Dimitri Liakhovitski
 wrote:
> Hello!
>
> I know how to transform dates into year.weeknumber format using zoo:
>
> library(zoo)
> as.numeric(format(as.Date("2010-10-02"), "%Y.%W"))
>
> But is there a straightforward way to do the opposite - to transform
> character strings like "2009.12" or "2009.30" back into dates
> (assuming that weeks start on Monday)?
> Thanks a lot!
>
>
> --
> Dimitri Liakhovitski
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Migration Analysis?

2011-07-15 Thread VikR

Is it possible to do Migration Analysis in R?

--
View this message in context: 
http://r.789695.n4.nabble.com/Migration-Analysis-tp3670866p3670866.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] running previous versions of R

2011-07-15 Thread jstevens

I'm running Ubuntu - natty. Forgive me if I sound lost, its been just over a
week since I switched over from Windows. I originally installed 2.12 using
the Ubuntu software center, but have now switched to using the terminal.
2.9.2 was installed using a .tar.zip file downloaded from cran. Before I
switched over to Linux I had both versions installed in Windows, but I had
to switch to Linux because Windows is so limiting on RAM use.


Upon further poking around I've figured out how to do it. I found the R
executable file in /usr/lib/R-2.9.2/bin, so entering $/usr/lib/R-2.9.2/bin/R
starts up version 2.9.2. Thanks for your help.


I'm running a CRAN package that has not been updated since 2009, and I'm not
rerunning an old analysis. The problem is that the software I'm using for
the analysis gives me an error saying that it requires R version 2.4 or
higher if I try to run it in version 2.10 or higher. I think someone was a
little sloppy with some code so that the program only looks at two integers
when verifying the version of R. I've let the author know about it, but the
program is no longer in development.

--
View this message in context: 
http://r.789695.n4.nabble.com/running-previous-versions-of-R-tp3670587p3670836.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] barplot question

2011-07-15 Thread Sally_roman

Hi - I would like to make to make a barplot of my data, but am having issues. 
An example of my data is:

species netpair   poundstype
Cod   Control   1   46  kept
Little Skate  Control   1   0   kept
Summer Flounder   Control   1   9   kept
Windowpane Flounder   Control   1   0   kept
Winter Flounder   Control   1   0   kept
Winter Skate  Control   1   0   kept
Yellowtail Flounder   Control   1   76  kept
Cod   Experimental  1   19  kept
Little Skate  Experimental  1   0   kept
Summer Flounder   Experimental  1   2   kept
Windowpane Flounder   Experimental  1   0   kept
Winter Flounder   Experimental  1   0   kept
Winter Skate  Experimental  1   0   kept
Yellowtail Flounder   Experimental  1   9   kept
Cod   Control   1   14 
discard
Little Skate  Control   1   75  
discard
Summer Flounder   Control   1   1   discard
Windowpane Flounder   Control   1   32  discard
Winter Flounder   Control   1   16  discard
Winter Skate  Control   1   225 discard
Yellowtail Flounder   Control   1   7   discard
Cod   Experimental  1   7   discard
Little Skate  Experimental  1   64  discard
Summer Flounder   Experimental  1   3   discard
Windowpane Flounder   Experimental  1   26  discard
Winter Flounder   Experimental  1   12  discard
Winter Skate  Experimental  1   136 discard
Yellowtail Flounder   Experimental  1   5   discard

I have 9 total pairs.  I would like to be able be able to make a barplot by
pair that shows the catch of the control net (kept & discard) stacked with
the catch of the experimental net also stacked by species like the image
below I did in excel.  

http://r.789695.n4.nabble.com/file/n3670861/image.jpg 

I can make barplots by net and pair, but I would like to have both nets on
one barplot if possible.  

Thanks Sally

--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-question-tp3670861p3670861.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help! kennard-stone algorithm in soil.spec packages does not work for my dataset!!!

2011-07-15 Thread tauQSAR

I'm also trying to use the kennard-stone algorithm in the soil.spec package
for my dataset,
(to generate a training and test set from the data, based on this algorithm,
because it's the
most commonly used and well-performing algorithm in QSAR studies) but it's
generating an error:

> ken.sto(mydataIN)
Error in ken.sto(mydataIN) : subscript out of bounds

My data is a 42 row by 6 column all numerical (except header) matrix of the
format:

id  x1  x2  x3   x4y1
2   66.77.710.079 4.58 3.0792
13  79.79.570.100 4.82 2.8451
5   77.73.100.071 1.42 0.4771
6   82.17.580.071 2.08 0.7160
32  98.85.600.143 3.27 1.7160
36  93.34.740.097 4.16 1.7160
...

I cannot find any documentation for the exact format of the data matrix for
this
function (http://www.inside-r.org/packages/cran/soil.spec/docs/ken.sto does
not
have this information). Any help would be appreciated!!!

--
View this message in context: 
http://r.789695.n4.nabble.com/help-kennard-stone-algorithm-in-soil-spec-packages-does-not-work-for-my-dataset-tp3031344p3670857.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scaling advice

2011-07-15 Thread Data Analytics Corp.


Hi,

I have a consultants nightmare -- I was given a project that another 
consultant did and I was told to do the same calculations, but there's 
no documentation on what he did.  Basically, I have yes/no answers to 
survey questions about the effectiveness of product attributes by 
brands.  There are 44 attributes and 13 brands.  The other guy scaled 
the proportion of respondents who said Yes to be mean 0 and variance 
1.0, apparently doing this by brand within each attribute.  He then 
created a matrix of 44 rows for the attributes and 13 columns for the 
brands.  No problem with this; I can always replicate this much.  But 
then he apparently rescaled this 44x13 matrix so that the rows all sum 
to zero and the columns all sum to zero.  None of the row and column 
standard deviations are 1.0.  This I can't see how to do.  How can I 
rescale the rows and columns so that they all sum to zero?  Any suggestions?


Thanks,

Walt



Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536

(V) 609-936-8999
(F) 609-936-3733
w...@dataanalyticscorp.com
www.dataanalyticscorp.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate Az (A sub z) with R?

2011-07-15 Thread Hans Werner Borchers

dd2es  virginia.edu> writes:

>
> I am looking for (or interested in writing) a function that calculates Az,
> an alternative measure of discriminability from SDT (alternative to d', Az).
> I have written my own functions for d', A', B"d, and am aware of the 'sdtalt'
> package, but I have yet to find a way to calculate Az, since it require the
> phi operator.

The Phi --- not Psi, see the paper --- function is simply the cumulative
normal distribution, so you can use pnorm() instead.

--  Hans Werner

> For a relevant paper/discussion (and formula), please see Verde and
> McMillian, 2006 (Measures of sensitivity based on a single hit rate and
> false alarm rate: The accuracy, precision, and robustness of d', Az, and Az)
>
> Any help on this would be greatly appreciated!
>
> David Dobolyi
> Graduate Student
> Cognitive Psychology
> University of Virginia
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Add permanently environment variable

2011-07-15 Thread Anna Lippel

Hello everyone, I know how to add a folder path to my EV path but it only
works for the current R session. Is there a way to add it permanently? Here
is my code:
Sys.setenv(PATH=paste("C:\\Program Files\\Java\\jre1.6.0_13\\bin;",
Sys.getenv(x="PATH"), sep=""))
Thanks a lot!

--
View this message in context: 
http://r.789695.n4.nabble.com/Add-permanently-environment-variable-tp3670920p3670920.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] running previous versions of R

2011-07-15 Thread Marc Schwartz

On Jul 15, 2011, at 2:58 PM, jstevens wrote:

> I'm running Ubuntu - natty. Forgive me if I sound lost, its been just over a
> week since I switched over from Windows. I originally installed 2.12 using
> the Ubuntu software center, but have now switched to using the terminal.
> 2.9.2 was installed using a .tar.zip file downloaded from cran. Before I
> switched over to Linux I had both versions installed in Windows, but I had
> to switch to Linux because Windows is so limiting on RAM use.
> 
> 
> Upon further poking around I've figured out how to do it. I found the R
> executable file in /usr/lib/R-2.9.2/bin, so entering $/usr/lib/R-2.9.2/bin/R
> starts up version 2.9.2. Thanks for your help.
> 
> 
> I'm running a CRAN package that has not been updated since 2009, and I'm not
> rerunning an old analysis. The problem is that the software I'm using for
> the analysis gives me an error saying that it requires R version 2.4 or
> higher if I try to run it in version 2.10 or higher. I think someone was a
> little sloppy with some code so that the program only looks at two integers
> when verifying the version of R. I've let the author know about it, but the
> program is no longer in development.

Hi,

I am presuming that you obtained the 2.9.2 source tarball for R from CRAN and 
then compiled it?

Otherwise, you would only have the source files from the extracted archive and 
R should not otherwise run. There are not pre-compiled tar files of binaries 
for R on CRAN for Linux.

A couple of other comments:

1. There is a SIG e-mail list for R on Debian based distros, of which Ubuntu is 
one. Info here:

  https://stat.ethz.ch/mailman/listinfo/r-sig-debian

I would recommend posting any technical questions related to using R on Ubuntu 
there. Also, a good *recent* intro book on Ubuntu would not be a bad idea. I 
was in the same boat about 10 years ago, when I made the transition from 
Windows to Linux (Red Hat then Fedora), though I have been running OSX for the 
past two years or so.

2. There are 64 bit versions of R for Windows now, if you have a 64 bit version 
of Windows running. See the R FAQ for Windows for more info.

A final comment, which is that 2.13.1 is the current version of R, so you 
should look to upgrade from 2.12.??.

Regards,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-15 Thread Bansal, Vikas

I have tried the aggregate command but it shows this error-


vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '')
> vars
[1] "CaseA" "CaseC" "CaseG" "CaseT"

> aggregate(file[vars], by = df['Pos'], FUN = sum)

Error in aggregate.data.frame(file[vars], by = df["Pos"], FUN = sum) : 
  arguments must have same length

the thing is I cant use the plyr because I want the coding so that I can use it 
to make a tool.

Can you please tell me why aggregate function is showing this error.I am 
confused.

Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: Dennis Murphy [djmu...@gmail.com]
Sent: Friday, July 15, 2011 7:38 PM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] Adding rows based on column value

Hi:

This seems to work:

library(plyr)
# select the variables to summarize:
vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '')

# Alternatively,
# vars <- names(df)[grep('Case', names(df))]

# One way: the ddply() function in package plyr in
# conjunction with the colwise() function

> ddply(df, .(Pos), colwise(sum, vars))
Pos CaseA CaseC CaseG CaseT
1 135344110 02448 0
2 135344113 0 024 0
3 13534411448 0 0 0
4 135344116 0 0 024
5 135344118 024 024
6 1353441222424 0 0
7 135344123 048 024
8 135344126 0 024 0

The colwise() function applies the same function (here, sum) to each
variable in the variable list given by vars. The wrapper function
ddply() applies the colwise() function to each subset of the data
defined by a unique value of Pos.

Another way is to use the aggregate() function from base R. The
following code comes from another thread on this list in the past
couple of days due to Bill Dunlap.

> aggregate(df[vars], by = df['Pos'], FUN = sum)
Pos CaseA CaseC CaseG CaseT
1 135344110 02448 0
2 135344113 0 024 0
3 13534411448 0 0 0
4 135344116 0 0 024
5 135344118 024 024
6 1353441222424 0 0
7 135344123 048 024
8 135344126 0 024 0

HTH,
Dennis


2011/7/15 Bansal, Vikas :
> Dear all,
>
> I have one problem and did not find any solution.
> I have attached the question in text file also because sometimes spacing is 
> not good in mail.
>
> I have a file(file.txt) attached with this mail.I am reading it using this 
> code to make a data frame (file)-
>
> file=read.table("file.txt",fill=T,colClasses = "character",header=T)
>
> file looks like this-
>
>  Chr   PosCaseA CaseCCaseG  CaseT
>  10 135344110  0.00 24.00  0.00  0.00
>  10 135344110  0.00  0.00 24.00  0.00
>  10 135344110  0.00  0.00 24.00  0.00
>  10 135344113  0.00  0.00 24.00  0.00
>  10 135344114 24.00  0.00  0.00  0.00
>  10 135344114 24.00  0.00  0.00  0.00
>  10 135344116  0.00  0.00  0.00 24.00
>  10 135344118  0.00 24.00  0.00  0.00
>  10 135344118  0.00  0.00  0.00 24.00
>  10 135344122 24.00  0.00  0.00  0.00
>  10 135344122  0.00 24.00  0.00  0.00
>  10 135344123  0.00 24.00  0.00  0.00
>  10 135344123  0.00 24.00  0.00  0.00
>  10 135344123  0.00  0.00  0.00 24.00
>  10 135344126  0.00  0.00 24.00  0.00
>
> Now some of the values in column Pos are same.For these same positions i want 
> to add the values of columns 3:6
> I will explain with an example-
> The output of first row should be-
>
>  Chr   Pos   CaseA CaseC  CaseG  CaseT
>  10 135344110  0.00 24.00  48.00  0.00
>
> because first three rows have same value in Pos column.
>
> so the whole output for above input should be-
>
>  Chr   PosCaseA CaseC CaseG  CaseT
>  10 1353441100.00  24.00  48.000.00
>  10 1353441130.00   0.00   24.000.00
>  10 135344114  48.00  0.000.00 0.00
>  10 135344116   0.00   0.000.0024.00
>  10 135344118   0.00  24.00   0.0024.00
>  10 135344122  24.00 24.00   0.000.00
>  10 135344123   0.00  48.00   0.0024.00
>  10 135344126   0.00  0.0024.00   0.00
>
> Can you please help me.
>
>
> Thanking you,
> Warm Regards
> Vikas Bansal
> Msc Bioinformatics
> Kings College London
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailin

[R] Convert continuous variable into discrete variable

2011-07-15 Thread Michael Haenlein

Dear all,

I have a continuous variable that can take on values between 0 and 100, for
example: x<-runif(100,0,100)

I also have a second variable that defines a series of thresholds, for
example: y<-c(3, 4.5, 6, 8)

I would like to convert my continuous variable into a discrete one using the
threshold variables:

If x is between 0 and 3 the discrete variable should be 1
If x is between 3 and 4.5 the discrete variable should be 2
If x is between 4.5 and 6 the discrete variable should be 3
If x is between 6 and 8 the discrete variable should be 4
If x is larger than 8 the discrete variable should be 5

Is there a straightforward way of doing this (besides working with several
if statements in a row)?

Thanks,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-15 Thread Bansal, Vikas



I have tried the aggregate command but it shows this error-


vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '')
> vars
[1] "CaseA" "CaseC" "CaseG" "CaseT"

> aggregate(file[vars], by = file['Pos'], FUN = sum)

Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument

the thing is I cant use the plyr because I want the coding so that I can use it 
to make a tool.

Can you please tell me why aggregate function is showing this error.I am 
confused.

Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: Dennis Murphy [djmu...@gmail.com]
Sent: Friday, July 15, 2011 7:38 PM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] Adding rows based on column value

Hi:

This seems to work:

library(plyr)
# select the variables to summarize:
vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '')

# Alternatively,
# vars <- names(df)[grep('Case', names(df))]

# One way: the ddply() function in package plyr in
# conjunction with the colwise() function

> ddply(df, .(Pos), colwise(sum, vars))
Pos CaseA CaseC CaseG CaseT
1 135344110 02448 0
2 135344113 0 024 0
3 13534411448 0 0 0
4 135344116 0 0 024
5 135344118 024 024
6 1353441222424 0 0
7 135344123 048 024
8 135344126 0 024 0

The colwise() function applies the same function (here, sum) to each
variable in the variable list given by vars. The wrapper function
ddply() applies the colwise() function to each subset of the data
defined by a unique value of Pos.

Another way is to use the aggregate() function from base R. The
following code comes from another thread on this list in the past
couple of days due to Bill Dunlap.

> aggregate(df[vars], by = df['Pos'], FUN = sum)
Pos CaseA CaseC CaseG CaseT
1 135344110 02448 0
2 135344113 0 024 0
3 13534411448 0 0 0
4 135344116 0 0 024
5 135344118 024 024
6 1353441222424 0 0
7 135344123 048 024
8 135344126 0 024 0

HTH,
Dennis


2011/7/15 Bansal, Vikas :
> Dear all,
>
> I have one problem and did not find any solution.
> I have attached the question in text file also because sometimes spacing is 
> not good in mail.
>
> I have a file(file.txt) attached with this mail.I am reading it using this 
> code to make a data frame (file)-
>
> file=read.table("file.txt",fill=T,colClasses = "character",header=T)
>
> file looks like this-
>
>  Chr   PosCaseA CaseCCaseG  CaseT
>  10 135344110  0.00 24.00  0.00  0.00
>  10 135344110  0.00  0.00 24.00  0.00
>  10 135344110  0.00  0.00 24.00  0.00
>  10 135344113  0.00  0.00 24.00  0.00
>  10 135344114 24.00  0.00  0.00  0.00
>  10 135344114 24.00  0.00  0.00  0.00
>  10 135344116  0.00  0.00  0.00 24.00
>  10 135344118  0.00 24.00  0.00  0.00
>  10 135344118  0.00  0.00  0.00 24.00
>  10 135344122 24.00  0.00  0.00  0.00
>  10 135344122  0.00 24.00  0.00  0.00
>  10 135344123  0.00 24.00  0.00  0.00
>  10 135344123  0.00 24.00  0.00  0.00
>  10 135344123  0.00  0.00  0.00 24.00
>  10 135344126  0.00  0.00 24.00  0.00
>
> Now some of the values in column Pos are same.For these same positions i want 
> to add the values of columns 3:6
> I will explain with an example-
> The output of first row should be-
>
>  Chr   Pos   CaseA CaseC  CaseG  CaseT
>  10 135344110  0.00 24.00  48.00  0.00
>
> because first three rows have same value in Pos column.
>
> so the whole output for above input should be-
>
>  Chr   PosCaseA CaseC CaseG  CaseT
>  10 1353441100.00  24.00  48.000.00
>  10 1353441130.00   0.00   24.000.00
>  10 135344114  48.00  0.000.00 0.00
>  10 135344116   0.00   0.000.0024.00
>  10 135344118   0.00  24.00   0.0024.00
>  10 135344122  24.00 24.00   0.000.00
>  10 135344123   0.00  48.00   0.0024.00
>  10 135344126   0.00  0.0024.00   0.00
>
> Can you please help me.
>
>
> Thanking you,
> Warm Regards
> Vikas Bansal
> Msc Bioinformatics
> Kings College London
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailma

Re: [R] Export Unicode characters from R

2011-07-15 Thread Sverre Stausland

Hi,

I'm interested in the suggestion to use writeLines( ...,
useBytes=TRUE), but how can I use this function on the way to
exporting from R? Could you please provide a simple example?

The following suggestion worked very well:

> funny.g<- "\u1E21"
> rawstuff<- charToRaw(funny.g)
> writeBin(rawstuff, "funny.g.txt")

But the function charToRaw() only allows an object with a single
character, and writeBin cannot be used to export data frames. Is there
any solution along these lines when I have a data frame with Unicode
characters?

Best
Sverre

On Fri, Jul 15, 2011 at 2:38 PM, Duncan Murdoch
 wrote:
> On 15/07/2011 1:42 PM, Sverre Stausland wrote:
>>
>> >>>
>> >>>  >    funny.g<- "\u1E21"
>> >>>  >    funny.g
>> >>
>> >>  [1] "ḡ"
>> >>
>> >>>  >    data.frame (funny.g) ->    funny.g
>> >>>  >    funny.g$funny.g
>> >>
>> >>  [1] ḡ
>> >>  Levels:
>> >
>> >  I think the problem is in the data.frame code, not in writing.
>> > Data.frames
>> >  try to display things in a readable way, and since you're on Windows
>> > where
>> >  UTF-8 is not really supported, the code helpfully changes that
>> > character to
>> >  the "" string. for display.
>>
>> I thought the data.frame function didn't alter the unicode coding,
>> since funny.g$funny.g above still displays the right unicode character
>> (although it does list the levels as).
>>
>> >  You should be able to write the Unicode character to file if you use
>> > lower
>> >  level methods such as cat(), on a connection opened using the file()
>> >  function with the encoding set explicitly.
>>
>> I'm sorry, but I don't understand what it means "to use cat() on a
>> connection opened using the file() function". Could you please clarify
>> that?
>>
>
> I just checked on how R does it.  We use UTF-8 encodings in the help pages,
> regardless of what kind of system you're running on.
>
> It converts the strings to UTF-8 internally first (your funny.g is already
> encoded that way; see Encoding(funny.g)) then uses
>
> writeLines( ..., useBytes=TRUE)
>
> to write it.  The useBytes argument says not to try to make the file
> readable on the local system, just write out the bytes.
>
> Another way to do it is to get your strings in the UTF-8 encoding, convert
> them to raw vectors, and use writeBin() to write those out.  For example,
>
> funny.g<- "\u1E21"
> rawstuff<- charToRaw(funny.g)
> writeBin(rawstuff, "funny.g.txt")
>
>
> All of this appears hard, because you're thinking of UTF-8 as text, but on
> Windows, R thinks of it as a binary encoding.  Modern Windows systems can
> handle UTF-8, but not all programs on them can.
>
> Duncan Murdoch
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scaling advice

2011-07-15 Thread peter dalgaard


On Jul 15, 2011, at 23:05 , Data Analytics Corp. wrote:

> Hi,
> 
> I have a consultants nightmare -- I was given a project that another 
> consultant did and I was told to do the same calculations, but there's no 
> documentation on what he did.  Basically, I have yes/no answers to survey 
> questions about the effectiveness of product attributes by brands.  There are 
> 44 attributes and 13 brands.  The other guy scaled the proportion of 
> respondents who said Yes to be mean 0 and variance 1.0, apparently doing this 
> by brand within each attribute.  He then created a matrix of 44 rows for the 
> attributes and 13 columns for the brands.  No problem with this; I can always 
> replicate this much.  But then he apparently rescaled this 44x13 matrix so 
> that the rows all sum to zero and the columns all sum to zero.  None of the 
> row and column standard deviations are 1.0.  This I can't see how to do.  How 
> can I rescale the rows and columns so that they all sum to zero?  Any 
> suggestions?
> 


If the _sum_ is zero, there must be both negative and positive elements, so it 
can't be a pure scaling. sweep()'ing out the row and column means would be the 
first thing to come to my mind.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting one column value into multiple rows

2011-07-15 Thread Madana_Babu

Hi,

This is working with when i have few lines and when i give those input lines
in R window. But i want to apply this function on a variable which is a part
of dataset and the data set is very large in size. Any help in this aspect
will really help me a lot.

Regards,
Madana

--
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3671087.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert continuous variable into discrete variable

2011-07-15 Thread Dennis Murphy

Hi:

x<-runif(100,0,100)
u <- cut(x, breaks = c(0, 3, 4.5, 6, 8, Inf), labels = c(1:5))

Based on the x I obtained,

> table(u)
u
 1  2  3  4  5
 3  2  1  2 92

cut() or findInterval() are the two basic functions for discretizing a
numeric variable.

HTH,
Dennis

On Fri, Jul 15, 2011 at 2:29 PM, Michael Haenlein
 wrote:
> Dear all,
>
> I have a continuous variable that can take on values between 0 and 100, for
> example: x<-runif(100,0,100)
>
> I also have a second variable that defines a series of thresholds, for
> example: y<-c(3, 4.5, 6, 8)
>
> I would like to convert my continuous variable into a discrete one using the
> threshold variables:
>
> If x is between 0 and 3 the discrete variable should be 1
> If x is between 3 and 4.5 the discrete variable should be 2
> If x is between 4.5 and 6 the discrete variable should be 3
> If x is between 6 and 8 the discrete variable should be 4
> If x is larger than 8 the discrete variable should be 5
>
> Is there a straightforward way of doing this (besides working with several
> if statements in a row)?
>
> Thanks,
>
> Michael
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scaling advice

2011-07-15 Thread Peter Langfelder

On Fri, Jul 15, 2011 at 2:05 PM, Data Analytics Corp.
 wrote:

>  But then he apparently rescaled this 44x13
> matrix so that the rows all sum to zero and the columns all sum to zero.
>  None of the row and column standard deviations are 1.0.  This I can't see
> how to do.  How can I rescale the rows and columns so that they all sum to
> zero?  Any suggestions?

Well, he could have used the Gower's centering transformation
described for example in this pdf:

www.stat.auckland.ac.nz/~mja/prog/PCO_UserNotes.pdf

As you can convince yourself very easily, given any matrix A, the
matrix G calculated as on page 3 of the document will have zero sum
rows and columns.

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help page becomes unavailable after a package is reinstalled

2011-07-15 Thread Yihui Xie

Hi all,

I have noticed this problem ever since R changed its static HTML help
pages to dynamic help pages: when I reinstall a package and try to
view any help page of this package, I always get this error (in the
terminal or html page)

Error in fetch(key) : internal error -3 in R_decompress1

As a package developer, I often have to reinstall a package again and
again, so I wish I do not have to restart R to see the new
documentation. Anybody ever met a similar situation and has an idea?
Thanks!

I use R 2.13.1 under Ubuntu, and it also appears in Windows 7.

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting one column value into multiple rows

2011-07-15 Thread David Winsemius



On Jul 15, 2011, at 6:05 PM, Madana_Babu wrote:


Hi,

This is working with when i have few lines and when i give those  
input lines
in R window. But i want to apply this function on a variable which  
is a part
of dataset and the data set is very large in size. Any help in this  
aspect

will really help me a lot.


Define "very large". And provide machine specifics and the full text  
of any errors you are encountering. There is no reason you cannot  
offer a column of an R data.frame to teh textConnection function. It  
will behave exactly like a file.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help page becomes unavailable after a package is reinstalled

2011-07-15 Thread Rolf Turner



I can verify that I get exactly the same error (also R 2.13.1 under Ubuntu).
No idea what to *do* about it, though. :-(

cheers,

Rolf Turner

On 16/07/11 11:25, Yihui Xie wrote:

Hi all,

I have noticed this problem ever since R changed its static HTML help
pages to dynamic help pages: when I reinstall a package and try to
view any help page of this package, I always get this error (in the
terminal or html page)

Error in fetch(key) : internal error -3 in R_decompress1

As a package developer, I often have to reinstall a package again and
again, so I wish I do not have to restart R to see the new
documentation. Anybody ever met a similar situation and has an idea?
Thanks!

I use R 2.13.1 under Ubuntu, and it also appears in Windows 7.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] summarized data set - how to use an "occurs" field

2011-07-15 Thread mloxton

I have a data set with 22 fields and several thousand records in which
one field (count) indicates the number of times that each specific
combination of the other 21 fields occurred in a bigger and largely
unavailable data set.
So each record is unique in its combination of field values and has a
field that identifies how many multiples of this record actually
occurred.

Without resorting to writing a program that re-expands the data set to
several million rows by cloning each row by the number of times the
"count" field indicated, is there a way in R to use that field to come
up with summary stats and bargraphs of the distribution of any one of
the other fields?

best
Matthew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Placing eps files from R into Adobe InDesign documents: specifying fontfamily

2011-07-15 Thread watson

Also try using pdf() instead of postscript(). It seems to keep everything
happy, and retain higher resolution.

--
View this message in context: 
http://r.789695.n4.nabble.com/Placing-eps-files-from-R-into-Adobe-InDesign-documents-specifying-fontfamily-tp1012186p3671150.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple ggplot in a single plot

2011-07-15 Thread hrishi

Hello friends i have to created several ggplots.
I have to combine them together to a new plot.
any ideas ?? I am new to R.
I am attaching a sample plot
--
http://r.789695.n4.nabble.com/file/n3671184/1A2.jpeg 
Thanks in Advance.


--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-ggplot-in-a-single-plot-tp3671184p3671184.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help page becomes unavailable after a package is reinstalled

2011-07-15 Thread Gabor Grothendieck

On Fri, Jul 15, 2011 at 8:14 PM, Rolf Turner  wrote:
>
> I can verify that I get exactly the same error (also R 2.13.1 under Ubuntu).
> No idea what to *do* about it, though. :-(
>
You could check if text help works:

options(help_type = "text")
?by

or

help("by", help_type = "text")

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] barplot question

2011-07-15 Thread Carl Witthoft

Start out with

?barplot

Then please tell us what your "issues" are.   The barplot function is 
pretty flexible.

If I guess that you are having difficulty simultaneously plotting one 
set of stacked bars and another set of non-stacked bars next to them, I 
would recommend two approaches.

One is to play with

> barplot(first_stuff)
> par(new=TRUE)
> barplot(second_stuff)

The other, and probably better, approach, is to write your data of 
interest into a new matrix with some zeroes added to certain columns, 
and use the "beside=FALSE" argument to barplot().

Carl

Sally Roman wrote:

Hi - I would like to make to make a barplot of my data, but am having 
issues. An example of my data is:

species netpair   poundstype
Cod   Control   1   46  kept
Little Skate  Control   1   0   kept
Summer Flounder   Control   1   9   kept
Windowpane Flounder   Control   1   0   kept
Winter Flounder   Control   1   0   kept
Winter Skate  Control   1   0   kept
Yellowtail Flounder   Control   1   76  kept
Cod   Experimental  1   19  kept
Little Skate  Experimental  1   0   kept
Summer Flounder   Experimental  1   2   kept
Windowpane Flounder   Experimental  1   0   kept
Winter Flounder   Experimental  1   0   kept
Winter Skate  Experimental  1   0   kept
Yellowtail Flounder   Experimental  1   9   kept
Cod   Control   1   14 
discard
Little Skate  Control   1   75  
discard
Summer Flounder   Control   1   1   discard
Windowpane Flounder   Control   1   32  discard
Winter Flounder   Control   1   16  discard
Winter Skate  Control   1   225 discard
Yellowtail Flounder   Control   1   7   discard
Cod   Experimental  1   7   discard
Little Skate  Experimental  1   64  discard
Summer Flounder   Experimental  1   3   discard
Windowpane Flounder   Experimental  1   26  discard
Winter Flounder   Experimental  1   12  discard
Winter Skate  Experimental  1   136 discard
Yellowtail Flounder   Experimental  1   5   discard

I have 9 total pairs. I would like to be able be able to make a barplot 
by pair that shows the catch of the control net (kept & discard) stacked 
with the catch of the experimental net also stacked by species like the 
image below I did in excel.

http://r.789695.n4.nabble.com/file/n3670861/image.jpg

I can make barplots by net and pair, but I would like to have both nets 
on one barplot if possible.

--
-
Sent from my Cray XK6

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple ggplot in a single plot

2011-07-15 Thread Joshua Wiley

Hi,

Here is one option, though this may be a bit tricky for you if you are
new.  ggplot2 is based on grid graphics, so using grid you can obtain
more customization.  There may be easier ways and even within grid it
may be possible to do more simply than I am demonstrating, I am still
finding my footing with that package.  Example code follows that
should reproduce the attached PDF.

Cheers,

Josh

###

require(ggplot2)

## some plots using the built in "mtcars" dataset
p1 <- ggplot(melt(as.data.frame(lapply(mtcars, scale))),
  aes(x = variable, y = value)) +
  geom_boxplot() +
  geom_jitter()

p2 <- ggplot(melt(abs(cor(mtcars))), aes(x = X1, y = X2, fill = value)) +
  geom_tile()

p3 <- ggplot(mtcars, aes(x = wt, y = disp, colour = factor(am))) +
  geom_point() +
  geom_smooth(method = "lm", aes(group = 1), se = FALSE)

p4 <- ggplot(mtcars, aes(x = hp, y = mpg, size = factor(cyl))) +
  geom_point()

## Start a new device with a specified size
dev.new(width = 11, height = 8.5)

## use the grid package to customize the layout
pushViewport(vpList(
  viewport(x = 0, y = .45, width = .5, height = .45,
just = c("left", "bottom"), name = "p1"),
  viewport(x = .5, y = .45, width = .5, height = .45,
just = c("left", "bottom"), name = "p2"),
  viewport(x = 0, y = 0, width = .5, height = .4,
just = c("left", "bottom"), name = "p3"),
  viewport(x = .5, y = 0, width = .5, height = .45,
just = c("left", "bottom"), name = "p4"),
  viewport(x = 0, y = .9, width = 1, height = .1,
just = c("left", "bottom"), name = "title")))

## Add the plots from ggplot2
upViewport()
downViewport("p1")
print(p1, newpage = FALSE)

upViewport()
downViewport("p2")
print(p2, newpage = FALSE)

upViewport()
downViewport("p3")
print(p3, newpage = FALSE)

upViewport()
downViewport("p4")
print(p4, newpage = FALSE)

## add an overall title (note I left space for it and gave it its own viewport)
upViewport()
downViewport("title")
grid.text("Four Plots created by the Excellent ggplot2 package",
  x = .5,  gp = gpar(fontsize = 18))



On Fri, Jul 15, 2011 at 4:46 PM, hrishi  wrote:
> Hello friends i have to created several ggplots.
> I have to combine them together to a new plot.
> any ideas ?? I am new to R.
> I am attaching a sample plot
> --
> http://r.789695.n4.nabble.com/file/n3671184/1A2.jpeg
> Thanks in Advance.
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Multiple-ggplot-in-a-single-plot-tp3671184p3671184.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/


multiggplot2.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summarized data set - how to use an "occurs" field

2011-07-15 Thread Dennis Murphy

Hi:

Your count variable is a frequency associated with a given row of the
data set. If you're more specific about what you want and can post a
representative sample of (some facsimile of) your data using dput(),
the list is likely to be more helpful. See the posting guide linked at
the bottom of this message for guidelines.

Dennis

On Fri, Jul 15, 2011 at 3:10 PM, mloxton  wrote:
> I have a data set with 22 fields and several thousand records in which
> one field (count) indicates the number of times that each specific
> combination of the other 21 fields occurred in a bigger and largely
> unavailable data set.
> So each record is unique in its combination of field values and has a
> field that identifies how many multiples of this record actually
> occurred.
>
> Without resorting to writing a program that re-expands the data set to
> several million rows by cloning each row by the number of times the
> "count" field indicated, is there a way in R to use that field to come
> up with summary stats and bargraphs of the distribution of any one of
> the other fields?
>
> best
> Matthew
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summarized data set - how to use an "occurs" field

2011-07-15 Thread David Winsemius

On Jul 15, 2011, at 6:10 PM, mloxton wrote:

I have a data set with 22 fields and several thousand records in which
one field (count) indicates the number of times that each specific
combination of the other 21 fields occurred in a bigger and largely
unavailable data set.
So each record is unique in its combination of field values and has a
field that identifies how many multiples of this record actually
occurred.

Without resorting to writing a program that re-expands the data set to
several million rows by cloning each row by the number of times the
"count" field indicated, is there a way in R to use that field to come
up with summary stats and bargraphs of the distribution of any one of
the other fields?

> dfrm <- expand.grid(A=1:3, B=1:3)
> dfrm$counts <- 1:9
> xtabs(counts~A, data=dfrm)
A
 1  2  3
12 15 18

>barplot(xtabs(counts~A, data=dfrm), xlab="Counts by A level")

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add permanently environment variable

2011-07-15 Thread Joshua Wiley

On Fri, Jul 15, 2011 at 1:45 PM, Anna Lippel  wrote:
> Hello everyone, I know how to add a folder path to my EV path but it only
> works for the current R session. Is there a way to add it permanently? Here

Yes, you can add it permanently using Windows.  If you are on Windows
7, something like this should work:

WindowsKey + R   (to bring up the run console)
powershell RET (to bring up the powershell)

# Create a new variable with the path to Java
$newpath = "C:\Program Files\Java\jre1.6.0_13\bin;"
# add the contents of the machine path to the above
$newpath += [environment]::GetEnvironmentVariable("PATH", "Machine")
$newpath # verify this is correct

# now set the environment variable "PATH" to the contents of $newpath
[Environment]::SetEnvironmentVariable("PATH", $newpath, "Process")

# Check that things look as they should (again)
[Environment]::GetEnvironmentVariable("PATH", "Process")

# Note that where I put "Process" you would need to put
# "Machine" if you want it to be permanent
# but be careful because you could really mess things up
# which is why I left it at the Process level which will be trashed
# when you exit that session of the powershell

Another, perhaps simpler option would be to use the control panel.
Searching for windows set environment variable will bring up countless
guides.

HTH,

Josh

> is my code:
> Sys.setenv(PATH=paste("C:\\Program Files\\Java\\jre1.6.0_13\\bin;",
> Sys.getenv(x="PATH"), sep=""))
> Thanks a lot!
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Add-permanently-environment-variable-tp3670920p3670920.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Z-test

2011-07-15 Thread Bogdan Tanasa

Hi,

please could you recommend a R package that computes a 2 sample z-test ?

thanks,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help page becomes unavailable after a package is reinstalled

2011-07-15 Thread Yihui Xie

Unfortunately, no. We can use any package to reproduce this error, e.g. rgl

> library(rgl)
> options(help_type = "text")
> ?rgl.open  # works fine
> install.packages('rgl')  # reinstall it
Installing package(s) into ‘/home/yihui/R/x86_64-pc-linux-gnu-library/2.13’
(as ‘lib’ is unspecified)
trying URL 
'http://streaming.stat.iastate.edu/CRAN/src/contrib/rgl_0.92.798.tar.gz'
Content type 'application/x-gzip' length 162 bytes (1.6 Mb)
opened URL
==
downloaded 1.6 Mb

* installing *source* package ‘rgl’ ...
checking for gcc... gcc

** testing if installed package can be loaded

* DONE (rgl)

The downloaded packages are in
‘/tmp/Rtmp0NdwY4/downloaded_packages’
> ?rgl.open
Error in fetch(key) : internal error -3 in R_decompress1

> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rgl_0.92.798

loaded via a namespace (and not attached):
[1] tools_2.13.1


Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Fri, Jul 15, 2011 at 7:33 PM, Gabor Grothendieck
 wrote:
> On Fri, Jul 15, 2011 at 8:14 PM, Rolf Turner  wrote:
>>
>> I can verify that I get exactly the same error (also R 2.13.1 under Ubuntu).
>> No idea what to *do* about it, though. :-(
>>
> You could check if text help works:
>
> options(help_type = "text")
> ?by
>
> or
>
> help("by", help_type = "text")
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Z-test

2011-07-15 Thread Joshua Wiley

Hi Bogdan,

Look at ?pnorm

Josh

On Fri, Jul 15, 2011 at 9:10 PM, Bogdan Tanasa  wrote:
> Hi,
>
> please could you recommend a R package that computes a 2 sample z-test ?
>
> thanks,
>
> Bogdan
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Z-test

2011-07-15 Thread Joshua Wiley

Hi,

The Z is basically:

(mean(x) - mean(y))/sqrt(var(x)/length(x) + var(y)/length(y))

and pnorm will give you a p-value, if you desire it.

If the n - 1 divisior used in var() is a problem for you, it is
trivial to work around:

X <- cbind(x, y)
XX <- crossprod(X - tcrossprod(matrix(1, nrow(X))) %*% X *
(1/nrow(X))) * 1/nrow(X)
diff(colMeans(X))/sqrt(sum(diag(XX)/nrow(X)))

where the last line gives the Z and again, pnorm() will give you a
p-value if desired.

In most cases a t-test is preferred (and is available using the t.test
function).

HTH,

Josh

On Fri, Jul 15, 2011 at 9:56 PM, Bogdan Tanasa  wrote:
> Hi Josh,
>
> thanks for your email. I have been looking into pnorm, but hmmm ... it does
> not seem to assess the difference between 2 populations, it says
> that it works on a vector of quantiles, and sd=1, mean = 0. please let me
> know if you have any suggestions. thanks,
>
> bogdan
>
> On Fri, Jul 15, 2011 at 9:49 PM, Joshua Wiley 
> wrote:
>>
>> Hi Bogdan,
>>
>> Look at ?pnorm
>>
>> Josh
>>
>> On Fri, Jul 15, 2011 at 9:10 PM, Bogdan Tanasa  wrote:
>> > Hi,
>> >
>> > please could you recommend a R package that computes a 2 sample z-test ?
>> >
>> > thanks,
>> >
>> > Bogdan
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> University of California, Los Angeles
>> https://joshuawiley.com/
>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R sign test for censored data

2011-07-15 Thread Brian Tsai

does anyone know a statistical test implemented in R that can do a sign test
for difference of medians, except that can handle censored data?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

97 matches

Mail list logo